A Text Classification-based Approach for Evaluating and Enhancing the Machine Interpretability of Building Codes

Engineering Applications of Artificial Intelligence, 2023

引用方式: Zheng, Z., Zhou, Y.C., Chen, K.Y., Lu, X.Z., She, Z.T., Lin, J.R.* (2024). A Text Classification-based Approach for Evaluating and Enhancing the Machine Interpretability of Building Codes. Engineering Applications of Artificial Intelligence, 127, 107207. doi: 10.1016/j.engappai.2023.107207 http://doi.org/10.1016/j.engappai.2023.107207 cited by count

摘要

将建筑工程规范及标准文档自动解析为计算机可处理的格式对于建筑物和基础设施的智能设计、建造至关重要。尽管自动规则解释 (ARI) 方法已被研究多年，但大多数方法仍高度依赖人工筛选可计算机解译的规范条文。且基本无人关注规范条文以及规范文档整体的机器可解译性，以评估单个规范条文或规范文档整体转换为计算机可处理格式的潜力。因此，本研究基于领域大模型，提出一种自动评估和增强单个条文及整体规范文档机器可解译性的新方法。首先，考虑条文规则解译要求，研究建立了规范条文的分类规则并建立了有关模型训练数据集。进而，基于团队领域预训练大模型和迁移学习算法，构建了规范条文高效分类算法。以此为基础，本研究首次提出了定量评估规范文本可解译性的新方法。研究表明，所提出的文本分类算法优于现有基于CNN或 RNN的方法，可将F1指标从72.16%提升到93.60%，且所提出的分类方法可以有效增强下游ARI算法，准确度可提升4%。同时，对中国150多部建筑工程规范文本的分析表明，规范文档整体的平均可解释性仅为34.40%，这意味着将整个规范性文件完全转换为计算机可处理的格式仍然面临巨大挑战。当前，亟需从人（在编写建筑规范时考虑某些约束）-机（开发更强大的算法、工具等）两个角度综合创新，以进一步推动计算机可完全解译推理的规范数字化技术发展。

graphical abstract

论文下载链接

预印本下载链接

The authors are grateful for the financial support received from the 2023 Open Selection Project of Tsinghua University-Tsingshang Joint Institute for Smart Scene Innovation Design and the National Natural Science Foundation of China (no. 52378306 and no. 51908323).

2020.1-2022.12：融合知识推理与性能仿真的性能消防设计审查方法

2024.1-2027.12：数据与知识融合驱动的复杂机电管线时空冲突辨识机理及优化方法

Twitter Facebook Google+ LinkedIn

林佳瑞

A Text Classification-based Approach for Evaluating and Enhancing the Machine Interpretability of Building Codes

摘要

相关项目资助情况:

2020.1-2022.12：融合知识推理与性能仿真的性能消防设计审查方法

2024.1-2027.12：数据与知识融合驱动的复杂机电管线时空冲突辨识机理及优化方法

分享

发表评论

你可能喜欢

A Natural‐Language‐Based Approach to Intelligent Data Retrieval and Representation for Cloud BIM

Framing and Evaluating the Best Practices of IFC-Based Automated Rule Checking: A Case Study

Rule-based Information Extraction for Mechanical-Electrical-Plumbing-Specific Semantic Web

Pretrained Domain-Specific Language Model for Natural Language Processing Tasks in the AEC Domain