Standard-Driven Chinese Knowledge Extraction in Highway Domains Using Machine Learning and NLP Approach
2024 Proceedings of the ASCE International Conference on Computing in Civil Engineering (i3CE 2024), 2025
Recommended citation: Ching, W.L., Zhang, X.B., Lin, J.R., Hu, Z.Z.* (2025). Standard-Driven Chinese Knowledge Extraction in Highway Domains Using Machine Learning and NLP Approach. Proceedings of the ASCE International Conference on Computing in Civil Engineering (i3CE 2024), 96-106. Pittsburgh, PA, USA. doi: 10.1061/9780784486115.010 http://doi.org/10.1061/9780784486115.010
Abstract
Advancements in informatics have led to numerous data-driven strategies for improving transportation efficiency. However, data exchange between transportation systems is often hindered by complexity, ambiguous definitions, and varied data sets. To tackle these issues, this article introduces a method for creating an automatic knowledge extraction model for highway standards. Machine learning models like BiLSTM-CRF, TextCNN-BiLSTM-CRF, BERT, and BERT-CRF are utilized on small training sets for tasks such as Named Entity Recognition using ISO 12006-3 as upper-level ontology and relationship classification. Additionally, post-prediction and manual corrections refine training data sets for iterative learning. The best results are formatted into graphs and saved as OWL ontologies. This approach yielded 158 graphs from Chinese standards, linked via ISO 12006-3 referenced classes, and the outcome links highway domain concepts, enhancing data management and project collaboration. This method shows promise for better data management and interdisciplinary collaboration in highway projects, furthering data-driven progress in the field.

The authors are grateful for the financial support received from the National Key R&D Program of China (No. 2022YFC3801100).
Leave a Comment