Disaster Tweets Classification Method based on Pretrained BERT Model

Journal of Graphics, 2022

Recommended citation: Lin, J.R.*, Cheng, Z.G., Han, Y., Yin, Y.P. (2022). Disaster Tweets Classification Method based on Pretrained BERT Model. Journal of Graphics, 43(3), 530-536. doi: 10.11996/JG.j.2095-302X.2022030530 http://www.txxb.com.cn/EN/10.11996/JG.j.2095-302X.2022030530 cited by count

Abstract

Social media has become an important medium for the release and dissemination of disaster information, the effective identification and utilization of which is of great significance to disaster emergency management. Given the shortcomings of the traditional text classification model, a disaster tweet classification method was proposed based on the pre-trained model of bidirectional encoder representations from transformers (BERT). After data cleaning and preprocessing, this study constructed a text classification model based on long short-term memory-convolutional neural network (LSTM-CNN) through comparative analysis, based on BERT. Experiments on the tweet datasets of the Kaggle competition platform showed that the proposed classification model outperforms the traditional Naive Bayesian classification model and the common fine-tuning model, with the recognition rate up to 85%. This study could shed significant light on enhancing the identification accuracy of real disaster information and the efficiency of disaster emergency response.

Download paper here

Download preprint here

The authors are grateful for the financial support received from the National Natural Science Foundation of China (No. 72091512, No. 51908323).

Financial Sources:

2020.1-2022.12: Automatic Compliance Checking of Performance-based Fire Protection Design by Integrating Reasoning and Simulation

2021.1-2025.12: Resilience Assessment and Management of City Infrastructures

Share on

Twitter Facebook Google+ LinkedIn

Named Entity Recognition for Automatic Compliance Checking

Published in the 7th National Conference on Building Information Modeling, 2021

In this work, we developed a few semantic labels and regulatory datasets for automatic compliance checking, with which deep learning-based named entity recognition algorithm based on deep learning is introduced for rule interpretation

Pretrained Domain-Specific Language Model for Natural Language Processing Tasks in the AEC Domain

Published in Computers in Industry, 2022

This research develops the first domain corpora and proposes the first domain-specific pretrained language model for AEC, experiments showed that the proposed model outperforms existing methods in all typical NLP tasks, with maximum improvements of 8.1% in the F1-score.

Integrating NLP and Context-Free Grammar for Complex Rule Interpretation towards Automated Compliance Checking

Published in Computers in Industry, 2022

This research integrates natural language processing (NLP) and context-free grammar (CFG) to propose a novel generalized rule interpretation approach, which outperforms the state-of-the-art methods and achieves 99.6% and 91.0% accuracies for parsing single- and multi-requirement sentences. This research also publishes the first regulation dataset for future exploration, validation, and benchmarking in the ARC area

Jia-Rui Lin