A visual–textual fused approach to automated tagging of flood-related tweets during a flood event

Huang, Xiao; Wang, Cuizhen; Li, Zhenlong; Ning, Huan

doi:10.6084/m9.figshare.7120775.v2

tjde_a_1523956_sm4882.docx (14.46 kB)

A visual–textual fused approach to automated tagging of flood-related tweets during a flood event

Version 2 2019-10-09, 09:28

Version 1 2018-09-22, 05:26

journal contribution

posted on 2019-10-09, 09:28 authored by Xiao Huang, Cuizhen Wang, Zhenlong Li, Huan Ning

In recent years, social media such as Twitter have received much attention as a new data source for rapid flood awareness. The timely response and large coverage provided by citizen sensors significantly compensate the limitations of non-timely remote sensing data and spatially isolated river gauges. However, automatic extraction of flood tweets from a massive tweets pool remains a challenge. Taking the Houston Flood in 2017 as a study case, this paper presents an automated flood tweets extraction approach by mining both visual and textual information a tweet contains. A CNN architecture was designed to classify the visual content of flood pictures during the Houston Flood. A sensitivity test was then applied to extract flood-sensitive keywords that were further used to refine the CNN classified results. A duplication test was finally performed to trim the database by removing the duplicated pictures to create the flood tweets pool for the flood event. The results indicated that coupling CNN classification results with flood-sensitive words in tweets allows a significant increase in precision while keeps the recall rate in a high level. The elimination of tweets containing duplicated pictures greatly contributes to higher spatio-temporal relevance to the flood.