TY - JOUR
T1 - Evaluating the application of K-mean clustering in Earthquake vulnerability mapping of Istanbul, Turkey
AU - Shafapourtehrany, Mahyat
AU - Yariyan, Peyman
AU - Özener, Haluk
AU - Pradhan, Biswajeet
AU - Shabani, Farzin
PY - 2022/9
Y1 - 2022/9
N2 - Performing the most up-to-date and accurate vulnerability assessment is key to an effective earthquake disaster management. In cities like Istanbul (Turkey) with a high rate of urban expansion, the safety of the residents must not be neglected. The challenges in such studies are related to the lack of a training dataset. Some areas are highly prone to earthquakes, however, there have been no earthquakes in those areas recently. This research proposes and tests the ability of the k-mean clustering method to create the training dataset for earthquake vulnerability analysis. Subsequently, the derived sample dataset was used in four state-of-the-art models i.e. Decision Tree (DT), Support Vector Machine (SVM), Self-Organizing Map (SOM) and Logistic Regression (LR) for assessing earthquake vulnerability in Istanbul, Turkey. The multicollinearity among the variables was determined using tolerance (TOL) and variance inflation factor (VIF) which revealed no multicollinearity among the variables. The highest VIF belonged to the “distance to faults” factor. Vulnerability related variables were classified, weighed and using k-mean clustering, a training database was constructed. Then, the standardized variables were keyed in as input alongside the training site maps into DT, SVM, SOM and LR to construct an Earthquake Vulnerability Map (EVM). EVMs were created for all the four samples and graded as very-low, relatively-low, moderate, high, or extremely-high. Several statistical metrics such as Area under the ROC curve (AUC), sensitivity (SST), specificity (SPF), root-mean-squared-errors (RMSE), positive predictive value (PPV), and negative predictive value (NPV) were used to evaluate the accuracy of the resultant maps. The highest and lowest AUC prediction rates were 0.962 and 0.912 from the K-means-SOM and K-means-LR models, respectively. The lowest RSME results using the testing dataset (0.329) belonged to K-means-SVM model. The region's most prone vulnerability maps were found to be in the districts 9, 13, 20, 21 and 35. Finally, an analysis of the buildings and population distribution was carried out among the 39 districts of Istanbul considering the SOM outcomes. The research outcome could help in laying strategies for earthquake preparedness in the Istanbul city.
AB - Performing the most up-to-date and accurate vulnerability assessment is key to an effective earthquake disaster management. In cities like Istanbul (Turkey) with a high rate of urban expansion, the safety of the residents must not be neglected. The challenges in such studies are related to the lack of a training dataset. Some areas are highly prone to earthquakes, however, there have been no earthquakes in those areas recently. This research proposes and tests the ability of the k-mean clustering method to create the training dataset for earthquake vulnerability analysis. Subsequently, the derived sample dataset was used in four state-of-the-art models i.e. Decision Tree (DT), Support Vector Machine (SVM), Self-Organizing Map (SOM) and Logistic Regression (LR) for assessing earthquake vulnerability in Istanbul, Turkey. The multicollinearity among the variables was determined using tolerance (TOL) and variance inflation factor (VIF) which revealed no multicollinearity among the variables. The highest VIF belonged to the “distance to faults” factor. Vulnerability related variables were classified, weighed and using k-mean clustering, a training database was constructed. Then, the standardized variables were keyed in as input alongside the training site maps into DT, SVM, SOM and LR to construct an Earthquake Vulnerability Map (EVM). EVMs were created for all the four samples and graded as very-low, relatively-low, moderate, high, or extremely-high. Several statistical metrics such as Area under the ROC curve (AUC), sensitivity (SST), specificity (SPF), root-mean-squared-errors (RMSE), positive predictive value (PPV), and negative predictive value (NPV) were used to evaluate the accuracy of the resultant maps. The highest and lowest AUC prediction rates were 0.962 and 0.912 from the K-means-SOM and K-means-LR models, respectively. The lowest RSME results using the testing dataset (0.329) belonged to K-means-SVM model. The region's most prone vulnerability maps were found to be in the districts 9, 13, 20, 21 and 35. Finally, an analysis of the buildings and population distribution was carried out among the 39 districts of Istanbul considering the SOM outcomes. The research outcome could help in laying strategies for earthquake preparedness in the Istanbul city.
KW - Earthquake vulnerability mapping
KW - GIS
KW - K-means clustering
KW - Machine learning
KW - Turkey
UR - http://www.scopus.com/inward/record.url?scp=85134614324&partnerID=8YFLogxK
U2 - 10.1016/j.ijdrr.2022.103154
DO - 10.1016/j.ijdrr.2022.103154
M3 - Article
AN - SCOPUS:85134614324
SN - 2212-4209
VL - 79
JO - International Journal of Disaster Risk Reduction
JF - International Journal of Disaster Risk Reduction
M1 - 103154
ER -