Paradox
Paradox
Impact-Metrics
Indian Institute of Technology (Banaras Hindu University)
Goals:
• Identify high-risk zones.
• Build predictive models for tsunami occurrences.
• Visualize key insights for stakeholders.
Project deliverables: A unique identifier added to the
dataset for tracking individual earthquake events.
- Predictive model
- Classification of regions into Risk zones(high, medium, low)
- Dashboard for analysis
2
Paradox
Dataset Features
3
Paradox Methodology
Data Cleaning: Steps: Flow Chart
• Downloading data and preprocessing in Google
Colab.
• From date-time data set, we removed year data
• Removed unnecessary data from data set
• One Hot encoding categorical features.
• Standardizing mixed units into a single
consistent format.
• Handling outliers to improve data integrity.
Classification:
• Applied DBSCAN to cluster data points on spacial
density.
• Focused on spatial patterns to enhance
understanding of regional risk zones
4
Paradox
Risk Zone Classification
Method:
Waited score classification based on weight score of Median magnitude Frequency
over 5 years (freq_5years)
Major earthquake frequency (freq_major_5years)
Ratio of occurrences (ratio_5years) Depth.
Results:
• High Risk: Frequent, high-magnitude, low depth, high ratio of occurence events.
• Medium Risk: Moderate seismic activity, medium parameter.
• Low Risk: Infrequent or low-impact zones lower parameter except for depth.
5
Predictive Modeling
Models Used:
• XGBoost: Optimized gradient-boosting Sequential ANN: 3-layer feedforward neural
algorithm for high-performance tasks. network for binary classification.
Key Features: Architecture:
Input seismic Parameters output whether a • Input Layer: Takes features like magnitude and
tsunami will occur or not depth.
• Zero precision is 0.96 and class one recall is • Hidden Layers: 3 layers with ReLU activation for
0.92. non-linear patterns.
• Accuracy of our model 92.8%. • Output Layer: Predicts tsunami occurrence
6
Predictive Modeling
XGBoost:
7
Predictive Modeling
Ann Model :
8
Paradox Exploratory Data Analysis (EDA) & Results
EDA Techniques:
• Univariate and bivariate analysis to identify Results:-
patterns. • Established strong correlations between
• Correlational analysis (correlation matrix). seismic parameters and tsunami occurrences.
• Classification of regions into High, Medium, and • Identified high-risk zones through effective
Low risk based on key features. classification techniques.
9
Paradox
Visualization Insights
High-Medium-Low Risk Distribution Cluster
10
Paradox
Visualization Insights
11
Paradox
Visualization Insights
Magnitude
12
Paradox
Visualization Insights
13
Resources
Dashboard Link :-
https://
public.tableau.com/app/profile/parth.nuwal4883/viz/EarthquakeAnalysis_17354827858960/Dashboard1?publ
ish=yes
Data Set :-
https://
docs.google.com/spreadsheets/d/1BXelepsoGZHqAjTq_4ug72LjmvA7_m9JFRhYGlVP8rQ/edit?usp=sharin
g
Git-Hub:-
https://github.com/Parthnuwal7/Tsunami-risk-prediction-and-Risk-zone-classification/tree/main
14