Disaster Response Pipeline Project
Disaster Response Pipeline Project
(https://github.com/msmohankumar/Disaster_Response_App/assets/
153971484/1af397f7-7751-4353-b9be-0d0948c30ee6)
Table of Contents
1. Description
2. Getting Started
o Dependencies
o Installing
o Executing Program
3. Additional Material
4. Authors
5. License
6. Acknowledgement
7. Screenshots
Description
This project is part of the Data Science Nanodegree Program by Udacity. It involves
building a Natural Language Processing (NLP) model to categorize messages from
real-life disaster events in real-time. The dataset consists of pre-labelled tweets and
messages.
The project is divided into the following key sections:
Processing data: Building an ETL pipeline to extract, clean, and store the data in
a SQLite database.
Building a machine learning pipeline: Training a classifier to categorize text
messages into various categories.
Running a web app: Displaying model results in real-time.
Getting Started
Dependencies
Python 3.5+
NumPy, SciPy, Pandas, Scikit-Learn
NLTK
SQLAlchemy
Pickle
Flask, Plotly
Installing
Clone the git repository:
Root Directory
o data: Contains data files and data processing scripts.
disaster_categories.csv: Categories data file.
disaster_messages.csv: Messages data file.
DisasterResponse.db
process_data.py: ETL pipeline script to clean and process data.
o models: Contains machine learning model scripts.
train_classifier.py: Script to train the classifier and save the model.
classifier.pkl
o screenshots: Contains screenshots of the web app.
intro.png: Introduction screenshot.
sample_input.png: Sample input screenshot.
sample_output.png: Sample output screenshot.
main_page.png: Main page screenshot.
process_data.png: Process data screenshot.
train_classifier_data.png: Train classifier screenshot.
o app: Contains the web application files.
run.py: Script to run the web app.
templates: HTML templates for the web app.
master.html: Main page template.
go.html: Classification result template.
o README.md: Project README file.
Executing Program
1. Run the ETL pipeline to clean data and store the processed data in the
database:
2. python data/process_data.py data/disaster_messages.csv
data/disaster_categories.csv data/DisasterResponse.db
3. Run the ML pipeline to load data from the database, train the classifier, and
save the classifier as a pickle file:
4. python models/train_classifier.py data/DisasterResponse.db models/classifier.pkl
5. Run the web app:
6. python run.py
Access the web app at:
Running on http://127.0.0.1:3001
Running on http://192.168.29.170:3001
Authors
M S Mohan Kumar
License
This project is licensed under the MIT License.
Acknowledgement
Udacity for providing an excellent Data Science Nanodegree Program.
Screenshots
Sample Input
Sample Output
Main Page
Process Data
Train Classifier without Category Level Precision Recall