0% found this document useful (0 votes)

117 views24 pages

Disaster Response Classification Using NLP

The Project aims on Disaster Management using Natural Language Processing (NLP). NLP is a field of Artificial Intelligence that gives machines the ability to read, understand and create understandable data out of human languages for it to use it meaningfully. Disaster Response is the most important part of any Disaster Management. Delay in response to each second can make a difference between life and death in a lot of situations. Usually when the message arrives to the disaster management, then

Uploaded by

Karanveer Singh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

117 views24 pages

Disaster Response Classification Using NLP

Uploaded by

Karanveer Singh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 24

DISASTER

RESPONSE
CLASSIFICATION
USING
NLP
Under Supervision Of -
Mrs. Sonali Mathur

Kumar Shantanu (1709110081)

Hardik Sharma (1709110065)
Karanveer Singh (1709110074)
Nikhil Vats (1709110096)
PRESENTATION OVERVIEW
PRESENTATION OVERVIEW

Project
i. RESEARCH Ii. RESEARCH
Project Introduction. implementation.
PAPER 1 PAPER 2
(DFD)

iii. RESEARCH iv. RESEARCH SNAPSHOT

PAPER 3 PAPER 4 OF PROJECT
PROJECT INTRODUCTION
PROJECT INTRODUCTION

Our project aims on building a disaster response web application which will help
classify the message sent by the user into different categories like earthquakes,
typhoon, etc, with the help of Python, flask and NLP.

The project consists of three major parts - ETL Pipeline, ML Pipeline, Web
Application Dashboard.
IMPLEMENTATION
IMPLEMENTATION PROCESS
PROCESS (DFD)
(DFD)
LITERATURE REVIEW
LITERATURE REVIEW
RESEARCH PAPER 1
RESEARCH PAPER 1

Title : SMS Classification Method for Disaster Response using Naïve Bayes Algorithm.

Publication Year : 2019

Introduction : This research paper is focused on extraction of information from SMS with the help of data mining and
training a model with that information.

Methodology : The paper was focused on performance analysis of Naïve-Bayes and J48 classification. There technique used
pre-processing techniques to eliminate insignificant words also known as STOP WORDS and to make a learned classifier.

Conclusion : Naïve-Bayes algorithm is more probabilistic in nature and fits more appropriately in the study of text
extraction. This technique showed 89% accuracy with 11% false-negative results.
RESEARCH PAPER 2
RESEARCH PAPER 2

Title : Identifying And Categorizing Disaster-related Tweets

Publication Year : 2016

Introduction : This paper presents a system for classifying disaster-related tweets. The focus is on Twitter data
generated before, during, and after Hurricane Sandy, which impacted New York in the fall of 2012.
Methodology : Three classification models (support vector machines (SVMs), maximum entropy (MaxEnt ) models,
and Naive Bayes) were assessed and the best performing one was used. SVM being the best F1 performance, was
used.
Conclusion : Their proposed classifiers are both more general (identifying all relevant tweets, not just situational
awareness) and richer (with fine-grained categorizations).
RESEARCH PAPER 3
RESEARCH PAPER 3
Title : Twitter as a Lifeline: Human-annotated Twitter Corpora for NLP of Crisis-related Messages.

Publication Year : 2016

Introduction : Three well-known learning algorithms are presented here Naïve Bayes (NB), Support Vector Machines
(SVM), and Random Forest (RF). They created a large corpora namely 52 million crisis related tweets during 19 different
crisis events to train there models.
Methodology : The main part technique implemented in their paper is the use of uni-grams and bi-grams as their main
features. Initial vocabularies are built consisting of lexical variations to identify OOV (Out of Vocabulary) words.

Conclusion : By training there models against a large corpus (52 Million tweets) accuracy is guaranteed to be high.
Microblogging sites can be really helpful for information gathering. The word embeddings are generated using
Continuous Bag Of Words (CBOW).
IMPLEMENTATION OF THE PROJECT
IMPLEMENTATION OF THE PROJECT

ETL Pipeline

ML Pipeline

Web Application
1. Data Visualization Screen
2. Classification Screen
ETL Pipeline :
ETL Pipeline : resembles the set of processes extracting data from input,
ETL Pipeline itresembles
transforming the itsetinto
and loading of processes extracting data
an output destination from input,
for example: database, data
transforming
warehouse foritreporting,
and loading it into and
analysis an output destination for example: database, data
data synchronization.
warehouse for reporting, analysis and data synchronization.

Figure 1 ETL
Pipeline
Load messages and categories and merge them.

ETL Pipeline

Two important steps in FIGURE 1:-

Load_ data() function :
Clean_ data() function:

Remove useless data from the dataset and cleans it.

FIGURE 2
FIGURE 2

This fig shows uncleaned merged data from messages and categories dataset.
This fig shows uncleaned merged data from messages and categories dataset.
FIGURE 3:-

The fig shows the clean data, columns

with category names are generated so that
it becomes easy to process numbers 0 and
1 instead of processing language data. In
this process around 35 category columns
are created.
ML Pipeline

Machine learning pipeline is a means of automating the machine learning workflow by enabling data to be
transformed and correlated into a model that can then be analyzed to achieve outputs. This type of ML pipeline
makes the process of inputting data into the ML model fully automated.

A typical machine learning pipeline

would consist of the following
processes:

1. Data collection.
2. Data cleaning.
3. Feature extraction (labelling and
dimensionality reduction)
4. Model validation.
5. Visualization.
ML Pipeline

Machine learning (ML) pipelines consist of several steps to train a model. Machine learning pipelines are iterative
as every step is repeated to continuously improve the accuracy of the model and achieve a successful algorithm
A pipeline consists of a sequence of components which are a compilation of computations. Data is sent through
these components and is manipulated with the help of computation.
These multiple sequential steps do everything from data extraction and preprocessing to model training and
deployment.
Training and building a model

Average accuracy ~95.56%

In order to use textual data for predictive modeling, the text must be parsed to remove certain
words – this process is called tokenization. These words need to then be encoded as integers, or
floating-point values, for use as inputs in machine learning algorithms. This process is called
feature extraction (or vectorization).
Scikit-learn’s CountVectorizer is used to convert a collection of text documents to a vector of
term/token counts. It also enables the pre-processing of text data prior to generating the vector
representation. This functionality makes it a highly flexible feature representation module for text.
Tf means term-frequency while tf-idf means term-frequency times inverse
document-frequency. This is a common term weighting scheme in information
retrieval, that has also found good use in document classification. Idf is “t” when
use_ idf is given, “n” (none)

With Tfidf transformer you will systematically compute word counts using Count
Vectorizer and then compute the Inverse Document Frequency (IDF) values and only
then compute the Tf-idf scores.
Web Application
Screen 1: Data Visualization

This Screen contains a text box to input a Message to

classify that contains words related to disaster.

Also Some Visualizations are added based on the data

used to train the classifier.
CATEGORIES WHICH ARE PRESENT IN THE TRAINING DATASET
Web Application This Screen contains the categories to which
the massage is related.
Screen 2: Message Classification Related categories are highlighted in the list.

QUERY(MESSA
QUERY(MESSA
GE)
GE) RESULTS
RESULTS
(QUERY
(QUERY RELATED
RELATED TO
TO THESE)
THESE)
THANK YOU

Fake News Detection
100% (1)
Fake News Detection
25 pages
(Cambridge Introductions To Philosophy) Kent W. Staley - An Introduction To The Philosophy of Science-Cambridge University Press (2014) PDF
100% (10)
(Cambridge Introductions To Philosophy) Kent W. Staley - An Introduction To The Philosophy of Science-Cambridge University Press (2014) PDF
212 pages
4.2 Leadership Styles Questionnaire
0% (1)
4.2 Leadership Styles Questionnaire
2 pages
Information Security Awareness - Refresher Course
100% (2)
Information Security Awareness - Refresher Course
83 pages
Spam News Detection Report
No ratings yet
Spam News Detection Report
9 pages
Grade 6 Term 1 Data Handling Lesson 1 2
50% (2)
Grade 6 Term 1 Data Handling Lesson 1 2
4 pages
Title of Project Report: Format of Cover Page (Hard Bound For 8 Sem and Spiral Bound For 7 Sem)
No ratings yet
Title of Project Report: Format of Cover Page (Hard Bound For 8 Sem and Spiral Bound For 7 Sem)
44 pages
Fake News Detection Project Report
100% (1)
Fake News Detection Project Report
8 pages
Reading Strategies For Efl Learners
No ratings yet
Reading Strategies For Efl Learners
34 pages
ORAL COM 11 Quarter 1 Module 7
85% (52)
ORAL COM 11 Quarter 1 Module 7
33 pages
Machine Learning Lecture - 2 and Lecture - 3
No ratings yet
Machine Learning Lecture - 2 and Lecture - 3
59 pages
L5 TextClassification Updated
No ratings yet
L5 TextClassification Updated
179 pages
Statistical Learning and Text Classification With NLTK and Scikit-Learn
No ratings yet
Statistical Learning and Text Classification With NLTK and Scikit-Learn
24 pages
Disaster Response Classification Using NLP: Under Supervision of - Mrs. Sonali Mathur
No ratings yet
Disaster Response Classification Using NLP: Under Supervision of - Mrs. Sonali Mathur
14 pages
NLP m4
No ratings yet
NLP m4
97 pages
ML7 - Text Classification
No ratings yet
ML7 - Text Classification
13 pages
E-Mail Spam Detection Using Machine Lear PDF
No ratings yet
E-Mail Spam Detection Using Machine Lear PDF
7 pages
NaiveBayes N Text Analytics
No ratings yet
NaiveBayes N Text Analytics
20 pages
NLP Unit-3
No ratings yet
NLP Unit-3
17 pages
Sentiment Analysis PDF
No ratings yet
Sentiment Analysis PDF
4 pages
Airline Tweets Classification Using Naive Bayes Classifier
No ratings yet
Airline Tweets Classification Using Naive Bayes Classifier
2 pages
Comparison of Naive Bayes Classifier and C-LSTM
No ratings yet
Comparison of Naive Bayes Classifier and C-LSTM
6 pages
Pavan
No ratings yet
Pavan
23 pages
AI - Phase 4
No ratings yet
AI - Phase 4
11 pages
Data Science
No ratings yet
Data Science
25 pages
Musa IEEE
No ratings yet
Musa IEEE
6 pages
Data Science With Python - Lesson 09 - Data Science With Python - NLP PDF
No ratings yet
Data Science With Python - Lesson 09 - Data Science With Python - NLP PDF
62 pages
NLP Text Classification Week4
No ratings yet
NLP Text Classification Week4
26 pages
Mla Unit-5'2
No ratings yet
Mla Unit-5'2
74 pages
CSC 528 Lecture 3
No ratings yet
CSC 528 Lecture 3
42 pages
Document Classification Using Distributed Machine Learning
No ratings yet
Document Classification Using Distributed Machine Learning
4 pages
127 1498038923 - 21-06-2017 PDF
No ratings yet
127 1498038923 - 21-06-2017 PDF
9 pages
Unstructured
No ratings yet
Unstructured
37 pages
Text Classification MLND Project Report Prasann Pandya
No ratings yet
Text Classification MLND Project Report Prasann Pandya
17 pages
Project Proposal - Group 17-2-5
No ratings yet
Project Proposal - Group 17-2-5
4 pages
Group08 - BDM01 - Topic Modelling in Text Classification
No ratings yet
Group08 - BDM01 - Topic Modelling in Text Classification
19 pages
Lect 05
No ratings yet
Lect 05
17 pages
Natural Language Processing-Section
No ratings yet
Natural Language Processing-Section
38 pages
Lect 02
No ratings yet
Lect 02
23 pages
Text Classification Using NLP
No ratings yet
Text Classification Using NLP
28 pages
For Fake or Real Disaster Tweet Analysis of Machine Learning Algorithms
No ratings yet
For Fake or Real Disaster Tweet Analysis of Machine Learning Algorithms
23 pages
Unit 3
No ratings yet
Unit 3
27 pages
IOSH - Level - 6 - Diploma
No ratings yet
IOSH - Level - 6 - Diploma
2 pages
Text Classification Reseach Paper
No ratings yet
Text Classification Reseach Paper
4 pages
Text Classification Research Based On Bert Model and Bayesian Network
No ratings yet
Text Classification Research Based On Bert Model and Bayesian Network
5 pages
Disaster Tweet Classification Report
No ratings yet
Disaster Tweet Classification Report
12 pages
E-Mail Spam Detection Using Machine Learning and Deep Learning
No ratings yet
E-Mail Spam Detection Using Machine Learning and Deep Learning
7 pages
Project Paper Submission B21CS045
No ratings yet
Project Paper Submission B21CS045
6 pages
Spam 123
No ratings yet
Spam 123
59 pages
News Classification Using Machine Learning
No ratings yet
News Classification Using Machine Learning
5 pages
A Comparative Analysis of Logistic Regression, Random Forest and KNN Models For The Text Classification
No ratings yet
A Comparative Analysis of Logistic Regression, Random Forest and KNN Models For The Text Classification
16 pages
Spam Detection
No ratings yet
Spam Detection
39 pages
FHNCP - Palma
100% (1)
FHNCP - Palma
4 pages
Disaster Tweet Classification Write Up
No ratings yet
Disaster Tweet Classification Write Up
2 pages
Leveraging Secondary Associations
No ratings yet
Leveraging Secondary Associations
24 pages
DeekshikaJadyada AP24LDS11
No ratings yet
DeekshikaJadyada AP24LDS11
6 pages
Machine Learning, NLP - Text Classification Using Scikit-Learn, Python and NLTK
No ratings yet
Machine Learning, NLP - Text Classification Using Scikit-Learn, Python and NLTK
9 pages
IEEE-paper (1) Original
No ratings yet
IEEE-paper (1) Original
3 pages
Active Online Learning For Social Media Analysis To Support Crisis Management
No ratings yet
Active Online Learning For Social Media Analysis To Support Crisis Management
10 pages
Report On Email Spam
No ratings yet
Report On Email Spam
7 pages
Minnesota Satisfaction Questionnaire
No ratings yet
Minnesota Satisfaction Questionnaire
2 pages
Data Compression MCQ With Previous Year Solved Paper
No ratings yet
Data Compression MCQ With Previous Year Solved Paper
53 pages
Juassic MVPTemplate
No ratings yet
Juassic MVPTemplate
22 pages
BSCS 221315 6th A Project
No ratings yet
BSCS 221315 6th A Project
9 pages
European Language Portfolio
No ratings yet
European Language Portfolio
10 pages
Guidelines: Guidelines For Preparing Seminar Presentations
No ratings yet
Guidelines: Guidelines For Preparing Seminar Presentations
8 pages
DL Assignment 2 Final
No ratings yet
DL Assignment 2 Final
15 pages
Developing A Strategic Vision
No ratings yet
Developing A Strategic Vision
10 pages
Oral Communication Unit 1 Lesson 3 Elements of Communication
No ratings yet
Oral Communication Unit 1 Lesson 3 Elements of Communication
21 pages
DLL - ORAL COM W2 - Jessica
No ratings yet
DLL - ORAL COM W2 - Jessica
4 pages
Big Data Analytics Thesis
100% (3)
Big Data Analytics Thesis
6 pages
Synopsis - Final Year Project
No ratings yet
Synopsis - Final Year Project
12 pages
Report Rohun Sjmoon
No ratings yet
Report Rohun Sjmoon
6 pages
Jeremiah ThesisManuscript Chapter1&2
No ratings yet
Jeremiah ThesisManuscript Chapter1&2
18 pages
MIND R12Wk1 Feedback
No ratings yet
MIND R12Wk1 Feedback
7 pages
Factors Affecting The Face - To - Face Classes
No ratings yet
Factors Affecting The Face - To - Face Classes
21 pages
ARD Form EPISD 2009 Revised Version 4
No ratings yet
ARD Form EPISD 2009 Revised Version 4
46 pages
16 Personalities Reflection
No ratings yet
16 Personalities Reflection
2 pages
LENGIFY
No ratings yet
LENGIFY
4 pages
Puzzles Puzzle 1 Theme: Mba 1 Post: Logical Reasoning
No ratings yet
Puzzles Puzzle 1 Theme: Mba 1 Post: Logical Reasoning
4 pages
Dr. A.P.J. Abdul Kalam Technical University, Lucknow (Uttar Pradesh)
No ratings yet
Dr. A.P.J. Abdul Kalam Technical University, Lucknow (Uttar Pradesh)
3 pages
Talentserve Data
No ratings yet
Talentserve Data
2 pages
Lesson Plan: Poetry: Objectives
No ratings yet
Lesson Plan: Poetry: Objectives
3 pages
Komljenovic Ana Ffos 2017 Diplo Sveuc
No ratings yet
Komljenovic Ana Ffos 2017 Diplo Sveuc
53 pages
Whay
No ratings yet
Whay
26 pages
Publishedresume
No ratings yet
Publishedresume
1 page
Mindful and Learner-Centered Syllabus Toolkit
No ratings yet
Mindful and Learner-Centered Syllabus Toolkit
11 pages
Aim of Life
No ratings yet
Aim of Life
2 pages
A Snappy Guide To Differentiating Affect and Effect PDF
No ratings yet
A Snappy Guide To Differentiating Affect and Effect PDF
2 pages
Department of Education: Republic of The Philippines
No ratings yet
Department of Education: Republic of The Philippines
5 pages
Short Term Goals Dq1 612
No ratings yet
Short Term Goals Dq1 612
2 pages
DATA MINING and MACHINE LEARNING. CLASSIFICATION PREDICTIVE TECHNIQUES: SUPPORT VECTOR MACHINE, LOGISTIC REGRESSION, DISCRIMINANT ANALYSIS and DECISION TREES: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. CLASSIFICATION PREDICTIVE TECHNIQUES: SUPPORT VECTOR MACHINE, LOGISTIC REGRESSION, DISCRIMINANT ANALYSIS and DECISION TREES: Examples with MATLAB
César Pérez López
No ratings yet
DEEP LEARNING TECHNIQUES: CLUSTER ANALYSIS and PATTERN RECOGNITION with NEURAL NETWORKS. Examples with MATLAB
From Everand
DEEP LEARNING TECHNIQUES: CLUSTER ANALYSIS and PATTERN RECOGNITION with NEURAL NETWORKS. Examples with MATLAB
César Pérez López
No ratings yet
Machine Learning with Python: Foundations and Applications: ML, #1
From Everand
Machine Learning with Python: Foundations and Applications: ML, #1
Mohammed Nurudeen
No ratings yet
DATA MINING AND MACHINE LEARNING. PREDICTIVE TECHNIQUES: REGRESSION, GENERALIZED LINEAR MODELS, SUPPORT VECTOR MACHINE AND NEURAL NETWORKS
From Everand
DATA MINING AND MACHINE LEARNING. PREDICTIVE TECHNIQUES: REGRESSION, GENERALIZED LINEAR MODELS, SUPPORT VECTOR MACHINE AND NEURAL NETWORKS
César Pérez López
No ratings yet
DATA MINING and MACHINE LEARNING: CLUSTER ANALYSIS and kNN CLASSIFIERS. Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING: CLUSTER ANALYSIS and kNN CLASSIFIERS. Examples with MATLAB
César Pérez López
No ratings yet
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
César Pérez López
No ratings yet