0% found this document useful (0 votes)
244 views129 pages

Final Report Final

This document is a project report for a web app that identifies eyewitness messages from Twitter data using textual features. It was created by two students, Asim Zubair and M.Tayyub Khan Ilyas, and supervised by Dr. Shahid Iqbal. The report includes sections on the project introduction and objectives, requirements specification and analysis, system design, and user interface design for the web app. It aims to build a machine learning model that can classify tweets as eyewitness messages or not based on textual features extracted from the tweets.

Uploaded by

ASIM ZUBAIR 111
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
244 views129 pages

Final Report Final

This document is a project report for a web app that identifies eyewitness messages from Twitter data using textual features. It was created by two students, Asim Zubair and M.Tayyub Khan Ilyas, and supervised by Dr. Shahid Iqbal. The report includes sections on the project introduction and objectives, requirements specification and analysis, system design, and user interface design for the web app. It aims to build a machine learning model that can classify tweets as eyewitness messages or not based on textual features extracted from the tweets.

Uploaded by

ASIM ZUBAIR 111
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 129

A Web App to Identify Eyewitness

Messages from Twitter using


Textual Features

Asim Zubair (BSE173121)


M.Tayyub Khan Ilyas (BSE173001)

Supervised By
DR. SHAHID IQBAL

Department of Computer Science


1
Capital University of Science & Technology, Islamabad

2
PROJECT REPORT
Version V 3.0 NUMBER 2
OF
MEMBERS

TITLE A Web App to identify eyewitness messages from twitter data using textual
feature

SUPERVISOR’S Dr. M.Shahid Iqbal Malik


NAME

MEMBER NAME REG. NO. EMAIL ADDRESS

Asim Zubair BSE173121 [email protected]

M.Tayyub Khan Ilyas BSE 173001 [email protected]

MEMBERS’ SIGNATURES

Supervisor’s Signatures

3
APPROVAL CERTIFICATE
This Project, entitled as “A Web App to Identify Eyewitness Messages from
Twitter using Textual Features” has been approved for the award of

Bachelor of Engineering in Software Engineering

Committee Signatures:

Supervisor:

(Dr. M. Shahid Iqbal Malik)

Project Coordinator:

(Mr. Ibrar Arshad)

Head of Department:

(Dr. Nadeem Anjum)

4
DECLARATION

We, hereby, declare that “No portion of the work referred to, in this project has been
submitted in support of an application for another degree or qualification of this or any other
university/institute or other institution of learning”. It is further declared that this
undergraduate project, neither as a whole nor as a part thereof has been copied out from any
sources, wherever references have been provided.

MEMBER’S SIGNATURES

5
Table of Contents
Chapter 1........................................................................................................................................................10
Introduction....................................................................................................................................................10
1.1. Project Introduction........................................................................................................................10
1.2. Problem Statement.........................................................................................................................10
1.3. Business Scope...............................................................................................................................11
1.4. Objectives.......................................................................................................................................11
1.5. Useful Tools and Technologies......................................................................................................11
1.6. Project Work Break Down.............................................................................................................13
1.7. Project Time Lapse.........................................................................................................................14
Chapter 2........................................................................................................................................................15
Requirement Specification and Analysis.......................................................................................................15
2.1. Functional Requirements................................................................................................................15
2.2. Non-Functional Requirements........................................................................................................17
2.3. Use Case Modeling........................................................................................................................18
2.4. Use Case Diagram:.........................................................................................................................18
2.5.1. Train model Use case Description..........................................................................................19
2.5.2. Test Model Use Case Description..........................................................................................21
Chapter 3........................................................................................................................................................22
System Design................................................................................................................................................22
3.1. Layer Definition.............................................................................................................................22
3.1.1. Presentation Layer..................................................................................................................22
3.1.2. Business Logic Layer.............................................................................................................22
3.2. System Design Diagrams...............................................................................................................23
3.2.1. High Level Design..................................................................................................................23
3.2.2. System Sequence Diagrams...................................................................................................23
3.2.2.1. Train Model SSD..............................................................................................................23
3.2.2.2. Test Model SSD...............................................................................................................25
3.3. Domain Model................................................................................................................................26
3.4. Flow Chart......................................................................................................................................27
3.4.1 View Dataset Flow Chart.......................................................................................................29
3.4.2 Features Computation Flow Chart.........................................................................................29
3.4.3 Pre-Processing Flow Chart.....................................................................................................31
3.4.4 Machine Learning Modeling Flow Chart...............................................................................31
3.4.5 Evaluation Metrics Flow Chart..............................................................................................32
3.4.6 Validation Method Flow Chart...............................................................................................33
3.4.7 Save Model Flow Chart..........................................................................................................34
3.4.8 Test Saved Model Flow Chart................................................................................................35
6
3.5. User Interface Design.....................................................................................................................35
3.5.1. Homepage interface................................................................................................................35
3.5.2. About Page.............................................................................................................................36
3.5.3. Train Model............................................................................................................................36
3.5.4. View Dataset Interface...........................................................................................................37
3.5.5. Feature Selection Interface.....................................................................................................37
3.5.6. Data Preprocessing Interface..................................................................................................38
3.5.7. Classification Selection Interface...........................................................................................38
3.5.8. Classifier Result Interface......................................................................................................39
3.5.9. Test Model History Interface..................................................................................................39
3.5.10. Unseen Tweet Prediction Interface........................................................................................40
3.5.11. Text Result Interface..............................................................................................................40
Chapter 4........................................................................................................................................................41
Software Development...................................................................................................................................41
4.1. Coding Standards...........................................................................................................................41
4.1.1 Indentation..............................................................................................................................41
4.1.2 Declaration.............................................................................................................................41
4.1.3 Statement Standards...............................................................................................................41
4.1.4 Naming Convention...............................................................................................................41
4.2 Front End Development Environment............................................................................................42
4.3 Back End Development Environment............................................................................................43
4.4 Software Description......................................................................................................................44
4.4.1. Module Classifier Code..........................................................................................................44
4.4.2. Module Feature Computation Code.......................................................................................47
4.4.3. MODULE WEB APP CODE.................................................................................................57
Chapter 5........................................................................................................................................................86
Software Testing.............................................................................................................................................86
5.1 Testing Methodology.....................................................................................................................86
5.2 Test Cases.......................................................................................................................................86
5.2.1 Choose Dataset Test case.......................................................................................................86
5.2.2 Train Model Test Case...........................................................................................................87
5.2.3 Apply Feature Extraction method on Dataset Test Case........................................................88
5.2.4 Apply Part of speech on Dataset Test Case............................................................................89
5.2.5 Remove Special Characters Test Case...................................................................................90
5.2.6 Apply Preprocessing Technique on Dataset Test Case..........................................................91
5.2.7 Apply Lemmatization Technique on Dataset Test Case........................................................91
5.2.8 Apply All Preprocessing Technique on Dataset Test Case....................................................92
5.2.9 Moving to Classifier Test case...............................................................................................93
5.2.10 Machine Learning Model Test case.......................................................................................94

7
5.2.11 Evaluation Metrics Test case..................................................................................................95
5.2.12 Evaluation Metrics Test case..................................................................................................96
5.2.13 Apply Classifier Test case......................................................................................................97
5.2.14 Save Model Test Case 1.........................................................................................................98
5.2.15 Save Model Test Case 2.........................................................................................................99
5.2.16 Test Model Test case............................................................................................................100
5.2.17 Test Model Test Case 2........................................................................................................101
5.2.18 Unseen Prediction Test Case 1.............................................................................................102
5.2.19 Unseen Prediction Test Case 2.............................................................................................103
5.2.20 Unseen Prediction Test Case 3.............................................................................................103
5.2.21 About Page Test case...........................................................................................................104
Chapter 6......................................................................................................................................................106
Software Deployment...................................................................................................................................106
6.1. Installation / Deployment Process Description.................................................................................106
• GitHub..........................................................................................................................................106
• Heroku..........................................................................................................................................107
Chapter 7......................................................................................................................................................110
REPORT APPROVAL CERTIFICATE......................................................................................................110
References....................................................................................................................................................111
Webpage...............................................................................................................................................111

8
List of Figures
Figure 1 Work Breakdown Chart................................................................................................................13
Figure 2 Project Time-lapse........................................................................................................................14
Figure 3Use case Diagram..........................................................................................................................19
Figure 4Train Model SSD...........................................................................................................................24
Figure 5Test Model SSD............................................................................................................................25
Figure 6 Domain Model..............................................................................................................................26
Figure 7 Domain Model..............................................................................................................................26
Figure 8 Flow Chart....................................................................................................................................28
Figure 9 Select Dataset Flowchart..............................................................................................................29
Figure 10 View Dataset Flow Chart...........................................................................................................29
Figure 11 Features computation flowchart.................................................................................................30
Figure 12 Preprocessing Flowchart............................................................................................................31
Figure 13 ML Model Flowchart.................................................................................................................32
Figure 14 Evaluation Metric Flowchart......................................................................................................33
Figure 15Validation Method Flow Chart....................................................................................................33
Figure 16 Save model Flow Chart..............................................................................................................35
Figure 17 Test Saved Model Flow Chart....................................................................................................35
Figure 18 Main Page Interface....................................................................................................................35
Figure 19About Page Interface...................................................................................................................36
Figure 20Train Model Choose Dataset Interface........................................................................................36
Figure 21View Dataset Interface................................................................................................................37
Figure 22Feature Selection Interface..........................................................................................................37
Figure 23Data Preprocessing Interface.......................................................................................................38
Figure 24Classifier Selection Interface.......................................................................................................38
Figure 25Classifier Result Interface...........................................................................................................39
Figure 26Test Model History Interface.......................................................................................................39
Figure 27Unseen Tweet Prediction Interface..............................................................................................40
Figure 28Text Result Interface...................................................................................................................40

9
List of Tables
Table 1 Functional Requirements....................................................................................................................
Table 2 Non-Functional Requirements............................................................................................................
Table 3 Train Model Use Case Description.....................................................................................................
Table 4 Test Model Use Case Description......................................................................................................
Table 5 Layers Definition................................................................................................................................
Table 6 Choose Dataset Test Case...................................................................................................................
Table 7 Train Model Testcase..........................................................................................................................
Table 8 Apply Feature Extraction method on Dataset.....................................................................................
Table 9 Apply Part of speech on Dataset Test Case........................................................................................
Table 10 Remove special characters Test Case...............................................................................................
Table 11 Apply Preprocessing Technique on Dataset Test Case....................................................................
Table 12 Apply Lemmatization Technique on Dataset Test Case...................................................................
Table 13 Apply Lemmatization Technique on Dataset Test Case...................................................................
Table 14 Moving to Classifier Test Case.........................................................................................................
Table 15 Machine Learning Model Test Case.................................................................................................
Table 16 Evaluation Metrics Test Case...........................................................................................................
Table 17 Evaluation Metrics Test Case...........................................................................................................
Table 18 Apply Classifier Testcase.................................................................................................................
Table 19 Save Model Test Case.......................................................................................................................
Table 20 Save Model Test Case 2....................................................................................................................
Table 21 Test model Test Case......................................................................................................................100
Table 22 Test model Test Case 2...................................................................................................................101
Table 23 Unseen Prediction Test Case..........................................................................................................102
Table 24 Unseen Prediction Test Case 2.......................................................................................................103
Table 25 Unseen Prediction Test Case 3.......................................................................................................103
Table 26 About Page Test Case.....................................................................................................................104
Table 27 Project Evaluation Guidelines........................................................................................................110

10
Chapter 1

Introduction

The following chapter provides the brief summary of project scope, project specification of the
project, this report includes an existing system and technologies which is used for the
development of the software, it also includes the flow of our project timeline and breakdown
structure of the project.

1.1. Project Introduction


Social media platforms such as Twitter provide convenient ways to share and consume
important information during disasters and emergencies. Information from bystanders and
eyewitnesses can be useful for law enforcement agencies and humanitarian organizations to
get firsthand and credible information about an ongoing situation to gain situational
awareness among other potential uses. However, the identification of eyewitness reports on
Twitter is a challenging task. Therefore, our work investigates an efficient way to analyze
such feedback and solve the problems related to the classification of Eyewitness of disasters
from twitter data. This exploits the use of different machine learning approaches to solve
user’s review classification problems based on different feature engineering techniques. The
classifiers, such as Naive Bayes and Random Forest were trained on tweets text to predict the
user’s tweet as being direct-eyewitness, non- eyewitness or don’t know for disasters and
emergencies.

1.2. Problem Statement


This work investigates different types of sources on tweets related to eyewitnesses and
classifies them into three types

 Direct eyewitnesses
 Non eyewitnesses
 Don’t know.

Moreover, we investigate various characteristics associated with each kind of eyewitness type.
We observe that words related to perceptual senses tend to be present in direct eyewitness
messages, whereas emotions, thoughts, and prayers are more common in indirect witnesses
11
We use these characteristics and labeled data to train several machine learning classifiers. Our
results performed on several real-world Twitter datasets reveal that textual features (bag-of-
words) when combined with domain-expert features achieve better classification
performance. Our approach contributes a successful example for combining crowdsourced
and machine learning analysis, and increases our understanding and capability of identifying
valuable eyewitness reports during disasters. [3]

1.3. Business Scope


This application will facilitate the researchers all over the globe in their work of short text
classification. The scope of text classification is at sentence level, or even sub-sentence level.
The business scope is very clear because such system can enhance the reputation of the
university by providing them a very intelligent system. This application can be used widely by
any university later to have a system which is very much user friendly.

1.4. Objectives
This project will have following objectives
 Textual Feature Computation.
 Applying Machine Learning Algorithms for classification purposes.
 Model performance is evaluated using 10-fold cross validation & Hold-Out method
 Model performance is presented by evaluation metrics such accuracy, precision recall
and f-measure.

1.5. Useful Tools and Technologies

PyCharm is a hybrid-platform developed by JetBrains as an IDE for


Python. It is commonly used for Python application development. Some
of the unicorn organizations such as Twitter, Facebook, Amazon, and
Pinterest use PyCharm as their Python IDE. [1]

Visual Studio Code is a streamlined code editor with support for


development operations like debugging, task running, and version
12
control. It aims to provide just the tools a developer needs for a quick
code-build-debug cycle and leaves more complex workflows to fuller
featured IDEs, such as Visual Studio IDE. [4]

Flask is a micro web framework written in Python. It is classified as a


micro framework because it does not require particular tools or libraries.
It has no database abstraction layer, form validation, or any other
components where pre- existing third-party libraries provide common
functions. [8]

Angular Material is a UI component library for Angular JS developers.


Angular Material components help in constructing attractive, consistent,
and functional web pages and web applications while adhering to modern
web design principles like browser portability, device independence, and
graceful degradation. [7]

PostgreSQL, also known as Postgres, is a free and open-source relational


database management system emphasizing extensibility and SQL
compliance. It was originally named POSTGRES, referring to its origins
as a successor to the Ingres database developed at the University of
California, Berkeley. [9]

pgAdmin is the leading open-source management tool for PostgreSQL, the


world's most advanced open source database. pgAdmin 4 is designed to
meet the needs of both novice and experienced Postgres users alike,
providing a powerful graphical interface that simplifies the creation,

13
maintenance, and use of database objects. [9]

1.6. Project Work Break Down


A work-breakdown structure in project management and systems engineering is a
deliverable oriented breakdown of a project into smaller components. A work breakdown
structure is a key project deliverable that organizes the team's work into manageable
sections. The Project is divided into several stages. All members are going to participate in
different stages of development as mentioned in Figure 1 which shows that the tasks assign
to different members.

14
Figure 1 Work Breakdown Chart

15
1.7. Project Time Lapse
A timeline is a chronological list of events that have happened or are about to happen.
Project timelines are the same, they tell you what tasks you need to complete and how
much time you have to complete them. As mentioned in Figure 2.

16
Chapter 2

Requirement Specification and Analysis

Requirement’s analysis is a process of determining user expectations for a new or modified


product. These features, called requirements, must be quantifiable, relevant and detailed. In
software engineering, such requirements are often called functional specifications. In Chapter
2 we will enlist the functional and non-functional requirements and model functional
requirements in the form of use case model.

2.1. Functional Requirements


Functional requirements define functionalities of a system or its components. Functional
requirements may be calculations, technical details, data manipulation and processing and
other specific functionality that define what a system is supposed to accomplish.

17
S. No. Functional Requirement Type Status

1. User can select earthquake Core Completed


disaster dataset.
2. User can select Hurricane Core Completed
Table 1 Functional
disaster dataset. Requirements
3. User can select flood disaster Core Completed
dataset.
4. User can select Wildfire Intermediate Completed
disaster dataset.
5. User can select all datasets. Core Completed

6. User can select Part of Speech Core Completed


feature for Training Model.

7. User can select Bag of Words Core Completed


feature for Training Model.

8. User can select TF-IDF (Team Core Completed


frequency-inverse document
frequency) feature for
Training Model.
9. User can select Word2vec Intermediate Completed
feature for Training Model.

10. User can select Uni-gram and Intermediate Completed


Bi-gram features for Training
Model.
11. User can select Fasttext Core Completed 2.2. Non
feature for Training Model. -
12. User can select all features for Core Completed
Training Model.

13. User can select lemmatization Core Completed


and stop words removal
preprocessing technique.
14. User can select stop wards Core Completed
removal preprocessing
technique.
15. User can select special Core Completed
character removal
preprocessing technique.
16. User can select random forest Core Completed
(RF) machine learning model
18 for Training Model.
17. User can select Naïve Bayes Core Completed
machine leaning model for
Training Model.
18. User can select Accuracy Core Completed
evaluation metric for
Functional Requirements
Non-Functional requirements (NFR) specify the quality attribute of a software system. They
serve as constraints or restrictions on the design of the system across the different backlogs. I.
Jacobson et al [5] state that NFR as a requirement that specifies system properties, such as
environmental and implementation constraints, performance, platform dependencies,
maintainability, extensibility and reliability.
Table 2 Non-Functional Requirements

S. No. Non-Functional Requirements Category

1. The user should reach the classified text with one button Usability
press if possible

2. The system also should be user friendly for admins because Usability
anyone can be admin instead of programmers.

3. Will predict class label (direct-eyewitness/non-eyewitness Accuracy


and don’t know) with maximum accuracy.

4. This application is being developed using tweet’s features Reliability


and machine learning techniques. Therefore, there is no
certain reliable percentage that is measurable.
5. Computation time and response time should be as little as Performance
possible, because one of the software’s features is
timesaving. Whole cycle of classifying a dataset should not
be more than 40 seconds.
6. After entering unseen tweet text, the system should classify Availability
it within defined time.

2.3. Use Case Modeling

A Use Case depicts how actors will interact with the system. A use case is a methodology
used in system analysis to identify, clarify and organize system requirements. The use case is
made up of a set of possible sequences of interactions between systems and users in a
particular environment and related to a particular goal. Following use case diagrams will

19
depict how our system works.

2.4. Use Case Diagram:


Use-case diagrams describe the high-level functions and scope of a system. These diagrams
also identify the interactions between the system and its actors. The use cases and actors in
use-case diagrams describe what the system does and how the actors use it, but not how the
system operates internally.

Train model

User
Test model

Figure 2Use case Diagram

2.5. Use case Description

2.5.1. Train model Use case Description


Table 3 Train Model Use Case Description

Use Case ID: UC 1


UC Name Train Model
20
Actors User
Description User trains the model by selecting dataset then disaster then
extract feature then select preprocessing technique then weighting
technique. After this user selects ML model, evaluation metrics
and validation technique then request to train model and save it.
Trigger “Train Model” button.
Pre-condition User must access the website.
Post-condition User must train the model successfully.
Basic Flow User System
1. User clicks on Train Model 2. System displays dataset
option. to choose.
3. User selects the dataset. 4. System notifies the user
5. User selects the disaster and about dataset selection and
click on next button. display disasters to choose.
7. User selects the feature 6. System notifies the user
extraction technique and about disaster selection and
click on next button. display to choose feature
9. User selects the extraction technique.
preprocessing technique and 8. System notifies the user
click on next button. about feature extraction
11. User selects the weighting technique selection and
technique and click on next display to choose
button. preprocessing technique.
13. User selects the Machine 10. System notifies the
learning model then evaluation user about preprocessing
metrics and after that the technique selection and
validation technique and click display to choose
on train model button. weighting technique.
12. System notifies the
user about weighting
technique selection and
display next page.
14. System successfully
selects all options and notifies
that model is trained
successfully.
Alternative Flow 1. User clicks on test model option.

21
2. User selects view dataset.
3. User cancel to save the model.
4. User selects view result.

Exception 1. Selected dataset had some miscellaneous information.


2. During loading dataset, System stops responding.
3. Service may not be available.
4. System may crash while posting request.
5. Preprocessing and feature extraction method may not be
applied.
6. Unknown error occurred while saving model.

22
2.5.2. Test Model Use Case Description
Table 4 Test Model Use Case Description

Use Case ID: UC 2


UC Name Test Model
Actors User
Description User enters unseen data and system predicts the class label.
Trigger “Predict” button.
Pre-condition Model must be trained.
Post-condition System will predict the class label.
Basic Flow User System
1. User selects the test model 2. System displays to
option. select already trained
3. User selects the already model.
trained model. 4. System notifies
5. User enters unseen data and about selected model.
click Predict button to predict 6. System displays the
class label. predicted class label.

Alternative Flow User selects the train model option.


Exception 1. System may not be responding at the moment.
2. No past data.
3. An unknown error occurred while updating the status.

23
Chapter 3

System Design

The purpose of this chapter is to provide information that is complementary to the


development phase. Without an adequate design, that delivers required function as well as
quality attributes, the project will fail. However, communicating architecture to its
stakeholders is as important a job as creating it in the first place.

3.1. Layer Definition

Table 5 Layers Definition

Layers Description

Presentation Layer This layer will be used for the interaction with the user
through a graphical user interface.

Business Logic Layer This layer contains the business logic. All the
constraints and majority of the functions reside under
this layer.

3.1.1. Presentation Layer

Occupies the top level and displays information related to services available on a website.
This tier communicates with other tiers by sending results to the browser and other tiers in
the network.

3.1.2. Business Logic Layer

Application Layer also called the middle tier, logic tier, business logic or logic tier, this tier
is pulled from the presentation tier. It controls application functionality by performing
detailed processing.
24
3.2. System Design Diagrams

System design is divided into two parts:

3.2.1. High Level Design

High-level design provides a view of the system at an abstract level. It shows how the major
pieces of the finished application will fit together and interact with each other. The high-
level design does not focus on the details of how the pieces of the application will work.
Those details can be worked out later during low-level design and implementation.

3.2.2. System Sequence Diagrams

System sequence diagram (SSD) is a sequence diagram that shows, for a particular scenario
of a use case, the events that external actors generate their order, and possible inter-system
events.

3.2.2.1. Train Model SSD

This is Train Model’s system sequence diagram which shows that when the user clicks on
Train model button, datasets are displayed. User first selects the dataset and dataset
selected. Then user selects disaster type and disaster selected as well. After this user
selects the feature extraction and after selection user selects the preprocessing technique.
After this user selects the weighting technique and then selects the ML model, evaluation
metrics and validation technique. After this user requests system to train the model.
System trains the model successfully. After that user can view result as well as save the
trained model as shown in Figure#4.

25
Figure 3Train Model SSD

26
3.2.2.2. Test Model SSD

When the user selects the Test model button, user selects the already trained model and
system in response displays the selected trained model. Then user enter the unseen data
and clicks on predict button and after this system displays the predicted class label as
shown in Figure#5.

Figure 4Test Model SSD

27
3.3. Domain Model

The Domain Model is your organized and structured knowledge of the problem. The Domain
Model should represent the vocabulary and key concepts of the problem domain and it should
identify the relationships among all of the entities within the scope of the domain. In our
system we have twelve entities, the user entity is used to register and login to the system, and
train model entity is used to train the model, to train the model we first need the dataset, so we
have a choose dataset entity, and after choosing data we have to apply feature extraction
method and preprocessing technique so we have a feature extraction entity and preprocessing
entity, after this we need machine learning model, evaluation metrics and validation technique
entities, and after getting the result of train model, we have to store that model in our database
so we have save model entity, we can also test our model by giving unseen review so we
have test model entity, then system predict result so will also be having prediction entity.

Figure 5 Domain Model

28
3.4. Flow Chart

 First user will enter into the system.


 After entering system will take user to home page of the system.
 After that user will select the dataset, if user want to view the dataset user can also
view the dataset.
 Then user will select one of the feature extraction methods from the given options.
 After selecting feature user will select preprocessing techniques from the given
options.
 User can also view the data on which feature extraction method and preprocessing
techniques is applied.
 Then user will select one machine learning model from the given options.
 Then User will select Evaluation metrices from the given options.
 User will also select one the validation technique from the given options.
 When User will apply these on dataset, system will train model and will display the
results for the trained model. User can also save the trained model in the system.
 User can go to history tab and view all the saved trained model.
 From History tab user can test the trained model which is saved in our system, by
giving unseen tweet.
 Then system system will predict the class label of the unseen review (Eyewitness/Non
Eyewitness/Don’t know), from the trained model which is selected.

29
flow chart Flow chart

Start

Train Model Test Model Select Already


Select Dataset
Trained Model

End

Enter Unseen Tweet


Text

View Dataset Save Model

Predict Class Label


View Result

Select Features

Train Model

End

Select Preprocessing
Technique
Select Validation
Technique

Select ML Model Select Evaluation


Metric

Figure 6 Flow Chart

30
3.4.1 View Dataset Flow Chart

User will View Dataset.


Start

Select Dataset

View Dataset

End

Figure 7 Select Dataset Flowchart

3.4.2 Features Computation Flow Chart

User will select one of the feature extraction methods from the given options, e.g.

 Bag of words.
 Part of speech tagging.
 Unigram.
 Bigram.
 TF-IDF (Term Frequency – Inverse Document Frequency),
 Word2vec
 Fasttext
Figure 8 View Dataset Flow Chart

31
flow chart Feature Computation FC

Part of Speech
Tagging

Bag of Words

Unigram

Select Features
Bigram

Start TFIDF - UNIGRAM

End

TF-IDF

TFIDF - BIGRAM

FastText

Word2vec

All Features

Figure 9 Features computation flowchart

32
3.4.3 Pre-Processing Flow Chart

User will select preprocessing techniques from the given options, e.g.

 Stopwords Removal.
 Stopwords Removal and Special Character Removal.
 Stopword Removal, Special Character Removal and Lemmatization.
flow chart Preprocessing Flowchart

Select Stopword
Removal

Select Preprocessing Technique Select Stopword


Removal + Special
End
Character Removal
Start

Select Stopword Removal +


Special Character Removal +
Lemmitization

Figure 10 Preprocessing Flowchart

3.4.4 Machine Learning Modeling Flow Chart

User will select one machine learning model from the given options:

 Naïve Bayes
 Random Forest

33
flow chart ML MODEL FC

Select Machine Select Naive


Learning Model Bayes

Start

End

Select Random
Forest

Figure 11 ML Model Flowchart

3.4.5 Evaluation Metrics Flow Chart

User will select Evaluation metrices from the given options:

 Accuracy
 F-measure
 Precision
 Recall

34
flow chart Evaluation Metric FC

Select Accuracy

Select Evaluation Metrics


Select Precision

End
Start

Select Recall

Select F-Measure

Select All Evaluation


Metrics

Figure 12 Evaluation Metric Flowchart

3.4.6 Validation Method Flow Chart


User will also select one the validation technique from the given options, e.g.
 10-Fold Cross Validation Method
 Hold-Out Method

35
flow chart Validation Technique FC

Select Validation Select 10-Fold


Technique Cross Validation

Start End

Select Hold-out
Method

Figure 13Validation Method Flow Chart

36
3.4.7 Save Model Flow Chart

User will save the trained model into the system with following credentials:
 Current date and time
 Dataset Name
 Feature Computation
 Pre-Processing Technique
 Machine Learning Model
 Accuracy
 Precision
 Recall
 F-measure
 Validation Technique

37
flow chart Save Model FC

Start
End

Save Current Date


and Time
Save Validation
Techniques

Save Dataset
Name
Save Evaluation
Meterics

Save Feature
Computation Save
Preprocessing Save Machine
Methods
Techniques Learning Model

Figure 14 Save model Flow Chart

3.4.8 Test Saved Model Flow Chart

User will test the saved model by giving unseen review.

flow chart predicted label FC

Enter Unseen
Predicted Label
Tweet
Start End

Figure 15 Test Saved Model Flow Chart

3.5. User Interface Design


3.5.1. Homepage interface

Main page shows tabs train model, test model, View Dataset and About. User can select any
of the tab as shown in Figure 16.

38
Figure 16 Main Page Interface

3.5.2. About Page

Figure 17About Page Interface

39
3.5.3. Train Model

User selects the ML model, evaluation metrics and validation technique and click on train
model button. User can also click on back button to make any change or view result button to
check the results as shown in Figure 18.

Figure 18Train Model Choose Dataset Interface

3.5.4. View Dataset Interface

In this user can select the dataset and upload that file from system location and press button
view data to preview data as shown in figure19.

40
Figure 19View Dataset Interface

3.5.5. Feature Selection Interface

User can select any of the feature and click on the next button for further process or back.

Figure 20Feature Selection Interface


3.5.6. Data Preprocessing Interface

User can select any of the preprocessing technique and click on the next button for further
process or back button if he wants to change dataset or disaster or feature as shown in Figure
21.
41
Figure 21Data Preprocessing Interface

3.5.7. Classification Selection Interface

Figure 22Classifier Selection Interface

3.5.8. Classifier Result Interface

42
Figure 23Classifier Result Interface

3.5.9. Test Model History Interface

Figure 24Test Model History Interface

3.5.10. Unseen Tweet Prediction Interface

43
Figure 25Unseen Tweet Prediction Interface

3.5.11. Text Result Interface

Figure 26Text Result Interface

Chapter 4
Software Development

44
4.1. Coding Standards

4.1.1 Indentation

Proper code indention is used in this project. The indentation of blocks of code
enhances readability, understandability and hierarchy of lines of code.

4.1.2 Declaration

 In this project we have used one declaration per line is to increase clarity and better
understanding of code. Following is the order of declaration:
 All the widgets have been imported at the beginning.
 The sequence of class variables is: First public, protected then private.
 Instance variables follow the sequence: First public then private instance variables.
 Then class constructors are declared with proper names.
 Class methods are grouped by functionality rather than by scope or accessibility to
make reading and understanding the code easier.
 Declarations for local variables are only at the beginning of code after importing
packages and libraries

4.1.3 Statement Standards

Each line of code contains one declaration at most. Compound statements in this project
contain lines of code enclosed in braces. The inner block of code of compound statements
begins after the opening braces from next line. Proper indentation is also followed for lines
of codes inside the compound statements. Proper braces are used in code around all
statements such as if-else, try-catch etc.

4.1.4 Naming Convention

Proper naming convention rules are followed while implementation of this project which
make programs more understandable by making them easier to read.

45
While implementing this project, we have used words from Natural Language (English) to
properly assign understandable names to classes, variables and methods. Such as Requests,
Document Collection, Basic Information etc. instead of un-understandable names like myc
method, a1, b1 etc.
Terminologies applicable to the domain of project are used. Implying that if user refers to
Email as Registration Number, then term Registration Number is used.
Mixed case is used to make names readable with lower case letters in general capitalizing the
first letter of class names and interface names.

4.2 Front End Development Environment


The Hypertext Markup Language, or HTML is the standard markup
language for documents designed to be displayed in a web browser. It
can be assisted by technologies such as Cascading Style Sheets and
scripting languages such as JavaScript. [3]

JavaScript is a scripting or programming language that allows you to


implement complex features on web pages every time a web page does
more than just sit there and display static information for you to look at
displaying timely content updates, interactive maps. [5]

Angular is a platform and framework for building single-page client


applications using HTML and Typescript. Angular is written in
Typescript. It implements core and optional functionality as a set of
Typescript libraries that you import into your apps. Angular Material
is a UI component library for Angular JS developers. Angular Material components
help in constructing attractive, consistent, and functional web pages and web
applications

46
while adhering to modern web design principles like browser portability, device
independence, and graceful degradation. [7]

4.3 Back End Development Environment

PyCharm is an integrated development environment used in


computer programming, specifically for the Python language. [1]

Python is an interpreted high-level general-purpose programming


language. Python's design philosophy emphasizes code readability
with its notable use of significant indentation. [2]

Flask is a micro web framework written in Python. It is classified


as a micro framework because it does not require particular tools or
libraries. It has no database abstraction layer, form validation, or
any other components where pre-existing third-party libraries
provide
common functions. [8]

SQLite is a relational database management system contained in


a C library. In contrast to many other database management
systems,
SQLite is not a client–server database engine. Rather, it is
embedded into the
47
end program. SQLite generally follows PostgreSQL syntax. [6]

48
4.4 Software Description
4.4.1. Module Classifier Code
# from mlxtend.classifier import StackingClassifier
import time

import pandas as pd
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import precision_score, recall_score,
classification_report, accuracy_score, \
f1_score
from sklearn.model_selection import ShuffleSplit
from sklearn.model_selection import cross_val_score
from sklearn.model_selection import train_test_split
from sklearn.naive_bayes import GaussianNB
from sklearn.preprocessing import StandardScaler, LabelEncoder

def get_preprocessing(pre_processing):
if pre_processing == 'Stopwords Removal':
return "a1"
elif pre_processing == 'Stopwords + Special Characters':
return "a2"
else:
return "a3"

def read_dataset(dataset_name, feature_type, pre_processing):


pre_processing = get_preprocessing(pre_processing)
if feature_type.lower() == 'part of speech tagging':
dataset = pd.read_csv("features/feature/part_of_speech/" +
dataset_name.lower() +
"_pos_" + pre_processing.lower() + '.csv',
encoding="ISO-8859-1")
dataset.drop('text', axis=1, inplace=True)
dataset.drop('Tweet #', axis=1, inplace=True)
dataset.reset_index(drop=True, inplace=True)
print(dataset.head())

elif feature_type.lower() == 'bag of words technique':


dataset = pd.read_csv(
"features/feature/bag_of_words/" + dataset_name.lower() + "_bog_"
+ pre_processing.lower() + '.csv')
dataset.drop('text', axis=1, inplace=True)
dataset.drop('Tweet #', axis=1, inplace=True)
dataset.reset_index(drop=True, inplace=True)

elif feature_type.lower() == 'tf-idf technique':


dataset = pd.read_csv(
"features/feature/tfidf/" + dataset_name.lower() + "_tf_idf_" +
49
pre_processing.lower() + '.csv')

50
dataset.drop(dataset.columns[[0, -1]], axis=1, inplace=True)
dataset.drop('text', axis=1, inplace=True)
dataset.reset_index(drop=True, inplace=True)

elif feature_type.lower() == 'unigram':


dataset = pd.read_csv(
"features/feature/unigram/" + dataset_name.lower() + "_uni_gram_"
+ pre_processing.lower() + '.csv')
dataset.drop('text', axis=1, inplace=True)
dataset.drop('Tweet #', axis=1, inplace=True)
dataset.reset_index(drop=True, inplace=True)

elif feature_type.lower() == 'bigram':


dataset = pd.read_csv(
"features/feature/bigram/" + dataset_name.lower() + "_bi_gram_" +
pre_processing.lower() + '.csv')
dataset.drop('text', axis=1, inplace=True)
dataset.drop('Tweet #', axis=1, inplace=True)
dataset.reset_index(drop=True, inplace=True)
elif feature_type.lower() == 'word2vec':
dataset = pd.read_csv(
"features/feature/word2vec/" + dataset_name.lower() + "_word2vec_"
+ pre_processing.lower() + '.csv')
dataset.drop('text', axis=1, inplace=True)
dataset.drop('Tweet #', axis=1, inplace=True)
dataset.reset_index(drop=True, inplace=True)
dataset.dropna(inplace=True)
print(len(dataset.index))

else:
raise Exception('Unknown Feature Type')
return dataset

def generate_random_forest(dataset):
label_Label = LabelEncoder()
# covernverting text into numbers
dataset["label"] = label_Label.fit_transform(dataset['label'])

X = dataset.drop("label", axis=1)
y = dataset['label']

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3)

sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)

start = time.time()
classifier = RandomForestClassifier(n_estimators=42, criterion='entropy')
classifier.fit(X_train, y_train)
y_pred = classifier.predict(X_test)

51
cv = ShuffleSplit(n_splits=5, test_size=0.3)

52
scores = cross_val_score(classifier, X, y, cv=10)
print(classification_report(y_test, y_pred))
print("Random Forest accuracy after 10 fold CV: %0.2f (+/- %0.2f)" %
(scores.mean(), scores.std() * 2) + ", " + str(
round(time.time() - start, 3)) + "s")
print("******************************")
print("******************************")
print("******************************")

# print (' Accuracy:', accuracy_score(y_test, y_pred))


print('scores.mean:', scores.mean())
accuracy = scores.mean()
print(" ")
print('Precision:', precision_score(y_test,
y_pred, average='weighted'))
precision = precision_score(y_test, y_pred, average='weighted')
# print ('Precision:', precision_score(y_test, y_pred))
print(" ")
print('Recall:', recall_score(y_test, y_pred, average='weighted'))
recall = recall_score(y_test, y_pred, average='weighted')
# print ('Recall:', recall_score(y_test, y_pred))
print(" ")
print('F1 score:', f1_score(y_test, y_pred, average='weighted'))
f1score = f1_score(y_test, y_pred, average='weighted')
# print ('F1 score:', f1_score(y_test, y_pred))
print(" ")

print("***********************************************************************
*******************")

return accuracy, precision, recall, f1score, classifier

def generateNaiveBayes(dataset):
start = time.time()
label_Label = LabelEncoder()
print(dataset["label"])
# covernverting text into numbers
dataset["label"] = label_Label.fit_transform(dataset['label'])
print(dataset["label"])
X = dataset.drop("label", axis=1)
y = dataset['label']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3)
nb = GaussianNB()
print(X.head())
print(y)
nb.fit(X_train, y_train)
y_pred = nb.predict(X_test)
cv = ShuffleSplit(n_splits=5, test_size=0.3)
scores = cross_val_score(nb, X, y, cv=10)
print(classification_report(y_test, y_pred))

53
print("Naive Bayes accuracy after 10 fold CV: %0.2f (+/- %0.2f)" %
(scores.mean(), scores.std() * 2) + ", " + str(

54
round(time.time() - start, 3)) + "s")
print("******************************")
print("******************************")
print("******************************")

print('Accuracy:', accuracy_score(y_test, y_pred))


accuracy = accuracy_score(y_test, y_pred)

print(" ")
print('Precision:', precision_score(y_test, y_pred, average='weighted'))
precision = precision_score(y_test, y_pred, average='weighted')
# print ('Precision:', precision_score(y_test, y_pred))

print(" ")

print('Recall:', recall_score(y_test, y_pred, average='weighted'))


recall = recall_score(y_test, y_pred, average='weighted')
# print ('Recall:', recall_score(y_test, y_pred))

print(" ")

print('F1 score:', f1_score(y_test, y_pred, average='weighted'))


f1score = f1_score(y_test, y_pred, average='weighted')
# print ('F1 score:', f1_score(y_test, y_pred))

print(" ")
return accuracy, precision, recall, f1score, nb

if name == " main ":


print("Start")
accuracy, precision, recall, f1score =
generateNaiveBayes(read_dataset('uni_gram', 'a1'))

4.4.2. Module Feature Computation Code

Part-of-Speech Tagging
A Part-Of-Speech Tagger (POS Tagger) is a piece of software that reads text in some language
and assigns parts of speech to each word (and other token), such as noun, verb, adjective, etc.
Following code is for POS
import re import nltk
import pandas as pd
from NGRAM import output_to_csv
from NGRAM import stopword_removal

feature_list = ["NN", "NNP", "CD", "VBD", "VBN", "NNS", "JJ", "PRP", "RB",

55
"VBP", "VBG", "VBZ", "IN", "DT", "NNPS",
"VB", "CC", "JJS", "PRP$", "JJR", "MD", "WRB", "UH", "EX",
"FW", "RBR", "WP", "TO", "RBS", "RP", "WDT",
"PDT"]

def main():
Tweets_df =
pd.read_csv("D:/FYP/Dataset/hurricanes_eyewitness_annotations_2004.csv")
texts_list = Tweets_df['text'].tolist()
print(texts_list)
pos_list = []
# texts_list[0] = "Playing...."
#for i in range(len(texts_list)):
#texts_list[i] =
texts_list[i].lower()
# Return a match at every NON word character (characters NOT between a
and Z. Like "!", "?" white-space etc.)
#texts_list[i] = re.sub(r'\W', ' ', texts_list[i])
## Replace all white-space characters with ""
#texts_list[i] = re.sub(r'\s+', ' ', texts_list[i])
# print(texts_list[i])
for text in texts_list:
tokens = nltk.word_tokenize(text)
print(tokens)
tokens = stopword_removal(tokens)
print(tokens)
tagged = nltk.pos_tag(tokens)
print(tagged)
counts = nltk.Counter(tag for word, tag in tagged)
print(counts)
pos_list.append(counts)

output_to_csv('POS_output.csv', pos_list, Tweets_df)

def get_features(text, preprocessing):


if preprocessing == "Stopwords + Special Characters":
text = text.lower()
# Return a match at every NON word character (characters NOT between a
and Z. Like "!", "?" white-space etc.)
text = re.sub(r'\W', ' ', text)
# Replace all white-space characters with ""
text = re.sub(r'\s+', ' ', text)

tokens = nltk.word_tokenize(text)
tokens = stopword_removal(tokens)
tagged = nltk.pos_tag(tokens)
counts = nltk.Counter(tag for word, tag in tagged)
counts = dict(counts)
result = []
for feature in feature_list:
if feature in counts:

56
result.append(counts[feature])
else:

57
result.append(0)

print(result)
return result

if name == " main ":


main()
#get_features("This is good", "")

Bag-of-words :
A bag-of-words model, or BoW for short, is a way of extracting features
from text for use in modeling, such as with machine learning algorithms.
The approach is very simple and flexible, and can be used in a myriad of
ways for extracting features from documents Code import re

import nltk import pandas as pd from nltk.corpus import stopwords from


nltk.stem import WordNetLemmatizer

from n_gram import output_to_csv from Pre_Processing import


stopword_rem from Pre_Processing import lemmatization
#wordnet_lemmatizer = WordNetLemmatizer()

#def stopword_rem(token):
#tokens_without_sw = [word for word in token if not word in stopwords.words()]
#return tokens_without_sw

#def lemmitization(token):
#token = wordnet_lemmatizer.lemmatize(token, pos="v")
#return token

58
def
main():
Review_df = pd.read_csv("C:/FYP/POS tagging/bagofwords/abc.csv") texts_list =
Review_df['text'].tolist() # texts_list[0] = "Playing...." for i in range(len(texts_list)):
texts_list[i] = texts_list[i].lower()
# Return a match at every NON word character (characters NOT between a and Z. Like
"!", "?" white-space etc.) texts_list[i] =
re.sub(r'\W', ' ', texts_list[i]) # Replace all
white-space characters with "" texts_list[i]
= re.sub(r'\s+', ' ', texts_list[i])
# TODO Number remove

bag_of_words_list = [] count = 0

for sentence in texts_list:


wordfreq = {} tokens = nltk.word_tokenize(sentence)
# List of words/tokens #stopwords ftoken=stopword_rem(tokens)

"""
['The', 'The', asim] wordfreq['The'] wordfreq {
'key': value
The: 2
Samad: 1
}
sentence_1 = ['The', 'The', Asim] sentence_2 = ['The', 'BAG', asim]

[{}, {}, {}] """ for token in ftoken: # Token 1 word


token=lemmitization(token)

59
#token = wordnet_lemmatizer.lemmatize(token, pos="v")
if token not in wordfreq.keys():
wordfreq[token] = 1
else:
wordfreq[token] += 1

count += 1
bag_of_words_list.append(wordfreq)

output_to_csv('bag_of_words_output.csv', bag_of_words_list, Review_df)

if name == " main ":


main()

TF-IDF
TF-IDF is a statistical measure that evaluates how relevant a word is to a
document in a collection of documents. It has many uses, most importantly
in automated text analysis, and is
very useful for scoring words in machine learning algorithms for Natural
Language Processing (NLP).
CODE
import pandas as pd import re import nltk from Pre_Processing

import stopword_rem from Pre_Processing import lemmitization

punctuations = "?:!.,;"

def compute_tf(token): num_of_words = len(token)

freq = {} tf = {} for word in token: if word in freq:


freq[word] += 1 else:
60
freq[word] = 1

for value in freq:


tf[value] = freq[value] / num_of_words

return tf, freq

def compute_idf(doc_list):
import math idf_dict = {}
N = len(doc_list) # [{}, {}, {}] for doc in doc_list:
for word, val in doc.items(): if val > 0: if idf_dict.get(word):
idf_dict[word] += 1 else:
idf_dict[word] = 1

for word, val in idf_dict.items():


idf_dict[word] = math.log(N / float(val))

return idf_dict

def compute_tf_idf(tf_list, idf): for tf_dict in tf_list: for word in tf_dict:


# Tf = doc[word] # idf =
idf[word] tf_dict[word] = tf_dict[word]
* idf[word] return tf_list

def output_to_csv(file_name, data_list, review_df=None):

61
df = pd.DataFrame(data_list) df = df.fillna(0) df.index.name = "Review #" if
review_df is not None: df['text'] = review_df['text'] cols = df.columns.tolist()
cols = cols[-1:] + cols[:-1] df = df[cols] df.to_csv(file_name)

def main():
texts_list = ["it is going to rain today",
"today i am not going outside",
"i am going to watch the season premiere"]

# corpus = ['This is the first document.',


# 'This document is the second document.',
# 'And this is the third one.',
# 'Is this the first document?',
# ]
# train_set = ["sky is blue", "sun is bright", "sun in the sky is bright"]

# reviews_df = pd.read_csv("abc.csv")

# texts_list = reviews_df['text'].tolist()

for i in range(len(texts_list)):
texts_list[i] = texts_list[i].lower() texts_list[i] = re.sub(r'\W', ' ', texts_list[i])
texts_list[i] = re.sub(r'\s+', ' ', texts_list[i])

all_tfs = [] all_freqs = [] for text in texts_list: token =


nltk.word_tokenize(text) # Remove Punctuation for word in token:
if word in punctuations: token.remove(word)

62
# Lemmatization for i in range(len(token)): token[i] = lemmitization(token[i])
tf, freq = compute_tf(token) all_tfs.append(tf) all_freqs.append(freq)

idf = compute_idf(all_freqs) tfs_final = compute_tf_idf(all_tfs, idf)


output_to_csv('tf_idf_output.csv', tfs_final, None)

if name == ' main ':


main()

Pre-processing
In pre-processing we are doing stop-word removal , special character removal and lemmatization
Code from nltk import WordNetLemmatizer from nltk.corpus import stopwords
wordnet_lemmatizer = WordNetLemmatizer() def stopword_rem(token):
tokens_without_sw = [word for word in token if not word in stopwords.words()] return
tokens_without_sw

def lemmitization(token):
token = wordnet_lemmatizer.lemmatize(token, pos="v") return token

Word2vec Code

import xlrd import sys import codecs import json import nltk
from nltk import WordNetLemmatizer
from nltk.stem import PorterStemmer
from nltk.tokenize import word_tokenize
from nltk.tokenize import sent_tokenize
import gensim
import re
from gensim.models import Word2Vec
import array as ar
import xlsxwriter
wordnet_lemmatizer = WordNetLemmatizer()
from nltk.corpus import wordnet

63
# nltk.download('punkt')
# nltk.download('stopwords')
from Preprocessing import lemmitization

def get_wordnet_pos(word):
"""Map POS tag to first character lemmatize() accepts"""
tag = nltk.pos_tag([word])[0][1][0].upper()
tag_dict = {"J": wordnet.ADJ,
"N": wordnet.NOUN,
"V": wordnet.VERB,
"R": wordnet.ADV}

return tag_dict.get(tag, wordnet.NOUN)

lemmatizer = WordNetLemmatizer()
word = 'feet'
print(lemmatizer.lemmatize(word, get_wordnet_pos(word)))

def lemmatize_stemming(text):
# return stemmer.stem(WordNetLemmatizer().lemmatize(text, pos='v'))
lemmatizer = WordNetLemmatizer()
return lemmatizer.lemmatize(text, get_wordnet_pos(text))

def tokenize_Words(tokenized_text):
processed_article = tokenized_text.lower()
processed_article = re.sub(
r'''(?i)\b((?:https?://|www\d{0,3}[.]|[a-z0-9.\-]+[.][a-
z]{2,4}/)(?:[^\s()<>]+|\(([^\s()<>]+|(\([^\s()<>]+\)))*\))+(?:\(([^\s()<>]+|(\
([^\s()<>]+\)))*\)|[^\s`!()\[\]{};:'".,<>?«»“”‘’]))''',
" ", processed_article)
processed_article = processed_article.lower()
# Return a match at every NON word character (characters NOT between a and
Z. Like "!", "?" white-space etc.)
processed_article = re.sub(r'\W', ' ', processed_article)
# print(texts_list[i])
# Number remove
processed_article = re.sub("\d+", "", processed_article)
# print(texts_list[i])
# articles removal
processed_article = re.sub('\s+(a|an|and|the|they|them|is|am|are)(\s+)', '
', processed_article)
# Replace all white-space characters with ""
processed_article = re.sub(r'\s+', ' ', processed_article)

# Preparing the dataset


all_sentences = nltk.sent_tokenize(processed_article)
all_words = [nltk.word_tokenize(sent) for sent in all_sentences]
# Removing Stop Words
from nltk.corpus import stopwords
for i in range(len(all_words)):
all_words[i] = [w for w in all_words[i] if w not in

64
stopwords.words('english')]

65
for rw in range(len(all_words)):
for cl in range(len(all_words[rw])):
all_words[rw][cl] = lemmatize_stemming(all_words[rw][cl])
return all_words

loc = "wildfire.xlsx"
wb = xlrd.open_workbook(loc)
sheet = wb.sheet_by_index(0)
# For row 0 and column 0
tokenized_text = ""
reviewList = []

for i in range(1, sheet.nrows):


tokenized_text = tokenized_text + " " + sheet.cell_value(i, 0)
reviewList.append(sheet.cell_value(i, 0))
print(len(reviewList))
tokenized_text = tokenize_Words(tokenized_text)

print(tokenized_text)

# Lemmatization

model = Word2Vec(sentences=tokenized_text, min_count=1, workers=1, sg=1,


window=5, seed=128)
vecReview = []
lst = ""
for rvw in reviewList:
lst = rvw
words = tokenize_Words(rvw)
for wordList in words:
vec = [model.wv[word] for word in wordList]
vecReview.append(vec)

vecfinal = []
for vr in vecReview:
count = 0
temp = []
for wr in vr:
if count == 0:
count = 1
for cnt in range(0, 100):
temp.append(wr[cnt])
else:
for cnt in range(0, 100):
temp[cnt] = temp[cnt] + wr[cnt]
vecfinal.append(temp)
# with open("tt.txt", "w") as
output: #
output.write(str(vecfinal))

66
print(len(vecfinal))

67
workbook = xlsxwriter.Workbook('res.xlsx')
worksheet = workbook.add_worksheet()

for c in range(0, 100):


worksheet.write(0, c, 'D' + str(c + 1))

col = 0
row = 1

for ii in vecfinal:
col = 0
for jj in ii:
worksheet.write(row, col, jj)
col = col + 1
row = row + 1

workbook.close()
print(vecfinal)

4.4.3. MODULE WEB APP CODE

Home
Component.html

<mat-drawer-container>
<mat-drawer-content>
<div class="silder">
<mat-icon aria-hidden="false" aria-label="Example home icon" class="desktop-hide"
(click)="drawer.toggle()">
reorder
</mat-icon>
<div class="silder-content">
<h1 style="font-family: flex;">EYEWITNESS IDENTIFICATION</h1>

</div>
</div>
<footer class="footer" style="background-color: gray; padding: auto; height: 155px;">
<div class="container-fluid">

<p class="copyright pull-right" style="font-family: flex; color: white;">

<a href="#" style="font-family: flex;">Supervised By Dr.Shahid Iqbal Malik</a>,


made by Asim Zubair & M.Tayyub Khan Ilyas
</p>
68
</div>
</footer>
</mat-drawer-content>
</mat-drawer-container>

About Component.html

<div class="main-content">
<div class="container-fluid">
<div class="row">
<div class="col-md-12">
<div class="card">
<div class="header">
<h4 class="title" style="font-family: flex;">Identification of Eye Witness
Tweets</h4>
</div>
<div class="content">
<div>Identifying Eyewitness during Disaster
Identifying Eyewitness is an important area of research in cognitive
psychology and human
memory.
Eyewitnesses frequently play a vital role in uncovering the truth about a
disaster.
The evidence they provide can be critical in identifying, charging, and
ultimately saving
the stucked people.
That is why it is absolutely essential that eyewitness evidence be accurate and
reliable and
twitter is one such source.

69
</div>
</div>
</div>

</div>
</div>
</div>
<div class="container-fluid">
<div class="row">
<div class="col-md-12">
<div class="card">
<div class="header">
<h4 class="title" style="font-family: flex;" >Major Steps Include</h4>
</div>
<div class="content">
<div>
<ul>
<li>Dataset Selection</li>
<li>Train Model</li>
<li>Data Preprocessing</li>
<li>Feature Extraction</li>
<li>Machine Learning Model</li>
<li>Model Validation Techniques</li>
<li>Evaluation</li> <li>Model Testing</li>
<li>Save Model</li>
</ul>

</div>
</div>
</div>

</div>
</div>
</div>

</div>
<footer class="footer">
<div class="container-fluid">

<p class="copyright pull-right" style="font-family: flex;">


70
Today is : {{todayDate | customDate}} <br><a href="#" style="font-family:
flex;">Supervised By Dr.Shahid Iqbal Malik</a>, made by Asim Zubair & M.Tayyub Khan
Ilyas

</div>
</footer>

Sidebar Component.html

<div class="sidebar-wrapper">

<div class="logo">

<a class="simple-text" style="font-family: Asim; font: size 500px;"


href="/src/app/home/home.component.html">

<!-- <div class="logo-img">

<img src="/assets/img/angular2-logo-white.png"/>

</div> -->

EYE WITNESS IDENTIFICATION

</a>

</div>

<ul class="nav responsive-nav">

<li *ngIf="isMobileMenu()">

<a class="dropdown-toggle" data-toggle="dropdown">

<i class="fas fa-columns"></i>

<p class="hidden-lg hidden-


md">Dashboard</p>

</a>

</li>

71
<!-- <li class="dropdown" *ngIf="isMobileMenu()">

<a class="dropdown-toggle" data-toggle="dropdown">

<i class="fa fa-globe"></i>

<b class="caret hidden-sm hidden-xs"></b>

<span class="notification hidden-sm hidden-xs">5</span>

<p class="hidden-lg hidden-


md">

5 Notifications

<b class="caret"></b>

</p>

</a>

<ul class="dropdown-menu">

<li><a href="#">Notification 1</a></li>

<li><a href="#">Notification 2</a></li>

<li><a href="#">Notification 3</a></li>

<li><a href="#">Notification 4</a></li>

<li><a href="#">Another notification</a></li>

</ul>

</li> -->

<!-- <li *ngIf="isMobileMenu()">

<a>

<i class="fa fa-search"></i>

72
<p class="hidden-lg hidden-
md">Search</p>

</a>

</li> -->

<!-- <li *ngIf="isMobileMenu()">

<a href="">

<p>Account</p>

</a>

</li> -->

<!-- <li class="dropdown" *ngIf="isMobileMenu()">

<a class="dropdown-toggle" data-toggle="dropdown">

<p>Dropdown
<b class="caret"></b>

</p>

</a>

<ul class="dropdown-menu">

<li><a href="#">Action</a></li>

<li><a href="#">Another action</a></li>

<li><a href="#">Something</a></li>

<li><a href="#">Another action</a></li>

<li><a href="#">Something</a></li>

<li class="divider"></li>

73
<li><a href="#">Separated link</a></li>

</ul>

</li> -->

<!-- <li *ngIf="isMobileMenu()">

<a>

<p>Log out</p>

</a>

</li> -->

<li class="separator hidden-lg hidden-md" *ngIf="isMobileMenu()"></li>

<li routerLinkActive="active" *ngFor="let menuItem of menuItems"


class="{{menuItem.class}}">

<a [routerLink]="[menuItem.path]">

<i class="{{menuItem.icon}}"></i>

<p>{{menuItem.title}}</p>

</a>

</li>

</ul>

</div>

Test Model Component.html

<div class="main-content">

<div class="container-fluid">

<div class="row">
74
<div class="col-md-12">

<div class="card">

<div class="header" *ngIf='!test'>

<h4 class="title" style="font-family: flex;" >RESULTS</h4>

</div>

<div class="content" *ngIf='dataSource.length == 0' >

<p style="font-family: flex;">No Model Exists</p>

</div>

<div class="content" *ngIf='dataSource.length > 0 && !test' style="max-height:


70vh; overflow: auto; overflow-x: auto;">

<table mat-table [dataSource]="dataSource" style="width: 100%;">

<!--- Note that these columns can be defined in any order.

The actual rendered columns are set as a property on the row definition" -->

<!-- Position Column -->

<ng-container matColumnDef="ID">

<th mat-header-cell *matHeaderCellDef> No. </th>

<td mat-cell *matCellDef="let element"> {{element.ID}} </td>

</ng-container>

<ng-container matColumnDef="Dataset">

<th mat-header-cell *matHeaderCellDef> Dataset </th>

<td mat-cell *matCellDef="let element"> {{element.Dataset}} </td>


75
</ng-container>

<ng-container matColumnDef="Feature">

<th mat-header-cell *matHeaderCellDef> Feature </th>

<td mat-cell *matCellDef="let element"> {{element.Feature}} </td>

</ng-container>

<ng-container matColumnDef="ClassBalancing">

<th mat-header-cell *matHeaderCellDef> Class Balancing </th>

<td mat-cell *matCellDef="let element"> {{element.classBalancing}} </td>


</ng-container>

<ng-container matColumnDef="Preprocessing">

<th mat-header-cell *matHeaderCellDef> Preprocessing </th>

<td mat-cell *matCellDef="let element"> {{element.Preprocessing}} </td>

</ng-container>

<ng-container matColumnDef="Model">

<th mat-header-cell *matHeaderCellDef> Model </th>

<td mat-cell *matCellDef="let element"> {{element.Model}} </td>

</ng-container>

76
<ng-container matColumnDef="Validation">

<th mat-header-cell *matHeaderCellDef> Validation </th>

<td mat-cell *matCellDef="let element"> {{element.Validation}} </td>

</ng-container>

<ng-container matColumnDef="Metrics">

<th mat-header-cell *matHeaderCellDef> Metrics </th>

<td mat-cell *matCellDef="let element"> {{element.Metrics}} </td>

</ng-container>

<ng-container matColumnDef="createDate">

<th mat-header-cell *matHeaderCellDef> Created Date </th>

<td mat-cell *matCellDef="let element"> {{element.createDate}} </td>

</ng-container>

<ng-container matColumnDef="Action">

<th mat-header-cell *matHeaderCellDef></th>

<td mat-cell *matCellDef="let element">

<div class="btn-group" role="group" aria-label="Basic outlined example">

<!-- <button class="btn btn-warning"


(click)='onTest(element)'>Test</button>
77
<button type="button" class="btn btn-success"
(click)='onEdit(element)'>Edit</button> -->

<a class="button3" (click)='onTest(element)'>Test</a>

<a class="button3" (click)='onEdit(element)' style="background-


color:#f14e4e">Edit</a>

</div>

</td>

</ng-container>

<tr mat-header-row *matHeaderRowDef="displayedColumns"></tr>

<tr mat-row *matRowDef="let row; columns: displayedColumns;"></tr>

</table>

</div>

<div class="content" *ngIf='test' style="max-height: 70vh;">

<app-test-prediction [modelId]='selectedModel' [preprocessing]="preprocessing"


[feature]="feature" [dataset]="dataset" (onBackEvent)='onBack()'></app-test-prediction>

</div>

</div>

</div>

</div>
78
</div>

</div>

Test Model Prediction Component.html

<nav aria-label="breadcrumb">

<ol class="breadcrumb">

<li class="breadcrumb-item" (click)='onBack()'><a href="/#/test-model">Results</a></li>

<li class="breadcrumb-item active">Prediction</li>

</ol>
</nav>

<!-- <form> -->

<div class="column">

<div>

<div class="form-group">

<label style="font-family: flex;">Tweet</label>

<div class="input-group" style="font-family: flex;">

<input #csvReader type="file" class="form-control" id="customFile" name="filename"


(change)="uploadListener($event)" accept=".txt">

</div>

</div>

</div>

<div class="form-check">

<input class="form-check-input" type="checkbox" value="" id="defaultCheck1"


(click)='onCheck()'>
79
<label class="form-check-label" for="defaultCheck1" style="padding-left: 2px; font-family:
flex;">

Enable Text

</label>

</div>

<div>

<div class="form-group">

<label for="exampleFormControlTextarea1" style="font-family: flex;">Tweet


Text</label>
<textarea class="form-control" style="font-family: flex;" id="exampleFormControlTextarea1" rows="3"
[(ngModel)]='text' [disabled]='!enableText'></textarea>

</div>

</div>

</div>

<div *ngIf='showResult' class="alert alert-danger">

Prediction Result: {{ result }}

</div>

<button mat-raised-button color="warn" (click)='onBack()'>Back</button>

<button type="submit" class="btn btn-info btn-fill pull-right" (click)='onPredict()'


[disabled]="disablePredict()">PREDICT</button>

<div class="clearfix"></div>

<!-- </form> -->

Train Model Component.html


<div class="main-content">
<div class="container-fluid">
80
<div class="row">
<div class="col-md-12">
<div class="card">
<div class="header">
<h4 class="title" style="font-family: flex;">Train Classifier</h4>
</div>
<div class="content">
<mat-horizontal-stepper linear #stepper>
<mat-step [stepControl]="firstFormGroup" [editable]="isEditable">
<form [formGroup]="firstFormGroup">
<ng-template matStepLabel>Choose Dataset</ng-template>
<mat-form-field appearance="outline">
<mat-label>Dataset Type</mat-label>
<mat-select matNativeControl required formControlName="dataset">
<mat-option value="EarthQuake">EarthQuake</mat-option>
<mat-option value="Floods">Floods</mat-option>
<mat-option value="Hurricane">Hurricane</mat-option>
<mat-option value="WildFire">WildFire</mat-option>
<mat-option value="all">All</mat-option>
</mat-select>
</mat-form-field>
<div class="form-group">
<label>Upload Dataset</label>
<div class="input-group">
<input #csvReader type="file" class="form-control" id="customFile"
name="filename" (change)="uploadListener($event)" accept=".csv">
</div>
</div>
<div>
<button mat-raised-button color='accent' [disabled]='!enableViewButton'
style="margin-right: 4px;" (click)='openDialog()'>View</button>

81
<button mat-raised-button color='primary' matStepperNext
[disabled]='firstFormGroup.invalid || !enableViewButton'>Next</button>
</div>
</form>
</mat-step>

82
<mat-select matNativeControl required formControlName="feature">
<mat-option value="Part of Speech Tagging">Part of Speech
Tagging</mat-option>
<mat-option value="Bag of Words Technique">Bag of Words </mat-
option>
<mat-option value="unigram">Uni-gram </mat-option>
<mat-option value="bigram">Bi-gram </mat-option>
<mat-option value="Tf-Idf Technique">TF-IDF </mat-option>
<mat-option value="Word2vec">Word2vec</mat-option>
</mat-select>
</mat-form-field>
<mat-checkbox
id="class-balancing"
color="primary"
formControlName="classBalancing">
Class Balancing
</mat-checkbox>
<div>
<button mat-raised-button color='accent' matStepperPrevious
style="margin-right: 4px;">Back</button>
<button mat-raised-button color='primary' matStepperNext
[disabled]='thirdFormGroup.invalid'>Next</button>
</div>
</form>
</mat-step>
<mat-step [stepControl]="secondFormGroup" [editable]="isEditable">
<form [formGroup]="secondFormGroup">
<ng-template matStepLabel>Pre-Processing</ng-template>
<mat-form-field appearance="outline">
<mat-label>Pre-Processing</mat-label>
<mat-select matNativeControl required
formControlName="preprocessing">
83
<mat-option value="Stopwords Removal">Stopwords Removal</mat-
option>
<mat-option value="Stopwords + Special Characters">Stopwords +
Special Characters</mat-option>
<mat-option value="Stopwords + Special Characters + Lemmatization"
[disabled]='disableP3'>Stopwords + Special Characters + Lemmatization</mat-option>
</mat-select>
</mat-form-field>
<div>
<button mat-raised-button color='accent' matStepperPrevious
style="margin-right: 4px;">Back</button>
<button mat-raised-button color='primary' matStepperNext
[disabled]='secondFormGroup.invalid'>Next</button>
</div>
</form>
</mat-step>
<mat-step [stepControl]="forthFormGroup" [editable]="isEditable">
<form [formGroup]="forthFormGroup">
<ng-template matStepLabel>Prediction</ng-template>
<mat-form-field appearance="outline">
<mat-label>Machine Learning Model</mat-label>
<mat-select matNativeControl required
formControlName="machineLearning">
<mat-option value="Random Forest">Random Forest</mat-option>
<mat-option value="Naive Bayes">Naive Bayes</mat-option>
</mat-select>
</mat-form-field>
<mat-form-field appearance="outline">
<mat-label>Validation Techniques</mat-label>
<mat-select matNativeControl required formControlName="validation">
<mat-option value="10-fold Cross Validation">10-fold Cross
Validation</mat-option>
84
<mat-option value="Hold-Out Method">Hold-Out Method</mat-option>
</mat-select>
</mat-form-field>

<div *ngIf='enableSlider'>
<mat-label>Training Test Split (%)</mat-label>
<mat-slider
class="slider"
[invert]="false"
[max]="90"
[min]="50"
[step]="1"
[thumbLabel]="true"
formControlName="slider"
[vertical]="false">
</mat-slider>
</div>

<div>
<span class="example-list-section">
<mat-checkbox class="example-margin"
[checked]="allComplete"
[color]="task.color"
[indeterminate]="someComplete()"
(change)="setAll($event.checked)">
{{task.name}}
</mat-checkbox>
</span>
<span class="example-list-section">
<ul>

85
<li *ngFor="let subtask of task.subtasks;">
<mat-checkbox
[ngModelOptions]="{standalone: true}"
[color]="subtask.color"
[(ngModel)]="subtask.completed"
(change)='updateAllComplete()'>
{{subtask.name}}
</mat-checkbox>
</li>
</ul>
</span>
</div>
<div>
<button mat-raised-button color='accent' matStepperPrevious
style="margin-right: 4px;">Back</button>
<button mat-raised-button color='primary' matStepperNext
[disabled]='forthFormGroup.invalid' (click)='trainModel()'>TRAIN</button>
</div>
</form>
</mat-step>
<mat-step>
<ng-template matStepLabel>Results</ng-template>
<div *ngIf='response'>
<p style="font-family: flex;">Model has been trained successfully.</p>
</div>
<div style="margin-top: 16px;" *ngIf='displayResults'>
<div id="successAlert">
<div id='accuracy-div' class="progress"
*ngIf='task.subtasks[0].completed'>
<div id='accuracy-bar' class="progress-bar" role="progress-bar" aria-
valuenow=response.accuracy aria-valuemin="0" aria-valuemax="100"
[style.width]="response.accuracy + '%'">
86
<span class="sr-only">60% Complete</span>
</div>
<span class="progress-type">Accuracy</span>
<span id='accuracy' class="progress-
completed">{{response.accuracy.toFixed(2)}}%</span>
</div>
<div id='precision-div' class="progress"
*ngIf='task.subtasks[1].completed'>
<div id='precision-bar' class="progress-bar progress-bar-success"
role="progressbar" aria-valuenow=response.precision aria-valuemin="0" aria-valuemax="100"
[style.width]="response.precision + '%'">
<span class="sr-only">40% Complete (success)</span>
</div>
<span class="progress-type">Precision</span>
<span id='precision' class="progress-
completed">{{response.precision.toFixed(2)}}%</span>
</div>
<div id='recall-div' class="progress" *ngIf='task.subtasks[2].completed'>
<div id='recall-bar' class="progress-bar progress-bar-info"
role="progressbar" aria-valuenow=response.recall aria-valuemin="0" aria-valuemax="100"
[style.width]="response.recall + '%'">
<span class="sr-only">20% Complete (info)</span>
</div>
<span class="progress-type">Recall</span>
<span id='recall' class="progress-
completed">{{response.recall.toFixed(2)}}%</span>
</div>
<div id='f1score-div' class="progress" *ngIf='task.subtasks[3].completed'>
<div id='f1score-bar' class="progress-bar progress-bar-warning"
role="progressbar" aria-valuenow=response.f1score aria-valuemin="0" aria-valuemax="100"
[style.width]="response.f1score + '%'">
<span class="sr-only">60% Complete (warning)</span>
</div>
87
<span class="progress-type">F1-Score</span>
<span id='f1score' class="progress-
completed">{{response.f1score.toFixed(2)}}%</span>
</div>
</div>
</div>

<mat-spinner *ngIf='isSpinner' style="margin:0 auto;"></mat-spinner>

<div>
<button mat-raised-button color='warn' (click)="stepper.reset();
fileReset()" style="margin-right: 4px;">Reset</button>
<button mat-raised-button color='accent' style="margin-right: 4px;"
[disabled]='!response' (click)='onDisplay()'>Display Result</button>
<button mat-raised-button color='primary' [disabled]='!response'
(click)='onSave()'>Save Result</button>
</div>
</mat-step>
</mat-horizontal-stepper>
</div>
</div>

</div>
</div>
</div>
</div>

app.component.ts

import { Component, OnInit } from '@angular/core';


import { LocationStrategy, PlatformLocation, Location } from '@angular/common';

88
@Component({

89
selector: 'app-root',
templateUrl: './app.component.html',
styleUrls: ['./app.component.css']
})
export class AppComponent implements OnInit {

constructor(public location: Location) {}

ngOnInit(){
}

isMap(path){
var titlee = this.location.prepareExternalUrl(this.location.path());
titlee = titlee.slice( 1 );
if(path == titlee){
return false;
}
else {
return true;
}
}
}

app.module.ts

import { BrowserAnimationsModule } from '@angular/platform-browser/animations';


import { NgModule } from '@angular/core';
import { FormsModule, ReactiveFormsModule } from '@angular/forms';
import { HttpClientModule } from '@angular/common/http';
import { RouterModule } from '@angular/router';

90
import { AppRoutingModule } from './app.routing';

91
import { NavbarModule } from './shared/navbar/navbar.module';
import { FooterModule } from './shared/footer/footer.module';
import { SidebarModule } from './sidebar/sidebar.module';
import {CustomDatePipe} from './custom.datepipe';

import { AppComponent } from './app.component';

import { AdminLayoutComponent } from './layouts/admin-layout/admin-layout.component';


import { AboutComponent } from './about/about.component';
import { ViewDataComponent } from './view-data/view-data.component';
import { TrainModelComponent } from './train-model/train-model.component';
import { TestModelComponent } from './test-model/test-model.component';
import { ViewDataDialogComponent } from './train-model/view-data-dialog/view-data-
dialog.component';
import { BrowserModule } from '@angular/platform-browser';
import { ModelService } from './services/http.service';

// Material Modules
import { MatStepperModule } from '@angular/material/stepper';
import { MatFormFieldModule } from '@angular/material/form-field';
import { MatInputModule } from '@angular/material/input';
import { MatSelectModule } from '@angular/material/select';
import { MatButtonModule } from '@angular/material/button';
import { MatCheckboxModule } from '@angular/material/checkbox';
import { MatSliderModule } from '@angular/material/slider';
import { MatDialogModule } from '@angular/material/dialog';
import { MatProgressSpinnerModule } from '@angular/material/progress-spinner';
import { MatSnackBarModule } from '@angular/material/snack-bar';
import { MatTableModule } from '@angular/material/table';
import { TestPredictionComponent } from './test-model/test-prediction/test-
prediction.component';
92
@NgModule({
imports: [
BrowserAnimationsModule,
FormsModule,
RouterModule,
HttpClientModule,
NavbarModule,
FooterModule,
SidebarModule,
AppRoutingModule,
MatStepperModule,
BrowserModule,
ReactiveFormsModule,
MatFormFieldModule,
MatInputModule,
MatSelectModule,
MatButtonModule,
MatCheckboxModule,
MatSliderModule,
MatDialogModule,
MatProgressSpinnerModule,
MatSnackBarModule,
MatTableModule
],
declarations: [
AppComponent,
CustomDatePipe,
AdminLayoutComponent,
AboutComponent,

93
ViewDataComponent,
TrainModelComponent,
TestModelComponent,
ViewDataDialogComponent,
TestPredictionComponent
],
providers: [ModelService],
bootstrap: [AppComponent]
})
export class AppModule { }

app.routing.ts

import { NgModule } from '@angular/core';


import { CommonModule, } from '@angular/common';
import { BrowserModule } from '@angular/platform-browser';
import { Routes, RouterModule } from '@angular/router';

import { AdminLayoutComponent } from './layouts/admin-layout/admin-layout.component';

const routes: Routes =[


{
path: '',
redirectTo: 'dashboard',
pathMatch: 'full',
}, {
path: '',
component: AdminLayoutComponent,
children: [
{
path: '',
94
loadChildren: './layouts/admin-layout/admin-layout.module#AdminLayoutModule'
}]},
{
path: '**',
redirectTo: 'dashboard'
}
];

@NgModule({ imp
orts: [
CommonModule,
BrowserModule,
RouterModule.forRoot(routes,{
useHash: true
})
],
exports: [
],
})
export class AppRoutingModule { }

custom.datepipe.ts

import { Pipe, PipeTransform } from '@angular/core';


import { DatePipe } from '@angular/common';

@Pipe({
name: 'customDate'
})
export class CustomDatePipe extends
DatePipe implements PipeTransform
95
{

96
transform(value: any, args?: any): any {
return super.transform(value, "EEEE d MMMM y h:mm a");
}
}

http.service.ts

import {Injectable} from '@angular/core';


import { HttpClient, HttpHeaders } from '@angular/common/http';
import {Observable} from 'rxjs/Observable';
import { map } from 'rxjs/operators';
import { Result } from 'app/train-model/train-model.component';

const httpOptions = {
headers: new HttpHeaders({ 'Content-Type': 'application/json' })
};

@Injectable()
export class ModelService {
base_url = 'http://localhost:5000/'

constructor(private http:HttpClient) {}

postModel(data: any){
let url = this.base_url + 'train';
return this.http.post(url, data, httpOptions)
}

saveModel(data: any){
let url = this.base_url + 'save-model';

97
return this.http.post(url, data, httpOptions)

98
}

predictModel(modelId: string, preprocessing: string, feature: string,dataset: string, text:


string){
let url = this.base_url + 'predict';
console.log(dataset)
return this.http.get(url, {
params: {
modelId: modelId,
preprocessing: preprocessing,
feature: feature,
dataset: dataset,
text: text
},
headers: httpOptions.headers
})
}
getModels(){
let url = this.base_url + 'get-all';
return this.http.get(url, httpOptions).pipe(
map(response => response['data']),
map(dataList => {
let resultList: Result[] = [];
let count = 1;
dataList.forEach(element => {
const result: Result = {
Dataset: element['dataset'],
Feature: element['feature'],
ID: count++,
Metrics: `Accuracy:${element['accuracy']} Precision:$
{element['precision']} Recall:${element['recall']} F1-Score:${element['f_score']}`,
99
Model: element['ml_model'],
Preprocessing: element['pre_processing'],
Validation: element['validation'],
createDate: element['created_date'],
modelId: element['model_uuid'],
classBalancing: element['class_balancing']
}
resultList.push(result);
});
return resultList;
})
)
}
}

result.service.ts

import { Injectable } from '@angular/core';


import { Result } from 'app/train-model/train-model.component';
import { BehaviorSubject } from 'rxjs';

@Injectable({ pro
videdIn: 'root'
})
export class ResultService {
private results: Result[] = [];
private selectedResult: Result = null;
private resultSubject: BehaviorSubject<Result[]> = new BehaviorSubject(this.results);
private selectedResultSubject: BehaviorSubject<Result> = new
BehaviorSubject(this.selectedResult);

constructor() {}
100
public addToResult(result: Result){
result.ID = this.results.length + 1;
this.results.push(result);
this.resultSubject.next(this.results);
}

public updateSelectedResult(result: Result){


this.selectedResult = result;
this.selectedResultSubject.next(this.selectedResult);
}

public getResults(){
return this.resultSubject;
}

public getSelectedResults(){
return this.selectedResultSubject;
}

public clearResults(){
this.results = [];
this.resultSubject.next(this.results);
}

101
Chapter 5
Software Testing
This chapter provides a description about the adopted testing procedure. This includes the
selected testing methodology, test suite and the test results of the developed software.

5.1 Testing Methodology


After implementation, the process flow manager is tested for functional errors. We are going
to do Black Box Testing (by passing random selected values and mapping it against the
expected output in a normal flow), Unit and Integration Testing which is the testing of the
functional requirements implemented in our system without regard to code.
The test cases are done manually without the use of any tool.

5.2 Test Cases

5.2.1 Choose Dataset Test case


Table 6 Choose
Dataset Test
Date: 1/8/2021 Case

System: Automatic Eye-witness Identification


Ta

Objective: Choose Dataset Test ID: 1


5.2.2

Test Type: Black Box


Version: 1
Testing
Input

Select csv file from system

Expected Output

Dataset should be selected

102 Actual Output

Dataset selected

Expected Exceptions

Corrupt csv File


5.2.2 Train Model
Test Case
Table 7 Train Model Testcase

Date: 1/8/2021

System: Automatic Eye-witness


Identification
Objective: Train Model Test ID: 4

Version: 1 Test Type: Black Box


Testing
Input

Tweets Data

Expected Output

System will show Feature Extraction

Actual Output

User will able to select Feature Extraction method.

Expected Exceptions

Backend exception

103
5.2.3 Apply Feature Extraction method on Dataset Test Case
Table 8 Apply Feature Extraction method on Dataset

Date: 4/8/2021

System: Automatic Eye-witness Identification

Objective: Apply Feature Extraction Test ID: 5


method on Dataset

Version: 1 Test Type: Black Box


Testing

Input

Part of Speech Tagging

Expected Output

System will apply part of speech tagging on dataset.

Actual Output

User will able to select Preprocessing technique.

Expected Exceptions

System may not be responding.

104
5.2.4 Apply Part of speech on Dataset Test Case
Table 9 Apply Part of speech on Dataset Test Case

Date: 4/8/2021

System: Automatic Eye-witness


Identification
Objective: Apply Part of speech on Test ID: 6
Dataset
Version: 1 Test Type: Black Box Testing

Input

Part of Speech Tagging

Expected Output

System will apply part of speech tagging on dataset.

Actual Output

User will able to select Preprocessing technique.

Expected Exceptions

System may not be responding.


Backend exception

105
5.2.5 Remove Special Characters Test Case
Table 10 Remove special characters Test Case

Date: 4/8/2021

System: Automatic Eye-witness


Identification
Objective: Remove special characters Test ID: 7

Version: 1 Test Type: Black Box


Testing
Input

Special Character removal

Expected Output

System will apply special character removal from the given dataset.

Actual Output

System not responding.

Expected Exceptions

System may not be responding. Backend exception

106
5.2.6 Apply Preprocessing Technique on Dataset Test Case

Table 11 Apply Preprocessing Technique on Dataset Test Case

Date: 4/8/2021

System: Automatic Eye-witness


Identification
Objective: Apply Preprocessing Test ID: 8
Technique on Dataset

Version: 1 Test Type: Black Box


Testing

Input

Stopword Removal

Expected Output

User will able to select Stopword Removal.

Actual Output

System applied Stopword Removal on dataset.

Expected Exceptions

System may not be responding.

5.2.7 Apply Lemmatization Technique on Dataset Test Case


Table 12 Apply Lemmatization Technique on Dataset Test Case

Date: 1/8/2021

System: Automatic Eye-witness


Identification
107
Objective: Apply Lemmatization Technique Test ID: 9
on Dataset
Version: 1 Test Type: Black Box
Testing
Input

Lemmatization

Expected Output

User will able to select Lemmatization Removal.

Actual Output

System applied Lemmatization Removal on dataset.

Exceptions Backend

Exception System Not

working

Invalid File

5.2.8 Apply All Preprocessing Technique on Dataset Test Case


Table 13 Apply Lemmatization Technique on Dataset Test Case

Date: 1/8/2021

System: Automatic Eye-witness


Identification
Objective: Apply All Preprocessing Test ID: 10
Technique on
Dataset
Version: 2 Test Type: Black Box
Testing

108
Input

Stopwords + Special Characters + Lemmatization

109
Expected Output

User will able to select Stopwords + Special Characters + Lemmatization.

Actual Output

System applied Stopwords + Special Characters + Lemmatization on dataset.

Exceptions Backend

Exception System Not

working

Invalid File

5.2.9 Moving to Classifier Test case


Table 14 Moving to Classifier Test Case

Date: 4/8/2021

System: Automatic Eye-witness


Identification
Objective: Moving to Classifier Test ID: 11

Version: 1 Test Type: Unit Testing

Input

No input

Expected Output

System will display classification method


Actual Output

System displayed the display Machine learning model and Evaluation Metrics.

Expected Exceptions

Backend exception
110
5.2.10 Machine Learning Model Test case
Table 15 Machine Learning Model Test Case

Date: 4/8/2021

System: Automatic Eye-witness


Identification
Objective: Applying Machine Learning Model Test ID: 12

Version: 1 Test Type: Black Box


Testing
Input

Naive Bayes

Expected Output

System will apply machine learning model

Actual Output

System applies machine learning model.

Expected Exceptions

System may not be responding.

111
5.2.11 Evaluation Metrics Test case
Table 16 Evaluation Metrics Test Case

Date: 4/8/2021

System: Automatic Eye-witness


Identification
Objective: Evaluation Metrics Test ID: 13

Version: 1 Test Type: Black Box Testing

Input

Accuracy, F-measure, Recall and Precision

Expected Output

System will apply evaluation metrics.

Actual Output

System applies evaluation metrics.

Exceptions

System not Responding

112
5.2.12 Evaluation Metrics Test case
Table 17 Evaluation Metrics Test Case

Date: 4/8/2021

System: Automatic Eye-witness


Identification
Objective: Selecting Validation Techniques Test ID: 14

Version: 1 Test Type: Black Box


Testing
Input

10-Fold Cross Validation or Hold-out Method

Expected Output

System will apply validation techniques.

Actual Output

System applies validation techniques.

Exceptions

System not Responding

113
5.2.13 Apply Classifier
Test case
Table 18 Apply Classifier Testcase

114
5.2.14 Date: 4/8/2021
Save
System: Automatic Eye-witness
Identification
Objective: Apply Classifier Test ID: 15 Table 19
Save Model
Test Case
Version: 1 Test Type: Black Box Testing

Input

• Machine learning model


• Evaluation metrics
• Validation techniques

Expected Output

System will apply Machine learning model, Evaluation metrics, Validation


techniques

Actual Output

System displayed the display Machine learning model and Evaluation Metrics
results.

Expected Exceptions

System may not be responding.


Backend Exception
Invalid Parameters

Model Test Case 1


Date: 4/8/2021

System: Automatic Eye-witness Identification

Objective: Save Model Test ID: 16

115
Test Type: Black Box
Version: 1 Testing
Input

• Machine learning modeling


• Preprocessing Techniques
• Feature Computation

Expected Output

System will save Machine learning model, Preprocessing Techniques,


Feature Computation with date and time

Actual Output

System didn’t save the model in history tab

Expected Exceptions

System may not be responding.

5.2.15 Save Model


Test Case 2

Table 20 Save Model Test Case 2

Date: 4/8/2021

116
System: Automatic Eye-witness
Identification

Objective: Save Model Test ID: 16

Test Type: Black Box


Version: 1 Testing

Input

• Machine learning modeling


• Preprocessing Techniques
• Feature Computation

Expected Output

System will save Machine learning model, Preprocessing Techniques,


Feature Computation with date and time.

Actual Output

System didn’t save the model in history tab.

Expected Exceptions

System may not be responding.

117
5.2.16 Test Model
Test case

Table 21 Test model Test Case

Date: 1/8/2021

System: Automatic Eye-witness


Identification
Objective: Test Model Test ID: 18

Version: 1 Test Type: Black Box Testing

Input

Test model button Click

Expected Output

Redirect to unseen tweet prediction.

Actual Output

System not responding

Expected Exceptions

Test model button may not work properly

118
5.2.17 Test Model
Test Case 2

Table 22 Test model Test Case 2

Date: 1/8/2021

System: Automatic Eye-witness


Identification
Objective: Test Model Test ID: 19

Version: 2 Test Type: Black Box


Testing
Input

Test model button Click

Expected Output

Redirect to unseen tweet prediction.

Actual Output

Redirected to unseen tweet prediction.

Expected Exceptions

Test model button may not work properly

119
5.2.18 Unseen Prediction Test Case 1

Table 23 Unseen Prediction Test Case

Date: 4/8/2021

System: Automatic Eye-


witness Identification
Objective: Unseen Prediction Test ID: 20

Version: 1 Test Type: Black Box Testing.

Input

Text field e.g. I feel an earthquake

Expected Output

System will predict unseen tweet (Direct-


Eyewitness)

Actual Output

Direct-Eyewitness

Expected Exceptions

System will may not respond.


Backend Exception

120
5.2.19 Unseen Prediction Test Case 2
Table 24 Unseen Prediction Test Case 2

Date: 4/8/2021

System: Automatic Eye-


witness Identification
Objective: Unseen Prediction Test ID: 21

Version: 1 Test Type: Black Box Testing.

Input

Text field e.g. There are no tropical cyclones in the Atlantic at this time.
Expected Output

Non-Eyewitness

Actual Output

Don’t Know

Expected Exceptions

System will may not respond.

5.2.20 Unseen Prediction Test Case 3


Table 25 Unseen Prediction Test Case 3

Date: 4/8/2021

System: Automatic Eye-


witness Identification
Objective: Unseen Prediction Test ID: 22

Version: 1 Test Type: Black Box Testing.

Page 121 of 108


Capital University of Science & Technology, Islamabad Department of Software Engineering
Input

Text field e.g. The Galveston hurricane


of 1900 remains the deadliest natural disaster
in U.S. history
Expected Output

System will predict unseen Tweet (Non-


eyewitness)
Actual Output

Non-eyewitness
Expected Exceptions

System will may not respond.

5.2.21 About Page


Test case
Table 26 About Page Test Case

Date: 1/8/2021

System: Automatic Eye-witness Identification

Objective: About Page Test ID: 23

Version: 1 Test Type: Black Box


Testing
Input

About Button Click


Expected Output

Redirect to About Page

Page 122 of 108


Capital University of Science & Technology, Islamabad Department of Software Engineering
Actual Output

Redirect to About Page

Expected Exceptions

None

Page 123 of 108


Capital University of Science & Technology, Islamabad Department of Software Engineering
Chapter 6
Software Deployment
6.1. Installation / Deployment Process Description

• GitHub

First, we have to install git on the system then we will make the account on GitHub.
Then we will install GitToolBox on PyCharm, plugins.

Figure 27Text Result Interface

• Then we will push the project on GitHub hub using this tool.
• Heroku

Page 124 of 108


Capital University of Science & Technology, Islamabad Department of Software Engineering
Then, we will make account on Cloud Application Platform | Heroku
 We will create application on Heroku, after creating application
on Heroku then we select python language.
 Then we will connect our Heroku account with GitHub account.
 Then we will click on Deploy Branch on Heroku website to
deploy the project.

Figure 28Text Result Interface

Page 125 of 108


Capital University of Science & Technology, Islamabad Department of Software Engineering
Figure 29Text Result Interface

• Then we will check status of deployment on Heroku and also from PyCharm.

Figure 30Text Result Interface

Page 126 of 108


Capital University of Science & Technology, Islamabad Department of Software Engineering
• Check status from PyCharm.

Figure 31Text Result Interface

• The link of our project eyewitnessprediction.herokuapp.com

Page 127 of 108


Capital University of Science & Technology, Islamabad Department of Software Engineering
Chapter 7
REPORT APPROVAL CERTIFICATE

The report of the project, “Web App to classify Shopify User Review Using Textual
Features” has been approved based on the following evaluation guideline.

Table 27 Project Evaluation Guidelines

Artifacts Guidelines
Analysis and Design artifacts are syntactically correct (use-case model, SSDs,
domain model, class diagram, SDs, ERDs, Flow charts, Activity Diagram,
DFDs)
Consistency and traceability have been maintained among different artifacts
General Guidelines
Formatting (font style, indentation) is according to the FYP template and
consistent throughout the document
Captions are added to all the figures and tables. Figure captions must be placed
below each figure, and table captions must be provided above the table
Each figure or table is followed by some text describing what it represents

Mr. Mudassar Miss Faria Nazeer Mr. Saqib


Adeel Ahmed (Examiner 2) (Examiner3)
(Examiner 1)

Dr. Shahid Iqbal Malik


(Supervisor)
Page 128 of 108
Capital University of Science & Technology, Islamabad Department of Software Engineering
References

Webpage
[1] https://www.jetbrains.com/pycharm/features/, last accessed July 24, 2021.

[2] https://www.python.org/doc/, last accessed July 24, 2021.

[3] https://www.w3schools.com/html/, last accessed July 24, 2021.

[4] https://code.visualstudio.com/, last accessed July 24, 2021.

[5] https://www.javascript.com/about, last accessed July 25, 2021.

[6] https://www.sqlite.org/index.html, last accessed July 26, 2021

[7] https://angular.io/, last accessed July 26, 2021

[8] https://flask.palletsprojects.com/en/2.2.x/, last accessed July 25, 2021

[9] http://postgressql.org/, last accessed July 24, 2021.

Page 129 of 108


Capital University of Science & Technology, Islamabad Department of Software Engineering

You might also like