0% found this document useful (0 votes)
19 views

Sentimental Analysis Using NLP

The document discusses sentiment analysis using natural language processing (NLP). It describes the process of loading and summarizing a Twitter sentiment dataset, segregating the data into training and test sets, running a machine learning algorithm, and evaluating the results using a confusion matrix. An example confusion matrix is provided for COVID-19 patient data to calculate accuracy, misclassification rate, and other metrics.

Uploaded by

Vicky Nagar
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views

Sentimental Analysis Using NLP

The document discusses sentiment analysis using natural language processing (NLP). It describes the process of loading and summarizing a Twitter sentiment dataset, segregating the data into training and test sets, running a machine learning algorithm, and evaluating the results using a confusion matrix. An example confusion matrix is provided for COVID-19 patient data to calculate accuracy, misclassification rate, and other metrics.

Uploaded by

Vicky Nagar
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 1

From the Tweet text

1 Finding the Problem Identifying the Sentiments from the text

2 Collecting Dataset Twitter Sentiment dataset

3 Load & Summarize dataset

4 Segregating Dataset into X & Y

Removing the Special


5. characters, Symbols with
Regular Expressions

Removing Stop words and


6
Extracting Features

Splitting Dataset into Train and


7
Test

Load Machine Learning


8
Algorithm

Sentimental
Analysis using 9 Predict with Test Dataset

NLP
Example

A confusion matrix is a table that is often


used to describe the performance of a COVID – 19 - Binary prediction (Yes / No)
classification model on a set of test data for Total no. of Patient : 165
which the true values are known. Real Data:
Covid Yes – 105 Patient
No Covid – 60 Patient

Our ML Predicted Data:


Covid Yes – 110 Patient
No Covid – 55 Patient

These are cases in which we predicted yes (


True positives (TP) they have the covid), and they have the
covid.

We predicted no, and they don't have the


True negatives (TN)
covid.

Elements
We predicted yes, but they don't actually
False positives (FP) have the covid. (Also known as a "Type I
error.")

Evaluating with Confusion


10 Confusion Matrix We predicted no, but they actually have the
Matrix
False negatives (FN) covid. (Also known as a "Type II error.")

Example

Accuracy & Loss Calculation from Confusion Accuracy: Overall correct.


Matrix
(TP+TN)/total = (100+50)/165 = 0.91

Misclassification Rate: Overall wrong | Error


Rate

(FP+FN)/total = (10+5)/165 = 0.09

Calculation

True Positive Rate: Sensitivity or Recall


TP/actual yes = 100/105 = 0.95

False Positive Rate:


Other Calculation
FP/actual no = 10/60 = 0.17

True Negative Rate: Specificity


TN/actual no = 50/60 = 0.83

Precision:
TP/predicted yes = 100/110 = 0.91

Prevalence :
actual yes/total = 105/165 = 0.64

You might also like