0% found this document useful (0 votes)

8 views6 pages

FND Imp Points

Uploaded by

shreya halaswamy

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views6 pages

FND Imp Points

Uploaded by

shreya halaswamy

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

LIBRARIES :

➢ tkinter – library that provides GUI in python

➢ tkk – contains all the widgets like buttons, label, message box, etc
➢ messagebox – used to display or type any text
➢ PIL – Python Imaging Library – used for adding images (ImageTk also does the same
thing)
➢ NumPy – Numeric Python – it includes mathematical operations , and is used to create
arrays and also used for storage
➢ pandas – Python Data Analysis/Panel Data – used to analyse the data, i.e., statistical
data calculations like wrong data, right data, NULL value, etc
➢ Itertools – library which contains basic data structures (tuples, sets, dictionaries, etc)
➢ sklearn.model_selection – used to separate the data into TRAIN sets and TEST sets
➢ train_test_split – we train the TRAIN sets to fit the news into the model
➢ sklearn.feature_extraction.text imports TfIdfVectoriser – converts the raw data into
TF-IDF matrix
➢ sklearn.metrics – used for measurements like score, loss, gain, etc
➢ Passive Aggressive Classifier Algorithm – an algorithm in which the system classifies
the dataset into TRAIN SETS and TEST SETS (FND runs on PAC, and PAC algorithm
is based on TF-IDF)
a. Passive – if the prediction is correct, then do not disturb the model
b. Aggressive – if the prediction is incorrect, then make some changes in the model
c. Classifier – classifies whether the part of the algorithm is Passive or Aggressive
• A machine learning algorithm is fed with some news. Some % of the news are
trained to predict whether they are true or fake. The remaining % are predicted
if they are true or fake by the trained values of the news.
• Example, if 1000 news are fed in the algorithm, 800 news are trained to be
fake or true. The remaining 200 news are predicted/analysed whether they are
true or fake based on the 800 trained news.
➢ TF-IDF – Term Frequency – Inverse Document Frequency
➢ TF – number of times a specific term is repeated in the algorithm
➢ Document Frequency – number of documents containing a specific term
➢ Inverse Document Frequency – number of times the term is repeated in all the
documents (indicates how important the term is)
➢ accuracy_score – used to calculate the accuracy of the model’s predictions
➢ Confusion Matrix – calculates the performance of the ML algorithm based on the truth
values of the model.
• It has 2 rows and 2 columns – TP, TN, FP, FN
1. TP – news which is actually true and predicted correctly
2. TN – news which is actually true and predicted incorrectly
3. FP – news which is actually false and predicted correctly
4. FN – news which is actually false and predicted incorrectly

CODE :
1. Defining a function:

def accuracy( ):

• Function is named as ACCURACY

2. Reading and Loading the Data:

df=pd.read_csv('news.csv')

• This line reads the data from a CSV file named 'news.csv' and stores it in a pandas
DataFrame called df. The DataFrame will contain the data from the CSV file,
allowing us to manipulate and analyse it.
➢ DataFrame – data structure provided by pandas library – used for data analysis (like
pandas)
➢ .csv file – comma separated value files – allows data to be saved in the tabular form
(different from excel sheets)
3. Exploring the Data:

df.shape

df.head( )

• The shape attribute of a DataFrame is used for implementing rows and columns in
the DataFrame. The shape attribute is not assigned to any variable, it is used in
common to all variables.
• The head( ) method displays the shape of the DataFrame and it will not be visible
on the GUI because it is not using the print( ) function.

4. Preparing the Data for Machine Learning:

labels = df.label

labels.head( )

• The 'label' column of the DataFrame is extracted and assigned to a variable called
labels.

5. Splitting the Data into Training and Testing Sets:

x_train, x_test, y_train, y_test = train_test_split (df['text'], labels, test_size=0.2,

random_state=7)

• The train_test_split function from scikit-learn is used to split the dataset into
training and testing sets.
• The 'text' column of the DataFrame is selected as the feature (x_test and y_test)
and the 'label' column is used as the target (x_train and y_train).
• The test_size parameter specifies the percentage of data to be used for testing (in
this case, 20%), and random_state ensures reproducibility of the same train-test
split when the code is run multiple times with the same random_state value.
6. TF-IDF Vectorization:
tfidf_vectorizer = TfidfVectorizer(stop_words='english', max_df=0.7)
tfidf_train = tfidf_vectorizer.fit_transform(x_train)
tfidf_test = tfidf_vectorizer.transform(x_test)
• TF-IDF vectorization to convert the text data (news) into numerical vectors.
TfidfVectorizer is used from scikit-learn for this purpose. The parameter
stop_words='english' removes common English stop words, and max_df=0.7 sets
the maximum document frequency for the words to be included in the vocabulary
(words appearing in more than 70% of the documents will be ignored).
• The fit_transform( ) method is applied to the training data (x_train) to learn the
vocabulary and transform the text into TF-IDF vectors.
• The transform( ) method is applied to the testing data (x_test) to transform it using
the learned vocabulary.

7. Training the Classifier:

pac = PassiveAggressiveClassifier(max_iter=50)

pac.fit(tfidf_train, y_train)

• The max_iter parameter sets the maximum number of iterations for the model to
converge. The classifier is trained using the TF-IDF vectors of the training data
(tfidf_train) and their corresponding labels (y_train).

8. Making Predictions and Calculating Accuracy:

y_pred = pac.predict(tfidf_test)

score = accuracy_score(y_test, y_pred)

• The trained classifier is used to predict the labels for the test data (x_test). The
predictions are stored in the y_pred variable.
• The accuracy_score( ) function from scikit-learn is used to compare the predicted
labels (y_pred) with the actual labels (y_test) and calculate the accuracy of the
model. The accuracy value is stored in the score variable.

9. Displaying the Results on the GUI:

print(f'Accuracy: {round(score*100,2)}%')

CNF = confusion_matrix(y_test, y_pred, labels=['FAKE', 'REAL'])

print(CNF)

accu = str(round(score*100, 2)) + "%"

e1.insert(10, accu)

real = str(CNF[0, 0])

e2.insert(10, real)

fake = str(CNF[1, 1])

e3.insert(10, fake)

• The code prints the accuracy of the model and the confusion matrix to the console.
The accuracy is also inserted into an Entry widget e1 on the GUI.
• The confusion matrix is calculated using the confusion_matrix( ) function from
scikit-learn, and its elements (True Positives and True Negatives) are inserted into
e2 and e3 Entry widgets, respectively.
• The code assumes that there are 3 Entry widgets (e1, e2, and e3) which are already
defined on the GUI, where the accuracy and confusion matrix values will be
displayed. If these Entry widgets are not defined earlier in the code, this part will
result in an error.
• e1/e2/e3 – name of the entry (it can be named anything)
• insert( ) – used to insert the entry into e1/e2/e3
• 10 – index/position in which the entry should be placed
GUI :

Text
No ratings yet
Text
25 pages
AI Phase4
No ratings yet
AI Phase4
5 pages
ML Lab Manual
No ratings yet
ML Lab Manual
13 pages
Unit 3
No ratings yet
Unit 3
81 pages
Ad3461-ML Manual
No ratings yet
Ad3461-ML Manual
27 pages
Learning Management System SRS
No ratings yet
Learning Management System SRS
6 pages
Data Analytics III
No ratings yet
Data Analytics III
5 pages
Viva
No ratings yet
Viva
7 pages
Reydisp Manager 2 Product Information V2.51
No ratings yet
Reydisp Manager 2 Product Information V2.51
26 pages
Module 5.pptx - 20250608 - 201231 - 0000
No ratings yet
Module 5.pptx - 20250608 - 201231 - 0000
43 pages
Natural Language Processing-Section
No ratings yet
Natural Language Processing-Section
38 pages
Lab 6
No ratings yet
Lab 6
47 pages
Amlnew
No ratings yet
Amlnew
25 pages
Fanuc Servo Motor Part Number Meaning
No ratings yet
Fanuc Servo Motor Part Number Meaning
3 pages
ISRM
No ratings yet
ISRM
25 pages
Unix System Calls
No ratings yet
Unix System Calls
13 pages
ML Lab Manual
No ratings yet
ML Lab Manual
14 pages
All in One Order Form CDN
No ratings yet
All in One Order Form CDN
9 pages
DM Assignment 2
No ratings yet
DM Assignment 2
23 pages
O379534v272 BR Controls Connectivity en
No ratings yet
O379534v272 BR Controls Connectivity en
15 pages
Shashank ML
No ratings yet
Shashank ML
23 pages
Document
No ratings yet
Document
3 pages
Progress of CATBOOST ALGORITHM FOR ELECTRICITY THEFT DETECTION IN POWER UTILITIES
No ratings yet
Progress of CATBOOST ALGORITHM FOR ELECTRICITY THEFT DETECTION IN POWER UTILITIES
9 pages
ML Report Fake News Detection
No ratings yet
ML Report Fake News Detection
15 pages
ML Manual
No ratings yet
ML Manual
34 pages
Naive Bayes Classification
No ratings yet
Naive Bayes Classification
8 pages
HTTPSWWW - Lae Electronic - Comsitesdefaultfilesinstructionsac1 5 Eng Spa Qins PDF
No ratings yet
HTTPSWWW - Lae Electronic - Comsitesdefaultfilesinstructionsac1 5 Eng Spa Qins PDF
2 pages
Pro Material Series: 500+ Free Mock Test Visit
No ratings yet
Pro Material Series: 500+ Free Mock Test Visit
11 pages
Pavya 3.3years
No ratings yet
Pavya 3.3years
3 pages
Machine Learning With Scikit Learn Strata 2015
No ratings yet
Machine Learning With Scikit Learn Strata 2015
72 pages
Methodology
No ratings yet
Methodology
9 pages
Data Preprocessing
No ratings yet
Data Preprocessing
9 pages
Machine Learning - Lab Manual
No ratings yet
Machine Learning - Lab Manual
35 pages
ML New Record
No ratings yet
ML New Record
51 pages
ADS - Phase 3
No ratings yet
ADS - Phase 3
34 pages
ML File External File
No ratings yet
ML File External File
25 pages
WDM - Week - I
No ratings yet
WDM - Week - I
24 pages
CCC
No ratings yet
CCC
25 pages
IPTV Xtream Code Windows & Android Free (21072021)
No ratings yet
IPTV Xtream Code Windows & Android Free (21072021)
4 pages
Random Forest
No ratings yet
Random Forest
5 pages
Nursery Admission Notification A. Y. 2025 26
No ratings yet
Nursery Admission Notification A. Y. 2025 26
4 pages
ML Manual With Outputs
No ratings yet
ML Manual With Outputs
30 pages
Machine Learning Assignment
No ratings yet
Machine Learning Assignment
13 pages
Practical # 11
No ratings yet
Practical # 11
10 pages
Ritesh Mangla ML PracticalFile
No ratings yet
Ritesh Mangla ML PracticalFile
55 pages
PowerTest User Manual V2.60 2021
No ratings yet
PowerTest User Manual V2.60 2021
548 pages
Fake News Classification - Ipynb - Colaboratory
No ratings yet
Fake News Classification - Ipynb - Colaboratory
6 pages
Unit 1
No ratings yet
Unit 1
28 pages
Capstone Project - Jaro-Prof. Babji
No ratings yet
Capstone Project - Jaro-Prof. Babji
5 pages
Deep Learning and Machine Learning: Lab Explanation
No ratings yet
Deep Learning and Machine Learning: Lab Explanation
34 pages
ML
No ratings yet
ML
8 pages
FakeNewsDetection Student
No ratings yet
FakeNewsDetection Student
7 pages
24CSPC212-PIC Lab Manual
No ratings yet
24CSPC212-PIC Lab Manual
45 pages
Sample Thesis Information System
100% (3)
Sample Thesis Information System
6 pages
5th PSKA - Outstanding SK Council (Province) Awards Matrix
No ratings yet
5th PSKA - Outstanding SK Council (Province) Awards Matrix
3 pages
Python Learning
No ratings yet
Python Learning
21 pages
SAMv 1
No ratings yet
SAMv 1
23 pages
ML Lab Exercise - 9
No ratings yet
ML Lab Exercise - 9
4 pages
Gujarat Technological University
No ratings yet
Gujarat Technological University
2 pages
Offbeat Careers
No ratings yet
Offbeat Careers
6 pages
AMBA AXI Protocol Interview Questions - Xilinx, Sicon, Ensilica and Mobiveil
100% (6)
AMBA AXI Protocol Interview Questions - Xilinx, Sicon, Ensilica and Mobiveil
2 pages
Revised 8th Sem Examination Pogramme
No ratings yet
Revised 8th Sem Examination Pogramme
2 pages
Proposal Pro
100% (1)
Proposal Pro
27 pages
Melody Ii Technical Manual: (MTMLDII-U01)
No ratings yet
Melody Ii Technical Manual: (MTMLDII-U01)
6 pages
Note-Windows Server Administration
No ratings yet
Note-Windows Server Administration
134 pages
Dr. Oligo 192c Brochure
No ratings yet
Dr. Oligo 192c Brochure
4 pages
Deep Learning
No ratings yet
Deep Learning
25 pages
Approaching (Almost) Any Machine Learning Problem - Abhishek Thakur - No Free Hunch
No ratings yet
Approaching (Almost) Any Machine Learning Problem - Abhishek Thakur - No Free Hunch
22 pages
Import Pandas As PD DF PD - Read - CSV ("Titanic - Train - CSV") DF - Head
No ratings yet
Import Pandas As PD DF PD - Read - CSV ("Titanic - Train - CSV") DF - Head
20 pages
Lab5 Example Fall 23
No ratings yet
Lab5 Example Fall 23
4 pages
Pompe Bosch 3054 E
100% (1)
Pompe Bosch 3054 E
59 pages
Project - Machine Learning-Business Report: By: K Ravi Kumar PGP-Data Science and Business Analytics (PGPDSBA.O.MAR23.A)
No ratings yet
Project - Machine Learning-Business Report: By: K Ravi Kumar PGP-Data Science and Business Analytics (PGPDSBA.O.MAR23.A)
38 pages
Cyber Security: PROJECT: Fake News Detection
No ratings yet
Cyber Security: PROJECT: Fake News Detection
8 pages
Ellion dvr960 e DVD RECORDER USER MANUAL
No ratings yet
Ellion dvr960 e DVD RECORDER USER MANUAL
60 pages
India Extravaganza 2024 - Early Bird Ticket - Announcement Flyer
No ratings yet
India Extravaganza 2024 - Early Bird Ticket - Announcement Flyer
4 pages
ML Lab
No ratings yet
ML Lab
7 pages
Machine Learning Cheatsheet
No ratings yet
Machine Learning Cheatsheet
12 pages
CH 2: Python Operators and Control Flow Statements (22616 Mad)
No ratings yet
CH 2: Python Operators and Control Flow Statements (22616 Mad)
48 pages
Home Work
No ratings yet
Home Work
12 pages
ML Summer Training
No ratings yet
ML Summer Training
20 pages
5G RAN3.1 Basic Feature Description
100% (2)
5G RAN3.1 Basic Feature Description
93 pages
Sample Code
No ratings yet
Sample Code
8 pages
A Comprehensive Guide To Understand and Implement Text Classification in Python
No ratings yet
A Comprehensive Guide To Understand and Implement Text Classification in Python
34 pages
CS178 Homework #1: Problem 0: Getting Connected
No ratings yet
CS178 Homework #1: Problem 0: Getting Connected
4 pages
School of Engineering: Lab Manual On Machine Learning Lab
No ratings yet
School of Engineering: Lab Manual On Machine Learning Lab
23 pages
Machine Learning Lab Assignment CSE-716: S. M. Shafkat Raihan ID: 16701041 SESSION: 2015-16
No ratings yet
Machine Learning Lab Assignment CSE-716: S. M. Shafkat Raihan ID: 16701041 SESSION: 2015-16
9 pages
Fake News Detection Project
No ratings yet
Fake News Detection Project
7 pages

FND Imp Points

Uploaded by

FND Imp Points

Uploaded by

LIBRARIES :

➢ tkinter – library that provides GUI in python

• Function is named as ACCURACY

2. Reading and Loading the Data:

4. Preparing the Data for Machine Learning:

5. Splitting the Data into Training and Testing Sets:

x_train, x_test, y_train, y_test = train_test_split (df['text'], labels, test_size=0.2,

7. Training the Classifier:

8. Making Predictions and Calculating Accuracy:

score = accuracy_score(y_test, y_pred)

9. Displaying the Results on the GUI:

CNF = confusion_matrix(y_test, y_pred, labels=['FAKE', 'REAL'])

accu = str(round(score*100, 2)) + "%"

real = str(CNF[0, 0])

fake = str(CNF[1, 1])

You might also like