Fake News Detection System Using Machine Learning
Fake News Detection System Using Machine Learning
The third group is poorly written news articles, which have a 3. METHODOLOGY
point of real news, but they're not entirely accurate. In short,
it's news that uses, for example, quotes from political figures
to report a totally fake story. Usually, this type of stories is
meant to market certain agenda or biased opinion [1]. In the
article published by Kai Shu, Amy SlivaSuhang Wang,
Jiliang Tang, and Huan Liu [2], they explored the fake news
problem by reviewing existing literature in two phases:
characterization and detection. In the characterization phase,
they introduced the essential concepts and principles of faux
news in both traditional media and social media. within the
detection phase, they reviewed existing fake news detection
approaches from a data mining perspective, including feature
extraction and model construction.
3.4 Feature Selection feature extraction and machine learning techniques. The
In this module we've performed feature extraction and proposed model achieves accuracy of roughly 92% when
selection methods from sci-kit learn python libraries. using TF-IDF features and logistic regression classifier. After
testing the data, the result will be an exactness of 0.92% and
3.5 Count Feature F1 score of 0.923.
The CountVectorizer provides an easy thanks to both
tokenize a set of text documents and build a vocabulary of 4.1 Input
known words, but also to encode new documents using that
vocabulary. you'll use it as follows: 1. Create an instance of
the CountVectorizer class. 2. Call the fit() function so as to
find out a vocabulary from one or more document. 3. Call the
transform() function on one pr more document as required to
encode each as vector.
3.6 Classifier
In this module everybody build all the classifiers for
predicting the fake news detection. The extracted features are
fed into different classifiers. One and all used Logistic
Figure 2: user entering a news
Regression classifier from sklearn. Each of the extracted
features were utilized in the classifier.Once fitting the model, Figure 2 shows the user interface where input values are submitted
we compared the f1 score and checked the confusion to the system.
matrix. After fitting all the classifiers, two best performing
models were selected as candidate models for fake news 4.2 Output
classification.Finally selected model was used for fake news
detection with the probability of truth. additionally to this,
also extracted the highest 50 features from our term-
frequency tfidf Vectorizer to ascertain what words are most
and important in each of the classes. All of us have also used
Precision-Recall and learning curves to see how training and
test sets perform once everybody increases the quantity of
knowledge in our classifiers.
15