Fake Job Prediction

The document presents a project titled 'Fake Job Prediction' aimed at identifying fraudulent job postings using machine learning and natural language processing techniques. It outlines the increasing prevalence of employment scams, the development of a predictive model to distinguish between legitimate and fake job postings, and the algorithms used for analysis. The project emphasizes the importance of protecting job seekers and improving job market security while suggesting future enhancements and collaborations.

Uploaded by

dummyboy353

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views23 pages

Fake Job Prediction

Uploaded by

dummyboy353

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 23

FAKE JOB

PREDICTIO
N
GROUP
NUMBER
:
32

1
BACHELOR OF TECHNOLOGY IN COMPUTER SCIENCE & TECHNOLOGY

Submitted By
NAME ENROLLMENT NO. REGISTRATION NO.
ADITYA GHOSH 12020009022288 304202000901063
ANUBHAV SENAPATI 12020009022257 304202000901008
ANNWESHA MAHANTA 12020009022172 304202000900639
RAHUL DAS 12020009022215 304202000900682
SNEHA SARKAR 12020009027009 304202000900828
JOYEE SAHA 12020009022168 304202000900635

Under the guidance of

(Prof.) Dr. Sudipta Basu Pal & (Prof.) Dr. Piyali Chandra
Department of COMPUTER SCIENCE AND TECHNOLOGY

2
TABLE OF
1 ABSTRACT
CONTENTS
2 INTRODUCTION PROBLEM
3 STATEMENT

4 SOLUTION 5 ALGORITHM AND

6 ANALYSIS
FLOW CHART
9 CONCLUSION
7 RESULT AND
8 FUTURE WORK
OUTPUT
1
REFERENCES
0
3
ABSTRACT
Employment scams are on the rise.
According to CNBC, the number of
employment scams doubled in 2018 as
compared to 2017. The current market
situation has led to high unemployment. This
project “Fake Job Prediction” mainly based
upon a guided model which predicts the
correct job whether it is genuine or not. Based
upon the opportunities providing or check the
identity we can use ML to check that from
where the job is been originated. Keeping the
current status of job and unemployment it can
it is necessary to identify the identity of the
job.
INTRODUCTION
The prevalence of employment fraud is increasing due to the
current economic situation and the impact of corona virus,
which has led to high unemployment rates. This situation
creates an opportunity for scammers to take advantage of
vulnerable individuals. Many people are falling prey to these
scammers. The primary goal of these scammers is to extract
personal information, such as bank account details and
addresses, from their victims. Scammers often lure people
with lucrative work opportunities that seem profitable, only
to request payment later on. This poses a significant danger,
but it can be mitigated through the use of Machine Learning
techniques and Natural Language Processing (NLP) which can
differentiate between legitimate and fake job postings.
PROBLEM STATEMENT

With the rise of online job

portals and the increasing
number of remote job
opportunities, there has been a
corresponding increase in the
number of fake job postings that
are designed to scam job
seekers. These fake job postings
can be used to collect personal
information, steal money, or
carry out other fraudulent
activities.
SOLUTION
The goal of the fake job prediction problem is to
develop a model that can automatically distinguish
between legitimate job postings and fake job
postings. This requires analyzing a variety of
features, including the job description, company
information, and contact details. The model must be
able to accurately identify patterns and indicators of
fraud, such as unrealistic job requirements, vague
or misleading descriptions, or requests for personal
information.
The successful development of a fake job prediction
model has important implications for both job
seekers and job portals. It can help to protect job
seekers from falling victim to scams, and can also
help job portals to maintain the quality and
legitimacy of their job listings.
ALGORITHM

1 2 3
Natural Language Naïve Bayes SGD Classifier
Processing Algorithm

Naïve Bayes and SGD Classifier are compared on accuracy and F1-scores and
a final model is chosen. These models are used on both the text and numeric
data separately and the final results are combined.
WHY THIS ALGORITHM?

Naïve Bayes SGD Classifier

Algorithm
A comparative model,
Naïve Bayes is the SGD Classifier is used
baseline model, and it is since it implements a
used because it can plain stochastic gradient
compute the conditional descent learning routine
probabilities of occurrence which supports different
of two events based on loss functions and
the probabilities of penalties for
occurrence of each classification. This
individual event, encoding classifier will need high
those probabilities is penalties when classified
extremely useful. incorrectly.
FLOW CHART

The following steps are taken for text processing:

Stop
Lemm
Tokeni To word
atizati
zation Lower remov
on
al

Tokenization: The textual data is split into smaller units.

In this case the data is split into words.
To Lower: The split words are converted to lowercase
Stop word removal: Stop words are words that do not
add much meaning to sentences. For example: the, a,
an, he, have etc. These words are removed.
Lemmatization: The process of lemmatization groups in
which inflected forms of words are used together.
ANALYSIS
DATA EXPLORATION
The data for this project is available at Kaggle -
https://www.kaggle.com/shivamb/real-or-fake-fake-jobposting-prediction. The dataset consists of
17,880 observations and 18 features.
After initial assessment of the dataset, it could be seen
that since these job postings have been extracted from
several countries the postings were in different
languages. To simplify the process this project uses
data from US based locations that account for nearly
60% of the dataset. This was done to ensure all the
data is in English for easy interpretability. Also, the
location is split into state and city for further analysis.
The final dataset has 10593 observations and 20
features.
The dataset is highly unbalanced with 9868 (93% of the
jobs) being real and only 725 or 7% of the jobs being
fraudulent. A count plot of the same can show the
disparity very clearly.
ANALYSIS CONTD.
EXPLORATORY ANALYSIS

The first step to visualize the

dataset in this project is to
create a correlation matrix to
study the relationship
between the numeric data.
ANALYSIS CONTD.
EXPLORATORY ANALYSIS

After the numeric features the

textual features of this dataset is
explored. We start this
exploration from location.

The graph aside shows which states

produces the greatest number of
jobs. California, New York and
Texas have the highest number of
job postings.
ANALYSIS CONTD.
EXPLORATORY ANALYSIS

The following formula is used to

compute how many fake jobs
are available for every real job:

Only ratio values

greater than or equal
to one are plotted
aside.
ANALYSIS CONTD.
EXPLORATORY ANALYSIS

A histogram describing a
character count is explored to
visualize the difference
between real and fake jobs.
What can be seen is that even
though the character count is
fairly similar for both real and
fake jobs, real jobs have a
higher frequency.
RESULT AND OUTPUT
The final model used for this analysis is – SGD. This is based on the results of
the metrics as compared to the baseline model. The outcome of the baseline
model and SGD are presented in the table below:

MODEL ACCURACY F1-SCORE

Naïve Bayes Algorithm 0.971 0.743

SGD Classifier 0.974 0.79

Based on these metrics, SGD has a slightly better

performance than the baseline model. This is how the
final model is chosen to be SGD.
FUTURE WORK
The future scope for a project on identifying and preventing fake
job postings using machine learning can be vast, depending on
how the project is designed and implemented
The project can be integrated with job portals to automatically scan
all job postings for any signs of fraudulence. Machine learning
algorithms can be refined over time by continuously training them
on new data, improving their accuracy in identifying fake job
postings.The project can collaborate with law enforcement
agencies to identify and prosecute individuals or organizations
involved in posting fake job advertisements. The project can be
customized to different regions, languages, and cultures to improve
its effectiveness in identifying fake job postings specific to those
regions.
CONCLUSION

In conclusion, a project focused on identifying and

preventing fake job postings using machine
learning can be an effective solution for improving
job market security and protecting job seekers
from fraudulent activities. Additionally, natural
language processing techniques can be used to
analyze candidate resumes and identify any
inconsistencies between their skills and
qualifications and the job requirements stated in
the job posting. However, it is important to note
that such a project should be developed and
implemented with caution and trained on a
diverse range of data to avoid biases and ensure
fairness.
REFERENCE
REFERENCE
REFERENCE
NK
HA
T U
YO
GROUP NUMBER :

Fake Job Post Prediction Using ML
No ratings yet
Fake Job Post Prediction Using ML
7 pages
Fake Job Entry Detectionnn
No ratings yet
Fake Job Entry Detectionnn
25 pages
Documentation - Real and Fake
No ratings yet
Documentation - Real and Fake
66 pages
M11 Final Document
No ratings yet
M11 Final Document
82 pages
Fake Online Job Recruitment
100% (1)
Fake Online Job Recruitment
13 pages
1822 B.E Cse Batchno 220
No ratings yet
1822 B.E Cse Batchno 220
74 pages
Bhargav Last (1) - 241128 - 143747
No ratings yet
Bhargav Last (1) - 241128 - 143747
48 pages
Summer Intern
No ratings yet
Summer Intern
34 pages
Intern Project Report
No ratings yet
Intern Project Report
47 pages
Analyzing The Performance of Novel Logistic Regression Over Linear Regression Algorithms
No ratings yet
Analyzing The Performance of Novel Logistic Regression Over Linear Regression Algorithms
5 pages
Updated Fake Job Posting Detection Presentation
No ratings yet
Updated Fake Job Posting Detection Presentation
13 pages
A Comparative Study On Fake Job Post Prediction Using Different Data Mining Techniques
100% (1)
A Comparative Study On Fake Job Post Prediction Using Different Data Mining Techniques
5 pages
Fake Job Posting Detection Report
No ratings yet
Fake Job Posting Detection Report
10 pages
Bibilography 5
No ratings yet
Bibilography 5
29 pages
20011f0015 Akshay PRC3
No ratings yet
20011f0015 Akshay PRC3
18 pages
Fake Job Detection System
No ratings yet
Fake Job Detection System
7 pages
Online Recruitment Fraud (ORF) Detection Using Deep Learning Approaches
No ratings yet
Online Recruitment Fraud (ORF) Detection Using Deep Learning Approaches
21 pages
Fake Job Detection
No ratings yet
Fake Job Detection
2 pages
Project Viva
No ratings yet
Project Viva
4 pages
Online Recruitment Fraud ORF Detection Using Deep
No ratings yet
Online Recruitment Fraud ORF Detection Using Deep
22 pages
Fake Job Post Detection Using Machine Learning
100% (1)
Fake Job Post Detection Using Machine Learning
24 pages
UDS in CAN Flash Programming
50% (2)
UDS in CAN Flash Programming
8 pages
A. Rupasri (20NE1A0510) Sk. Rehamunnisha (20NE1A0539) D. Sai Supriya (20NE1A0542) Sk. Mohammad Fahim (20NE1A0551)
No ratings yet
A. Rupasri (20NE1A0510) Sk. Rehamunnisha (20NE1A0539) D. Sai Supriya (20NE1A0542) Sk. Mohammad Fahim (20NE1A0551)
20 pages
Fake Job Posting Detection
No ratings yet
Fake Job Posting Detection
5 pages
20011f0015 Akshay PRC2 New
No ratings yet
20011f0015 Akshay PRC2 New
15 pages
Fake E Job Posting Prediction Based On A
No ratings yet
Fake E Job Posting Prediction Based On A
7 pages
Fakejobpublished
No ratings yet
Fakejobpublished
5 pages
Orf Review
No ratings yet
Orf Review
10 pages
Final
No ratings yet
Final
30 pages
Final Year Project - Nagabhusana K Nagabhusana K
No ratings yet
Final Year Project - Nagabhusana K Nagabhusana K
6 pages
Fakejobdett
No ratings yet
Fakejobdett
9 pages
Fake Job Detection Research Proposal
No ratings yet
Fake Job Detection Research Proposal
4 pages
Synopsis
No ratings yet
Synopsis
12 pages
Ijett V68i4p209s
No ratings yet
Ijett V68i4p209s
6 pages
Fake Jobs Code
No ratings yet
Fake Jobs Code
3 pages
Litrature - Survey - Keer
No ratings yet
Litrature - Survey - Keer
11 pages
Fakejob
No ratings yet
Fakejob
5 pages
Fake Job Post Prediction: Supervisor: I.Lakshmi Manikyamba Ass0Ciate Professor-Cse
No ratings yet
Fake Job Post Prediction: Supervisor: I.Lakshmi Manikyamba Ass0Ciate Professor-Cse
10 pages
2023-V14I209 Fake Job Detection Using Machine Learning
No ratings yet
2023-V14I209 Fake Job Detection Using Machine Learning
8 pages
Fin Ijprems1680687249
No ratings yet
Fin Ijprems1680687249
6 pages
Project Report: Fake Job Prediction
No ratings yet
Project Report: Fake Job Prediction
3 pages
IEEE Conference Template 9
No ratings yet
IEEE Conference Template 9
6 pages
ABSTRACT
No ratings yet
ABSTRACT
5 pages
Fin Irjmets1668589338
No ratings yet
Fin Irjmets1668589338
6 pages
Fake Job Detection Using ML Abstract
No ratings yet
Fake Job Detection Using ML Abstract
3 pages
Fake Job Post Detection Using Machine Learning
No ratings yet
Fake Job Post Detection Using Machine Learning
9 pages
Accurate Prediction of Real and Fake Job Postings Using Machine Learning
No ratings yet
Accurate Prediction of Real and Fake Job Postings Using Machine Learning
5 pages
Fake Job Abstract
No ratings yet
Fake Job Abstract
2 pages
A Comparative Study On Fake Job Post Prediction Using Different Machine Learning Techniques
No ratings yet
A Comparative Study On Fake Job Post Prediction Using Different Machine Learning Techniques
11 pages
Fake Job Listing Detection Using Machine Learning Approach
No ratings yet
Fake Job Listing Detection Using Machine Learning Approach
6 pages
Fake Job Recruitment Detection Using Machine Learning Approach
No ratings yet
Fake Job Recruitment Detection Using Machine Learning Approach
7 pages
Sample IEEE Article Ready Format
No ratings yet
Sample IEEE Article Ready Format
5 pages
Detection of Online Employment Scam Through Fake Jobs Using Random Forest Classifier
No ratings yet
Detection of Online Employment Scam Through Fake Jobs Using Random Forest Classifier
8 pages
Predicting Fake Job Advertisement
No ratings yet
Predicting Fake Job Advertisement
3 pages
G H Raisoni College of Engineering and Management, Pune: Department Name
No ratings yet
G H Raisoni College of Engineering and Management, Pune: Department Name
22 pages
Predicting Fraudulant Job Ads With Machine Learning
No ratings yet
Predicting Fraudulant Job Ads With Machine Learning
3 pages
RECOUNT TEXT Test
0% (1)
RECOUNT TEXT Test
2 pages
Bla Power Pvt. LTD: Woodward 505 Governor Valve / Actuator Calibration &test
No ratings yet
Bla Power Pvt. LTD: Woodward 505 Governor Valve / Actuator Calibration &test
23 pages
Fake Job Detection Using Machine Learning
No ratings yet
Fake Job Detection Using Machine Learning
8 pages
Predicting The Trends of Quality-Oriented Jobs
No ratings yet
Predicting The Trends of Quality-Oriented Jobs
3 pages
A Flowchart
100% (1)
A Flowchart
5 pages
Machine Learning-Powered Web Application For Predicting and Identifying Fake Job Listing
No ratings yet
Machine Learning-Powered Web Application For Predicting and Identifying Fake Job Listing
6 pages
Alastair Fowler - Genre and The Literary Canon - 1979
100% (1)
Alastair Fowler - Genre and The Literary Canon - 1979
24 pages
Sample IEEE Article Ready Format
No ratings yet
Sample IEEE Article Ready Format
5 pages
Together kl4 U6 Unit Test Challenge
50% (4)
Together kl4 U6 Unit Test Challenge
2 pages
Unit 2 Oral Quiz: Conversation Strategy Conversation Strategy
100% (1)
Unit 2 Oral Quiz: Conversation Strategy Conversation Strategy
1 page
Christian Leadership Workshop Resources
No ratings yet
Christian Leadership Workshop Resources
7 pages
Walberg Theory of Educational Productivity
100% (1)
Walberg Theory of Educational Productivity
1 page
RBP020L062S FPM Assessment Brief 2024-25 - Final
No ratings yet
RBP020L062S FPM Assessment Brief 2024-25 - Final
13 pages
Drawing Combined
No ratings yet
Drawing Combined
10 pages
Ali (A.s.) Ashja-un-Nas
0% (1)
Ali (A.s.) Ashja-un-Nas
177 pages
Synthesis Essay Outline
100% (2)
Synthesis Essay Outline
2 pages
8086 Hardware Specification
100% (1)
8086 Hardware Specification
84 pages
ZIEHL ABEGG Catalogue
No ratings yet
ZIEHL ABEGG Catalogue
126 pages
Formula Grammar PPT B2 U5
No ratings yet
Formula Grammar PPT B2 U5
7 pages
Association For Computational Linguistics
No ratings yet
Association For Computational Linguistics
308 pages
Score Part Copying Guide
No ratings yet
Score Part Copying Guide
2 pages
Introduction To IOS - XR 6.0: System Engineer, Global Service Providers CCIE SP #42403
No ratings yet
Introduction To IOS - XR 6.0: System Engineer, Global Service Providers CCIE SP #42403
48 pages
Quiz - Flottation - Corrigé
No ratings yet
Quiz - Flottation - Corrigé
2 pages
Peepdf - PDF Analysis Tool
No ratings yet
Peepdf - PDF Analysis Tool
12 pages
Lesson 4 - Sentence Structures
No ratings yet
Lesson 4 - Sentence Structures
48 pages
Going Away PART 2: It S +adjective +to
100% (1)
Going Away PART 2: It S +adjective +to
12 pages
PDF 20220904 234628 0000
No ratings yet
PDF 20220904 234628 0000
16 pages
Composition Touch (R 340G) : Radio
No ratings yet
Composition Touch (R 340G) : Radio
43 pages
Still Vs Yet
No ratings yet
Still Vs Yet
5 pages
Mock Assessment Informatica - Practitioner & Specialist Level
No ratings yet
Mock Assessment Informatica - Practitioner & Specialist Level
5 pages
Narrative Report - INSET DAY 4
No ratings yet
Narrative Report - INSET DAY 4
2 pages
String Manipulation Worksheet 2.1
No ratings yet
String Manipulation Worksheet 2.1
4 pages
Engleza Maritima 4: 2a-SMCP External ROUTINE Communication - READING Comprehension-Book1
No ratings yet
Engleza Maritima 4: 2a-SMCP External ROUTINE Communication - READING Comprehension-Book1
5 pages
Scale Invariant Feature Transform: Unveiling the Power of Scale Invariant Feature Transform in Computer Vision
From Everand
Scale Invariant Feature Transform: Unveiling the Power of Scale Invariant Feature Transform in Computer Vision
Fouad Sabry
No ratings yet