Final Edit
Final Edit
BY
180404027
This is to certify that this seminar work was carried out by ANTHONY OLUWATOBILOBA
EMMANUEL with matric number 180404027 in the Department of Computer Science, Faculty of
CHAPTER ONE............................................................................................................................................3
INTRODUCTION.........................................................................................................................................3
1.1 Background of the Study..................................................................................................................3
1.2 Statement of the Problem..................................................................................................................5
1.3 Aim and Objectives..........................................................................................................................5
1.4 Methodology.....................................................................................................................................6
1.5 Scope of the Study............................................................................................................................7
1.6 Expected Contribution to Knowledge...............................................................................................8
1.7 Definition if terms.............................................................................................................................8
CHAPTER TWO...........................................................................................................................................9
LITERATURE REVIEW..............................................................................................................................9
2.0 Introduction.......................................................................................................................................9
2.1 Research Concepts..........................................................................................................................11
2.1.1 Curriculum Vitae (CV)...................................................................................................................11
2.1.2 How a Curriculum Vitae (CV) Works..........................................................................................11
2.1.3 Differences between Curriculum Vitae (CV) and Resume...........................................................12
2.1.4 Different types of CV...................................................................................................................12
2.1.5 Benefits of CV..............................................................................................................................13
2.1.6 An Overview of NLP....................................................................................................................16
2.1.7 Using Machine Learning for analysis and Classification..............................................................16
2.2 Review of related works.................................................................................................................18
REFERENCES
CHAPTER ONE
INTRODUCTION
Whether seeking for a temporary employment to gain experience or a permanent position with
the organization, the applicant must provide one of these documents with their application
materials: a résumé or curriculum vitae (CV). These documents serve as important marketing
tools that will give a self-portrait or advertisement of the employee and will present your relative
(Bhushan Kinge et al., 2022)
strengths, skills, and experiences to a potential employer . An
effective resume or CV will provide an employer with an overview of who you are as a student
or young professional, what you know and can do in relation to the position of interest, and what
relevant skills, traits, and accomplishments you have achieved at this point in your education or
career. Therefore, the objective of your resume or CV is to catch the eye of a prospective
(Anggakusuma et al., 2020)
employer and secure an interview .
Every day, any company with a job opening for a particular position receives thousands of
emails from potential employees. It will be challenging for any recruiter to select the top
candidates from a huge pool of applicants for that employment position. It is exceedingly
difficult for recruiters to manually go through hundreds of resumes to locate the top candidates
for the post. About 75% of the thousands of resumes that were sent to the organization in
response to the job posting do not demonstrate the pertinent abilities needed for the position. As
a result, it can be quite difficult for recruiters to select the best individuals from a big pool of
applicants. Also, the process of reviewing each employee CV can be cumbersome, as such,
(Amin et al., 2019)
company would have a adopt the use of external recruiter .
The recruitment industry is worth $200 billion and it deals with selecting the best candidates
from an enormous pool of applicants who have the necessary skills for a certain job description.
Numerous employees send their resumes to the organization to apply for any job openings that
may exist at the company and screening resumes of all job applicants is the recruiting process for
any recruiter (Bersin, 2017). There has always been a search for an automated process in which
employers can quickly select eligible candidates and applicants can show their ingenuity by
using a single application format to apply to several organizations because employing an external
recruiter can be expensive at times, therefore, application of an automated system to carry out the
process is a tested and true way to carry out the process. This study has adopted the application
of machine learning.
In the subject of machine learning, we develop a system with a dataset to forecast the intended
result from new data. Natural language processing (NLP), which relates to the way humans
speak with one another, is primarily used to screen the resumes. The goal of NLP is to enable
computers to comprehend spoken and written language in a manner similar to that of humans.
statistical, machine learning and deep learning models. Together, combining these technologies
helps computers process the way human language works in the form of texts or voice data and to
‘understand’ its full meaning. As the job market is growing in India, millions of new job seekers
are joining the workforce every year, as per LinkedIn (Suhua et al, 2020)
Although the data formats used in CV/Resumes are not entirely unstructured, it is still difficult to
accept them in a standardized format since there is no set of rules for writing a CV/Resume. With
machine learning and NLP, to analyze any written documents such as resumes, the potential to
interpret unstructured data and extract relevant information from it, as well as the ability to teach
The number of job seats available is not enough to cover the staggering number of
applications/resumes a companies will receive. Each applicant is unique, as different people with
different experiences apply for jobs. Some persons hold positions in the human resources
division and they will have to review hundreds to thousands of resumes in order to find the best
fit for a job opening. Hence, if the companies hire in bulk there are many applications to find the
talent that they need which will require a considerable number of resources and time. On an
summary together and adding the data to the database. Executives condense the résumé and enter
the applicant's contact information into their database and calling them for interviews following
the acquisition of the resume, but with machine learning it will rank out the top resumes which
are the best fit for the job role using NLP algorithms.
The system will also ensure the switch from labor intensive human resume processing to
incredibly quick and affordable software. The following objectives are identified as necessary in
meeting set
Goals.
1.3.1 Aim
The aim of this project is to develop a machine learning model to automate the extraction of
required information of candidate resume without manually going through all submitted resume
of an applicant. To achieve the project aim, the following objectives would be considered.
1.3.2 Objectives
a. develop a system that can accept new CV/Resume for review using NLP (Natural
1.4 Methodology
The methodology for the development of a curriculum vitae review system based on natural
language processing is depicted in Figure 1.1 below.
INFORMATION GATHERING
BUILDING SYSTEM
TESTING SYSTEM
DEPLOYMENT
i. Information gathering: This process involves collecting the necessary tools needed,
reviewing of related works and identifying the best and effective tool/techniques to
ii. Building the system: The process involves writing the codes to achieve the project
iii. Test the system: The process involves creating and structuring a format that the
system will follow to review the document. The new CV to be reviewed will be
supplied and the similarity score will be produced ranging to an 100%. The higher the
iv. Deployment: The system can be packed and a manual will be provided on how the
The difficulty of extracting relevant information from a resume in an organized fashion can be
overcome with the aid of a purpose system. This study aims at developing machine learning
model that can help automate the process. This will be achieved by implementing a NLP
algorithm that will be capable of comparing submitted resume with the expected format and
information needed by a company. The model will only cover supplying a format for the resume
and a resume would be provided to compare the similarity of the new resume to the company’s
format.
1.6 Expected Contribution to Knowledge
The proposed machine learning model is expected to contribute to knowledge by providing the
required tool to extract the crucial data from a CV and Find relevant qualifications from a variety
science, and artificial intelligence concerned with the interactions between computers and
human language, in particular how to program computers to process and analyze large
ii. CV: A CV, which stands for curriculum vitae, otherwise known as Resume, is a
document used when applying for jobs. It allows you to summarise your education, skills
and experience enabling you to successfully sell your abilities to potential employers
iii. Machine learning (ML) is a field of inquiry devoted to understanding and building
methods that 'learn', that is, methods that leverage data to improve performance on some
LITERATURE REVIEW
2.0 Introduction
Hiring the right talent is a challenge for all businesses. Manually screening a large number of
resumes/cv takes at least one day. If a recruiter considers 4-6 appropriate resumes when going
through the initial resumes, chances are that they will not consider the other submitted resumes.
This decreases the likelihood of a successful resume being shortlisted. Going through each
resume is time-consuming, and manually organizing and managing a large number of resumes is
challenging. It’s normal to have some prejudice, wherever there’s been human involvement
(Naik et al., 2022)
.
This challenge is magnified by the high volume of applicants if the business is labor-intensive,
growing, and facing high attrition rates. An example of such a business is that IT departments are
technical skills and business domain expertise are hired and assigned to projects to resolve
(Barrak et al., 2022)
customer issues . This task of selecting the best talent among many is known
as Resume Screening. Typically, large companies do not have enough time to open each CV, so
they use machine learning algorithms for the Resume Screening task and by this unemployment
rate also reduced with efficient hiring. Machine learning is a field in which involves training a
model with data to anticipate the intended outcome when new data is submitted. Natural
language processing (NLP) is a commonly used to screen resumes. Natural language refers to
In the NLP the system enables us to find the text based on the English dictionary in the same
way as humans. NLP combines statistical, machine learning, and deep learning models of human
language with computational linguistics-based rule-based modeling, here we need to check for
the data from different formats which are either in the form of the document or either in the form
(Dimopoulos, 2019)
of the audio data and understanding the whole meaning of it . The number of
applications is in the millions, making it a time-consuming chore to sort through them. Here we
need a machine learning algorithm that can give a better way of understanding and also can full
fill the requirements according to the requirement in the industry. The proposed system takes a
CSV file as input which contains different categories and resumes based on the category and
features of the resume the accuracy and performance are calculated using different machine
learning classifiers.
The study Employers expectations, a probabilistic text mining model (Gao and Eldin, 2018),
more than 20,000 job advertisements from various websites were processed, the method of text
mining was applied to identify information skills derived from the web pages of the construction
industry sector. In the research named Text Analysis for Job Matching Quality Improvement
(Kinoa et al., 2017), in a context of data analysis that includes travel time, job location, job type,
rates, candidate skill set, etc. And when applying keywords in a machine learning process using
text mining tools, as a result, effective keywords are discovered for a job matching system. In the
research entitled Natural Language Processing and Text Mining to Identify Knowledge Profiles
for Software Engineering Positions (Almada et al., 2017), through the application of NLP and
TM to analyze the unstructured text of the resumes and job offers, it manages to identify the
In the research entitled Data Mining Approach to Monitoring the Requirements of the Job
Market: A Case Study (Karakatsanis et al., 2018), presents an approach based on data mining to
identify the most demanded occupations in the modern labor market. To achieve this, have a
latent semantic indexing model that is able to match the job announcement extracted from the
A curriculum vitae works in much the same way as a resume, providing information about an
individual's educational and work history. Often called a CV for short, it's much more
comprehensive than the typical resume and can be much longer. There's no limit to how long a
CV can be, but it must be focused on academic and professional experience. A lengthy CV isn't
any better than a short one if it contains fluff or irrelevant data. A job applicant seeking an
should always use a CV. If you're unsure whether a prospective employer wants a resume or CV,
use the job announcement to guide you. It will usually state which document the institution
A CV begins with one’s contact information, including one’s name, address, telephone number,
and email address. You should also indicate one’s area or areas of academic interest. One’s CV
should include a comprehensive account of one’s academic history, including the title of one’s
dissertation or thesis. It must also contain details about all publications, research projects, and
presentations to which you have contributed. You should also list any grants, academic awards,
and other related honors you've received. The employment and experience section of one’s CV
should contain teaching and research positions, both paid and unpaid. In addition to jobs, include
any relevant internships and volunteer experiences here. Following that section, discuss
memberships in scholarly and professional associations and include offices you have held, if any.
Finally, provide a list of references, along with their contact information, on one’s curriculum
vitae. Doing this is in contrast to a resume, which never contains this information.
A resume is a summary of one’s background and experience. Its emphasis is on one’s work
background. Resumes are typically two pages or less, while CVs can be as long as needed to
convey one’s academic background and experience. CVs are used for academic positions, and
the format can vary as long as it includes all the information one’s prospective employer
requires. Resumes are used for most other positions and follow a few standard templates.
CV RESUME
Comprehensive list of one’s academic and professional Summary of one’s relevant work
experience experience
applications
If you're crafting a new CV (or one’s first CV) you'll need to think about what type of CV you
want to make. This will depend on one’s experience, circumstances, industry and personal
This is the most traditional type of CV, and is what most employers expect to see. A
chronological CV lays out one’s professional experience in reverse chronological order so that
one’s most recent job is at the top of the page. Ideally, a CV should go back around 10-15 years,
Although most CVs are chronological, in certain situations you may decide to order them
differently. For example, if you are changing careers, you might prioritise education and
experience that is most relevant to the role you're applying for, moving less relevant experience
further down the page. However, ensure that one’s CV is as clear as possible for potential
employers.
Creative CVs heavily use visual elements such as pictures, graphs and colors to represent skills
and experience. Creative CVs are common in fields such as marketing or design, but may not be
a good idea for more formal industries like banking or law. You can get an idea of whether a
creative CV would impress one’s potential employer by studying their job advert and website—
if it's written very formally, it's probably best to stick to a traditional CV.
2.1.5 Benefits of CV
A CV is important because it serves as an attention-grabbing bridge between you today and the
more successful future of living one’s dream career. The curriculum vitae is the most important
document that you’ll ever write to snatch the job opportunity. It will be one’s first impression,
and it should leave people wanting more. It should show the best angle of one’s image to one’s
i. Boost self-confidence
Writing down one’s skills and abilities on one’s CV is a very constructive thing to do for one’s
self-confidence. With all of one’s positive traits on a piece of paper (or a screen) in front of you,
you will feel imbued with a strange strength that you thought would only be fit for heroes. Wise
men say that well-earned confidence is half the battle won. Future employers will also be more
likely to hire confident candidates. So, all the more reason to create one’s self-confidence
booster.
A good CV is not only one’s positive traits but also one’s certifications, experience, and other
notable achievements you have. Those are one’s proof of knowledge and help put you as one of
the self-aware candidates. Employers look for these kinds of credentials when they search for
candidates. They know that someone who has done something before knows how to do it better
Since we're already mentioning one’s college organizational experience, one’s Curriculum Vitae
would also be a good place to display what teamwork skills you've put to good use. In all
fairness, there would only be a handful of jobs out there that don't require you to work with a
team to finish one’s daily tasks. This is why having one’s teamwork skills easily known is a good
Let's face it, how would you feel if you receive a letter without clear information? Would you try
to get to know the person sending the letter? Most of us would just forget that letter and grab the
next letter in the queue. Now, that clear information for employers is one’s CV. Let's make a
memorable resume and let one’s potential employers remember one’s application and try to get
With a good CV in the hands of one’s potential employer, there will be more minor interrogation
happening to you. If they already have the needed information suitable for the job description
you are aiming for, the interview process would focus on what kind of person you are instead of
the details on the official documents. Concentrating on what you can do again is memorizing
one’s life which is backed up by one’s CV, that's going to be a fun process. Reciting what you
can do is another confidence booster, at least that's what I've experienced myself.
A great CV makes for a great hiring process, both for you as the job seeker and also for one’s
potential employer. With all one’s suitable traits written down on the CV that they're now
holding, prospective employers can't help but conclude that you're a person with good attention
to detail, who will go and research things that need to be done, an independent yet inquisitive
worker. Add that to the pleasant interview they had when they had you in their office, and at the
very least, I'd say you'll be on their list of top candidates to hire (Karakatsanis et al., 2018).
2.1.6 An Overview of NLP
Natural language processing (NLP) is a subfield of Artificial Intelligence (AI). This is a widely
used technology for personal assistants that are used in various business fields/areas. This
technology works on the speech provided by the user, breaks it down for proper understanding
and processes accordingly. This is a very recent and effective approach due to which it has a
really high demand in today’s market. Natural Language Processing is an upcoming field where
already many transitions such as compatibility with smart devices, interactive talks with a human
have been made possible. Knowledge representation, logical reasoning, and constraint
satisfaction were the emphasis of AI applications in NLP. Here first it was applied to semantics
(Anggakusuma et al., 2020)
and later to the grammar .
In the last decade, a significant change in NLP research has resulted in the widespread use of
statistical approaches such as machine learning and data mining on a massive scale. The need for
automation is never ending courtesy of the amount of work required to be done these days. NLP
is a very favorable, but aspect when it comes to automated applications. The applications of NLP
have led it to be one of the most sought-after methods of implementing machine learning.
Natural Language Processing (NLP) is a field that combines computer science, linguistics, and
machine learning to study how computers and humans communicate in natural language. The
goal of NLP is for computers to be able to interpret and generate human language. This not only
improves the efficiency of work done by humans but also helps in interacting with the machine.
NLP bridges the gap of interaction between humans and electronic devices.
pros and cons but none of them stands as a perfect solution. Static analysis is one of the
approaches and it can be defined as the analysis of a software without its execution. It is clear
that a good analysis tool can help spot and eradicate vulnerabilities, furthermore, it is becoming a
part of the development process. But there is still room for improvement and all the research
work done in this area can be of uttermost relevance for the industry (Mohammed, and Behrouz,
2018).
There are different types and classifications of machine learning models, provided by different
i. Decision trees:
Decision trees are a simple, but powerful form of multiple variable analysis. They are produced
by algorithms that identify various ways of splitting data into branch-like segments. Decision
trees partition data into subsets based on categories of input variables, helping you to understand
Regression is one of the most popular methods in statistics. Regression analysis estimates
relationships among variables, finding key patterns in large and diverse data sets, and how they
Patterned after the operation of neurons in the human brain, neural networks (also called
artificial neural networks) are a variety of deep learning technologies. They’re typically used to
solve complex pattern recognition problems – and are incredibly useful for analyzing large data
sets. They are great at handling nonlinear relationships in data – and work well when certain
Time Series Algorithms, Clustering Algorithms, Ensemble Models, Factor Analysis, Naïve
Bayes and Support vector machines. Each classifier approaches data in a different way, therefore
for organizations to get the results they need, they need to choose the right classifiers and
models. data scientists and IT experts are tasked with the development of choosing the right
Juneja et al. (2016) Used Natural Language Processing (NLP) and Machine Learning (ML) to
rank the resumes according to the given constraint, this intelligent system ranks the resume of
any format according to the given constraints or the following requirement provided by the client
company. We will basically take the bulk of input resume from the client company and that
client company will also provide the requirement and the constraints according to which the
resume should be ranked by our system. Beside the information provided by the resume we are
going to read the candidates social profiles (like LinkedIn, Github etc) which will give us the
Amin et al. (2019) , this research focus majorly on the design of the web application which will
be used to screen resumes (Curriculum Vitae) for a particular job posting. In the proposed
system, a web application will encourage the job applicant candidates as well as the recruiters to
use it for job applications and screening of resumes. The recruiters from various companies can
post the details of the job openings available in their respective companies. The interactive web
application will allow the job applicants to submit their resume and apply for their job postings
they may still be interested in. The resumes submitted by the candidates are then compared with
the job profile requirement posted by the company recruiter by using techniques like machine
learning and Natural Language Processing (NLP). Scores can then be given to the resumes and
they can be ranked from highest match to lowest match. This ranking is made visible only to the
company recruiter who is interested to select the best candidates from a large pool of candidates.
Rabih et al. (2021), presented a paper on curriculum vitae evaluation using machine learning
approach. Its main role is to detect the eligibility of people who are applying to job vacancies or
higher education programs. This research work ambitions in elaborating a system that automates
the preselection of eligibility and assessment of candidates in the higher education students’
recruitment process. This system will replace the tedious tasks of manual processing of CVs and
will provide accurate and effective evaluation results. To achieve this requirement, the system
will be implemented using a machine learning approach using different classification algorithms.
Its main role is to detect the eligibility of people who are applying to job vacancies or higher
education programs. This research work ambitions in elaborating a system that automates the
recruitment process. This system will replace the tedious tasks of manual processing of CVs and
will provide accurate and effective evaluation results. To achieve this requirement, the system
will be implemented using a machine learning approach using different classification algorithms.
The limitation of this work is that the system can not the analysis scope is applied to the
Lokesh.et al. (2022),
candidates who are applying to pursue a Masters degree only this report has
discussed the detailed design and related algorithms for a resume screener, to decide whether a
particular candidate is suitable for the applied role or not. Candidates apply in large numbers for
jobs on web portals by uploading their resumes. As a result, filtering applicants for the
appropriate position in an organization becomes a difficult task for recruiters. Natural Language
Processing (NLP) techniques to extract the relevant information from the resume to save time
and effort. Also, a Machine Learning (ML) model is trained to check whether a candidate’s
skills, experiences, and other aspects are suitable for that particular role. In addition to that, our
system will also recommend the other available job roles based on the candidate’s skillset. On
analyzing the performance of the system, we found that Logistic Regression performs the best
for this problem statement. We also found that more dataset is required for making this system
work even more efficiently. More attributes can be added to find much better performance.
Overall, the system performs pretty well with the current resources. As a part of our future work,
it was intended to improve the accuracy of the system by collecting more resumes from
organizations and training our system for all the available roles. In addition to that, we could also
analyze the candidate’s information from social networking sites like Facebook, Twitter,
Linkedin, so that we can decide more accurately and authentically whether to offer the job or not.
Additionally, algorithms such as Naive Bayes, K-Nearest Neighbor, and C4.5 Analysis can be
Naga (2022), Selecting applicants for the appropriate job within a company is a difficult task for
recruiters. Extraction the key information from the CV using NLTK, Natural Language
Processing (NLP) techniques to save time and effort. This paper examines a variety of machine
learning model such as KNN, SVM, logistic regression and MLP, to detect, identify, and
categories diverse resumes. And here we achieve the better accuracy and we implement a web
interface to screen the resumes and analyses the type of job related to resume, MLP outperforms
other approaches like KNN, SVM, Logistic Regression. Furthermore, this system attempts to
find the accuracy and performance of the proposed methodology and incorporate it in the IT
firms and other regulations for the prevention of manual screening and establish a safe allocation
Amin, S., Jayakar, N., Sunny, S., Babu, P., Kiruthika, M., & Gurjar, A. (2019, January 1). Web Application for
Screening Resume. 2019 International Conference on Nascent Technologies in Engineering, ICNTE 2019 -
Proceedings. HYPERLINK "https://doi.org/10.1109/ICNTE44896.2019.8945869"https://doi.org/10.1109/
ICNTE44896.2019.8945869
Anggakusuma, J., Mawardi, V. C., & Lauro, M. D. (2020). Resume extraction with conditional
random field method. IOP Conference Series: Materials Science and Engineering, 1007(1).
https://doi.org/10.1088/1757-899X/1007/1/012154
Barrak, A., Adams, B., & Zouaq, A. (2022). Toward a traceable, explainable, and fairJD/Resume
recommendation system. http://arxiv.org/abs/2202.08960
Rabih, H.E. & Mercier, L., (2021). Curriculum Vitae Evaluation using Machine Learning
Approach. Curriculum Vitae Evaluation using Machine Learning Approach. Artificial Intelligence for
Knowledge Management IFIP AICT 614, 2021.
Bhushan Kinge, Shrinivas Mandhare, Pranali Chavan, & S. M. Chaware. (2022). Resume Screening
using Machine Learning and NLP: A proposed system. International Journal of Scientific
Research in Computer Science, Engineering and Information Technology, 253–258.
https://doi.org/10.32628/cseit228240
Lokesh. S, Balaje. S, M., Prathish. E, & B. Bharathi. (2022). Resume Screening and Recommendation
System using Machine Learning Approaches. Computer Science & Engineering: An
International Journal, 12(1), 1–7. https://doi.org/10.5121/cseij.2022.12101
Naik, R. S., Dhotre, S. R., & Professor, A. (2022). RESUME RECOMMENDATION USING
MACHINE LEARNING (Vol. 10, Issue 7). www.ijcrt.org
Almada, R. V., Elias, O. M., G ´omez, C. E., Mendoza, M. D., L ´opez, S. G.,Natural Language
Processing and Text Mining to Identify Knowledge Profiles for Software Engineering
Positions, 5th 81International Conference in Software Engineering Research and
Innovation (CONISOFT), 2017.
Gao, L., Eldin, N., Employer’s expectations: A probabilistic text mining model, Creative
Construction Conference 2018, CC2014.
Karakatsanis, I., AlKhader, W., MacCrory, F., Alibasic, A., Omar, M. A., Aung,Z., Woon, W.
L., Data Mining Approach to Monitoring The Requirements of the Job Market: A Case
Study. Electrical Engineering and Computer Science, Masdar Institute of Science and
Technology, Abu Dhabi, United Arab Emirates, 2018.
Kinoa, Y., Kurokia, H., Machidab, T., Furuyab, N., Takanob, K., “Text Analysis for Job
Matching Quality Improvement,” International Conference on Knowledge Based and
Intelligent Information and Engineering Systems, 2017.
Melo-Acosta, German E., et al. “Fraud Detection in Big Data Using Supervised and Semi-
Supervised Learning Techniques.” 2017 IEEE Colombian Conference on
Communications and Computing (COLCOM), 2017,
doi:10.1109/colcomcon.2017.8088206.
Mohammed, Emad, and Behrouz Far. “Supervised Machine Learning Algorithms for Credit Card
Fraudulent Transaction Detection: A Comparative Study.” IEEE Annals of the History of
Computing, IEEE, 1 July 2018, doi.ieeecomputersociety.org/10.1109/IRI.2018.00025.
Juneja, A., Momin, A. (2016) Resume Ranking using NLP and Machine Learning.