0% found this document useful (0 votes)
10 views8 pages

JRC: A Job Post and Resume Classification System For Online Recruitment

The document presents JRC, a Job Post and Resume Classification system designed to enhance online recruitment by efficiently matching candidate resumes with job offers using an integrated knowledge base. It addresses the inefficiencies of traditional hiring methods by automating the extraction of structured information from unstructured resumes and classifying them into relevant occupational categories. The proposed system aims to improve the precision and reduce the runtime complexity of the matching process compared to existing online recruitment systems.

Uploaded by

ketem62810
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views8 pages

JRC: A Job Post and Resume Classification System For Online Recruitment

The document presents JRC, a Job Post and Resume Classification system designed to enhance online recruitment by efficiently matching candidate resumes with job offers using an integrated knowledge base. It addresses the inefficiencies of traditional hiring methods by automating the extraction of structured information from unstructured resumes and classifying them into relevant occupational categories. The proposed system aims to improve the precision and reduce the runtime complexity of the matching process compared to existing online recruitment systems.

Uploaded by

ketem62810
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

2017 International Conference on Tools with Artificial Intelligence

JRC: A Job Post and Resume Classification System


for Online Recruitment
Abeer Zaroor Mohammed Maree * Muath Sabha
Computer Science Department Information Technology Department Information Technology Department
The Arab American University The Arab American University The Arab American University
Jenin, Palestine Jenin, Palestine Jenin, Palestine
[email protected] Corresponding author: [email protected]
[email protected]

Abstract— Due to the increasing growth in online associated with screening, matching, and classifying candidate
recruitment, traditional hiring methods are becoming inefficient. resumes. For instance, one of the employed methods addresses
This is due to the fact that job portals receive enormous numbers the automatic matching between candidate resumes and their
of unstructured resumes - in diverse styles and formats - from
corresponding job offers [9, 10, 11]. Other approaches have
applicants with different fields of expertise and specialization.
Therefore, the extraction of structured information from
attempted to automate the extraction of structured segmented
applicant resumes is needed not only to support the automatic information from both job posts and resumes to be later used
screening of candidates, but also to efficiently route them to their in the matching and classification processes [12, 13].
corresponding occupational categories. This assists in minimizing Although these approaches produce high precision ratios in
the effort required by employers to manage and organize finding candidates to fill a vacancy [9], they give less attention
resumes, as well as to screen out irrelevant candidates. In this to the run time complexity of the matching process i.e. every
paper, we present JRC - a Job Post and Resume Classification job offer will be matched with every resume in the corpus
system that exploits an integrated knowledge base for carrying instead of matching resumes that are only related to their
out the classification task. Unlike conventional systems that
occupational category. Other researchers try to overcome this
attempt to search globally in the entire space of resumes and job
posts, JRC matches resumes that only fall under their relevant
problem by utilizing machine learning techniques to first
occupational categories. To demonstrate the effectiveness of the classify job posts and resumes under their relevant
proposed system, we have conducted several experiments using a occupational categories [14]. Although these techniques have
real-world recruitment dataset. Additionally, we have evaluated proven to be more efficient (i.e. have low run time
the efficiency and effectiveness of proposed system against state- complexity), they suffer from high error rates and low
of-the-art online recruitment systems. classification accuracy [15].
Keywords—Conceptual Matching; Resume Ranking; Online To overcome the abovementioned limitations, we present a
Recruitment; Knowledge base Assisted Classification hybrid approach to classify resumes and their corresponding
job post by utilizing an integrated occupational categories
I. INTRODUCTION knowledge base. The exploited knowledge base assists in i)
classifying resumes and job offers under their corresponding
In the recent years, online job portals have started to
occupational categories and ii) automatically ranking
receive an enormous number of resumes in diverse styles and
applicants that best match the announced offers. We
formats from job seekers who have different academic
summarize the contributions of our work as follows:
backgrounds, work experiences and skills [1, 2]. Finding and
hiring the right talent from a wide and heterogeneous range of • Automatic Integrated Knowledge based Occupational
candidates remains one of the most important and challenging Category Classification of Resumes and Job Postings.
tasks of the HR department in any organization [3, 4]. To
• Employing a Section-based Segmentation heuristic by
address this challenge, many companies have shifted to exploiting Natural Language Processing (NLP),
exploiting e-recruiting platforms [5, 6]. These platforms Concept-relatedness techniques and regular expressions.
reduce the cost, time and effort required for manually
processing and screening applicant resumes. As stated in [7], The remainder of this paper is organized as follows. In
there were more than 40,000 e-recruitment sites in 2012 for section 2, we introduce the work related. Section 3 describes an
helping job-seekers and recruiters worldwide. According to overview of the proposed system’s architecture. In section 4,
the International Association of Employment Web Sites we provide the details of the proposed matching steps.
(IAEWS) [8], the number of e-recruitment systems has Experimental validation of the effectiveness and efficiency of
become more than 60,000 in 2017. These systems employ the proposed system is presented in section 5. In section 6, we
discuss the conclusions and outline future work.
different methods and approaches to address the challenges

2375-0197/17/31.00 ©2017 IEEE 780


DOI 10.1109/ICTAI.2017.00123
II. RELATED WORK main classification schemes: DICE1 and O*NET2. More
Many approaches and techniques have been proposed for details on these resources will be provided in Section IV.
addressing the e-recruitment challenges. In this context, some Then, the Category-based Matching module takes the lists of
approaches attempt to overcome issues associated with the skills from both resumes and job posts to construct semantic
matching process between candidate resumes and their networks by deriving the semantic relatedness between their
corresponding job offers, while others attempt to classify concepts in the same fashion as presented in [2]. Finally, the
resumes and job posts prior to starting the matching process matching algorithm takes the semantic networks as input - as
[16, 17, 18, 20, 13]. For instance, the authors of [16] have long as they are in the same space - and produces the measures
proposed an approach for the automatic matching and querying of semantic closeness between them as an output.
of information in the human resources domain. The proposed
approach exploits DISCO, ISCO and ISCED taxonomies to
achieve better matching results than traditional techniques that
simply look for overlapping keywords between the content of
job posts and the applicant’s resume ignoring the hidden
semantic dimensions in the text of both documents [2]. The
authors of [17] have proposed an ontology-based hybrid
approach that matches job seekers and job advertisements
through utilizing a similarity-based approach to rank
applicants. The proposed system exploits semantic
technologies in order to improve the matching process.
However, the main drawback of this approach is the huge cost
(run time complexity) of the matching process. On the other
hand, JobDiSC system [18] attempts to classify job
advertisements automatically by employing a standard
classification scheme called Dictionary of Occupational Titles
(DOT). The proposed system automatically generates
Fig. 1. Overall Architecture of the Proposed System
classification rules from a set of pre-classified job openings and
assigns one or more class for each job post. The main
drawback of this system is that DOT doesn’t cover the IV. DETAILS OF THE PROPOSED MATCHING STEPS
occupational information that is more relevant to the modern
In this section, we detail the steps of the proposed system as
workplace [19]. Other systems utilize machine learning
follows:
algorithms in order to annotate segments of resumes with the
appropriate category, taking the advantage of the resume’s
contextual structure where related information units usually A. Section-Based Segmentation Module
occur in the same textual segments [13, 20]. However, the During this module, an automatic extraction of important
main drawback of these approaches is that a large fraction of segments such as: Education and Experience and other
the produced results suffer from low precision since the employment information such Company name, Applicant’s
information extraction process passes through two loosely- Role in the company, Date of designation, Date of resignation
coupled stages, in addition to the time needed to pre-process and Loyalty is carried out. Accordingly, unstructured resumes
and post-process job posts in order to minimize the error and are converted into segments (semi-structured documents)
maximize the classification accuracy. based on employing Natural Language Processing (NLP)
techniques and rule-based regular expressions. As detailed in
III. OVERVIEW [9], the NLP steps are: document splitting, n-gram
In this section, we present an overview of the proposed tokenization, stop word removal, part-of-Speech-Tagging
system’s architecture and discuss its main modules. As shown (POST) and Named Entity Recognition (NER). To do this,
in Figure 1, the proposed system comprises several modules first, we divide the text of a given resume into segments in
that are organized as follows. First, a Section-based order to process each paragraph separately. Then, each
Segmentation module is used to extract a list of candidate segment of the resume is split into tokens where we remove
matching concepts, in addition to information such as tokens that appear to be of little value in the classification and
personal, education, experience and applicant’s employment matching process. After that, we utilize the StanfordCoreNLP
history. Next, the Filtration module refines the concept lists by POSTagger to assign the appropriate part of speech category
removing insignificant terms that don't contribute in the for each token. Finally, we employ the NER to map tokens
matching process. The third module of the proposed system into categories such as names of persons, countries and
takes a set of skills extracted from both resumes and job posts locations. The following example clarifies the process of
as input in order to classify them under their corresponding resume segmentation:
occupational categories. At this step, we exploit an integrated
occupational categories knowledge base which combines two
1
http://www.dice.com/skills
2
http://www.onetcenter.org/taxonomy/2010/list.html

781
Example 1: Resume Segmentation- Sample of a job that O*NET is not scalable enough for our classification
seeker’s resume (CV1): needs. Furthermore, some skill acronyms are not classified
I have 3 years of experience as a web developer. And I have the correctly in O*NET. However, and on the contrary of Dice,
following skills: PHP, HTML, CSS, JQuery, Ajax, android, ios O*NET is able to better classify skills that are related to the
Education: Bachelor of Science (BSc) in Computer Science.
Employment Details Medical and Artistic fields. For instance, “JPA” which refers
I worked as a Front-End developer in SaFa Company from 2007 to to “Java Persistence” is classified under “Accountants”
2011. category by O*NET, but classifies correctly under “Software
In this example, we convert the CV1 from unstructured Development” in DICE. However, there are terms such as
document into a section-based resume as follows: “Radiography” and “Medical analysis” are not classified in
DICE, but classified correctly under “Radiologic Technicians”
<Applicantdata> and “Medical and clinical Laboratory” categories in O*NET.
<Experience>
<Years>3</Years> 1) Skill-Based Resume Classification Module
<Field> web developer </Field> In this module, each skill in the skills set is submitted to
</Experience>
<Education>
the exploited knowledge base sequentially in order to obtain a
<Degree> Bachelor of Science (BSc) </Degree> list of candidate occupational categories. As a result, a list of
<Field> Computer Science </Field> weighted occupational categories is obtained and sorted by the
</Education> highest weight (as one skill may return zero, one, or more than
<EmploymentHistory>
<role> Front-End developer </role>
one occupational category). For instance, as shown in Figure
<companyName> SaFa Company </companyName> 2, when the skill “android” is submitted to the skills
<FromDate>2007</FromDate> knowledge base, “Software Development/ Mobile
<ToDate>2011</ToDate> Development” occupational category is obtained first. Then,
<loyalty> 4</loyalty>
</EmploymentHistory>
using this procedure, a list of additional weighted categories is
<skills> PHP, HTML, CSS, JQuery, Ajax, android, ios obtained and sorted according to their highest weight.
</skills>
</Applicantdata>

Once unstructured resumes are converted into semi-


structured document, the list of candidate concepts is
identified, extracted, and filtered using the filtration module.
Table 1 shows the results of this step.

TABLE I. RESULT OF THE FILTERATION MODULE


Candidate terms extracted from resume Filtered Concepts List from
resume
PHP android
HTML ios
CSS HTML
JQuery CSS
Ajax Jquery
Android web
SaFa php Fig. 2. List of obatined occupational categories for CV1
Web Ajax
php php
skills
Table 2 shows each occupational category assigned to its
experience corresponding skills.
company
ios
TABLE II. SKILLS TO OCCUPATIONAL CATEGORIES MAPPING
As shown in Table 1, concepts that belong to the list of pre-
Job category skills
defined terms (e.g. contact info, address, birth date, country
Software Development/ Mobile Development Android, ios
name) or have low tf-idf weights [2] are removed from the lists CSS, html, php, Ajax,
of candidate concepts. Software Development/ Web Development
jquery

B. Conceptual Classification Module


In our previous work [2, 23], we have utilized an 2) Job Post Classification Module
integrated knowledge-base which combines Dice skills center In the Job Post Classification module, we use both the job
(henceforth stated as DICE) and Occupational Information title and the required skills from the structured job post for
classification purposes. First, the job post is pre-processed and
Network (O*NET) (henceforth stated as O*NET) to classify
filtered through removing noisy information such as: city
resumes and job posts. In this context, we use DICE to classify names, state and country acronyms that appear in the job title
skills that belong to Information and Communication or job details. After that, we use the skills knowledge base to
Technologies (ICT), and Economy field because we noticed classify job posts in the same manner as we do for classifying

782
resumes. Accordingly, we assign weights (Job Title=70% and Where yd is the weight for the degree d in the applicant
Required Skills=30%) since we believe that the job title is resume and xd is the weight for the degree d required in the
more significant than the required skills and guides to better
matching results. More examples on the results of this module job post. For example, if a job post requires a BSc degree and
are presented in Section V.B. an applicant with a BSc degree applies for this job post; she/he
will be considered a qualified applicant as he satisfied the
C. Matching Resumes and their Corresponding Job Postings yBSc
educational requirement for the job post ( =perfect
Inspired by the work developed in [2], we employ multiple xBSc
semantic resources to derive the semantic aspects of resumes match). However if the applicant has a Diploma degree he will
and job posts. These are WordNet ontology [21] and YAGO2 yDi
ontology [22]. In addition, we utilize statistical concept- be considered underqualified since ( =under qualified). If
relatedness measures to further enrich the lists of extracted xBSc
concepts from the job posts and resumes that weren't the applicant has a Master or PhD degree he will be
recognized by the used semantic resources. Moreover, in order yMSc
to increase the transparency and the effectiveness of the considered overqualified for that job post since ( or
matching process, we have added an additional weighting xBSc
parameter that is loyalty parameter to the matching formula. yPhD
=over qualified). In the same fashion we quantify the
By loyalty we mean the degree of devotion to the company xBSc
that the applicant is working or worked in. The formula for experience parameters using the following formula:
calculating the scoring percentage is as follows:
yr
(1) ExQ = (3)
xj
Where yr is the years of experience the applicant has and xj
Where: is years of experience required in the job post.
• S: is the relevance score assigned between a job post - If yr = x j the applicant will be a qualified match.
and a resume.
- If yr < x j the applicant will be underqualified.
• Sr: is the correspondences set of applicant’s skills.
- If yr > x j the applicant will be overqualified.
• RSj: are the required skills in the job post. Accordingly, assume JP be a job post with a set of
• Er: is the set of concepts that describe applicant’s requirements ( Ed JP , S JP , ExJP ) where,
educational information.
• REj: are the concepts that represent the required • Ed JP : is the required educational degree
educational information in the job post.
n
• Xr: is the set of concepts that describe applicant’s
experience information.
• S JP : is the list of skills, ¦S
i =1
JPi

• RXj: are the concepts that represent the required • ExJP : is the required experience. It is important to
experience information in the job post.
mention that some JPs the employer specifies a number
• Yw: is the total number of employment years. of years while other JPs they specify the number of
years in a specific field. For example: +2 years of
• Cw: is the number of companies that the applicant experience in java development.
worked in.
And let JS be an applicant who applies for JP has a set of
As shown in the formula, we have set the following weighting
values: qualification ( Ed JS , ExJS , S JS ) where EdJS the education
Skills weight = 50%, Educational level weight = 20%, Job degree JS has, ExJS is the amount of experience JS has, SJS SJP:
experience weight = 20% and Loyalty level weight= 10%. n

In order to quantify the education parameters, as well as


list of skills, ¦S
i =1
JS i . A qualified match donates that a job
experience parameters, we give a weight for each field. For
seeker satisfies all the requirements for JP i.e. the score=100%
instance, we give a value for each educational degree
(Diploma, Bachelor, Master, PhD). score = 20% ∗ Escore + 20% ∗ Exscore + 50% ∗ Skillscore + 10% ∗ loyalty
yd where:
EdQ = (2)
xd • Escore= EdJP ∩ EdJS

783
• Exscore= ExJP ∩ ExJS h and 35 min using tf-idf/NC, while it only took 1 hour in tf-
idf/WC and 40 min in JRC since only resumes that fall under
• SkillScore= SJP ∩ SJS “software Development/Web architecture” category were
considered in the matching process i.e. we only match 148
V. EXPERIMENTAL EVALUATION resumes instead of matching 2000 resume. Furthermore, our
system provides better result than tf-idf/WC since JCR attempt
This section describes the experiments that we have carried
out to evaluate the techniques of the proposed system. In order to reduce the cost issue by segmenting the content of both
to evaluate the efficiency and the effectiveness of the proposed resume and job post and finding matches between important
system, we collected a data set of 2000 resumes downloaded segments in both instead of matching between the content of
the whole resumes and job posts. For instance, “video editor”
from Amrood13, indeed4, and we used 10,000 different job
job post cost 5 min of execution time and 11 min using tf-
posts obtained from monster5, shine6 and careerbuilder7. The
idf/WC. It may be argued that it’s not fair to compare
collected resumes are unstructured documents in different
document formats such as (.pdf) and (.doc) and we considered MatchingSem with JRC, since MatchingSem doesn’t adopt
job posts as structured documents having the following classification of job posts and resumes. Therefore, we have
segments (job title, job description, required skills, years of minimized the space of resumes and job posts to be the same
number of the results produced in JRC classification results.
experience, required education qualifications and additional
Again, we perform the comparison but on the minimized
desired requirements). The experiments of our system’s
dataset. Figure 4 shows the run-time complexity of the
prototype show that the classification process for the resumes
and job posts took 6 hours on average on a PC with dual-core Matching Process between JRC and MatchingSem on the
CPU (2.1GHz) and (4GB) RAM. minimized dataset.

A. Execution Time for Matching Resumes with


Corresponding Job Post
In this section, we compare the results produced by our system
to those produced by: MatchingSem system [2] which is a
semantics-based automatic recruitment system, tf-idf scheme
without classification (henceforth stated as tf-idf/NC) and tf-
idf scheme with classification (henceforth stated as tf-
idf/WC). Figure 3 shows the run-time complexity of the
matching process between them.

Fig. 4. Cost (Run-time Complexity in Hours) of the Matching Process


between JRC and MatchingSem

As we can see from Figure 4, the run time is nearly the same
especially for “video editor” and “radiologic Technologists”.
However, JRC produces more precise results as we
demonstrate in the next sections.
B. Experiments of Job Post Classification
In this section, we discuss job post classification. As
mentioned in section IV.B.2, we have used job title and
Fig. 3. Cost (Run-time Complexity in Hours) of the Matching Process
required skills in the classification process. In Table 3, we
As shown in Figure 3, our system (JRC) was able to achieve have compared the results of the classification process with
higher precision results compared to the other approaches. weighted zone scoring (henceforth stated as WZS) and
This is due to the fact that, unlike MatchingSem and tf-idf/NC, without weighted zone scoring (henceforth stated as NWZS).
we only match job posts with their corresponding resumes that
fall under the same occupational category instead of searching TABLE III. JOB POST CLASSIFICATION RESULTS WITH/WITHOUT
WEIGHTED ZONE SCORING
globally in the entire space of resumes. For instance, “java
j2ee Developer” job post costs 6 h and 55 min of execution Weight Weight
time for finding the best candidate using MatchingSem and 6 Job title Job using using
Required skills
classification WZS NWZS
3 Adobe Illustrator, After
http://www.amrood.com/resumelisting/listallresume.htm Design/
4 Video Effects, Premiere Pro,
http://www.indeed.com/resumes Editor photoshop, Adobe
Multimedia 100% 100%
5
http://jobs.monster.com Design
6
Audition
http://www.shine.com/job-search SQL Server Sql, sql server,
7 Data/ Databases 96.25% 88.9%
http://www.careerbuilder.com Developer Redshift, Qlik view,

784
database, ETL, BI classification results using two weighting scheme: Weighted
Industry- zone scoring and tf-idf scheme.
specific /
MS office suite 3.75% 11.1%
Microsoft
Office TABLE IV. JOB POST CLASSIFICATION RESULTS USING WIEGHTED ZONE
Software SCORING AND TF_IDF SCHEME
Development/
Android, xcode, 76.67% 30% Job title Weight Weight
Mobile
development Required skills Job classification using using tf-
Software WZS idf
Android
HTML5, CSS, Development/ scheme
Developer 16.67% 50%
javascript, ajax, jQuery Web Video Adobe Illustrator, Design/ Multimedia 100% 40.8%
Development Editor After Effects, Design
Premiere Pro,
Sql server, SQL
Data / Databases 6.66% 20% photoshop, Adobe
Express
Audition
IT SQL Sql, sql server, Data/ Databases 96.25% 35.72%
CAT5E, CAT6, CATV
Network Administration/ Server Redshift, Qlik
cable router, optical 100% 100%
Technician Technical Developer view, database,
fiber, CCTV, BICSI
Support ETL, BI
Software MS office suite Industry-specific / 3.75% 2.6%
Wordpress, HTML,
Development/ Microsoft Office
CSS, javascript, Ajax, 93.3% 80%
Web Web Android Android, xcode, Software 76.67% 30.5%
Jquery, Angular
Developer Development Developer Development/
Communication/ Mobile
WCM, Adobe CQ 6.7% 20%
Marketing development
Adobe Creative suite, Design/ HTML5, CSS, Software 16.67% 5.06%
photoshop, Illustrator, Multimedia 82.5% 46.1% javascript, ajax, Development/ Web
After Effect, InDesign Design jQuery Development
Software Sql server, SQL Data / Databases 6.66% 8.4%
Development/ Express
Multimedia Ios, Android 5.0% 15.4%
Mobile Network CAT5E, CAT6, IT Administration/ 100% 41.9%
Developer
development Technician CATV cable Technical Support
Software router, optical
HTML, CSS,
Development/ fiber, CCTV,
javascript, wordpress, 12.5% 38.5%
Web BICSI
Drupal
Development Web Wordpress, Software 93.3% 26.6%
Developer HTML, CSS, Development/ Web
javascript, Ajax, Development
As shown in Table 3, we can see that “Web Developer” job Jquery, Angular
post falls under “Software Development/ Web Development” WCM, Adobe CQ Communication/ 6.7% 3.85%
occupational category with a weight that equals 93.3%, and Marketing
this is because when we submit the job title to our skills Multimedia Adobe Creative Design/ Multimedia 82.5% 25.7%
Developer suite, photoshop, Design
knowledge base it returns “Software Development/ Web Illustrator, After
Development” category with weight 70%, then we submit the Effect, InDesign
Ios, Android Software 5.0% 1.5%
required skills and we find that “Wordpress, HTML, CSS, Development/
javascript, Ajax, Jquery, Angular” skills fall under the same Mobile
space as job title with weight 23.3% “resulting in a total of development
HTML, CSS, Software 12.5% 2.65%
93.3% for the Software Development/ Web Development javascript, Development/ Web
space”, but “WCM, Adobe CQ” skills fall under wordpress, Drupal Development
“Communication/ Marketing” space with weight 6.7%.
However, when we submit the same job post to our skills As shown in Table 4, we can see that “Video Editor” job post
knowledge base without giving weights to the job title and the falls under “Design/ Multimedia Design” occupational
required skills; we find that “Software Development/ Web category with weight equals 100%, and this is because when
Development” occupational category weight decrease to we submit the job title to our skills knowledge base it returns
become 80% and “Communication/ Marketing” weight “Software Development/ Web Development” category with
increase to become 20%. And this is because when we didn’t weight 70%, then we submit the required skills and we find
use weighted zone scoring we considered that the job title has that all of them fall under the same space with weight 30%.
the same weight as the required skills. And the same for However, when we use tf-idf weighting the weight decreases
“Android Developer” job post, that falls under three to 40.8% and this is because the tf-idf weighting scheme deals
categories: “Software Development/ Mobile development” with the job posts as a bag of words ignoring the co-relation
with weight 76.67%, “Software Development/ Web between the different zones and the different words.
Development” with weight 16.67%, and “Data / Databases” C. Precision Results of Matching Resumes with
with weight 6.66%. And without weighted zone scoring the Corresponding Job Post
weights become 30%, 50%, 20% respectively. However, we In this section we evaluate our system’s effectiveness using
notice that the results for some job posts didn’t change like precision indicator. For each job post, we compare between
“Front End Web Developer” and “IT Technician”; and this is the manually assigned scores and their corresponding scores
because these job posts fall under one job category with that are automatically produced by the system. Table 5, shows
weight 100%. Table 4 shows a comparison between the

785
the precision results of matching resumes with their requirements: 1 years of professional editorial experience in a
corresponding job post. video marketing environment, knowledge of Adobe Premiere,
video compression, post-production, full Adobe CC suite, and
TABLE V. PRECISION RESULTS OF MATCHING RESUMES WITH THEIR
experience with motion graphics and Adobe After Effects. The
CORRESPONDING JOB POSTS second job post “Database developer” requires a Bachelor's
Occupational Resume Manual Auto
degree in CS, knowledge in MariaDB, MySQL, Oracle DB,
Job Title Precision
Category index score score ASM, Oracle RAC, Oracle 11g, and 3 years of experience
Software with SQL development. For instance, if we take CV1, CV3,
CV4 0.42 0.51 0.82
Development / Multimedia
CV2 0.87 0.90 0.96 CV6 and CV7; we can see that the difference between the
Interactive Designer
Multimedia
CV1 0.09 0.10 0.90 manual score and the automatic score equals “0” and this leads
Design CV1 0.67 0.70 0.95
to the perfect match between the score assigned by the expert
Graphic
Software /
Designer
CV2 0.12 0.20 0.60 and the scores generated by our system. On the other hand, the
Graphics CV3 0.81 0.81 1.00 difference between the manual scores and the automatic scores
Recruiting / Associate CV5 0.45 0.53 0.84
Human HR CV6 0.33 0.44 0.75 for CV2 and CV5 is (0.10 and 0.08) respectively, and the
resources Consultant CV7 0.77 0.83 0.92 reason behind that is because for CV2 our system was unable
to extract the Loyalty from the applicant resume, and for CV5,
As shown in Table 5, we match job posts to their our system was unable to recognize “ASM” skill from the
corresponding resumes that fall under the same occupational applicant resume. However, we manually enrich our
categories. For instance, “Graphic Designer” job post is knowledge base with the missing skills and re-do the
matched only with resumes that fall under “Design Software / experiments and the difference became “0”. Finally, For CV4
Graphics” category. As such, CV1 and CV2 are matched with and CV8 the difference between the manual scores and the
“Graphic Designer” and “Multimedia Designer” job posts. automatic scores is (-0.05 and -0.07) respectively. As for CV4
And this is because these CVs exist in both “Design Software / that identifies an applicant with 2 years of experience in video
Graphics” and “Software Development / Interactive montaging and editing, and this exceed the required
Multimedia” categories. However, the matching score differ experience in “video editor” job post. Furthermore, CV8
from one job post to another. For instance, CV2 achieved a identifies an applicant with Master degree in computer
very low matching score when matched with “Graphic science.
Designer” job post (0.12 manual score, 0.20 automatic score),
but CV1 achieved better score for the same job post (0.67
TABLE VII. COMPARATIVE EVALUATION – JRC VS. OTHER APPROACHES
manual score, 0.70 automatic score). On the other hand, CV2
achieved better results than CV1 when it was matched with Tf-idf JRC
Resume Manual MatchingSem
Job title Auto Auto
“Multimedia Designer” job post (0.87 manual score, 0.90 index score
score
Auto score
score
automatic score) and this is because CV2 falls under Back- CV1 0.38 0.16 0.30 0.45
end web CV2 0.26 0.19 0.19 0.19
“Software Development / Interactive Multimedia” with weight develop
86.6% and falls under “Design Software / Graphics” with er CV3 1.0 0.56 0.70 1.0
weight 13.4%.
Java CV4 0.61 0.35 0.50 0.65
develop CV5 0.46 0.35 0.40 0.46
er CV6 0.53 0.21 0.35 0.54
TABLE VI. COMPARATIVE EVALUATION – RELEVANCE JUDGMENTS CV7 0.20 0.20
Animat 0.35 0.35
or CV8 0.70 0.61 0.70 0.75
Difference
Resume Manual Automatic Designe
Job title (Manual- Judgement
index score score r CV9 0.20 0.20 0.25 0.25
Automatic)
Perfect
CV1 0.28 0.28 0.00
match
Under As shown in Table 7, we have three job posts and for each job
CV2 0.60 0.50 0.10
qualified post we have three resumes. The first job post namely, “Back-
Video
Editor CV3 0.44 0.44 0.00
Perfect end web developer” with the following requirements: 2+ years
match
of experience building JPA data access layers, with Spring and
Over
CV4 0.70 0.75 -0.05
qualified
Hibernate, BSc degree in CS or relevant and knowledge in
Under Lucene, Solr, NoSQL, Riak, Cassandra SQL and Oracle. The
CV5 0.20 0.12 0.08
qualified second job post requires BS Degree in CS, SE or related field
CV6 0.63 0.63 0.00
Perfect combined with 3-5 years of experience developing web
match
Database applications and experience with Java in an IBM WebSphere
Developer Perfect
CV7 1.00 1.00 0.00
match (or similar environment). The third job post is looking for the
Over
candidate that has strong understanding of animation, timing
CV8 0.83 0.90 -0.07
qualified and editing as it relates to motion graphics and can use a
variety of software platforms like Photoshop, After Effects,
As shown in Table 6, we have two job posts and four resumes and Cinema 4D. As we can see, the automatically calculated
for each. The first job post is “Video Editor” has the following scores by our system (JRC) are very close to the manually

786
assigned scores by our expert. For example, if we take the [7] S Al-Otaibi and M Ykhlef, "Job Recommendation Systems for
Enhancing E-recruitment Process", in Proceedings of the International
second job post “java developer” and the first applicant “CV4” Conference on Information and Knowledge Engineering (IKE), Las
who has 2+ years of experience in java programming and has Vegas Nevada, USA, pp. 433-439, 2012.
BSc in computer science, we can see that the difference [8] The International Association of Employment Web Sites (IAEWS),
between the manual score and the automatic score (.04) is less avaliable from: http://www.icmaonline.org/international-association-of-
than the difference between the manual score and the employment-web-sites, Date Visited: June 20, 2017.
automatic generated by MatchingSem (0.1) and tf-idf scheme [9] A Kmail, M Maree, and M Belkhatir, "MatchingSem: Online
recruitment system based on multiple semantic resources," Proceedings
(0.26). This is because the tf-idf scheme ignores the semantic of the 12th International Conference on Fuzzy Systems and Knowledge
aspects of the concepts encoded in both resumes and job posts. Discovery (FSKD), IEEE, pp. 2654-2659, 2015.
On the other hand, – unlike MatchingSem system – we are [10] W Hong, S Zheng, H Wang, J Shi, “A Job Recommender System Based
integrating a section-based segmentation module to extract on User Clustering,” Journal of Computers, vol. 8(8), pp. 1960-1967,
features such as educational background, years of experience 2013.
and employment information from applicants’ resumes. When [11] V.S Kumaran, and A Sankar, “Towards an automated system for
intelligent screening of candidates for recruitment using ontology
we incorporate these features, the matching scores produced mapping EXPERT,” Int. J. Metadata Semantics. Ontologies, vol. 8(1),
by our system are better than when using only a list of pp. 56-64, 2013.
candidate concepts as proposed in MatchingSem. [12] R Kessler, N Béchet, JM Torres-Moreno, M Roche and M. El-Bèze,
“Job Offer Management: How Improve the Ranking of Candidates”, in
Foundations of Intelligent Systems, J. Rauch, et al., Editors. Springer
VI. CONCLUSIONS AND FUTURE WORK Berlin Heidelberg, pp. 431-441, 2009.
In this paper, we have proposed a job post and resume [13] K Yu, G Guan, and M Zhou, "Resume information extraction with
cascaded hybrid model." Proceedings of the 43rd Annual Meeting on
classification system (JRC) based on coupling an integrated Association for Computational Linguistics. Association for
skills knowledge base and an automatic matching procedure Computational Linguistics, pp. 499–506, 2005.
between candidate resumes and their corresponding job [14] F Javed, Q Luo, M McNair, F Jacob, M. Zhao, and TS. Kang,
postings. The proposed system first utilizes section-based "Carotene: A Job Title Classification System for the Online Recruitment
segmentation module in order to segment the resumes and Domain," Proceedings of the IEEE First International Conference on Big
Data Computing Service and Applications (BigDataService), pp. 286-
extract a set of skills that are used in the classification process. 293, 2015.
Next, the system exploits an integrated skills knowledge base [15] R Kessler, N Béchet, M Roche, J. M Torres-Moreno, and M El-Bèze, “A
for carrying out the classification task. As indicated in section hybrid approach to managing job offers and candidates,” Information
V, the conducted experiments using the exploited knowledge Processing & Management, 48(6), 1124-1135, 2012.
base demonstrate that using the proposed classification [16] J.Martinez-Gil, A.L. Paoletti, and K.D. Schewe, “A smart approach for
matching, learning and querying information from the human resources
module assists in achieving higher precision results in a less domain,” In East European Conference on Advances in Databases and
execution time than conventional approaches. In the future Information Systems, Springer International Publishing, pp. 157-167,
work, we plan to utilize the extracted information from 2016.
applicants’ resumes to dynamically generate user profiles to [17] M Fazel-Zarandi and M S Fox, “Semantic matchmaking for job
be further used for recommending jobs to job seekers. recruitment an ontolgy based hybrid approach,” In Proceedings of the
3rd International Workshop on Service Matchmaking and Resource
Retrieval in the Semantic Web at the 8th International Semantic Web
REFERENCES Conference, Washington D. C., USA, 2010.
[1] E Faliagka, L Iliadis, I Karydis, M Rigou, S Sioutas, A Tsakalidis, and G [18] S Clyde, J Zhang, and CC Yao, "An object-oriented implementation of
Tzimas, “ On-line consistent ranking on e-recruitment: seeking the truth an adaptive classification of job openings," Proceedings of the 11th
behind a well-formed CV,” The Artificial Intelligence Review, 42(3), Conference on Artificial Intelligence for Applications, IEEE, pp. 9-16,
515, 2014. 1995.
[2] A Kmail, M Maree, M Belkhatir, and S Alhashmi "An Automatic Online [19] About Occupational Information Network (O*NET). Available from:
Recruitment System based on Exploiting Multiple Semantic Resources https://onet.rti.org/about.cfm. Date Visited: February 5, 2016.
and Concept-relatedness Measures," Proceedings of the EEE 27th [20] R Kessler, J Torres-Moreno, and M El-Bèze, “E-Gen: automatic job
International Conference on Tools with Artificial Intelligence (ICTAI), offer processing system for human resources,” in Proceedings of the
pp. 620-627, 2015. artificial intelligence 6th Mexican international conference on Advances
[3] J Chen, Z Niu, H Fu, "A Novel Knowledge Extraction Framework for in artificial intelligence, Springer-Verlag: Aguascalientes, Mexico, pp.
Resumes Based on Text Classifier," Proceedings of the International 985-995, 2007.
Conference on Web-Age Information Management. Springer [21] G.A Miller, “WordNet: a lexical database for English,” Comm. ACM,
International Publishing, pp. 540-543, 2015. vol. 38(11), pp. 39-41, 1995.
[4] C Hauff, G Gousios, "Matching GitHub developer profiles to job [22] J Hoffart, FM Suchanek, K Berberich, E. Lewis-Kelham, G. De Melo,
advertisements." Proceedings of the 12th Working Conf. on Mining and G. Weikum, “YAGO2: exploring and querying world knowledge in
Software Repositories, pp. 362-366, 2015. time, space, context, and many languages”, in Proceedings of the 20th
[5] T Schmitt, P Caillou, M Sebag, "Matching Jobs and Resumes: a Deep international conference companion on World Wide Web, ACM:
Collaborative Filtering Task," Proc. of the 2nd Global Conf. on Hyderabad, India, pp. 229-232, 2011.
Artificial Intelligence, pp.1-14, 2016. [23] A. Zaroor, M. Maree, and M. Sabha, “A Hybrid Approach to Conceptual
[6] S Mehta, R Pimplikar, A Singh, LR Varshney and K. Visweswariah, Classification and Ranking of Resumes and Their Corresponding Job
"Efficient multifaceted screening of job applicants," Proceedings of the Posts,” In: Czarnowski I., Howlett R., Jain L. (eds) Intelligent Decision
16th International Conference on Extending Database Technology. Technologies 2017. IDT 2017. Smart Innovation, Systems and
ACM, pp. 661–671, 2013. Technologies, vol 72. Springer, Cham.

787

You might also like