0% found this document useful (0 votes)
45 views18 pages

LN and ML-based Model Architecture For Recruiting IT Professionals

This document discusses research on using natural language processing and machine learning techniques for automating the IT personnel recruitment process. It reviews 20 previous studies that used machine learning, natural language processing, or semantic correspondence methods. The objective of the current research is to design an architecture based on natural language processing and machine learning to address the problem of recruiting IT professionals.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
45 views18 pages

LN and ML-based Model Architecture For Recruiting IT Professionals

This document discusses research on using natural language processing and machine learning techniques for automating the IT personnel recruitment process. It reviews 20 previous studies that used machine learning, natural language processing, or semantic correspondence methods. The objective of the current research is to design an architecture based on natural language processing and machine learning to address the problem of recruiting IT professionals.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 18

Computer Science and Information Systems 00(0):0000–0000 https://doi.org/10.

2298/CSIS123456789X
1

1 LN and ML-based model architecture for recruiting IT


2 professionals *

3 Juan Rolando Eneque Pisfil1, Hugo Calderón-Vilca1


4 1 Universidad Nacional Mayor de San Marcos, Lima, Perú
5 [email protected], [email protected]

6 Abstract. Personnel recruitment is a key process for companies since through


7 this process they try to find the ideal or best qualified person to perform a job or
8 occupy a position. This process involves the definition of the job offer, through
9 which several qualities (age, knowledge, experience, among others) that must
10 be met for the position offered are exposed. There are several platforms that
11 serve as intermediaries to connect people and companies, but they do not
12 provide the ability to assess whether the applicant for a job offer meets the
13 requirements requested. The objective of this research is to propose a model
14 that helps automate the recruitment process focused on IT professionals.

15 Keywords: Recruitment, PLN, ML

161. Introduction

17Personnel selection is the process of obtaining the quantity and quality of employees
18needed for the business and involves a large number of activities (planning,
19recruitment, selection and incorporation of new employees).
20 [16] indicates that one of the disadvantages of the recruitment process is the cost of
21operation related to the application of appropriate selection techniques, that is,
22choosing the candidate that meets the requirements of the position offered is a
23complicated task because it implies that the Human Resources area invests large
24resources, distributed among activities such as: review of profiles, filtering and
25personal interviews.
26 Human resources management and the problems they present are being addressed
27by Artificial Intelligence (IA) and its branches. For example, in the literature review
28of [6], the author shows us that AI offers a diverse set of suggestions of how specific
29AI techniques could be applied to specific Human Resources tasks.
30 An example of the aforementioned is reflected in the proposal of [4], in which they
31address the problem of candidate classification with the help of Machine Learning.
32For this purpose, they evaluated algorithms (linear regression, M5 model tree, REP
33decision tree and support vector machine) of supervised learning in combination with
34a semantic skill matching mechanism to achieve automated electronic recruitment.

2* If this is an extended version of a conference paper, it should be clearly stated here.

3
12 First Author et al.

1 Another ML-based proposal is [3], in which they propose a microservices-based


2framework in order to recommend the best job offers for a candidate.
3 On the other hand, [19] proposes a system with a hybrid approach (PLN and
4regular expressions) that seeks to solve the problem of resume categorization and
5resume-job offer matching.
6 Finally, [11] present a bidirectional recommender system for candidates in job
7search. The author’s proposal implements a microservices-based, scalable and
8stateless architecture to drive automation through recommendation using Machine
9Learning and static methods.
10 Using an electronic recruitment or e-recruitment strategy that also implements
11“intelligent” mechanism or AI techniques, offers great advantages when evaluating
12hundreds of profiles, since they offer faster (depending on the technique and
13processing resources) and more accurate results for what we are looking for.
14 Based on the review of recruitment research, the objective of the present research
15is to design an architecture based on Natural Language Processing and Machine
16Learning to address the problem of recruiting IT professionals.
17 The rest of the paper is organized as follows. Section 2 covers research covering
18the recruitment problem. In section 3, we detail the architecture design and, finally, in
19section 4 we show the results and discussion on these.
20

212. State of the art

22We analyzed a total of 20 investigations and divided them into 3 categories according
23to the techniques applied: Machine learning, Natural Language Processing and
24Semantic Correspondence.

252.1. IT personnel recruitment using Machine Learning techniques

26[4] proposes a system for candidate selection through the analysis of the candidate’s
27LinkedIn and blogger profile. For this purpose, they evaluated supervised learning
28algorithms (linear regression, M5 model tree, REP decision tree and support vector
29machine) and combined them with a semantic skill matching mechanism.
30Supported by the strengths of semantic knowledge (concept similarity) and the
31strengths of Machine Learning methods, [3] propose a scalable and stateless
32architecture for an automated Human Capital Management system and with which
33they seek to recommend jobs to a candidate and vice versa, recommend candidates for
34a company.
35 A recommendation system that uses a Gradient Boosting Decision Tree (GBDT)
36and a hybrid convolutional neural network model to compute a correlation between a
37job seeker and a job offer with the goal of improving the quality of human resource
38recommendation is proposed by [17].
39 [20] proposes a convolutional neural network model with the objective of solving
40the person-job matching problem. The authors’ proposal is a neural network that
41learns the joint representation of person-job fit from historical job applications.

2
1 Authors’ Instructions 3
2
3
1 [11] proposes an architecture for automation through recommendation using
2machine learning and statistical methods. The authors’ proposal is an extension of the
3research of [3] in which they aim to achieve better system robustness and
4recommendation quality by implementing features such as candidate career interests,
5scoring functions for academic information and professional experience, string
6matching, etc.
7 [15] presents an automated Machine Learning-based model for CV
8recommendation. In which, a CV goes through preprocessing for cleaning and feature
9extraction using the TF-IDF approach and subsequently through the classification
10model is assigned to a category.
11 In the recruitment process, recruiters do not focus exclusively on a person’s
12technical skills to determine their sustainability for an offered position, but also take
13into account characteristics such as education, personality, experience, etc.

4
14 First Author et al.

1 Table 1. Characteristics to consider for personnel selection


Author(s Considered Characteristics
)
[Error: Personality
Reference
source not
found]
[Error: Location, experience and education
Reference
source not
found]
[Error: Employment history
Reference
source not
found]
[Error: Work experience, company
Reference (Glassdoor parameters), location and
source not education
found]
2
3As shown in Table 1, in the different proposals analyzed, we can see that the authors
4take into account what is described in the previous paragraph and evaluate or consider
5other characteristics apart from the technical skills of a candidate to assess the
6suitability of this for an offered position.
7 Machine Learning algorithms were evaluated by [4] and [15]. [4] evaluated linear
8regression, M5 tree model, decision tree (REP) and Support Vector Regression (SVR)
9with two nonlinear kernels (polynomial kernel and universal PUK kernel) for the
10evaluation of total experience and relative experience concluding that the tree models
11and the SVR model with PUK kernel produced better correlation results for their
12proposal. On the other hand, [15] evaluated Random Forest (RF), Multinomial Naive
13Bayes (NB), Logistic Regression (LR) and Linear Support Vector Machine (SVR), for
14CV classification, obtaining that the latter had better accuracy for the classification
15task.
16 [17] and [20] dive into Machine Learning subtasks and propose solutions based on
17the use of Convolutional Neural Networks. The first one proposes a recommendation
18system with a GBDT model and a hybrid convolutional neural network model for
19regularization and recommendation. The second one, on the other hand, relies
20exclusively on convolutional neural networks and applies cosine similarity to calculate
21the similarity between job offers and a candidate’s CV.

222.2. IT personnel recruitment using semantic correspondence techniques

23A job recommendation system based on user profile is proposed by [8], in which they
24also seek to predict career advancement from the user’s work history.
25A content-based recommendation algorithm that extends and updates the Minkowski
26distance is proposed by [1], with the objective of matching people and jobs. The

2
1 Authors’ Instructions 5
2
3
1authors’ proposal quantifies the sustainability of a searcher/candidate by analyzing a
2structured form of the candidate’s job and profile created from the content analysis of
3the unstructured form of these.
4 [7] proposes a Resume Matching System called ResuMatcher, which determines
5the sustainability of a job by calculating the similarity between the models generated
6from the resume and the job description.
7 A career path recommendation system that relies on text mining and collaborative
8filtering techniques and also recommends skills based on related job offers generated
9from the user’s profile skills is proposed by [13].
10 [12] proposes a candidate recommendation system called Smart Applicant Ranker;
11in it, they use ontologies to compare CV models (consisting of education, work
12experience and skills) and job requirement models to find the best candidates based on
13the similarity of the generated ontological models.
14 A bidirectional semantic correspondence system is proposed by [2] to measure the
15degree of semantic similarity between the skills and qualifications of a job seeker and
16an offered vacancy. In addition, they apply machine learning techniques for
17bidirectional matching of job vacancies and occupational standards to improve the
18content of job vacancies and job seeker profiles based on social network analysis and
19occupational standards.
20 [18] propose the use of weighted tree algorithms to calculate the similarity
21between job advertisements and keywords or criteria used by job seekers.
22 [14] propose an ontology-based (most relevant) job recommendation system that is
23built from the basic information collected and the list of favorite and viewed jobs by
24the user.
25 In the proposals of [8], [1], [12], [2], [18] and [14] the authors propose solutions
26that require the information to be analyzed to have a certain structure. On the other
27hand, the proposals of [7] and [13] apply unstructured analysis, taking into account
28that the information contained in a CV does not present a unique style or format.
29 Table 2. Format of the information to be processed
Information to process
Author(s) Structured Not Structured
[Error: X
Reference source
not found]
[Error: X
Reference source
not found]
[Error: X
Reference source
not found]
[Error: X
Reference source
not found]
[Error: X
Reference source
not found]

4
16 First Author et al.

[Error: X
Reference source
not found]
[Error: X
Reference source
not found]
[Error: X
Reference source
not found]
1 [8], [7] and [2] present proposals that approach the selection problem from the
2perspective of similarity between a candidate’s CV/profile and the vacancy/position
3offered. In contrast, [18] addresses the problem through the similarity of the content of
4a job offer and the search keywords used by a user.
5 Although the proposals of [8], [7] and [2] address the same similarity approach,
6each one presents some peculiarity. In the proposal put forward by [8],
7recommendation based on the content of the candidate’s work history is applied. [7]
8rely on the qualifications, skills and work experience described in the candidate’s CV
9and those required in the job offer and generate recommendations based on the
10similarity between them. Finally, [2] take into account the similarity of qualifications
11and skills and also take into account the candidate’s connections since their testimony
12enhances the process of evaluating whether or not a candidate is suitable for a
13vacancy.
14
15 Table 3. Data source
Author(s) Data Source Quantity
[Error: LinkedIn 2400
Reference
source not
found]
[Error: Kaggle 100
Reference
source not
found]
[Error: Indeed 1000
Reference
source not
found]
[Error: Universidad Estatal de 1000
Reference San José
source not
found]
[Error: - -
Reference
source not
found]
[Error: Not specified 175

2
1 Authors’ Instructions 7
2
3
Reference
source not
found]
[Error: Not specified 100
Reference
source not
found]
[Error: - -
Reference
source not
found]

12.3. IT personnel recruitment using Natural Language Processing techniques

2An online recruitment system that exploits multiple semantic resources and uses
3statistical measures of concepts relatedness is proposed by [10]. Moreover, it relies on
4PLN to identify and extract possible concept lists from job postings and candidate
5CVs.
6 [9] propose a solution focused on job matching for older workers. In this solution,
7from the description entered in the system search engine, keywords are extracted from
8the text after tokenizing sentences and filtering words based on morphological
9analysis. Then, based on the top 10 keywords, the search for related job offer
10documents is performed.
11 To solve the resume-job offer matching problem of job portals [19] pose a hybrid
12approach and incorporate the use of resume categorization to reduce the dataset to be
13analyzed, that is, instead of evaluating the total resumes, the analysis is only applied
14to resumes that fall within the category described in the job offer.
15 To cover the problem of CV retrieval based on the description of a job offer, [5]
16propose the use of the average word embedding (AWE) model and the Principal
17Component (PCA) algorithm to solve the dimensionality problem that AWE can
18present.
19
20 Table 4. Weighting techniques applied in proposals using PLN
21
Author(s Technique/Approach
) Weighting
[Error: TF-IDF
Reference
source not
found]
[Error: BM25
Reference
source not
found]
[Error: TF-IDF
Reference

4
18 First Author et al.

source not
found]
[Error: AWE
Reference
source not
found]
1
2In the proposals of [10], [9], [19] and [5], we could appreciate different techniques
3applied to information retrieval, as shown in Table 4, [9] applied that TF-IDF
4weighting scheme to eliminate concepts that do not present significant value. [9] made
5use of Solr/lucene scores of the BM25 algorithm, which performs scoring based on
6term frequency and document length normalization. [9], relied on the TF-IDF
7technique, which subsequently performs concept list filtering/refinement by removing
8concepts with low weights assigned by this technique. On the other hand, [9] indicate
9that classical information retrieval models such as Bag of Word (BOW) and BM25
10have certain weaknesses and require complementary techniques such as latent
11semantic indexing (LSI). Therefore, they rely on the average word embeddings
12(AWE) models.
13

143. Architecture design

15An architecture based on PLN and ML is proposed, whose objective is to generate a


16list of candidates that fully or largely meet the requirements of a job offer or vice
17versa, displaying a list of job offers that best match the IT skills of a person’s profile.
18To do this, the information contained in a person’s CV or job offer is entered into a
19structured form, this information goes through a cleaning stage with the objective of
20eliminating those words that do not provide relevant meaning or that may obstruct the
21capture of IT skills.
22 The IT skills obtained during pre-processing go through a model to detect those
23with greater semantic similarity and with which we obtain a list of professional
24categories by consulting an IT skill – related professions dictionary, which forms one
25more characteristic of the processed profile or job offer.
26 Finally, in the matching module, in case a job offer is being evaluated, we obtain
27the CVs that have a category in common with the job offer, in order to reduce the
28volume of data to be evaluated, and through a grouping algorithm, we generate a list
29or ranking of the best candidates for the job offer. In case a profile is being evaluated,
30exactly the same thing happens, with the difference that the profile is grouped with a
31set of job offers.

2
1 Authors’ Instructions 9
2
3

1
2 Figure 1. Model Architecture
3 Figure 1 shows the architecture of our model and its components:
4  Data form
5  Pre-processing module
6  Categorization module
7  Matching module

83.1. Data form

9 It represents the core of the system and is the component that receives the necessary
10information for the model to work. Through it, the actors (applicant and candidate)
11initiate the behavior of the model, since they provide the data that pass through each
12of the components of the model and ultimately generate a ranking of candidates for
13the job offer entered or a ranking of job offers for the CV entered.

4
110 First Author et al.

13.2. Pre-processing module

2 In this component, the corpus of the text entered in the skills section goes through a
3cleaning process, through which we detect and eliminate those punctuation marks or
4symbols that do not provide context-relevant meaning or that cause an IT skill not to
5be detected.

6
7 Figure 2. Skills corpus cleaning
8
9In figure 2, we present the proposed flowchart for data cleaning. Since in our skills
10detection process we rely on an IT dictionary, it is necessary to ensure that an IT skill
11(contained in the skills section of each form) does not contain characters that would
12cause the omission of this skill during the process. Therefore, the first step to follow is
13the conversion of the text of the skills section into a list of characters. After that, we
14parse each element of the generated list and remove the signs and symbols. Finally,
15we rejoin this list of characters and obtain a clean corpus to process.
16 An important element in this module is Word2vec, which is a neural network
17composed of an input layer, a hidden layer and an output layer that allows us to
18calculate the semantic relationship between words in a given context. Taking into
19account the above, we take advantage of this tool and train it with IT skills.
20 This model helps us to fulfill the objective of this module, which is to obtain a
21subset of skills with a strong semantic relationship and thus, reduce the number of
22queries to be made later in the categorization module. This is under the premise that a
23set of strongly related skills will result in an equally related number of IT occupations.
24

253.3. Categorization module

26 With this module we obtain the IT occupations related to each of the skills detected
27in the previous module. These occupations help us to categorize the document (job

2
1 Authors’ Instructions 11
2
3
1offer or CV) that is being processes and also serve to reduce the volume of data to be
2worked with in the next module.
3
4 Table 5. IT dictionary excerpt
5
IT Skill IT Professions
Expressjs backend, js developer
Extjs frontend, js developer
Firebase backend, mobile
developer
Flask python developer,
backend, web developer
6 Table 5 shows a small excerpt of how the IT dictionary is composed.
7
8 An IT skill is not exclusive to one profession and that is why during the
9consultation of our IT skill dictionary it is possible that there are one or more IT skills
10that have in common one or more IT professions/occupations.
11 Taking into account the above, during each query to our dictionary we assign a
12frequency value. Then, at the end of the query process, we calculate the average
13frequency and categorize the document under evaluation (job offer or CV) with those
14professions that have a value greater than or equal to the average.
15

163.4. Matching module

17 In this module, in case a job offer is being processed, the list of professional
18categories obtained is taken and for each of these, the CVs of the same category are
19extracted from the database. In case a profile or CV is being processed, the documents
20extracted from the database will be job offers.
21 With the set of documents obtained, a data table is built. This data table has as
22column headers the IT skills detected from the filtered set and the item being
23processed, each row will be represented by a profile or CV, where each row – column
24intersection will have a value that depends on the following conditions:
25
26 • 0 will be assigned if the CV does not possess the IT skill
27 described in the column.
28 • 1 will be assigned if the CV possesses the IT skill described in
29 the column.
30 • 2 will be assigned if the CV contains the IT skill described in
31 the column and it matches one of the requirements of the job
32 offer.
33
34 In case a profile or CV is being processed, the criteria are the same, with the
35difference that each row will be represented by a job offer.
36This data table represents the input for clustering. The unsupervised Mean-shift
37algorithm is in charge of analyzing this set and assigning a group or cluster number to

4
112 First Author et al.

1each one. This algorithm, unlike others, does not require a number of clusters to be
2assigned, but it iterates and analyzes each of the elements of the set and establishes the
3number of clusters. Once the process is finished, we have the number of clusters to
4which each element belongs. Of these, those that are in the same cluster as the
5document (job offer or CV) being processed represent the output of the clustering
6component. es el encargado de analizar este conjunto y asignar a cada uno un número
7de grupo o clúster.
8

93.5. Model output

10 Our final objective is to obtain a ranking of candidates; therefore, we order the CVs
11(obtained during clustering) based on the percentage of skills that a CV fulfills with
12respect to those specified in the job offer. Put differently, given a CVi, where i є N,
13which contains an HCV list of skills, and given the job offer, which contains the
14required skills (RS) and the desirable skills (DS). The percentage of RS (%RS) is
15calculated as the number of RS that are contained in HCV over the total number of
16HCV items.
17
18As an example, given a CV and a job offer with RS and DS. The percentage of RS and
19DS is calculated as follows:
20 HCV = [Java, Spring, JSF, Oracle, Android, Flutter, Spring Boot]
21 • n(HCV) = 7
22 • RS = [Java, Android, React, Flutter]  %RS = 3/7 ≈ 42.8%
23 • DS = [Spring, Spring Boot]  %DS = 2/7 ≈ 28.5%

244. Results and Discusion

25 In this section, for the evaluation and discussion of results, we used 200 job offers
26and 50 profiles or CVs. In addition, we rely on an IT dictionary which consists of 225
27skills, and the occupations associated with each of these.
28 As we indicated in the theoretical input chapter, out model consists of 3
29components: pre-processing, categorization and clustering. In this chapter we will
30show the results of processing a document (job offer or CV) by each of these
31components.
32

334.1. 4.1 Pre-processing results

34 4.1.1 Case 1:CV


35 When registering a CV through the web system form, the section containing IT
36knowledge or skills is processed to detect those with the highest semantic similarity:
37

2
1 Authors’ Instructions 13
2
3
1 Table 6. CV: Pre-processing results
CV IT skills detected Most similar IT skills
cv_0001 9 = ['html', 'css', 'javascript', 8 = ['html5', 'css3',
'java', 'php', 'laravel', 'vuejs', 'javascript', 'php', 'vue.js', 'java',
'rxjava', 'spring'] 'spring', 'laravel']
cv_0002 13 = ['html', 'css', 'javascript', 9 = ['html5', 'css3',
'typescript', 'java', 'php', 'python', 'javascript', 'php', 'typescript',
'angular', 'nodejs', 'azure', 'react', 'angular', 'python', 'react',
'js', 'nestjs'] 'nodejs']
cv_ 16 = ['java', 'hibernate', 'jpa', 10 = ['java', 'spring', 'android',
0049 'mybatis', 'spring', 'spring', 'hibernate', 'mybatis',
'javascript', 'c', 'python', 'flask', 'javascript', 'html5', 'css3',
'html', 'css', 'datastage', 'sql', 'python', 'linux']
'linux', 'android']
cv_0050 11 = ['python', 'django', 'drf', 5 = ['android', 'java', 'css3',
'flask', 'angular', 'android', 'java', 'html5', 'javascript']
'css', 'html', 'js', 'net']
2
3 Table 6 shows the results obtained by pre-processing the skills section, at this point,
4Word2vec helps us to reduce the skills detected in the aforementioned section and as
5results, we obtain those IT skills that are more related to each other or, in other words,
6those that have a greater semantic relationship.
7
8 4.1.2 Case 2: job offer
9 On the other hand, in the case of a job offer, the sections that go through pre-
10processing are the required skills and desirable skills, since these include IT skills.
11
12 Table 7. Job offer: pre-processing results
13
Offer IT skills detected Most similar IT skills
Oferta_1 9 = ['html', 'css', 8 = ['html5', 'css3',
'javascript', 'nodejs', 'angular', 'javascript', 'php', 'nodejs',
'php', 'laravel', 'aws', 'azure'] 'laravel', 'aws', 'azure']
Oferta_2 13 = ['php', 'javascript', 10 = ['php', 'python',
'typescript', 'c#', 'xamarin', 'symfony', 'css3', 'javascript',
'python', 'symfony', 'django', 'html5', 'typescript', 'angular',
'html', 'css', 'aws', 'dynamo', 'c#', 'xamarin']
'angular']

Oferta_199 14 = [‘android’, 'html', 9 = ['html5', 'css3',


'css', 'javascript', 'typescript', 'javascript', 'php', 'typescript',
'java', 'php', 'python', 'angular', 'python', 'react',
'angular', 'nodejs', 'azure', 'nodejs']
'react', 'js', 'nestjs']

4
114 First Author et al.

Oferta_200 10 = ['java', 'spring', 10 = ['java', 'spring',


'android', 'hibernate', 'android', 'hibernate',
'mybatis', 'javascript', 'html5', 'mybatis', 'javascript',
'css3', 'python', 'linux'] 'html5', 'css3', 'python',
'linux']
1
2 In Table 7, we show the results obtained from pre-processing the skills section
3(required and desirable). As we mentioned in the case of the CVs, the output obtained
4indicated which IT skills have the highest semantic relationship.

54.2. 4.2 Categorization results

6 4.2.1 Case 1:CV


7 After obtaining the IT skills with the highest semantic similarity, each skill is
8queried in a dictionary to obtain the associated IT occupations.
9
10 Table 8. CV: categorization results
11
CV Assigned categories

cv_0001 ['frontend', 'web developer']


['frontend', 'web developer',
cv_0002 'js developer']

cv_0049 ['java developer', 'backend']


cv_0050 ['java developer', 'frontend']
12
13 As mentioned in the previous paragraph, the skills obtained as output from the pre-
14processing are consulted in the IT dictionary and as a result, we obtain data shown in
15Table 8.
16
17
18 4.2.2 Case 2: job offer
19 For the case of a job offer, the same process is applied as in case 1, but the skills
20that are consulted in the IT dictionary are those that were detected in the mandatory
21skills section, since these are the ones that best describe the required profile.
22
23 Table 9. Job offer: categorization results
Offer Assigned categories
Oferta_1 ['frontend', 'web developer', 'js
developer', 'php developer']
Oferta_2 ['php developer', 'backend', 'web
developer', 'frontend', '.net
developer']

2
1 Authors’ Instructions 15
2
3

Oferta_199 [‘web developer']


Oferta_200 ['java developer', ‘web developer']
1
2 As a result of the above, Table 20 shows the categories obtained for each job offer
3going through the categorization process.

44.3. 4.3 Clustering results

5 4.3.1 Case 1:CV


6 In the clustering stage, if the document entered is a CV, those job offers that have a
7common category are extracted from the database in order to reduce the volume of
8data to be processed. From this set of data (CV and job offers) the input dataset for the
9Mean-shift algorithm is created and from which we obtain as a result a cluster
10containing the CV and a subset of job offers (most suitable offers to the profile).

4
116 First Author et al.

1 Table 10. CV: clustering results


CV Best offers % of required % of desired
skills met skills met
cv_0001 oferta_0057 60.0 0
oferta_0048 30.77 0
oferta_0021 30.77 0
cv_0002 oferta_0084 66.67 0
oferta_0048 oferta_0021 34.62 0
34.62 0

cv_0049 oferta_0105 41.18 100


oferta_0007 33.34 0
oferta_0048 30.77 0
oferta_0021 30.77 0
cv_0050 oferta_0084 55.56 0
oferta_0003 45.45 0
oferta_0073 45.45 0
oferta_0110 27.78 0
oferta_0007 27.78 0
oferta_0108 25.0 0
oferta_0048 23.08 0
oferta_0021 23.08 0
oferta_0051 18.75 0

2
3 With the subset obtained, as shown in Table 10, we created a ranking of the best
4job offers for each CV.
5 In the proposal made by [4], they employ semantic matching to calculate the
6distance between the candidate’s profile skills and experience with the job offer
7requirements. On the other hand, [2] use string matching to evaluate the
8correspondence between a vacancy (job offer) and a profile. In this type of methods, it
9does not consider that some IT skills can be represented in more than one form (Ex:
10Javascript can be found in some offers or profiles al JS). Therefore, in our proposal we
11create an IT dictionary to deal with this problem. Such a dictionary not only informs
12us about the occupations related to a skill, but also considers the various forms of
13writing with which this IT skill can be represented. The latter contributes to broaden
14the detection of skills and, thus, to obtain a better quality result.
15 In the study conducted by [19] they propose a method to automatically classify CVs
16to their respective job offers, they perform a categorization/labeling of the documents,
17with the objective of comparing only the elements of the same category. To this end,
18they combined two knowledge bases (DICE and O*NET) with which they obtained
19the occupation associated with each skill. On the other hand, in our proposal, we
20constructed an IT dictionary with 226 skills. In this dictionary, for each skill there is a
21set of associated IT occupations according to the current market. The latter is what
22differentiates us from the aforementioned proposal, since, unlike the author’s

2
1 Authors’ Instructions 17
2
3
1proposal, our proposal focuses exclusively on the IT area, using a knowledge base
2built manually for this purpose.
3

45. Conclusions and future work

5 An architecture based on Natural Language Processing and Machine Learning is


6proposed to address the problem of recruiting IT personnel.
7 As shown in the cited references, in addition to the skills or knowledge, there are
8other qualities that are qualified to determine which person best meets the
9requirements of a job offer. Among these we find the work history, with which we can
10obtain the years of experience, positions held, among others. As future work, we want
11to build on this architecture to design a generalized architecture for recruitment.
12

136. References

141. Almalis, N. D., Tsihrintzis, G. A., Karagiannis, N., & Strati, A. D. (2016). FoDRA - A
15 new content-based job recommendation algorithm for job seeking and recruiting. IISA
16 2015 - 6th International Conference on Information, Intelligence, Systems and
17 Applications.
182. Chala, S. A., Ansari, F., Fathi, M., & Tijdens, K. (2018). Semantic matching of job seeker
19 to vacancy: a bidirectional approach. International Journal of Manpower, 39(8), 1047–
20 1063.
213. Chaudhary, A., Jobanputra, M., Shah, S., Gandhi, R., Chaudhary, S., & Goswami, R.
22 (2018). Automated human capital management system. 12th Annual IEEE International
23 Systems Conference, SysCon 2018 - Proceedings, 1–8.
244. Faliagka, E., Iliadis, L., Karydis, I., Rigou, M., Sioutas, S., Tsakalidis, A., & Tzimas, G.
25 (2014). On-line consistent ranking on e-recruitment: Seeking the truth behind a well-
26 formed CV. Artificial Intelligence Review, 42(3), 515–528.
275. Fernández-Reyes, F. C., & Shinde, S. (2019). CV Retrieval System based on job
28 description matching using hybrid word embeddings. Computer Speech and Language, 56,
29 73–79.
306. Figueroa-García, J. C., Kalenatic, D., & López-Bello, C. A. (2015). Artificial Intelligent
31 Techniques in Human Resource Management. Intelligent Systems Reference Library, 87,
32 623–643.
337. Guo, S., Alamudun, F., & Hammond, T. (2016). RésuMatcher: A personalized résumé-job
34 matching system. Expert Systems with Applications, 60, 169–182.
358. Heap, B., Krzywicki, A., Wobcke, W., Bain, M., & Compton, P. (2014). Combining career
36 progression and profile matching in a job recommender system. Lecture Notes in
37 Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture
38 Notes in Bioinformatics), 8862, 396–408.
399. Kaoru, S., Kenichi, S., Masatomo, K., & Atsuhi, H. (2017). Towards extracting recruiters’
40 tacit knowledge based on interactions with a job matching system. Lecture Notes in
41 Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture
42 Notes in Bioinformatics), 10298, 557–568.

4
118 First Author et al.

110. Kmail, A. B., Maree, M., Belkhatir, M., & Alhashmi, S. M. (2016). An automatic online
2 recruitment system based on exploiting multiple semantic resources and concept-
3 relatedness measures. Proceedings - International Conference on Tools with Artificial
4 Intelligence, ICTAI, 2016-Janua, 620–627.
511. Mehta, M., Derasari, R., Patel, S., Kakadiya, A., Gandhi, R., Chaudhary, S., & Goswami,
6 R. (2019). A service-oriented human capital management recommendation platform.
7 SysCon 2019 - 13th Annual IEEE International Systems Conference, Proceedings, 1–8.
812. Mohamed, A., Bagawathinathan, W., Iqbal, U., Shamrath, S., & Jayakody, A. (2018).
9 Smart Talents Recruiter - Resume Ranking and Recommendation System. 2018 IEEE 9th
10 International Conference on Information and Automation for Sustainability, ICIAfS 2018,
11 1–5.
1213. Patel, B., Kakuste, V., & Eirinaki, M. (2017). CaPaR: A career path recommendation
13 framework. Proceedings - 3rd IEEE International Conference on Big Data Computing
14 Service and Applications, BigDataService 2017, 23–30.
1514. Rimitha, S. R., Abburu, V., Kiranmai, A., Marimuthu, C., & Chandrasekaran, K. (2019).
16 Improving Job Recommendation Using Ontological Modeling and User Profiles. 2019
17 15th International Conference on Information Processing: Internet of Things, ICINPRO
18 2019 - Proceedings.
1915. Roy, P. K., Chowdhary, S. S., & Bhatia, R. (2020). A Machine Learning approach for
20 automation of Resume Recommendation system. Procedia Computer Science, 167(2019),
21 2318–2327.
2216. Vallejo Chávez, L. M. (2016). Gestión del talento humano ESPOCH 2016.
2317. Wang, H., Liang, G., & Zhang, X. (2018). Feature Regularization and Deep Learning for
24 Human Resource Recommendation. IEEE Access, 6, 39415–39421.
2518. Wierfi, A. D., Utami, E., & Sunyoto, A. (2019). The application of extended weighted tree
26 similarity algorithm for similarity searching. 2019 International Conference on
27 Information and Communications Technology, ICOIACT 2019, 428–433.
2819. Zaroor, A., Maree, M., & Sabha, M. (2018). A Hybrid Approach to Conceptual
29 Classification and Ranking of Resumes and Their Corresponding Job Posts. International
30 Conference on Intelligent Decision Technologies, 2, 13–21.
3120. Zhu, C., Zhu, H., Xiong, H., Ma, C., Xie, F., Ding, P., & Li, P. (2018). Person-Job Fit.
32 ACM Transactions on Management Information Systems, 9(3), 1–17.
33

You might also like