Job Recommendation: An Approach To Match Job-Seeker's Interest With Enterprise's Requirement - Ngoc-Trung-Kien Ho & Hung Ho-Dac & Tuan-Anh Le

Download as pdf or txt
Download as pdf or txt
You are on page 1of 7

Job Recommendation: An Approach

to Match Job-Seeker’s Interest


with Enterprise’s Requirement

Ngoc-Trung-Kien Ho, Hung Ho-Dac, and Tuan-Anh Le

Abstract Globalization has been raising more and more job position changing
demand which leads to rapid increase in job seeking. Finding suitable job and appro-
priate employee are becoming more complicated due to the huge amount of informa-
tion. In that context, job recommender system becomes a fabulous means to fill the
gap between job-seeker and enterprise. The need to find a potential job that matches
the enterprise’s requirements with job-seeker’s interest is one of the core features of
job recommender system. In this paper, we propose an approach to match job-seeker’s
interest with enterprise’s requirement by applying similarity metrics such as Consine,
Mahattan, Jaccard and Levenshtein metrics. Our experiments are examined on real
dataset of Binh Duong Career Service Center (BDCSC). The experimental results
demonstrate that Levenshtein distance shows the best accuracy among metrics.

Keywords Job recommendation · Similarity metrics · Matching

1 Introduction

In recent years, the demand for job seeking is rapidly increasing especially in devel-
oping country [1]. Take Vietnam as an example, Navigos Group Vietnam JSC records
an increase in online labor supply in the system in mid-2019 by more than 20% over
the same period in 2018. Besides, Navigos Group Vietnam JSC also records amazing
compounded annual rate of growth of online labor supply between 2014 and 2019
which is 14.3% [2]. The more information, the more complex for both job-seekers

N.-T.-K. Ho (B) · H. Ho-Dac · T.-A. Le


Thu Dau Mot University, Binh Duong Province, Vietnam
e-mail: [email protected]
H. Ho-Dac
e-mail: [email protected]
T.-A. Le
e-mail: [email protected]

© Springer Nature Singapore Pte Ltd. 2021 361


R. Kumar et al. (eds.), Research in Intelligent and Computing in Engineering,
Advances in Intelligent Systems and Computing 1254,
https://doi.org/10.1007/978-981-15-7527-3_35
362 N.-T.-K. Ho et al.

and enterprises. In order to automatically match job-seeker and enterprise, job recom-
mender system is integrated into online recruitment platforms. This integration helps
job-seekers reduce seeking time and helps the recruitment of enterprises become
more effectively.
In this work, we propose an approach to match job-seeker’s interest with enter-
prise’s requirement by applying similarity metrics. The rest of the paper is as follows.
Section 2 surveys the related work. We introduce the methodology of our approach
in Sect. 3. Section 4 evaluates the results of the approach. We conclude our work in
Sect. 5.

2 Related Work

There are many works in job recommendation domain. In [3], Al-Otaibi et al.
conducted a survey of job recommender systems. Collaborative filtering, content-
based filtering, hybrid approach, machine learning, e-recruiting, similarity measure
were surveyed methods. This work also presented e-recruitment process in our
modern context. Hong et al. [4] proposed an approach based on job-seeker clus-
tering and implemented other methods for each job-seeker cluster. Dai-Dong Nguyen
conducted an analysis of the real situation of labor supply, demand and solutions [5];
this analysis highlights the situation in the labor market in Vietnam and provides
future solutions for connecting employees and employers in Vietnam. In another
work, Gupta et al. [6] proposed an approach by applying data mining techniques to
job recommendation.

3 Methodology

In this section, we describe our approach to build up job recommender core which
has 5 main steps as in Fig. 1. We use jsed dataset in our experiments. jsed dataset
is a real dataset retrieved from BDCSC and then we reorganize data in CSV format.
There are originally 231 tables in jsed dataset but due to the need of our work, we
extract data from 02 tables as described below:
(i) Table Job-Seekers: contains 78,669 records with 13 columns.
(ii) Table Enterprises: contains 33,807 records with 13 columns.

Data Pre- Vectorization Similarity Evaluation


Retrieval processing Computation

Fig. 1 Proposed approach procedure


Job Recommendation: An Approach to Match … 363

3.1 Preprocessing and Vectorization

Detailed requirements of job positions are described in JD. So, from table Enterprises,
job descriptions (JDs) are extracted. Besides, we also extract profiles of job-seekers
from table Job-seekers. JDs and profiles then are cleaned by eliminating stop words
using tfidf (1), (2) and (3):

f t,d
t f (t, d) = 0.5 + 0.5 · , (1)
max{ f t  d : t  ∈ d}
N
id f (t, D) = log , (2)
|{d ∈ D : t ∈ d}|

t f id f = t f (t, d) · d f (t, D) (3)

and segmented using VnTokenizer [7]. In order to prepare for similarity computation,
we vectorize JD and profiles into vectors.

3.2 Similarity Computation

In this work, we use four similarity metrics: Consine, Mahattan, Jaccard and
Levenshtein.
Cosine: Computing document similarity based on Cosine metrics is a simple method
and gives results with high accuracy. The document is represented by the bag-of-
words regardless of grammar and word order. Each document is segmented into n-
grams words. Each word is counted by the number of occurrences and is an element
of n-dimensional vector, where n is the number of different words of the document.
After converting the two documents into vectors a and b,  we can use the Cosine
metrics to compute the similarity of the text [8]. The formula for computing Cosine
similarity is as follows (4):

  a .b
Sim c a , b = (4)
a .b
Manhattan: This metric is a form of distance between two points in Euclidean
space with Cartesian coordinate system. The metric is calculated by the total length
of the projection of the line connecting these two points in the Cartesian coordinate
system. When two documents are represented as two vectors a and b, we can compute
Manhattan distance [9, 10] as follows (5):

  n
L m a , b = |ai − bi |, (5)
i=1
364 N.-T.-K. Ho et al.
 
L m a , b ranges from 0 to 1. Hence, similarity of these two vectors a and b is
computed as following formula (6):
  n
|ai − bi |

Sim M a , b = 1 − i=1 (6)
n

Jaccard: The input strings of each document are converted into n-gram sets. Given
two n-gram sets corresponding to the two documents to be compared are A and B.
After converting the two documents into vectors a and b,  respectively, the Jaccard
coefficient [8, 11] is calculated as the following formula (7):
 
 
  a ∩ b
Sim J a , b =   (7)
 
a ∪ b


Levenshtein: This distance represents the difference in distance between two strings
of characters. The Levenshtein distance between string A and string B is the smallest
number of steps to convert string A to string B through three transformations: delete
one character, add one character and change one character to another. In order to
compute Levenshtein distance [12], we use dynamic planning algorithm, compute
on 2-dimensional (n + 1) * (m + 1) arrays, where n and m are the string lengths A
and B, respectively. The formula of similarity based on Levenshtein distance is as
follow (8):

d[m, n]
Sim L (A, B) = 1 − (8)
s
where d[m, n] is computed by algorithm in Listing 1.

3.3 Evaluation

BDCSC regularly hosts employment support sessions where job-seekers can take
part in and get list of recommended job positions. We retrieve jobs which have high
similarity based on Consine, Mahattan, Jaccard and Levenshtein metrics. Job-seekers
then help us determine whether recommendations are appropriate or not. According
to these verifications, we summarize and compute the performance of proposing
metrics. The detailed result is presented in Sect. 4.
Job Recommendation: An Approach to Match … 365

func LevenshteinDistance(char A[1..m], char B[1..n]):

//input: two strings A and B

//output: Levenshtein distance between string A and string B

//declaration

declare int d[0..m, 0..n]

//initialization

for i from 0 to m

d[i,0]:=i

for j from 0 to n

d[0,j]=j

//computation

for i from 1 to m

for j from 1 to n

if A[i]==B[j] then cost:=0

else cost:=0

d[i,j]:= minimum(d[i-1,j]+1, d[i,j-1]+1, d[i-1,j-1]+cost)

return d[m,n]

Listing 1 Pseudo-code of Levenshtein distance algorithm

4 Results

We build up our recommendation core and extract recommended job positions based
on proposing metrics for 500 job-seekers who interact with BDCSC. The specific
process for job-seekers in BDCSC is described in Fig. 2.
Feedbacks from job-seekers are recorded and summarized as in Table 1. Our
summary shows that there is strong fluctuation in the number of positive feed-
backs between metrics. Levenshtein distance gains the highest amount of positive
feedbacks, while Jaccard coefficient gains the lowest amount of positive feedbacks.

Request Job Receive Job Evaluate Job Verify


Description Description Description suitability

Fig. 2 Proposed approach procedure


366 N.-T.-K. Ho et al.

Table 1 Job-seekers feedbacks summary


Consine Mahattan Jaccard Levenshtein
Number of proposing job descriptions 12,690 12,690 12,690 12,690
Number of positive feedbacks 3870 4020 1370 9350

Table 2 Accuracy of
Metrics Accuracy (%)
recommendation based on
proposing metrics Cosine 30.50
Mahattan 31.68
Jaccard 10.80
Levenshtein 73.68

The accuracy for proposing metrics is presented in Table 2. The result shows that
Levenshtein gains the highest accuracy.

5 Conclusion

In this work, we propose an approach to match job-seeker’s interest with enter-


prise’s requirement based on similarity. Popular similarity metrics such as Consine,
Mahattan, Jaccard and Levenshtein are examined on real dataset (jsed) provided
by BDSCS. Recommendation and verification is done directly at Binh Duong
Career Service Center. We discover that Levenshtein gain the highest accuracy by
receiving the highest positive feedbacks from job-seekers in Binh Duong province.
In our further work, we will observe more personalized factors in order to improve
performance for our proposed approach.

References

1. Erixon F (2018) The economic benefits of globalization for business and consumers. European
Centre for International Political Economy
2. Report on the online recruitment market in Vietnam in the first half of 2019 (2019). Navigos
Group Vietnam JSC
3. Al-Otaibi st, Ykhlef M (2012) A survey of job recommender systems. Int J Phys Sci 7(29):5127–
5142
4. Hong W et al (2013) A job recommender system based on user clustering. JCP 8(8):1960–1967
5. Nguyen D-D (2011) An analysis of the real situation of labor supply, demand and solutions.
Ministry of Labor, War Invalids and Social Affairs
6. Gupta A, Garg D (2014) Applying data mining techniques in job recommender system
for considering candidate job preferences. In 2014 international conference on advances in
computing, communications and informatics (ICACCI). IEEE
Job Recommendation: An Approach to Match … 367

7. Phuong LH et al (2008) A hybrid approach to word segmentation of vietnamese texts. Lang


Automata Theory Appl 240–249
8. Huang A (2008) Similarity measures for text document clustering. In Proceedings of the
sixth New Zealand computer science research student conference (NZCSRSC2008), vol 4.
Christchurch, New Zealand
9. Khatibsyarbini M, Isa MA, Abang Jawawi DN (2017) A hybrid weight-based and string
distances using particle swarm optimization for prioritizing test cases. J Theor Appl Inf Tech
95:12
10. Ledru Y et al (2012) Prioritizing test cases with string distances. Automated Software
Engineering 19(1):65–95
11. Niwattanakul S et al (2013) Using of Jaccard coefficient for keywords similarity. In Proceedings
of the international multiconference of engineers and computer scientists, vol 1, 6
12. Su Z et al (2008) Plagiarism detection using the Levenshtein distance and Smith-Waterman
algorithm. In 2008 3rd international conference on innovative computing information and
control. IEEE

You might also like