Collaborative_job_prediction_based_on_Nave_Bayes_Classifier_using_python_platform
Collaborative_job_prediction_based_on_Nave_Bayes_Classifier_using_python_platform
Abstract - The paper aims to implement recommendation The Netflix Competition for the recommendation system [8]
system based on collaborative filtering technique for job highlights the combination of both content based filtering and
portals. The system is designed to suggest the jobs to the the coIlaborative filtering. Different algorithms analysis done
user depending upon his profile and by calculating a by M. Papagelis and D. Plexousakis [9] show how
similarity index using Euclidian distance of two skiD sets recommendation can be obtained depending upon the data set.
and then ranking them according to their na'ive Bayes On the similar ground, depending upon the data of jobs and
algorithm. The recommendation system has been its type, different methods were combined to obtain a suitable
implemented in python. algorithm for thejob prediction.
III. PROPOSED M ODEL Step2: Data sanitization phase is used to enhance the data for
our computational processing.
Authorized licensed use limited to: BIRLA INSTITUTE OF TECHNOLOGY AND SCIENCE. Downloaded on February 15,2023 at 13:52:23 UTC from IEEE Xplore. Restrictions apply.
2016 International Conference on Computational Systems and Information Systems for Sustainable Solutions
\
�
0. 15
0.25
Step 5: Filtration unit deals with the removal of certain skills 0.225
from the recommendation set based on certain constraints 0.2
0.175
defIned. This operation is done in order to keep the 0.15
0.125
recommendations healthy and free of noise. In our current
8�
0. 1 0.0711
model skills with low occurrence are fIltered and add to a
global set so that they may not get recommended along with
0d? 9�o �-
PUIO: Represents the probability of skillj being in a user's Normalization Euclidean Pearson
profile given that skill i is already present factor Error Error
1 0.3566 0.2994
FreqU i) : Represents the number ofj-i pairs in the cluster of 2 0.0711 0.3055
skill i 3 0.0152 0.3073
4 0.0045 0.3121
FreqCi) : Represents the number of users who possess skill i
5 0.0021 0.3143
6 0.0009 0.3127
As PUIi) would not be equal to PCilj), this model uses an
Table 1 represents the correspondrng values of MSE at different
asymmetric similarity function. [5][6][7] One of the
normalization factor.
limitations of using an asymmetric similarity function is that
each item i will tend to have high conditional probabilires In Fig 3 and Fig 4 X-axis represents the normalization factor and
with items that are being purchased frequently. This solution Y-axis represents the mean square error rate which is calculated
is inspired from the inverse-document scaling performed in using formula
information retrieval systems.
1 n
v. RESULTS
MSE =
n LCW' - W)2
In order to get optimum results an optimum error analysis i=l
has to be chosenwhose results stabilize after a certain MSE= Mean Squared Error
normalization factor.
n= Sample space
CW' - W) = Deviation.
Authorized licensed use limited to: BIRLA INSTITUTE OF TECHNOLOGY AND SCIENCE. Downloaded on February 15,2023 at 13:52:23 UTC from IEEE Xplore. Restrictions apply.
2016 International Conference on Computational Systems and Information Systems for Sustainable Solutions
As compared to the Pearson coefficient in which the deviations TABLE 3: Weightage table of Python
keep on increasing with increase in the normalization factor,
conversely Euclidean distance algorithm provides better results
for our application.
� Html
Pass
0.91
= 0.2
Fail
0.09
Pass
0.9
= 0.25
Fail
0.1
MySQI 0.9 0.1 0.93 0.07
C 0,98 0.02 0,97 0.03
C++ 0.94 0.06 0,93 0,07
SQL 0.85 0.15 0.82 0.18
Authorized licensed use limited to: BIRLA INSTITUTE OF TECHNOLOGY AND SCIENCE. Downloaded on February 15,2023 at 13:52:23 UTC from IEEE Xplore. Restrictions apply.
2016 International Conference on Computational Systems and Information Systems for Sustainable Solutions
Authorized licensed use limited to: BIRLA INSTITUTE OF TECHNOLOGY AND SCIENCE. Downloaded on February 15,2023 at 13:52:23 UTC from IEEE Xplore. Restrictions apply.