0% found this document useful (0 votes)
10 views9 pages

Classification Model of Prediction For Placement of Students

The document presents a classification model for predicting the placement of students in the MCA course using data mining techniques. It discusses the application of algorithms such as Decision Tree and Naïve Bayes to analyze students' academic performance and improve educational decision-making. The study aims to enhance the understanding of student placement outcomes based on their academic records, ultimately benefiting both students and educational institutions.

Uploaded by

supdingo123
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views9 pages

Classification Model of Prediction For Placement of Students

The document presents a classification model for predicting the placement of students in the MCA course using data mining techniques. It discusses the application of algorithms such as Decision Tree and Naïve Bayes to analyze students' academic performance and improve educational decision-making. The study aims to enhance the understanding of student placement outcomes based on their academic records, ultimately benefiting both students and educational institutions.

Uploaded by

supdingo123
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/260185437

Classification Model of Prediction for Placement of Students

Article · November 2013


DOI: 10.5815/ijmecs.2013.11.07

CITATIONS READS

60 5,838

1 author:

Saurabh Pal
Veer Bahadur Singh Purvanchal University
104 PUBLICATIONS 3,637 CITATIONS

SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Application of Data Mining Techniques with special reference to academic Performance Monitoring and evaluation in higher education View project

Prediction of Presence of Breast Cancer Disease in the Patient using Machine Learning Algorithms and SFS View project

All content following this page was uploaded by Saurabh Pal on 15 February 2014.

The user has requested enhancement of the downloaded file.


I.J.Modern Education and Computer Science, 2013, 11, 49-56
Published Online November 2013 in MECS (http://www.mecs-press.org/)
DOI: 10.5815/ijmecs.2013.11.07

Classification Model of Prediction for Placement


of Students
Ajay Kumar Pal
Research Scholar, Sai Nath University, Ranchi, Jharkhand
Email: [email protected]

Saurabh Pal
Head, Department of MCA, VBS Purvanchal University, Jaunpur, India
Email: [email protected]

Abstract— Data mining methodology can analyze Applications (MCA) course provides professional
relevant information results and produce different computer technological education to students. This
perspectives to understand more about the students’ course provides state of the art theoretical as well as
activities. When designing an educational environment, practical knowledge related to information technology
applying data mining techniques discovers useful and make students eligible to stand in progressing
information that can be used in formative evaluation to information industry.
assist educators establish a pedagogical basis for taking The prediction of MCA students where they can be
important decisions. Mining in education environment is placed after the completion of MCA course will help to
called Educational Data Mining. Educational Data improve efforts of students for proper progress. It will
Mining is concerned with developing new methods to also help teachers to take proper attention towards the
discover knowledge from educational database and can progress of the student during the course. It will help to
used for decision making in educational system. build reputation of institute in existing similar category
In this study, we collected the student’s data that have institutes in the field of IT education.
different information about their previous and current The present study concentrates on the prediction of
academics records and then apply different classification placements of MCA students. We apply data mining
algorithm using Data Mining tools (WEKA) for analysis techniques using Decision tree and Naïve Bayes
the student’s academics performance for Training and classifier to interpret potential and useful knowledge [7].
placement. The rest of this paper is organized as follows: Section
This study presents a proposed model based on II presents different type of data mining techniques for
classification approach to find an enhanced evaluation machine learning Section III describes background and
method for predicting the placement for students. This history of educational data mining. Section IV describes
model can determine the relations between academic the methodology used in our experiments about applying
achievement of students and their placement in campus data mining techniques on the educational data for
selection. placement of students and the results obtained. Finally
we conclude this paper with a summary and an outlook
Index Terms— Knowledge Discovery in Databases, for future work in Section V.
Data Mining, Classification Model, Classification,
WEKA.
II. DATA MINING

I. INTRODUCTION Data mining, also popularly known as Knowledge


Discovery in Database, refers to extracting or ‘mining’
Majority of students in higher education join a course knowledge from large amounts of data. Data mining
for securing a good job. Therefore taking a wise career techniques are used to operate on large volumes of data
decision regarding the placement after completing a to discover hidden patterns and relationships helpful in
particular course is crucial in a student’s life. An decision making. While data mining and knowledge
educational institution contains a large number of discovery in database are frequently treated as
student records. Therefore finding patterns and synonyms, data mining is actually part of the knowledge
characteristics in this large amount of data is not a discovery process.
difficult task. Higher Education is categorized into Data mining is: Discovering the methods and patterns
professional and non-professional education. in large databases to guide decisions about future
Professional education provides professional knowledge activities. It is expected that data mining tools to get the
to students so that they can make their stand in corporate model with minimal input from the user to recognize.
sector. Professional education may be technology The model presented can be useful to understand the
oriented or it may be totally concentrating on improving unexpected and provide an analysis of data followed by
managerial skills of candidate. Masters in Computer other tools to put decision-making are examined and it

Copyright © 2013 MECS I.J. Modern Education and Computer Science, 2013, 11, 49-56
50 Classification Model of Prediction for Placement of Students

ultimately leads to strategic decisions and business function (when we are not so much familiar with the
intelligence. The simplest word for knowledge relationship between input and output attributes) which
extraction and exploration of volume data is very high sets the example determined by the vector attribute
and the more appropriate name for this term is values into one or more classes.
“Exploring the knowledge of database". A database is
C. C4.5 Tree
knowledge of discovery process. This process includes
the preparation and interpretation of results. The most commonly, and nowadays probably the
Classification is the most commonly applied data most widely used decision tree algorithm is C4.5.
mining technique, which employs a set of pre-classified Professor Ross Quinlan [2] developed a decision tree
attributes to develop a model that can classify the algorithm known as C4.5 in 1993; it represents the result
population of records at large. This approach frequently of research that traces back to the ID3 algorithm (which
employs decision tree or neural network-based is also proposed by Ross Quinlan in 1986). C4.5 has
classification algorithms. The data classification process additional features such as handling missing values,
involves learning and classification. In learning the categorization of continuous attributes, pruning of
training data are analyzed by classification algorithm. In decision trees, rule derivation, and others. Basic
classification test data are used to estimate the accuracy construction of C4.5 algorithms uses a method known as
of the classification rules. If the accuracy is acceptable divide and conquer to construct a suitable tree from a
the rules can be applied to the new data sets. The training set S of cases (Wu and Kumar, [3]):
classifier-training algorithm uses these pre-classified • If all the cases in S belong to the same class or
attributes to determine the set of parameters required for S is small, the tree is a leaf labelled with the
proper discrimination. The algorithm then encodes these most frequent class in S.
parameters into a model called a classifier. The widely • Otherwise, choose a test based on a single
used classification algorithms are
attribute with two or more outcomes. Make this
A. Naïve Bayesian Classification test the root of the tree with one branch for each
outcome of the test, partition S into
The Naïve Bayes Classifier technique is particularly corresponding subsets S1, S2, ……… according
suited when the dimensionality of the inputs is high. to the outcome for each case, and apply the
Despite its simplicity, Naive Bayes can often outperform same procedure recursively to each subset.
more sophisticated classification methods. Naïve Bayes
model identifies the characteristics of dropout students. There are usually many tests that could be chosen in
It shows the probability of each input attribute for the this last step. C4.5 uses two heuristic criteria to rank
predictable state. possible tests: information gain, which minimizes the
A Naive Bayesian classifier is a simple probabilistic total entropy of the subsets, and the default gain ratio
classifier based on applying Bayesian theorem (from that divides information gain by the information
Bayesian statistics) with strong (naive) independence provided by the test outcomes.
assumptions. By the use of Bayesian theorem we can J48 algorithm is an implementation of C4.5 decision
write tree algorithm in Weka software tool. Flowchart of
decision trees is presented by the tree structure. In every
internal node the condition of some attribute is being
examined, and every branch of the tree represents an
outcome of the study. The branching of the tree ends
We preferred Naive Bayes implementation because: with leaves that define a class to which examples belong.
Decision tree algorithm is a popular procedure today
• Simple and trained on whole (weighted) training because of its ease of implementation and in particular
data because of the possibility for the results to be
• Over-fitting (small subsets of training data) graphically displayed.
protection To evaluate the robustness of the classifier, the usual
• Claim that boosting “never over-fits” could not methodology is to perform cross validation on the
be maintained. classifier. In this study, a 3-fold cross validation was
• Complex resulting classifier can be determined used: we split data set randomly into 3 subsets of equal
reliably from limited amount of data size. Two subsets were used for training, one subset for
B. Multilayer Perceptron cross validating, and one for measuring the predictive
accuracy of the final constructed network. This
Multilayer Perceptron (MLP) algorithm is one of the procedure was performed 3 times so that each subset
most widely used and popular neural networks. The was tested once. Test results were averaged over 3-fold
network consists of a set of sensory elements that make cross validation runs. Data splitting was done without
up the input layer, one or more hidden layers of sampling stratification. The Weka software toolkit can
processing elements, and the output layer of the calculate all these performance metrics after running a
processing elements (Witten and Frank, [1]). MLP is specified k-fold cross-validation. The prediction
especially suitable for approximating a classification accuracy of the models was compared.

Copyright © 2013 MECS I.J. Modern Education and Computer Science, 2013, 11, 49-56
Classification Model of Prediction for Placement of Students 51

establish the prognostic value of different measures of


III. BACKGROUND AND RELATED WORK cognition, personality and demographic variables for
success at higher secondary level in science stream. The
Data mining techniques has evolved its research very
selection was based on cluster sampling technique in
well in the field of education in a massive amount. This
which the entire population of interest was divided into
tremendous growth is mainly because it contributes
groups, or clusters, and a random sample of these
much to the educational systems to analyze and improve
clusters was selected for further analyses. It was found
the performance of students as well as the pattern of
that girls with high socio-economic status had relatively
education. Various works had been done by a large
higher academic achievement in science stream and
number of scientists to explore the best mining
boys with low socioeconomic status had relatively
technique for performance monitoring and placement.
higher academic achievement in general.
Few of the related works are listed down to have a better
Z. J. Kovacic [10] presented a case study on
understanding of what should be carried on in the past
educational data mining to identify up to what extent the
for further growth.
enrolment data can be used to predict student’s success.
Han and Kamber [4] describes data mining software
The algorithms CHAID and CART were applied on
that allow the users to analyze data from different
student enrolment data of information system students of
dimensions, categorize it and summarize the
open polytechnic of New Zealand to get two decision
relationships which are identified during the mining
trees classifying successful and unsuccessful students.
process.
The accuracy obtained with CHAID and CART was
Bhardwaj and Pal [5] conducted study on the student
59.4 and 60.5 respectively.
performance based by selecting 300 students from 5
Galit [11] gave a case study that use students data to
different degree college conducting BCA (Bachelor of
analyze their learning behavior to predict the results and
Computer Application) course of Dr. R. M. L. Awadh
to warn students at risk before their final exams.
University, Faizabad, India. By means of Bayesian
Yadav, Bhardwaj and Pal [12] conducted study on the
classification method on 17 attributes, it was found that
student retention based by selecting 398 students from
the factors like students’ grade in senior secondary exam,
MCA course of VBS Purvanchal University, Jaunpur,
living location, medium of teaching, mother’s
India. By means of classification they show that
qualification, students other habit, family annual income
student’s graduation stream and grade in graduation play
and student’s family status were highly correlated with
important role in retention.
the student academic performance.
Al-Radaideh, et al [13] applied a decision tree model
Tongshan Chang, & Ed.D [6] introduces a real project
to predict the final grade of students who studied the
to assist higher education institutions in achieving
C++ course in Yarmouk University, Jordan in the year
enrollment goals using data mining techniques
2005. Three different classification methods namely ID3,
Furthermore, the results also provide evidence that data
C4.5, and the NaïveBayes were used. The outcome of
mining is an effective technology for college recruitment.
their results indicated that Decision Tree model had
It can help higher education institutions mange
better prediction than other models.
enrollment more effectively.
Sudheep Elayidom , Sumam Mary Idikkula & Joseph
Pandey and Pal [7] conducted study on the student
Alexander [14] proved that the technology named data
performance based by selecting 600 students from
mining can be very effectively applied to the domain
different colleges of Dr. R. M. L. Awadh University,
called employment prediction, which helps the students
Faizabad, India. By means of Bayes Classification on
to choose a good branch that may fetch them placement.
category, language and background qualification, it was
A generalized framework for similar problems has been
found that whether new comer students will performer
proposed.
or not.
Baradwaj and Pal [15] obtained the university
Hijazi and Naqvi [8] conducted as study on the
students data like attendance, class test, seminar and
student performance by selecting a sample of 300
assignment marks from the students’ previous database,
students (225 males, 75 females) from a group of
to predict the performance at the end of the semester.
colleges affiliated to Punjab university of Pakistan. The
Ayesha, Mustafa, Sattar and Khan [16] describe the
hypothesis that was stated as "Student's attitude towards
use of k-means clustering algorithm to predict student’s
attendance in class, hours spent in study on daily basis
learning activities. The information generated after the
after college, students' family income, students' mother's
implementation of data mining technique may be helpful
age and mother's education are significantly related with
for instructor as well as for students.
student performance" was framed. By means of simple
Pal and Pal [17] conducted study on the student
linear regression analysis, it was found that the factors
performance based by selecting 200 students from BCA
like mother’s education and student’s family income
course. By means of ID3, c4.5 and Bagging they find
were highly correlated with the student academic
that SSG, HSG, Focc, Fqual and FAIn were highly
performance.
correlated with the student academic performance.
Khan [9] conducted a performance study on 400
Bray [18], in his study on private tutoring and its
students comprising 200 boys and 200 girls selected
implications, observed that the percentage of students
from the senior secondary school of Aligarh Muslim
receiving private tutoring in India was relatively higher
University, Aligarh, India with a main objective to

Copyright © 2013 MECS I.J. Modern Education and Computer Science, 2013, 11, 49-56
52 Classification Model of Prediction for Placement of Students

than in Malaysia, Singapore, Japan, China and Sri Lanka. The domain values for some of the variables were
It was also observed that there was an enhancement of defined for the present investigation as follows:
academic performance with the intensity of private • MR - Marks obtained in MCA. It is split into
tutoring and this variation of intensity of private tutoring three class values: First – ≥ 60%, Second – ≥
depends on the collective factor namely socioeconomic 45% and <60%, Third – ≥ 36% and < 45%.
conditions.
Yadav, Bhardwaj and Pal [19] obtained the university • SEM – Seminar Performance obtained. In each
students data like attendance, class test, seminar and semester seminar are organized to check the
assignment marks from the students’ database, to predict performance of students. Seminar performance
the performance at the end of the semester using three is evaluated into three classes: Poor –
algorithms ID3, C4.5 and CART and shows that CART Presentation and communication skill is low,
is the best algorithm for classification of data. Average – Either presentation is fine or
Communication skill is fine, Good – Both
presentation and Communication skill is fine.
IV. DATA MINING PROCESS • LW – Lab Work. Lab work is divided into two
classes: Yes – student completed lab work, No –
Knowing the factors for placement of student can help student not completed lab work.
the teachers and administrators to take necessary actions
so that the success percentage of placement can be • CS – Communication Skill. Communication
improved. Predicting the placement of a student needs a skill is divided into three classes: Poor –
lot of parameters to be considered. Prediction models Communication skill is low, Average –
that include all personal, social, psychological and other communication skill is up to mark, Good-
environmental variables are necessitated for the effective communication skill is fine.
prediction of the placement of the students. • GB – Graduation Background. This defines the
background of student. Whether students have
A. Data Preparations
done graduation is Art, Science or Computer.
The data set used in this study was obtained from • Placement - Whether the student placed or not
VBS Purvanchal University, Jaunpur (Uttar Pradesh) on after completing his/her MCA. Possible values
the sampling method for Institute of Engineering and are Yes if student placed and No if student not
Technology for session 2008-2012. Initially size of the placed.
data is 65.
B. Data selection and Transformation C. Implementation of Mining Model

In this step only those fields were selected which were Weka is open source software that implements a large
required for data mining. A few derived variables were collection of machine leaning algorithms and is widely
selected. While some of the information for the variables used in data mining applications. From the above data,
was extracted from the database. All the predictor and placement.arff file was created. This file was loaded into
response variables which were derived from the WEKA explorer. The classify panel enables the user to
database are given in Table I for reference. apply classification and regression algorithms to the
resulting dataset, to estimate the accuracy of the
resulting predictive model, and to visualize erroneous
TABLE I: STUDENT RELATED VARIABLES predictions, or the model itself. The algorithm used for
Variables Description Possible Values classification is Naive Bayes, Multilayer Perceptron
(MLP) and J48. Under the "Test options", the 10-fold
Sex Students Sex {Male, Female}
cross-validation is selected as our evaluation approach.
Since there is no separate evaluation data set, this is
necessary to get a reasonable idea of accuracy of the
{First ≥ 60%, generated model. This predictive model provides way to
MR MCA Result Second ≥ 45 & <60% predict whether a new student will place or not in an
Third ≥ 36 & <45%} organization.
D. Results
SEM Seminar Performance {Poor , Average, Good}
To better understand the importance of the input
LW Lab Work { Yes, No } variables, it is customary to analyse the impact of input
variables during students' placement success, in which
CS Communication Skill {Poor , Average, Good} the impact of certain input variable of the model on the
output variable has been analysed. Tests were conducted
GB Graduation Background {Art, Computer, Science} using four tests for the assessment of input variables:
Placement Placement of Student {Yes, No}
Chi-square test, Info Gain test and Gain Ratio test.
Different algorithms provide very different results, i.e.

Copyright © 2013 MECS I.J. Modern Education and Computer Science, 2013, 11, 49-56
Classification Model of Prediction for Placement of Students 53

each of them accounts the relevance of variables in a TABLE IV: TRAINING AND SIMULATION ERROR
different way. The average value of all the algorithms is Evaluation Classifiers
taken as the final result of variables ranking, instead of Criteria
selecting one algorithm and trusting it. The results
NB MLP J48
obtained with these values are shown in Table II.
Kappa statistic 0.7234 0.6001 0.5076

TABLE II: RESULT OF TESTS AND AVERAGE RANK Mean absolute error 0.2338 0.2212 0.3156
(MAE)
Variable Chi- Info Gain Average Root mean squared 0.3427 0.4234 0.453
squared Gain Ratio Rank error (RMSE)

Relative absolute 46.7085% 44.2036% 63.0499 %


Sex 2.0107 0.0225 0.0231 0.6854 error (RAE)
Root relative 68.4637% 84.568% 90.4895 %
16.3112 0.2053 0.1338 5.5501 squared
MR
error (RRSE)
20.1697 0.261 0.1799 6.8702
SEM
9.6973 0.1106 0.1121 3.3067 Once Predictive model is created, it is necessary to
LW check how accurate it is, The Accuracy of the predictive
CS
15.1661 0.1828 0.1211 5.1567 model is calculated based on the precision, recall values
of classification matrix.
3.4595 0.0389 0.0248 1.1744
GB PRECISION is the fraction of retrieved instances that
are relevant. It is calculated as total number of true
The aim of this analysis is to determine the positives divided by total number of true positives +
importance of each variable individually. Table II shows total number of false positives.
that attribute Sex impacts output the most, and that it
showed the best performances in all of the three tests. True positives (1)
Precision =
Then these attributes follow: GB (Graduation True positives + False positives
Background), LW (Lab Work), and CS (Communication
Skill). RECALL is fraction of relevant instances that are
Now, we have carried out some experiments in order retrieved. It is usually expressed as a percentage. It is
to evaluate the performance and usefulness of different calculated as total number of true positives divided by
classification algorithms for predicting students’ total number of true positives + total number of false
placement. The results of the experiments are shown in negatives.
table III.
True positives (2)
Recall =
TABLE III: PERFORMANCE OF THE CLASSIFIERS True positives + False negatives
Evaluation Classifiers
Criteria Comparison of evaluation measures by class are
shown in table V.
NB MLP J48
TABLE V: COMPARISON OF EVALUATION MEASURES
Timing to build 0 0.27 0
model (in Sec) Classifier TP FP Precision Recall Class
Correctly classified 56 52 49
instances
Incorrectly 9 13 16 NB 0.818 0.094 0.9 0.818 Yes
classified instances 0.906 0.182 0.829 0.906 No
MLP 0.788 0.188 0.813 0.788 Yes
Accuracy (%) 86.15% 80.00% 75.38%
0.813 0.212 0.788 0.813 No
J48 0.758 0.25 0.758 0.758 Yes
The percentage of correctly classified instances is 0.75 0.242 0.75 0.75 No
often called accuracy or sample accuracy of a model. So
Naïve Bayes classifier has more accuracy than other two The performance of the learning techniques is highly
classifiers. dependent on the nature of the training data. Confusion
Kappa statistic, mean absolute error and root mean matrices are very useful for evaluating classifiers. The
squared error will be in numeric value only. We also columns represent the predictions, and the rows
show the relative absolute error and root relative squared represent the actual class. To evaluate the robustness of
error in percentage for references and evaluation. The classifier, the usual methodology is to perform cross
results of the simulation are shown in Tables IV. validation on the classifier.

Copyright © 2013 MECS I.J. Modern Education and Computer Science, 2013, 11, 49-56
54 Classification Model of Prediction for Placement of Students

TABLE VI: CONFUSION MATRIX


shortest time which is around 0 seconds compared to the
others. MLP requires the longest model building time
Classifier Yes No Class which is around 0.27 seconds. The Naïve Bayes
Classifier and J48 takes 0 seconds.
NB 27 6 Yes Kappa statistic is used to assess the accuracy of any
3 29 No particular measuring cases, it is usual to distinguish
MLP 26 7 Yes between the reliability of the data collected and their
6 26 No validity [20].
J48 25 8 Yes
The average Kappa score from the selected algorithm
8 24 No
is around 0.5-0.7. Based on the Kappa Statistic criteria,
the accuracy of this classification purposes is substantial
Fig 1 and 2 are the graphical representations of the
[20]. From Fig 2, we can observe the differences of
simulation result.
errors resultant from the training of the three selected
algorithms. This experiment implies a very commonly
used indicator which is mean of absolute errors and root
mean squared errors. Alternatively, the relative errors
are also used. Since, we have two readings on the errors,
taking the average value will be wise. It is discovered
that the highest error is found in j48with an average
score of around 0.38 where the rest of the algorithm
ranging averagely around 0.28-0.32. An algorithm
which has a lower error rate will be preferred as it has
more powerful classification capability and ability.
Decision trees are considered easily understood
models because a reasoning process can be given for
each conclusion. Knowledge models under this
Figure 1: Results paradigm can be directly transformed into a set of IF-
THEN rules that are one of the most popular forms of
knowledge representation, due to their simplicity and
comprehensibility which professor can easy understand
and interpret Fig 3.

Figure 2: Comparison between Parameters

E. Discussion Figure 3: Decision Tree


Based on the above Fig 1, 2 and Table III, we can
clearly see that the highest accuracy is 86.15% and the After examining the classification tree, we can
lowest is 75.38%. The other algorithm yields an average summarize the following results:
accuracy of 80%. In fact, the highest accuracy belongs SEM = Good: Yes (17.0/1.0)
to the Naïve Bayes Classifier followed by Multilayer SEM = Average
Perceptron function with a percentage of 80.00% and | LW = Yes: Yes (14.0/4.0)
subsequently J48 tree. An average of 52 instances out of | LW = No
total 65 instances is found to be correctly classified with | | CS = Good
highest score of 56 instances compared to 49 instances, | | | MCAresult = First: Yes (3.0)
which is the lowest score. The total time required to | | | MCAresult = Second: Yes (1.0)
build the model is also a crucial parameter in comparing | | | MCAresult = Third: No (2.0)
the classification algorithm. | | CS = Average: No (14.0/1.0)
In this simple experiment, from Table I, we can say | | CS = Poor: No (1.0)
that a single conjunctive rule learner requires the SEM = Poor: No (13.0/2.0)

Copyright © 2013 MECS I.J. Modern Education and Computer Science, 2013, 11, 49-56
Classification Model of Prediction for Placement of Students 55

[13] Q. A. AI-Radaideh, E. W. AI-Shawakfa, and M. I.


V. CONCLUSIONS AI-Najjar, “Mining student data using decision
trees”, International Arab Conference on
As a conclusion, we have met our objective which is
Information Technology (ACIT'2006), Yarmouk
to evaluate and investigate placement of student after
University, Jordan, 2006.
doing MCA by the three selected classification
[14] Sudheep Elayidom, Sumam Mary Idikkula &
algorithms based on Weka. The best algorithm based on Joseph Alexander “A Generalized Data mining
the placement data is Naïve Bayes Classification with an
Framework for Placement Chance Prediction
accuracy of 86.15% and the total time taken to build the
Problems” International Journal of Computer
model is at 0 seconds. Naïve Bayes classifier has the
Applications (0975– 8887) Volume 31– No.3,
lowest average error at 0.28 compared to others. These
October 2011.
results suggest that among the machine learning
[15] B.K. Bharadwaj and S. Pal. “Mining Educational
algorithm tested, Naïve Bayes classifier has the potential
Data to Analyze Students’ Performance”,
to significantly improve the conventional classification
International Journal of Advance Computer Science
methods for use in placement.
and Applications (IJACSA), Vol. 2, No. 6, pp. 63-
69, 2011.
[16] Shaeela Ayesha, Tasleem Mustafa, Ahsan Raza
REFERENCES Sattar, M. Inayat Khan, “Data mining model for
[1] Witten, I.H. & Frank E., Data Mining– Practical higher education system”, Europen Journal of
Machine Learning Tools and Techniques, Second Scientific Research, Vol.43, No.1, pp.24-29, 2010.
edition, Morgan Kaufmann, San Francisco, 2000. [17] A. K. Pal, and S. Pal, “Analysis and Mining of
[2] Quinlan, J.R., C4.5: Programs for machine learning, Educational Data for Predicting the Performance of
Morgan Kaufmann, San Francisco, 1993. Students”, (IJECCE) International Journal of
[3] Wu, X. & Kumar, V., The Top Ten Algorithms in Electronics Communication and Computer
Data Mining, Chapman and Hall, Boca Raton. 2009. Engineering, Vol. 4, Issue 5, pp. 1560-1565, ISSN:
[4] J. Han and M. Kamber, Data Mining: Concepts and 2278-4209, 2013.
Techniques, Morgan Kaufmann, 2000. [18] M. Bray, The shadow education system: private
[5] B.K. Bharadwaj and S. Pal., Data Mining: A tutoring and its implications for planners, (2nd ed.),
prediction for performance improvement using UNESCO, PARIS, France, 2007.
classification”, International Journal of Computer [19] S. K. Yadav, B.K. Bharadwaj and S. Pal, “Data
Science and Information Security (IJCSIS), Vol. 9, Mining Applications: A comparative study for
No. 4, pp. 136-140, 2011. Predicting Student’s Performance”, International
[6] Tongshan Chang, Ed.D, Data Mining: A Magic Journal of Innovative Technology and Creative
Technology for College Recruitment‛s, Paper of Engineering (IJITCE), Vol. 1, No. 12, pp. 13-19,
Overseas Chinese Association for Institutional 2011.
Research (www.ocair.org), 2008. [20] Kappa at http://www.dmi.columbia.edu/homepages/
[7] U. K. Pandey, and S. Pal, A Data mining view on chuangj/ kappa.
class room teaching language, (IJCSI) International
Journal of Computer Science Issue, Vol. 8, Issue 2,
pp. 277-282, ISSN:1694-0814, 2011.
[8] S. T. Hijazi, and R. S. M. M. Naqvi, “Factors Ajay Kumar Pal received his
affecting student’s performance: A Case of Private MCA. (Master of Computer
Colleges”, Bangladesh e-Journal of Sociology, Vol. Applications) from VBS
3, No. 1, 2006. Purvanchal University, Jaunpur,
[9] Z. N. Khan, “Scholastic achievement of higher UP, India. At present, he is doing
secondary students in science stream”, Journal of research in Data Mining and
Social Sciences, Vol. 1, No. 2, pp. 84-87, 2005. Knowledge Discovery. He is an
[10] Z. J. Kovacic, “Early prediction of student success: active member of CSI and
Mining student enrollment data”, Proceedings of National Science Congress. He has
Informing Science & IT Education Conference, published two papers in international journals.
2010.
[11] Galit.et.al, “Examining online learning processes Saurabh Pal received his M.Sc.
based on log files analysis: a case study”. Research, (Computer Science) from
Reflection and Innovations in Integrating ICT in Allahabad University, UP, India
Education, 2007. (1996) and obtained his Ph.D.
[12] S. K. Yadav, B.K. Bharadwaj and S. Pal, “Mining degree from the Dr. R. M. L.
Educational Data to Predict Student’s Retention :A Awadh University, Faizabad
Comparative Study”, International Journal of (2002). He then joined the Dept.
Computer Science and Information Security of Computer Applications, VBS
(IJCSIS), Vol. 10, No. 2, 2012 Purvanchal University, Jaunpur as

Copyright © 2013 MECS I.J. Modern Education and Computer Science, 2013, 11, 49-56
56 Classification Model of Prediction for Placement of Students

Lecturer. At present, he is working as Head and Sr.


Lecturer at Department of Computer Applications.
Saurabh Pal has authored more than 35 numbers of
research papers in international/ national Conference/
journals and also guides research scholars in Computer
Science/ Applications. He is an active member of CSI,
Society of Statistics and Computer Applications and
working as reviewer and member of editorial board for
more than 15 international journals. His research
interests include Image Processing, Data Mining, Grid
Computing and Artificial Intelligence.

Copyright © 2013 MECS I.J. Modern Education and Computer Science, 2013, 11, 49-56

View publication stats

You might also like