0% found this document useful (0 votes)
80 views4 pages

Prediction Model For Students PDF

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
80 views4 pages

Prediction Model For Students PDF

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

2018 4th IEEE International Conference on Information Management

Prediction Model for Students’ Future Development by Deep Learning and


Tensorflow Artificial Intelligence Engine

Wilton W.T. Fok1, Y.S. He1, H.H. Au Yeung1, K.Y. Law1, KH Cheung1, YY. Ai1, P. Ho1
1
The University of Hong Kong
e-mail: [email protected]

Abstract—Classification and prediction of students’ for pattern recognition and correlation of assessment results.
performance in examination are the typical challenges for There are some traditional data mining techniques that have
educators. Various traditional data mining methods such as been used to predict students’ performance. Some researches
decision tree and association rules were used to perform educational data mining method had been done to identify
classification. In recent years, the rapid development of those important attributes in students data.
artificial intelligence and deep learning algorithm provided
another approach for intelligent classification and result II. TRADITIONAL METHOD FOR CLASSIFICATION AND
prediction. In this paper, a research on how to use Tensorflow PREDICTION
artificial intelligence engine for classifying students’
performance and forecasting their future universities degree In order to build the predictive modeling, there are
program is studied. An appropriate and accurate forecast is several traditional tasks used for example classification,
important for providing prompt advice to student on program regression and categorization. Algorithm such as association
and university selection. For a more comprehensive rules and decision tree are commonly used.
consideration of an all rounded factors, the deep learning Association rule reflects the interdependence and
model analysed not only the traditional academic performance correlation between different things. It is commonly used in
including Mathematic, Chinese, English, Physics, Chemistry, physical stores and e-commerce recommendation systems.
Biology and History, but also non-academic performance such
Likewise, it can be used for recommending schools and
as service, Conduct, Sport and Art. A few parameters in
Tensorflow engine including the number of intermediate nodes
subjects to students. Support, confidence are two key
and number of deep learning layers are adjusted and concepts of them. In a given database, each transaction
compared. With a data set of two thousands students, 75% of contains a set of items. Every rule is composed by two
these data are used as the training data and 25% are used as different itemsets X and Y. Support is the percentage of the
the testing data, the accuracy ranged from 80% to 91%. The transaction contains both X and Y, confidence is the
optimal configuration of the Tensorflow deep learning model percentage of Y that contains X. In other words, support is
that achieves highest prediction accuracy is determined. This the probability while confidence is the conditional
study determined the factors affecting the accuracy of the probability. If the minimum thresholds of them, which have
prediction model. to be set by us, are satisfied, there is an association that
exists between X and Y. Association rule mining is more
Keywords: e-Learning assessment, Artificial Intelligence, applicable to the situation where the index in the record are
Deep Learning, prediction modelling discrete value. If the indicator values in the original database
are continuous data, and appropriate discretization should be
performed previously.
I. INTRODUCTION
The classification decision tree model is a tree structure
How to predict students’ performance is always a that describes the classification of instances. The decision
question concerned by the students’ teachers and parents. tree consists of nodes and directed edges. There are two
Based on the past examination results and in-class types of nodes: internal nodes and leaf nodes, internal nodes
assessments, it is possible to forecast the future development represent a feature or attribute, and leaf nodes represent a
of the students. It is a challenging and important matters as it class.
involves the large volume of data in educational databases When categorizing, a certain feature of the instance is
and the result could impact the future development of a tested starting from the root node, and the instance is
young kid. A good and accuracy prediction could bring the assigned to its child node according to the test result; at this
benefits and impacts to students, educators and academic time, each child node corresponds to a value of the feature.
institutions. Various type of data mining techniques had This recursively moves down until it reaches the leaf node,
been used for performance prediction for decades, e.g. and finally assigns the instance to the class of the leaf node.
decision tree, Naive Bayes, K-Nearest Neighbor and Support The decision tree can be seen as a set of if-then rules: a rule
Vector Machine [1]. However, with the rise of artificial is constructed from the root node of the decision tree to each
intelligence and deep learning application, using AI engine path of the leaf nodes; the characteristics of the internal
such as Google Tensorflow for pattern recognition has now nodes on the path correspond to the conditions of the rules,
been rising it importance. In this paper, we will investigate and the leaf nodes Corresponds to the classification of the
how to use artificial intelligence and deep learning algorithm conclusions.

978-1-5386-6147-5/18/$31.00 ©2018 IEEE 103


III. NEW METHODS USING DEEP LEARNING AND used in the pattern recognition problem. Using a one-hot
ARTIFICIAL INTELLIGENCE output system, a full connected network was built up with
the relationship between different nodes is Wx + b = y.
Neural network is another emerging technique used in
Applying the softmax cross entropy to evaluate its cost
educational data mining. The advantage of neural network is
function, the main structural was completed as shown in
that it has the ability to detect all possible interactions
Figure 1.
between predictors variables [2]. When more computing
power is nowadays available, more layers of neural network
can be implemented and deep learning analysis can be
practically implemented. Deep learning could perform
detection even in a complex nonlinear relationship between
dependent and independent variables [3]. It is considered as
one of the best prediction method.
In this study, the Google Tensorflow Deep Learning
analystic engine is used to predict students future
development. The attributes analyzed are the academic
performance such as the examination scores of various
traditional academic subjects including Mathematic, Chinese,
English, Physics, Chemistry, Biology and History, and their
non-academic performance such as conduct, sport, arts and
participation of services are included in the analysis. Two
thousand records are generated according to the following
rules:

1. Good at physics and mathematics but poor in


Chinese Æ engineering in University A
2. Good at physics, Chemistry and English but poor
Figure 1. Structure of the Convolution Neural Network and softmax cross
in Sport Æ Medicine in University A entropy
3. Good at physics, Chemistry, English and Conduct
but poor in Chinese Æ Medicine in University B In this research, Python and the Deep Learning Engine
4. Good at Chinese and Service but poor in biology, Tensorflow is used for the development. TensorFlow is an
open-source software library for dataflow programming
maths and conduct Æ Education in University C across a range of tasks. It is a symbolic math library, and
5. Good at Chinese but poor in biology, maths and also used for machine learning applications such as neural
conduct Æ Chinese in University C networks. It enables this research to be started easily
6. Good at Chinese, English but poor in Maths, with deep learning in the cloud. The framework has broad
support in the industry and has become a popular choice for
Physics Æ Translation in University D
deep learning research and application development,
7. Good at Sport and Chinese but poor in history Æ particularly in areas such as computer vision, natural
Education in University D language understanding and speech translation. The scores
8. Good at Sport and Chinese and poor in conduct Æ and future development of students follow certain pattern
Sport Science in University D and therefore this pattern recognition engine is used for the
analysis. The following loop for iteration is run to train the
9. Good at Maths and art but poor in Chinese,
model:
English Æ design in University E
10. Good at Physics, Maths, English but poor in for itt in range(epoch):
Chinese Æ engineering in University B avg_cost = 0
feed_dict = {X: input, Y: output}
At the beginning, these 2000 data sets were divided into c, _ = sess.run([cost, optimizer],
two parts. The first data part contains 75% of the data set, i.e.
1500 data sets, were set as the training data, while the feed_dict=feed_dict)
remained 25% data, i.e. 500 data sets, were used as the test
data.
The assessment performance of various subjects were After the model is trained, the following code in
used as the input x1, x2,… xn . The corresponding program Tensorflow is run to test the performance of the trained
and university selection outcome were used as the output y1, model:
y2,… ym . To build up the prediction model, the Convolution
Neural Network (CNN) model is used. CNN is commonly

104
# Test model and check accuracy
correct_prediction =
tf.equal(tf.argmax(hypothesis, 1),
tf.argmax(Y, 1))

accuracy =
tf.reduce_mean(tf.cast(correct_prediction,
tf.float32))

result = sess.run(accuracy, feed_dict={X:


input_t, Y: output_test})

print('Accuracy:',result ) Figure 3. No. of hidden node (j) and No. hidden layers (k) Vs the
prediction accuracy
record = record + str(result) + ','
B. Effect of Number of Learning Rate and Number of
The number of hidden layers and number of nodes are Iteration to the Prediction Accuracy
adjusted for comparing the best performance configuration Next, the relationship between the number of iteration (p)
(Figure 2.). By adapting different calculation factor and and the learning rate (q) are compared. The number of hiddle
element, such as hidden nodes, hidden layer and learning layer (k) is set to 3 while the number of hidden nodes (j) is
rate and so on, accuracy up to 90% was achieved. set to 20. The number of iteration (p) and the learning rate (q)
are summarized in the table below and in Fig. 4.
(p)
5000 10000 15000 20000 25000
(q)
0.001 0.834 0.842 0.836 0.828 0.824
0.002 0.83 0.828 0.836 0.836 0.802
0.003 0.78 0.834 0.844 0.818 0.856
0.004 0.862 0.846 0.838 0.86 0.844
0.005 0.834 0.838 0.788 0.856 0.846
0.006 0.818 0.858 0.842 0.816 0.848
0.007 0.492 0.858 0.852 0.818 0.856
0.008 0.862 0.844 0.854 0.828 0.856
0.009 0.806 0.836 0.838 0.696 0.832
0.010 0.808 0.858 0.842 0.828 0.822
0.011 0.802 0.856 0.86 0.83 0.866
0.012 0.846 0.826 0.834 0.646 0.852
Figure 2. Hidden layers in Tensorflow
0.013 0.84 0.82 0.854 0.854 0.862
0.014 0.862 0.846 0.832 0.652 0.83
IV. RESULTS 0.015 0.824 0.826 0.878 0.528 0.85
0.016 0.76 0.822 0.84 0.838 0.848
A. Effect of Number of Hidden Node and Layers to the 0.017 0.844 0.832 0.154 0.83 0.858
Prediction Accuracy 0.018 0.838 0.852 0.808 0.844 0.82
The number of hiddem nodes (j) and number of hidden 0.019 0.864 0.828 0.236 0.838 0.782
layers of deep learning (k) are adjusted and the accuracy of
the prediction is compared. The prediction accuracy against
different j and k are summarised as follow:-
(j) 10 20 30 40 50 60 70 80
(k)
1 0.84 0.86 0.91 0.89 0.88 0.88 0.88 0.84
2 0.80 0.85 0.85 0.83 0.84 0.83 0.85 0.86
3 0.67 0.80 0.84 0.83 0.84 0.85 0.83 0.83
4 0.15 0.83 0.83 0.80 0.86 0.85 0.83 0.85
5 0.15 0.87 0.77 0.82 0.84 0.80 0.85 0.83
The number of hidden node (j) and the number of hidden
layers (k) and the prediction accuracy are plotted in the 3D
chart in Figure 3. In this result, the number of hidden layers
does not have significant impacts on the accuracy. However,
when the number of hidden layers increase, the accuracy
Figure 4. No. of iteration (p) and the learning rate (q) vs the prediction
also increase until the number of hidden layer exceed 20. accuracy

105
V. CONCLUSIONS good enough to provide appropriate recommendations for
Classification and prediction is a general problem. students, their teachers and parents to decide their
Traditional data mining techniques such as association rules, development pathway. It is believed that more applications
decision tree, clustering and so on had been use for a few of deep learning could be used for education and corporate
decades for solving this problem. The raising popularity of staff training in the future.
using Tensorflow for deep learning and artificial intelligence REFERENCE
opened a new approach and direction for solving
[1] Amirah Mohamed, Shahiri Wahidah, “A Review on Predicting
classification problem and prediction of non-linear results. Student's Performance Using Data Mining Techniques”, Procedia
In this research, the number of hidden layers, hidden Computer Science, Volume 72, 2015, Pages 414-422,
notes, the number of iteration and the learning rate are https://www.sciencedirect.com/science/article/pii/S18770509150361
adjusted and compared. It is discovered that it is not always 82
true that the deep the deep learning model, i.e. the more [2] G. Gray, C. McGuinness, P. Owende, An application of classification
number of hidden layer, the more accurate will the result be. models to predict learner progression in tertiary education, in:
Advance Computing Conference (IACC), 2014 IEEE International,
There is an optimal point at are required to be tested and IEEE, 2014, pp. 549–554.
identified.
[3] P. M. Arsad, N. Buniyamin, J.-l. A. Manan, A neural network
For the learning rate, a higher learning rate could help to students’ performance prediction model (nnsppm), in: Smart
speed up the convergence of the trained model. However, if Instrumentation, Measurement and Applications (ICSIMA), 2013
the learning rate is too high, the result might overshoot the IEEE International Conference on, IEEE, 2013, pp. 1–5.
optimal point. Therefore, the prediction performance could [4] Joonhyuck Lee , Dongsik Jang and Sangsung Park, “Deep Learning-
be improved with a low momentum and high learning rate. Based Corporate Performance Prediction Model Considering
Then gradually, the momentum can be increased and the Technical Capability” MDP http://www.mdpi.com/2071-
1050/9/6/899/pdf
learning rate can be decreased for ensuring convergence.
[5] Rianne Conijn, Chris Snijders, Ad Kleingeld, Uwe Matzat,
This study demonstrated that deep learning could be an “Predicting Student Performance from LMS Data: A Comparison of
effective tool for predicting the students’ performance. The 17 Blended Courses Using Moodle LMS. October 2016IEEE
result ranged from 80% to 91%. The prediction result is Transactions on Learning Technologies PP(99):1-1

106

You might also like