Literature Review On Educational Data Mining
Literature Review On Educational Data Mining
Welcome to our guide on crafting a comprehensive literature review on Educational Data Mining
(EDM). As you embark on this academic journey, you'll quickly realize that crafting a literature
review on this intricate subject matter isn't a walk in the park. It requires meticulous research, critical
analysis, and adept synthesis of existing literature to present a coherent narrative that contributes
meaningfully to the field.
Educational Data Mining, a burgeoning interdisciplinary field, amalgamates techniques from data
mining, machine learning, and statistics to analyze educational data and improve learning outcomes.
Hence, crafting a literature review demands a nuanced understanding of both educational
methodologies and data analysis techniques.
The difficulty lies not only in identifying relevant literature but also in critically evaluating its quality
and synthesizing diverse perspectives into a cohesive narrative. As you delve into the realm of EDM
literature, you'll encounter a vast array of studies, ranging from theoretical frameworks to empirical
research, each offering unique insights and challenges.
Navigating through this sea of information can be overwhelming, especially for those new to the
field. It requires the ability to discern seminal works from peripheral ones, identify gaps in existing
research, and articulate how your study contributes to filling those gaps.
Moreover, EDM literature is dynamic, constantly evolving with emerging technologies and
methodologies. Staying abreast of the latest research trends and incorporating them into your
literature review adds another layer of complexity to the task.
In light of these challenges, we understand the need for expert assistance in crafting a literature
review that meets the highest academic standards. That's where ⇒ StudyHub.vip ⇔ comes in.
With our assistance, you can rest assured that your literature review will not only meet but exceed
the expectations of your academic peers and mentors. So, why struggle alone when you can leverage
the expertise of seasoned professionals?
Order your literature review on Educational Data Mining from ⇒ StudyHub.vip ⇔ today and
embark on your academic journey with confidence. Let us help you navigate the complexities of
EDM literature and pave the way for your success in this dynamic field.
Results and Discussion 3.1. Predicting the Performance of Students at Risk Using ML The students’
performance prediction provides excellent benefits for increasing student retention rates, effective
enrollment management, alumni management, improved targeted marketing, and overall educational
institute effectiveness. With accrual of information in these repositories a challenge persisted as how
to extract meaningful knowledge from it. Meanwhile, more journals focused on the topic as
compared to the conferences. The acquired knowledge allows the computer to generalize to new
settings correctly”. The long-term log data from e-learning platforms such as MOOC, LMS, and
Digital Environment to Enable Data-driven (DEED) can be used for student and course assessment.
With these large amount of features and applications there are some challenging issues also which
are not exclusive and are not ordered in any way. ANN and SVM had identical results in terms of
RMSE and performance parameters. Download Free PDF View PDF Research Article February
2014 Data Mining: A Literature Survey Sankar Keeral Koloth Vidyadharan Download Free PDF
View PDF A Review of Data Mining Literature Journal of Computer Science IJCSIS, Majid Zaman
With progression in technology specifically in last three decades or so, an enormous magnitude of
information has been transitioned into a digital form, which resulted in formation of enormous data
repositories. Data mining considered as stepping stone to procedure of knowledge discovery in
databases, this is a procedure of extracting hidden information from enormous sets of databases to
excavate eloquent patterns and rules. Download Free PDF View PDF See Full PDF Download PDF
Loading Preview Sorry, preview is currently unavailable. The PISA 2015 dataset from the nine
countries was used where the total number of attributes at the school level and student level was 19
and 18, respectively. The 70% instances were used as the training set while the rest 30% were used
as testing sets. Even though HUI is related to Business Intelligence, its application extends to Web
Server Logs, Biological Gene Databases, Network Traffic Measurements and many other fields. IT is
the process of finding patterns among dozens of fields in large relational databases. As a result, the
instructors can gain more hints to build up proper interventions for learners and achieve precision
education targets. Generally, since most rules with high support are obvious or are already known by
users, low support rules that provide users with some interesting new knowledge may be more novel
than high support rules. The intervention programs in schools help those students who are at risk of
failing to graduate. However, only a few studies proposed remedial solutions to provide in-time
feedback to students, instructors, and educators to address the problems. Future research will focus
more on developing a efficient ensemble method to practically deploy the ML-based performance
prediction methodology and search for dynamic ways or methods to predict students’ performance
and provide automatic needed remedial actions to help the students as early as possible. The
algorithms are evaluated using precision and recall at top positions. These classifiers include Naive
Bayes, Bayesian Network, ID3, J48, and Neural Networks. ML models can automatically and
quickly analyze bigger and more complex data with accurate results and avoid unexpected risks. The
dataset used in their review comprised records of students’ between 2007 and 2008 registered in two
e-learning courses. Mirajkar This paper addresses the discussion of impact of data mining in today's
fast growing world. The data contained the students’ behavior logs over 40 days. Proposed method
addressed this issue by building pruning based utility co-occurrence structure (PEUCS) for
elimatination of low-profit itemsets, thus, obviously it process only optimal number of high-utility
itemsets, so it is called as optimal FHM (OFHM). The wide variety of research has discovered and
enforced new possibilities and opportunities for technologically enhanced learning systems based on
students’ needs. The application of these rules helps identify required courses that have significant
impacts on the student’s final GPA. Their investigation concluded that both methods could discover
valuable insights into the dataset. Various popular data mining tools and techniques are available
today for supporting large amount of applications. The EDM research community utilizes session
logs and student databases for processing and analyzing student performance prediction using a
machine learning algorithm.
In this literature review a discussion of the basic concepts, applications, and the challenges in data
mining is done. The input dataset that is supplied for the naive Bayes method is discretised.
Download Free PDF View PDF See Full PDF Download PDF Loading Preview Sorry, preview is
currently unavailable. The dataset used in their review comprised records of students’ between 2007
and 2008 registered in two e-learning courses. Twelve top features were identified as important
features for predicting student academic performance. With accrual of information in these
repositories a challenge persisted as how to extract meaningful knowledge from it. Secondly,
affected by data quality and selection of threshold, the generator may produce useless rules or even
lose some useful rules. Afterward, they applied many supervised learning methods to identify the
students who had similar patterns and their predicted final grades. These classifiers include Naive
Bayes, Bayesian Network, ID3, J48, and Neural Networks. Five machine learning algorithms used
for experimentation purposes include; Support Vector Machine (SVM), Random Forest, Logistic
Regression, Adaboost, and Decision Tree. However there are some issues that need to resolve these
are discussed in this paper. The results suggested that 2-layered feed-forward ANN achieved a high
accuracy of 89%. The highest classification accuracy achieved in this study was 95.34% produced by
deep learning techniques. In this research, maximize profit, the itemset utilities should be decided by
the quantity of items sold and the unit profit on these items. In proposed system, an algorithm
named Utility Model-Growth for mining high utility itemsets from transaction databases are used. To
best of our knowledge, this is the first time the prosed method to be applied for diatom classification
of any ecosystem. Gradient Boosting Tree (GBT), SVM, and RF classifiers are used where GBT
score high average AUC score. The key finding from the investigation was that ethnicity, course
program, and course block are the top three main features affecting students’ success. The concept of
data mining as well as its various methodologies are summarized. Data mining tools predict future
trends and behaviors, allowing it’s users to make proactive, knowledge-driven decisions. It was also
observed that most of the studies used traditional machine learning algorithms such as SVM, DT,
NB, KNN, etc., and only a few have investigated the potential of deep learning algorithms. Last but
not least, the current literature does not consider the dynamic nature of student performance. To
browse Academia.edu and the wider internet faster and more securely, please take a few seconds to
upgrade your browser. Generally, since most rules with high support are obvious or are already
known by users, low support rules that provide users with some interesting new knowledge may be
more novel than high support rules. Analyzing performance, providing high-quality education,
strategies for evaluating the students’ performance, and future actions are among the prevailing
challenges universities face. For more information on the journal statistics, click here. These
bibliographies contain those studies that are entirely fit the inclusion criteria. Journal of Low Power
Electronics and Applications (JLPEA). You can download the paper by clicking the button above.
The growing volumes of data, cheaper storage, and robust computational systems are the reasons
behind the rebirth of the machine from just a pattern recognition algorithm to Deep Learning (DL)
methods. The feature set contains demographic features, pre-college entry information, and
transcript information. The performance of the classifiers was evaluated using the RMSE, Receiver
Operator Characteristics (ROC) curve, and Cohen’s Kappa Coefficient.
It can be observed that the research community from Germany and UK focused more on the field
than the other countries. Several business applications have been found to benefit from the discovery
of high utility itemsets and association rules from transaction databases. The data for analysis is
obtained from the college database with 225 instances, where each instance comprised ten attributes.
Furthermore, to reduce the size of the feature space, they adopted feature selection methods using
the data for the first-year engineering course at Midwestern US University from the years 2013 and
2014. Each instance in the dataset contains 13 attributes, whereas nine class labels represent remedial
actions. The EDM research community utilizes session logs and student databases for processing and
analyzing student performance prediction using a machine learning algorithm. The dataset included
72,598 instances, where each instance comprised 17 attributes values. THIS REVIEW’S five ML
algorithms included; ANN, LR, SVM, NBC, and DT. The dataset collected for predicting dropout is
an imbalance in nature as most of the instances belong to one class. Moreover, the data pre-
processing technique can contribute significantly to more accurate results. In this paper, the task of
automatic music genre classification is explored. Student performance prediction at entry-level and
during the subsequent periods helps the universities effectively develop and evolve the intervention
plans, where both the management and educators are the beneficiaries of the students’ performance
prediction plans. So the best way to resolve this dilemma is to first set a low support threshold or
use dynamic support threshold to complete a series of mining and then employ the new association
rules measure framework to screen mining results and extract the most valuable and interesting
association rules at the same time. Data mining is a logical process that is used to analyze large
amounts of information that can be in the form of document in order to find important data. The
number of features was then reduced by eliminating those with low information gain. The educators
and academic administrators can benefit from their counterparts in business and service industries
where a complex system of methods and techniques, usually referred as data analytics or data
mining, are being used to analyze a large influx of real-time data in decision-making. Balanced data
mean that each of the prediction classes has an equal number of attributes. This measure is inspired
from the method of Obtaining the Comprehensive Indicator through the Geometric Methods and
discussed in this paper. The performance of the proposed methods was then compared with the
baseline classification model using a dataset collected from MOOC’s platform. IT is the process of
finding patterns among dozens of fields in large relational databases. The authors argued the
importance of SVM and ANN algorithms and proposed a modified DEEDS system where ANN and
SVM are part of such systems for student performance prediction. The intervention programs in
schools help those students who are at risk of failing to graduate. A comprehensive survey and study
of various methods in existence for high utility itemset mining, association rule mining with utility
considerations have been presented in this paper. Specialised tools for automating data mining for
hospital management. So a latter problem, High utility itemsets (HUI) mining was developed to focus
on the itemsets that generate huge profit to the business. Data mining has now become an
indispensable component in almost every field of human life. Editors select a small number of
articles recently published in the journal that they believe will be particularly. They aimed to beat the
limitation and worked to identify the explainable human characteristics that may determine the
student will have poor tutorial performance. But it is not so easy to get relevant information that can
help you to take proper decision. For this purpose, they used students’ transcript data and applied a
decision tree algorithm for extracting classification rules.
The log data need pre-processing so that it can be used to train ML algorithms. One hundred twelve
features were extracted into three categories; user features, course features, and enrollment features.
Data mining considered as stepping stone to procedure of knowledge discovery in databases, this is a
procedure of extracting hidden information from enormous sets of databases to excavate eloquent
patterns and rules. The authors acknowledged the overgeneralization limitation of SMOTE and
discussed some methods to reduce the unbalancing data problem without the overgeneralization
problem. The 70% instances were used as the training set while the rest 30% were used as testing
sets. The dataset collected for predicting dropout is an imbalance in nature as most of the instances
belong to one class. The main objective of Utility Mining is to identify the itemsets with highest
utilities, by considering other user preferences such as profit, quantity and cost. Authors
independently collected the research papers and were agreed on the included papers. For
experimentation purposes, the year 2007 data is used for training, while 2008 data were used for
testing purposes. The proposed method showed promising results in terms of the Mean Square Error
(MSE). The success of such programs is based on accurate and timely identification and prioritization
of the students requiring assistance. The technologies can also assist in planning administrative
strategies to provide quality services to all stakeholders of an educational institution. Thus need is to
further analyze and evaluate the mined rules in order to find the most valuable association. To browse
Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade
your browser. Download Free PDF View PDF A Survey of Data Mining: Concepts with
Applications and its Future Scope IJCST Eighth Sense Research Group In this paper we have to
focus on data mining concept and its tools and technology which help us for market perspective to
take a proper decision and get a proper result. Dataset 1 consisted of 500 student records with 16
features. Classification algorithms C4.5, NB and K-NN, are used for wrapper methods. The dataset
was first converted from nominal to numeric forms before analyzing statistically. As a result, ML has
all the potential to speed up the progress in the educational field and it can be noticed that the
efficiency of education grows significantly. The dataset was comprised of students enrolled in
languages and Math courses. Results and Discussion 3.1. Predicting the Performance of Students at
Risk Using ML The students’ performance prediction provides excellent benefits for increasing
student retention rates, effective enrollment management, alumni management, improved targeted
marketing, and overall educational institute effectiveness. Many features are included, which makes
the big sample data; they exploited Hadoop, a machine learning-based open-source platform. The
author analyzed PISA 2005 data from several counties, including Germany, the USA, UK, Spain,
Italy, France, Australia, Japan, and Canada. Conflicts of Interest The authors declare no conflict of
interest. Research Method A systematic literature review is performed with a research method that
must be unbiased and ensure completeness to evaluate all available research related to the respective
field. Download Free PDF View PDF Research Article February 2014 Data Mining: A Literature
Survey Sankar Keeral Koloth Vidyadharan Download Free PDF View PDF A Review of Data
Mining Literature Journal of Computer Science IJCSIS, Majid Zaman With progression in
technology specifically in last three decades or so, an enormous magnitude of information has been
transitioned into a digital form, which resulted in formation of enormous data repositories. First of
all, the generation of the association rules is totally based on the fact, data without considering the
relationship between the rules. Because of the growth in the interdisciplinary nature of EDM the
paper, also try to provide boundary scope and definitions for EDM. The present article provides an
analysis of the available literature on data mining. Another important reason is the lack in
investigating the suitable factors which affect the academic performance and achievement of the
student in particular course. In this paper, the task of automatic music genre classification is explored.