0% found this document useful (0 votes)
43 views6 pages

To Development Manufacturing and Education Using Data Mining A Review

Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-3 | Issue-5 , August 2019, URL: https://www.ijtsrd.com/papers/ijtsrd27910.pdf Paper URL: https://www.ijtsrd.com/computer-science/data-miining/27910/to-development-manufacturing-and-education-using-data-mining-a-review/aye-pwint-phyu

Uploaded by

Editor IJTSRD
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
43 views6 pages

To Development Manufacturing and Education Using Data Mining A Review

Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-3 | Issue-5 , August 2019, URL: https://www.ijtsrd.com/papers/ijtsrd27910.pdf Paper URL: https://www.ijtsrd.com/computer-science/data-miining/27910/to-development-manufacturing-and-education-using-data-mining-a-review/aye-pwint-phyu

Uploaded by

Editor IJTSRD
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

International Journal of Trend in Scientific Research and Development (IJTSRD)

Volume 3 Issue 5, August 2019 Available Online: www.ijtsrd.com e-ISSN: 2456 – 6470

To Development Manufacturing and


Education using Data Mining: A Review
Aye Pwint Phyu, Khaing Khaing Wai
Department of Information Technology Support and Maintenance,
University of Computer Studies, Mandalay, Myanmar

How to cite this paper: Aye Pwint Phyu | ABSTRACT


Khaing Khaing Wai "To Development In modern manufacturing environments, vast amounts of data are collected in
Manufacturing and database management systems and data warehouses from all involved areas.
Education using Data mining is the nontrivial extraction of implicit, previously unknown, and
Data Mining: A potentially useful information from data. It is the extraction of information
Review" Published from huge volume of data or set through the use of various data mining
in International techniques. The data mining techniques like clustering, classification help in
Journal of Trend in finding the hidden and previously unknown information from the database. In
Scientific Research IJTSRD27910 addition, data mining also important role and educational sector. Educational
and Development Data Mining (EDM) is a field of analysis and research where various data
(ijtsrd), ISSN: 2456-6470, Volume-3 | mining tools and techniques are used to optimize the applications in education
Issue-5, August 2019, pp.2168-2173, sector. The paper aims to analyze the enormous data from the education
https://doi.org/10.31142/ijtsrd27910 sector and provide solutions and reports for specific aspects of education
sector such as student’s performance and placements. Moreover, this paper
Copyright © 2019 by author(s) and reviews the literature dealing with knowledge discovery and data mining
International Journal of Trend in Scientific applications in the broad domain of manufacturing with a special emphasis on
Research and Development Journal. This the type of functions to be performed on the data. The major data mining
is an Open Access article distributed functions to be performed include characterization and description,
under the terms of association, classification, prediction, clustering and evolution analysis.
the Creative
Commons Attribution
License (CC BY 4.0) KEYWORDS: Data Mining, EDM, Manufacturing, Literature review
(http://creativecommons.org/licenses/by
/4.0)
1. INTRODUCTION 1.1 Data Mining for Manufacturing
In most sectors, manufacturing is extremely competitive and Knowledge discovery in databases (KDD) and data mining
the financial margins that differentiate between success and (DM) have therefore become extremely important tools in
failure are very tight, with most established industries realizing the objective of intelligent and automated data
needing to compete, produce and sell at a global level. To analysis. The additional steps in the KDD process, such as
master these trans-continental challenges, a company must data preparation, data cleaning, data selection, incorporation
achieve low cost production yet still maintain highly skilled, of appropriate prior knowledge and proper interpretation of
flexible and efficient workforces who are able to consistently the results of mining, ensure that useful knowledge is
design and produce high quality and low cost products. In derived from the data (Mitra et al. 2002). these fields provide
higher-wage economies, this can generally only be done specific data mining tools that can be used in various steps of
through very efficient exploitation of knowledge (Harding a KDD process. Recently, with the growth of data mining
and Popplewell 2006; Choudhary et al. 2006). In modern technology, researchers and practitioners in various aspects
manufacturing, the volume of data grows at an of manufacturing and logistics have started applying this
unprecedented rate in digital manufacturing environments, technology to search for hidden relationships or patterns
using barcodes, sensors, vision systems etc. which might be used to equip their systems with new
knowledge. Early applications of data mining were mostly
The huge amounts of data in manufacturing databases, applied to financial applications, for example Zhang and
which contain large numbers of records, with many Zhou (2004) described data mining in the context of
attributes that need to be simultaneously explored to financial applications from both technical and application
discover useful information and knowledge, make manual perspectives. In this area, the competitive advantage gained
analysis impractical. All these factors indicate the need for through data mining included increased revenue, reduced
intelligent and automated data analysis methodologies, cost, much improved market place responsiveness and
which might discover useful knowledge from data. awareness. A recent sur-vey carried out by Harding et al.
Knowledge discovery in databases (KDD) and data mining (2006) and a special issue published on “data mining and
(DM) have therefore become extremely important tools in applications in engineering design, manufacturing and
realizing the objective of intelligent and automated data logistics” (Feng and Kusiak 2006) clearly indicated the
analysis. Data mining is a particular step in the process of potential scope of data mining in these areas to achieve
KDD, involving the application of specific algorithms for competitive advantages. A major advantage of data mining
extracting patterns (models) from data. over other experimental techniques is that the required data

@ IJTSRD | Unique Paper ID – IJTSRD27910 | Volume – 3 | Issue – 5 | July - August 2019 Page 2168
International Journal of Trend in Scientific Research and Development (IJTSRD) @ www.ijtsrd.com eISSN: 2456-6470
for analysis can be collected during the nor-mal operation of amounts of data stored in a data warehouse or other
the manufacturing process being studied. Therefore, it is information repositories. In the context of manufacturing,
generally not necessary to specially dedicate machines or two high level primary goals of data mining are prediction
processes for data collection. and description. Descriptive data mining focuses on
discovering interesting patterns to describe the data.
1.2 Data Mining for Manufacturing Literature Review
Predictive data mining focuses on predicting the behaviour
Han and Kamber (2001) classified data mining systems
of a model and determining future values of key variables
based on various criteria such as kind of database mined,
based on existing information from available databases. The
kind of knowledge mined, kind of technique utilized and
boundaries between, descriptive and predictive data mining
application areas adopted. Pham and Afify (2005) reviewed
are not sharp, e.g. some aspects of the predictive model can
machine learning techniques in the manufacturing domain. .
be descriptive, to the degree that they are understandable
Harding et al. (2006) surveyed data mining systems in
and vice versa. The goals of prediction and description can
different application areas of manufacturing, including some
be achieved by using a variety of data mining tools and
less considered areas such as manufacturing planning and
techniques. The next section there-fore describes a range of
shop floor control. However, in the last few years, data
functions and reviews their applicability in manufacturing
mining research in manufacturing has increased at an
domains.
exponential rate. Han and Kamber (2001) mentioned that
the kind of knowledge to be mined determines the data 1.2.2 Concept Description (Characterization and
mining functions to be performed. Possible kinds of Discrimination) in Manufacturing
knowledge include concept description (characterization and Characterization can be used to identify the features that
discrimination), association classification, clustering, and significantly impact the quality. Characterization provides a
prediction. The aim of this paper is therefore to consolidate concise and succinct summarization of the given collection of
the existing state-of-the art research efforts concerning the data, while concept or class discrimination or comparison
current practices in data mining applications in provides descriptions that compare two or more collections
manufacturing based on the kind of knowledge mined and the of data. In manufacturing contexts, these functions are
kind of technique utilized, thereby identifying promising basically used to understand the process. Huyet (2006)
areas for study. The remainder of the paper is organized as proposed an evolutionary optimization and data mining
follows. briefly discusses about KDD, data mining, and the based approach to produce the knowledge of systems
kinds of knowledge that particularly occur in manufacturing behaviour in a simulated job shop based production process.
contexts. Section “Concept description (characterization and Assigning proper dispatching rules is an important issue in
discrimination) in manufacturing” will discuss concept enhancing the performance measures for a flexible
descriptions which include characterization and manufacturing system (FMS). Lee and Ng (2006) presented a
discrimination in manufacturing. Classification in hybrid case based reasoning (HyCase) system for online
manufacturing is discussed in section “Classification in technical support of PC fault diagnosis. Romanowski and
manufacturing,” followed by clustering in manufacturing in Nagi (1999) applied a decision tree based data mining
section “Clustering in manufacturing”. Section “Prediction in approach on a scheduled maintenance dataset and a
manufacturing” discusses prediction in manufacturing, and vibration signal dataset. Subsystems which are most
association in manufacturing is discussed in section responsible for low equipment availability are recognized in
“Association in manufacturing”. Details of our novel text the scheduled maintenance data and a recommendation for
mining approach are given in section “Detailed analysis and preventive maintenance interval is made.
discussion: a text mining perspective on reviewed literature”
and this is followed by conclusions in section “Conclusion”. 1.2.3 Classification in Manufacturing
Classification is a useful functionality in many areas of
1.2.1 KDD, data mining and knowledge types manufacturing, for example, in the semiconductor industry,
KDD is the nontrivial process of identifying valid, novel, defects are classified to find patterns and derive the rules for
potentially useful, and ultimately understandable patterns in yield improvement. Online control chart pattern recognition
data (Fayyad et al. 1996a). The KDD process is interactive (CCPR) is another example of classification for SPC, because
and iterative involving more or less the following steps unnatural patterns displayed by a control chart can be
(Fayyad et al. 1996b; Mitra et al. 2002). associated with specific causes that adversely impact the
 -Understanding the manufacturing domain manufacturing process. Classification is a learning function
 -Collecting the targeted data that maps (classifies) a data item into one of several
 -Data cleaning, pre-processing and transformation predefined categorical classes. Generally, classification is
 Data integration performed in two steps. In the first step, a model is built to
 -Choosing the functions of data mining describe a predetermined set of data classes or concepts, and
 -Choosing the appropriate data mining algorithm this is done by analyzing the database tuples described by
 -Data mining attributes, which collectively form the training dataset.
 -Interpretation and visualization Rokach and Maimon (2006) applied a feature set
 -Implementation of discovered knowledge decomposition methodology for quality improvement. They
 -Knowledge storage, reuse and integration into the developed the Breadth Oblivious Wrapper (BOW) algorithm
manufacturing system and showed its superiority over existing tools on datasets
from IC fabrication and food processing. The idea is to find
Data mining is an interdisciplinary field with the general goal the classifier that is capable of predicting the quality
of predicting outcomes and uncovering relationships in data. measure of product or batch based on its manufacturing
It makes use of automated tools and techniques, employing parameters. Braha and Shmilovici (2002) presented three
sophisticated algorithms to discover hidden patterns, classification based data mining methods (decision tree
associations, anomalies and/or structure from large induction, neural network and composite classifier) for a

@ IJTSRD | Unique Paper ID – IJTSRD27910 | Volume – 3 | Issue – 5 | July - August 2019 Page 2169
International Journal of Trend in Scientific Research and Development (IJTSRD) @ www.ijtsrd.com eISSN: 2456-6470
new laser based wafer cleaning process called advanced materials(GBOMS) entities that represent the different
wafer cleaning. The purpose of the data mining based variants in a product family and facilitate the search for
classifier is to enhance understanding of the cleaning similar designs and the configurations of new variants. Lee
process by categorizing the given data into a given et al. (2001) proposed an intelligent inline measurement
predefined number of categorical classes and determine to sampling method for process excursion monitoring and
which the new data belongs. A fractal dimension based control in semiconductor manufacturing. The average
classifier was proposed by Purintrapiban and diagnostic accuracy of 80% showed that this hybrid model is
Kachitvichyanukul (2003) for detection of unnatural promising for an EMI diagnostic support system. Hui and Jha
patterns in process data. Kusiak (2002a), Kusiak (2002b) (2000) investigated the application of data mining
applied data mining to support decision making processes by techniques to extract knowledge from the customer service
using different data-mining algorithms to generate rules for database for decision support and fault diagnosis.
a manufacturing system. A subset of these rules was then Predictability of manufacturing processes, quality,
selected to produce a control signature for the maintenance, defects, or even within manufacturing systems
manufacturing process where the control signature is a set is of vital importance. For example in the context of
of feature values or ranges that lead towards an expected maintenance, predictions can be made about what condition
output. maintenance will be required or how equipment will
deteriorate based on the analysis of past data. Feng and
From this review, the major application areas where data Kusiak(2006), Feng et al. (2006) showed that there is no
mining tools and techniques are used for classification significant statistical advantage of using fivefold CV over
include fault diagnosis, quality control and condition threefold CV and or of using a two hidden layer neural
monitoring. In order to perform the classification task, network over a one hidden layer neural network for turning
decision tree, rough set theory, hybrid neural network and surface roughness data. Pasek (2006) used the rough set
other hybrid approaches have been successfully used. In theory based classifier for the prediction of cutting tool
hybrid approaches, Fuzzy logic is used often in combination wear. For tool condition monitoring Sun et al. (2005) applied
with other techniques to deal with noise and uncertainty in a neural network for recognition of tool condition in a
the data. The next section will deal with clustering and its monitoring system. Sylvain et al. (1999) used different data
performance on manufacturing databases. mining techniques including decision trees, rough sets,
regression and neural networks to predict component failure
1.2.4 Clustering in Manufacturing based on the data collected from the sensors of an aircraft.
Clustering is an important data mining function that can be Their results also led to the design of preventive
performed on specified manufacturing data such as order maintenance policies before the failure of any component.
picking in logistics and supply chain. For example order Lin and Tseng (2005) introduced a cerebellar model
picking is routine in distribution centers and before picking a articulation controller (CMAC) neural network based
large set of orders, orders are clustered into batches to machine performance estimation model. Tsai et al. (2006)
accelerate the product movement within the storage zone. presented a case based reasoning (CBR) system using
Clustering is also useful in the formation of cells in cellular intelligent indexing and reasoning approaches for PCB defect
manufacturing where it is used for the simultaneous design prediction. Knowledge elicitation is a technique that is
of the part families and machine cells. generally used for producing rules based on human
expertise. A method was developed by Browneet al. (2006)
Clustering is also known as unsupervised learning. Unlike to fuse knowledge elicitation and data mining using an
classification (supervised learning), in clustering the class expert system.
object of each data object is not known. Clustering maps a
data item into one of several clusters, where clusters are 1.2.5 Association in Manufacturing
natural groupings of data items based on similarity metrics Association rules mining was first introduced in 1993, and is
or probability density models (Mitra et al. 2002; used to identify relationships between a set of items in a
XuandWunsch2005). Sebzalli andWang (2001) applied database (Agrawal et al. 1993). These relationships are not
principal component analysis and fuzzy c means clustering based on inherent properties of the data themselves (as with
to a refinery catalytic and fuzzy c means clustering to a functional dependencies), but rather are based on co-
refinery catalytic and fuzzy c means clustering to a refinery occurrence of the data items. In design contexts, the
catalytic and fuzzy c means clustering to a refinery catalytic associations between requirements may provide additional
process to identify operational spaces and develop information useful for the design. For example, technical
operational strategies for the manufacture of desired specifications might state that a car that has two doors and a
products and to minimize the loss of product during system diesel engine requires a specific speed transmission. In such
changeover. Kim and Ding (2005) proposed a data mining cases, knowing the number of cars with two doors and the
aided optimal design method for fixture layout in a four number of cars with a diesel engine is not relevant whilst the
station SUV side panel assembly process. Clustering and number of cars with two doors and a diesel engine is useful,
classifications are carried out to generate a design library for example to determine the capacity of a manufacturing
and design selection rules, respectively. Torkul et al. (2006) process. The nature of this association can be extracted by
showed the outperformance of fuzzy c means clustering over applying data mining algorithms on the database.
crisp methods on a selected data set. Romanowski and Nagi
(2001) proposed a design system which supports the Agard and Kusiak (2004b) applied data mining to customer
feedback of data mined knowledge from life cycle data to the response data for its utilization in the design of product
initial stages of the design process. Romanowski and Nagi families. Jiao and Zhang (2005) developed explicit decision
(2005) and Romanowski and Nagi (2004) also applied a support to improve the product portfolio identification issue
data-mining approach for forming generic bills of by using association rule mining from past sales and product

@ IJTSRD | Unique Paper ID – IJTSRD27910 | Volume – 3 | Issue – 5 | July - August 2019 Page 2170
International Journal of Trend in Scientific Research and Development (IJTSRD) @ www.ijtsrd.com eISSN: 2456-6470
records. This review shows that the major areas where  Generalized Linear Models (GLM)
association as a data mining function has been applied  Support vector machine etc.
include product design, process control, mass customization,
cellular design etc. Association rule mining has been applied  Clustering:
as a dominating tool to identify the associations among In clustering technique, the dataset is divided into various
variable. groups, known as clusters. As per clustering phenomenon,
the data point of one cluster should be more similar to other
2. Data Mining For EDM data points of same cluster and more dissimilar to data
Similarly; Education sector is one of the sectors where data points of another cluster. There are two ways of initiation of
mining is relatively new as compared to other sectors and clustering algorithm: Firstly, clustering algorithm has to be
hence it is under-utilized. The International Educational Data started with no prior assumption; and secondly clustering
Mining Society defines EDM as follows: “EDM is an emerging algorithm has to be started with a prior postulate.
discipline, concerned with developing methods for exploring
the unique types of data that come from educational settings,  Relationship Mining:
and using those methods to better understand students, and It helps in finding relations between values in a data corpus
the settings which they learn in” (Baker, 2015). “The EDM and organizing them as rules. There are various relationship
process converts raw data coming from educational systems mining procedures such as association rule mining,
into useful information that could potentially have a greater sequential pattern mining, correlation and causal data
impact on educational research and practice” (Romero and mining. In EDM, relationship mining is utilized to recognize
Ventura, 2010). EDM by and large comprises (Baker, 2010; connections between the understudy’s web exercises and
and Romero and Ventura, 2010) four phases: the last outcomes and to display student’s critical thinking
1. Data Collection: movement successions.
The first phase of EDM is to explore the interrelations
between the data of educational sector using data mining  Discovery with Models:
techniques, i.e., classification, clustering, regression etc. This It uses an approved model of a method utilizing expectation,
phase focuses on grouping the data and also preprocesses grouping, or information building as a segment ahead of time
them for mining. Data size is enormous and hence needs a lot examination, for example, forecast or relationship mining. It
of preprocessing in order to obtain a desired outcome. is utilized as a part of circumstances to get a kick out of the
chance to recognize the connections between the
2. Validating Relations: understudy’s history and qualities.
The second phase of EDM is validation of found inter-
relations between data with the goal that uncertainty can be  Outlier Detection:
evaded. The relations are then validated based on the The point of outlier detection is to distinguish characteristics
training dataset. that are unfathomably interesting than whatever is left of
information. An exception is an alternate occasion that is
3. Predicting the Future Progress: normally more prominent or lesser than alternate esteems in
The third phase is to make predictions for future on the basis information corpus. In EDM, exception identification can be
of validated relationships in learning environment. used to recognize varieties in the students or instructor’s
activities or practices, unpredictable learning forms, and for
4. Decision Making: Distinguishing understudies with learning troubles
The fourth phase is utilizing the gathered information and (Dominguez et al., 2010; and Baker, 2015).
making calculated decisions using techniques like prediction
and classification. 2.1 Data Mining for EDM Literature Review
Pruthi and Bhatia (2015) utilized the data mining technique
Educational institutes use data mining techniques for to predict the student’s performance in the placement
purposes like analyzing and visualization of data, predicting activity of the computer science students and also predict
student’s performances. Data mining techniques like the company they are going to be placed in (name and type
clustering can be used to group students based on the of company). They used the classification process based on
parameters decided by the analyst. Data mining helps in the parameters like their overall result and specific student’s
identifying unwanted behaviors provides feedback to marks. The main issue with their process was that they used
instructors on student’s performance with information to a limited amount of data only available with the University
support the evaluation. for the training and testing purposes. They identified the
parameters as marks in many cores IT subjects.
Data mining utilizes many techniques and algorithms, and
they can be classified into the following categories: Dominguez et al. (2010) developed a process and feedback
 Prediction: generation engine that generated feedback based on the
It aims at generating a single target attribute of the data by current performance or the performance of similar class of
analyzing all the other attributes and generating patterns students. They used student information, current
from them (Romero and Ventura, 2013). Types of prediction performance and the performance history of other users as
techniques are classification, clustering, etc. parameters to predict the performance of the student and
thus provide real-time feedback for them. This is a real-time
 Classification: generation of the evaluation of student and hence prediction
Groups information/data into a few predefined attributes. of student performance. Educational data mining methods
The techniques utilized for classification are: are based on statistics, machine learning and database
 Decision tree theory. The main activities of this area are: data mining
 Naive-biased classification usage for Intelligence Tutoring Systems support, analysis of

@ IJTSRD | Unique Paper ID – IJTSRD27910 | Volume – 3 | Issue – 5 | July - August 2019 Page 2171
International Journal of Trend in Scientific Research and Development (IJTSRD) @ www.ijtsrd.com eISSN: 2456-6470
education processes, visual data mining and visual education who are actively participating in the classrooms activity tend
process pattern. The analysis of the scientific literature in the to have a strong base of technical and nontechnical skillset
field of using the methods of data mining showed that this which helps them in placement activities. Such parameters
problem is interesting to many modern researchers. For as medium of schooling and student’s skillset help in
example, in (Ceylan 2015) the authors propose a searching predicting the performance of students in the academics.
model system related to student success in the form of
 College Reputation:
classifiers, each of them is learned with different dataset
Google’s CEO is from IIT and even Microsoft’s CEO is from
with hundreds of thousands of lines in relation to sections.
Manipal University. Why not from a normal institute? Yes,
Received classifiers would serve as an advisory system for
that is due to the reputation which these institutions have
students who want to choose courses prior the registration
made. Hence, even if a student is an above average student
in the semester. In ( Herlina 2017), the role of the K-Means
from a local, non-reputed college, no MNC is going to offer a
algorithm for classifying students learning activities using e-
job directly via college campus. That is just because the
learning was showed. This algorithm helped to form student
college is not reputed.
activity and improving student abilities cluster. An approach
based on minimal spanning tree for clustering e-learning
3. Discussion
resources is proposed in (Wu 2016). The developed
The reviewed literature shows that there is a rapid growth in
clustering method can classify students into groups so that a
the application of data mining in manufacturing, particularly
homogeneous classification can increase the learning
in the semiconductor industry. In this research, we have
effectiveness.(Rawat 2019) justified the use of cluster
briefly discussed data mining concept and its techniques for
analysis for classifying a new student into the corresponding
development of knowledge management in organizations.
class and recommending relevant courses using various
The next section discusses the text mining experiments
evaluation metrics. In addition, global trends, dynamic
undertaken using the abstract and keywords of the 150
environment, difficulty of the problems requiring greater
published works reviewed in this paper.
efficiency, adaptability, integration and coordination of all of
relevant design process and implementation of the e-  Knowledge discovery in text and text mining
learning systems. applications on the literature review. Following the
definition of KDD by Fayyad et al. (1996a), Karanikas
2.2 Factors Affecting EDM
and Theodoulidis (2002) defined KDT as “the non trial
There are many factors affecting the aspects of EDM. The
process of identifying valid, novel, potentially useful, and
main issues that EDM focuses on are placement, admissions,
ultimately understandable patterns in unstructured
and branch or career selection and student performances.
data”. Text Mining (TM) is also a step in the KDT process
There are many factors that affect the areas of education.
consisting of particular data mining and natural
Although there are many factors, almost all of them can be
language processing algorithms that under certain
classified into the following factors:
computational efficiency and limitations produce a
 Interest of Student:
particular enumeration of patterns over a set of
Career of an individual depends on the choices he makes.
unstructured textual data. KDT in reviewed literature
These choices are above averagely influenced by the interest
mainly consists of three steps as follows:
of the student towards any area. The area in which a student
has interest in can help him perform better in terms of A. Abstract and keyword collection:
academics as well as in his corporate life. If a student In our experiments, the abstracts and key words of the
chooses to pursue an occupation or academics in a topic literature reviewed in this paper have been collected. Where
which he is not interested in, it can lead to a difficult life as necessary, additional key words have also been identified
he would not be performing well. Interest also includes his from the papers and added to the abstract for text mining.
habits and hobbies. For example, if a person has a hobby of This is important as the published abstracts often did not
traveling around, then he can choose his future in that field. include full details of the type of data mining function (s) and
Hence interest, hobbies and habits can affect above areas of application discussed in the paper.
averagely all the factors of education.
B. Retrieving and pre-processing documents:
Abstracts have only been taken from papers which deploy
 College Facilities:
data mining methodology to solve problems of
College facilities are the things that student pays fee for— for
manufacturing. The additional key words have been
better infrastructure, better faculties, library, residential
identified based on knowledge area, function performed and
facilities, food availability and other things. All this combine
technique used. The major knowledge areas examined
to make the basic need of a student for education. College is
include manufacturing system, quality control, fault
responsible for providing all these facilities along with
diagnosis, maintenance, job shop, yield improvement,
academic knowledge which is their primary work.
manufacturing process, fault diagnosis, product design,
 Schooling: production control, and supply chain management. Similarly,
Children with good schooling present good academic results the functions considered include concept description,
in higher education. They have experienced an educational classification, clustering, prediction and association. Major
environment that takes more interest in the practical view of techniques used include rough set theory, decision tree,
studies. They tend to be more mature and regular in their statistics, neural network, association rule, fuzzy c means
assignments. Most of the students adapt to the learning clustering, and regression analysis and hybrid algorithms. In
material and methodology quickly with ease. Similarly, this context, the term “hybrid algorithm” indicates that
schooling, medium and tutoring are imperative as students either a group of algorithms have been used in combination
with English medium foundation generally make more to solve a particular problem, or a group of algorithms have
inquiries amid showing learning process. These students been used at different stages of data mining.

@ IJTSRD | Unique Paper ID – IJTSRD27910 | Volume – 3 | Issue – 5 | July - August 2019 Page 2172
International Journal of Trend in Scientific Research and Development (IJTSRD) @ www.ijtsrd.com eISSN: 2456-6470
C. Text mining: Proceedings of the1993 ACM SIGMOD International
For the current purpose, text analysis and link analysis were Conference on Management of Data (pp. 207–216).
used to extract patterns, trends, useful knowledge and meet Washington, D.C., May 1993.Backus, P., Janakiram, M.,
the listed benefits. The text mining was performed as an Mowzoon, S., Runger, G. C., & Bhargava, A. (2006).
automatic process with manual interventions during the pre- Factory cycle time prediction with a data-mining
processing stage. Poly analyst, which is one of the leading approach. IEEE Transactions on Semiconductor
data/text mining software package in the market was used Manufacturing, 19(2). doi:10.1109/TSM.2006.873400.
for this purpose. All the results shown and interpretations [5] Batanov, D., Nagarur, N., & Nitikhumkasem, P. (1993).
made were automatically generated using this software. The Expert—MM: A knowledge based system for
following subsections describe how the abovementioned maintenance management. Artificial Intelligence in
objectives were achieved. Equally, job description mining Engineering, 8, 283–291. doi:10.1016/0954-
can reveal actionable insight for students, employers and the 1810(93)90012-5.
institution. The institution can provide students with a
[6] Belz, R., & Mertens, P. (1996). Combining knowledge
better understanding of co-op opportunities in various
based systems and simulation to solve rescheduling
disciplines and therefore help them select the right academic
problems. Decision Support Systems, 17, 141–157.
program and career. Additionally, the institution may use
doi:10.1016/0167-9236(95)00029-1.
frequently appearing words and the clustering of jobs in
various disciplines to produce more effective promotional [7] Baker R (2010), “Data Mining for Education”, in B
material for its co-op programs and to help attract strong McGaw, P Peterson and
students. Furthermore, students can find out what types of [8] E Baker (Eds.), International Encyclopedia of Education,
jobs are available to them and what soft and technical skills 3rd Edition, Vol. 7, pp. 112-118, Elsevier, Oxford, UK.
are required. In particular, clustering can be used to segment Baker R (2015), Educational Data Mining and Learning
the job descriptions to make it easier for students to find Analytics, Springer International Publishing.
jobs they are interested in and institutions can align their [9] Blagojević M and Micić Ž (2013), “A Web-Based
curricula with job market needs. Intelligent Report e-Learning System Using Data
4. CONCLUSION Mining Techniques”, Computers & Electrical
Knowledge discovery and data mining have created new Engineering, Vol. 39, No. 2, pp. 465-474, Elsevier BV.
intelligent tools for extracting useful information and [10] Gafarova, L. M., Zavyalova, I. G., Mustafin, N. N. (2015)
knowledge automatically from manufacturing databases. ‘On the features of the application of the Pearson
The present article provides a survey of the available consensus criterion X2’, ESGI, no. 4(8), pp. 63-67.
literature on data mining applications in manufacturing with [11] Gorlushkina, N. N., Kotsyuba, I. Yu., Khlopotov, M. V.
a special emphasis on the kind of knowledge mined. The (2015) ‘Tasks and methods of intellectual analysis of
types of knowledge identified indicate the major data mining educational data for decision support], GTR, no. 1, pp.
functions to be performed include characterization and 472-482.
description, association, classification, prediction, clustering
[12] Hanna, M. (2004) ‘Data Mining in the e-learning
in data. This paper reviewed A novel text mining approach
domain’, Campus-Wide Information Systems, no. 21(1),
has been applied on the reviewed literature to identify the
pp. 29-34.
popular and successful research tools and existing research
gaps, examine the under looked and overlooked areas, [13] He, W. (2013) ‘A survey of security risks of mobile
identify good practices in data mining in manufacturing and social media through blog mining and an extensive
some key features unknown to data mining practitioners. literature search’, Information Management and
EDM and manufacturing for using data mining as an area of Computer Security, no. 21(5), pp. 381-400.
research. The paper discussed various techniques, factors [14] Herlina, Latipa Sari, Dewi, Suranti Mrs.and Leni, Natalia
and applications of EDM and manufacturing. There are many Zulita (2017) “Implementation of k-means clustering
factors that affect the aspects of EDM. The paper highlighted method for electronic learning model” // International
some of them and also compared many of them based on Conference on Information and Communication
their impact on placement outcomes, academic performance, Technology (IconICT) IOP Publishing IOP Conf. Series:
and college and branch selection. Journal of Physics: Conference Series, Volume 930.
REFERENCES [15] Hussain, M. et al. (2018) “Student Engagement
[1] Agard, B., & Kusiak, A. (2004a). Data mining for Predictions in an e-Learning System and Their Impact
subassembly selection. Journal of Manufacturing on Student Course Assessment Scores” //
Science and Engineering, 126, 627– 631. Computational Intelligence and Neuroscience, vol 2018.
doi:10.1115/1.1763182. [16] Ilyina, T. S., Zakharov, N. Yu. (2016) ‘Management of
[2] A. K. Choudhary. J. A. Harding. M. K. Tiwari. Data mining educational risks’, Vestnik VGUIT, no. 4(70), pp. 290-
in manufacturing: a review based on the kind of 295.
knowledge, J Intell Manu(2009)20:501-521 DOI [17] Kamisli, Ozturk, Z., Erzurum, Cicek, Z. I. and Ergul, Z
10.1007/s10845-008-0145-x. (2017), “Sentiment Analysis an Application to Anadolu
[3] Agard, B., & Kusiak, A. (2004b). Data-mining based University,” Acta Physica Polonica A, vol. 132, no. 3, pp.
methodology for the design of product families. 753–755.
International Journal of Production Research, 42, 15, [18] Smeet M Thakrar*, Navjyotsinh Jadeja** and Nikunj
2955–2969.Agrawal, R., Imielinski, T.,&Swami, A. N. Vadher***,Educational Data Mining:A Review, The IUP
(1993).Mining association Journal of Information Technology, Vol. XIV, No. 1,
[4] Rules between sets of items in large databases. In 2018

@ IJTSRD | Unique Paper ID – IJTSRD27910 | Volume – 3 | Issue – 5 | July - August 2019 Page 2173

You might also like