Seminar On Data Mining Concepts and Its
Seminar On Data Mining Concepts and Its
on
DATA MINING: CONCEPTS AND ITS TECHNIQUES
We are in an age often referred to as the information age. There is a huge amount
of data analysis in the information industry. This data is of no use until it is
converted into useful information. It is necessary to analyze this huge amount of
data and extract useful information from it.
Extraction of information is not only the single process, data mining also provides
other processes such as data cleaning, data integration, data transformation, pattern
evaluation, and data presentation.
Once all these processes are over, we would be able to use this information in
many applications such as fraud, detection, market analysis, science exploration,
etc.
Concept
Data mining is the process of discovering patterns in large data sets involving
methods at the intersection of machine learning, statistics, and database
systems. Data mining is an interdisciplinary subfield of computer science with an
overall goal to extract information (with intelligent methods) from a data set and
transform the information into a comprehensible structure for further use. Data
mining is the analysis step of the "knowledge discovery in databases" process, or
KDD. The term "data mining" is in fact a misnomer, because the goal is the
extraction of patterns and knowledge from large amounts of data, not the extraction
(mining) of data itself.
Definitions
3.Data selection (where data relevant to the analysis task are retrieved from the
database)
4.Data transformation (where data are transformed and consolidated into forms
appropriate for mining by performing summary or aggregation operations)
5.Data mining (an essential process where intelligent methods are applied to
extract data patterns)
6.Pattern evaluation (to identify the truly interesting patterns representing
knowledge based on interestingness measures.
TECHNIQUES
Several core techniques that are used in data mining describe the type of mining
and data recovery operation. These are the following:
1.Association
2.Classification
4.Prediction
The prediction, as its name implied, is one of a data mining techniques that
discovers the relationship between independent variables and relationship between
dependent and independent variables. For instance, the prediction analysis
technique can be used in the sale to predict profit for the future if we consider the
sale is an independent variable, profit could be a dependent variable. Then based
on the historical sale and profit data, we can draw a fitted regression curve that is
used for profit prediction.
5.Regression
6.Sequential patterns
Sequential patterns analysis is one of data mining technique that seeks to discover
or identify similar patterns, regular events or trends in transaction data over a
business period.
In sales, with historical transaction data, businesses can identify a set of items that
customers buy together different times in a year. Then businesses can use this
information to recommend customers buy it with better deals based on their
purchasing frequency in the past.
7.Decision trees
A decision tree is one of the most commonly used data mining techniques because
its model is easy to understand for users. In decision tree technique, the root of the
decision tree is a simple question or condition that has multiple answers. Each
answer then leads to a set of questions or conditions that help us determine the data
so that we can make the final decision based on it.
9. Tracking patterns.
One of the most basic techniques in data mining is learning to recognize patterns in
your data sets. This is usually a recognition of some aberration in your data
happening at regular intervals, or an ebb and flow of a certain variable over time.
For example, you might see that your sales of a certain product seem to spike just
before the holidays, or notice that warmer weather drives more people to your
website.
10.Outlier detection.
In many cases, simply recognizing the overarching pattern can’t give you a clear
understanding of your data set. You also need to be able to identify anomalies, or
outliers in your data. For example, if your purchasers are almost exclusively male,
but during one strange week in July, there’s a huge spike in female purchasers,
you’ll want to investigate the spike and see what drove it, so you can either
replicate it or better understand your audience in the process
Bio Information
Data mining can be used by an institution to take accurate decisions and also
to predict the results of the students.
Learning pattern of the students can be captured and used to develop
techniques to teach them.
CONCLUSION
Data mining is more than running some complex queries on the data you stored in
your database. You must work with your data, reformat it, or restructure it,
regardless of whether you are using SQL, document-based databases such as
Hadoop, or simple flat files. Identifying the format of the information that you
need is based upon the technique and the analysis that you want to do. After you
have the information in the format you need, you can apply the different
techniques (individually or together) regardless of the required underlying data
structure or data set.