LECTURE 3-BDM 411 Data Analytics and BIG Data
LECTURE 3-BDM 411 Data Analytics and BIG Data
INTELLIGENCE AND
ANALYTICS
LECTURE 3: BIG DATA ANALYTICS AND DATA MINING
INTRODUCTION TO BIG DATA ANALYTICS
INTRODUCTION TO BIG DATA ANALYTICS
Definition:
Classification – is the task of generalizing known structure to
apply to new data. For example, an e-mail program might
attempt to classify an e-mail as "legitimate" or as "spam".
Regression – attempts to find a function which models the
data with the least error.
Summarization – providing a more compact representation of
the data set, including visualization and report generation.
Data mining Process
State the problem and formulate the hypothesis
For
example, one feature with the range [0, 1] and the
other with the range [−100, 1000] will not have the
same weights in the applied technique;
Data mining Process
4. Estimate the model
The selection and implementation of the
appropriate data-mining technique is the main
task in this phase.
Uncertain economics
Rapidly changing environments
Global competition
Demanding customers
Taking
advantage of information acquired by
companies is a Critical Success Factor.
The Information Gap