Introduction To Big Data Analytics: Welcome Intro To BDA !
Introduction To Big Data Analytics: Welcome Intro To BDA !
1
11/24/2021
1. Introduction
3. Data Pre-Processing
5. Data Visualization
3
• “Data mining is the analysis of (often large) observational data sets to find
unsuspected relationships and to summarize the data in novel ways that are
both understandable and useful to the data owner”. (David Hand, Heikki
Mannila, and Padhraic Smyth, Principles of Data Mining, MIT Press, Cambridge,
MA, 2001.)
• Process Mining is the task of converting event data into process models.
2
11/24/2021
• Must address:
• Enormity of data
• High dimensionality
of data
• Heterogeneous,
distributed nature
of data
3
11/24/2021
4
11/24/2021
10
5
11/24/2021
11
12
6
11/24/2021
13
14
7
11/24/2021
15
16
8
11/24/2021
17
18
9
How much data do we generate?
20
19
11/24/2021
10
11/24/2021
https://www.allaccess.com/merge/archive/31294/
infographic-what-happens-in-an-internet-minute/
22
11
11/24/2021
23
24
12
11/24/2021
25
26
13
11/24/2021
27
28
14
11/24/2021
Why BDA?
29
KK
30
15
Slide 30
Managing Organizations
Informed decision making as a prerequisite for success
Vision
Mission
Values, Purpose, Structure, Politics, Environment, etc.
Strategic Givens
Direction
Policies, Goals, and Objectives
Decision What should be done ?
Making
Analytics, Decision Making
When and how ??
Implementation
Project Management
Action
INTELLIGENCE MODELS
DATA
Structuring Relationships
DESIGN Problem Representation
Variables (Measures and Generation of Alternatives
Estimates)
Probabilities and
Estimates
CHOICE
Spreadsheet Models
Decision Analysis and
Influence Diagrams for for managing complex
Visualizing Models and relationships and detail
Choices
16
11/24/2021
Components of a DSS
Creating Information Under Conditions of Uncertainty and Complexity
DATA MODEL
BASE BASE
Enterprise Application
Data Models
DBMS MBMS
Business Reporting
Pricing
Promotion Marketing Demand Consumers
Loyalty
Capacity
Labor Production Quantity Suppliers
Materials
Cash flow
Finance Revenues Investors
Debt/Equity
Investments
17
11/24/2021
35
36
18
11/24/2021
37
38
19
11/24/2021
Why DM?
• Data explosion • Data Information Knowledge
• We are drowning in data, but
starving for knowledge!" • Knowledge Discovery
• Interpretation
• Machine Learning
• Understanding
• Learning
• Data Mining
• Acting
• Descriptive data mining:
clustering, pattern mining, etc.
• Predictive data mining:
classification, prediction, etc.
39
40
20
11/24/2021
41
42
21
11/24/2021
43
44
22
11/24/2021
45
46
23
11/24/2021
47
24
11/24/2021
49
50
25
11/24/2021
51
26
11/24/2021
54
27
11/24/2021
56
28
11/24/2021
Data Warehouse
57
29
11/24/2021
30
11/24/2021
62
31
11/24/2021
64
32
11/24/2021
65
66
33
11/24/2021
67
34
11/24/2021
• Data selection, where data relevant to the analysis task are retrieved from the
database.
KDD
• KDD: Knowledge Discovery in
Databases.
• Data archeology.
• Information harvesting
• Knowledge extraction
• Machine learning
• Data science?
• Business intelligence?
70
35
11/24/2021
71
CRISP-DM
An industry- and tool-neutral
data mining process model.
Business understanding
phase
Modeling phase
Evaluation phase
Deployment phase
36
11/24/2021
DM in Businesses DM in practice
• Process management
1. Learn about the problem domain
• Market basket analysis
2. Data selection
• Marketing 3. Data, cleaning, preprocessing and
reduction
• Customer loyalty
4. Data mining
• Fraud detection
5. Interpretation of information
• Trend analysis
6. Apply knowledge in domain
74
37
11/24/2021
Quizzes 10%
Final Exam
30 - 50%
• Each Class should join Google Classroom
• Zero-Tolerance on plagiarism
76
38