0% found this document useful (0 votes)

57 views15 pages

DMBAR Chapter 1

Uploaded by

ANAM AFTAB 22GSOB2010404

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

57 views15 pages

DMBAR Chapter 1

Uploaded by

ANAM AFTAB 22GSOB2010404

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 15

DATA MINING FOR BUSINESS

ANALYTICS IN R

Galit Shmueli , Peter. C Bruce, Inbal Yahav,

Nitin R. Patel, Kenneth C. Lichtendahl, Jr.

Indian Adaptation by
O.P. Wali, Professor, Indian Institute of Foreign Trade

PRELIMINARIES

CHAPTER 1 Introduction

CHAPTER 2 Overview of the Data Mining Process

CHAPTER 1

Introduction
1.1 WHAT IS BUSINESS ANALYTICS?

▪ Business Analytics (BA) is the practice and art of bringing quantitative data to bear on decision-making.

▪ Business Analytics, or more generically, analytics, include a range of data analysis methods. Many powerful
applications involve little more than counting, rule-checking, and basic arithmetic.
▪ The next level of business analytics, now termed Business Intelligence (BI), refers to data visualization and
reporting for understanding.
▪ Business Analytics now typically includes BI as well as sophisticated data analysis methods, such as statistical
models and data mining algorithms used for exploring data, quantifying and explaining relationships between
measurements, and predicting new records.

The Business Analytics toolkit also includes statistical experiments, the most common of which is known to
marketers as A-B testing. These are often used for pricing decisions:

 Orbitz, the travel site, found that it could price hotel options higher for Mac users than Windows users.

 Staples online store found it could charge more for staplers if a customer lived far from a Staples store.

Successful use of analytics and data mining requires both an understanding of the business context where value is to
be captured, and an understanding of exactly what the data mining methods do.
1.2 WHAT IS DATA MINING?

 Data mining refers to business analytics methods that go beyond counts, descriptive techniques, reporting, and
methods based on business rules.
 Data mining includes statistical and machine-learning methods that inform decision-making, often in an automated
fashion.
 The era of Big Data has accelerated the use of data mining.
 Data mining methods, with their power and automaticity, have the ability to cope with huge amounts of data and
extract value
1.3 DATA MINING AND RELATED TERMS

 The term data mining itself means different things to different people.
 Data mining stands at the confluence of the fields of statistics and machine learning (also known as artificial
intelligence).
 The emphasis that classical statistics places on inference is absent from data mining.
 In comparison to statistics,Data mining deals with large datasets in an open-ended fashion, making it impossible to
put the strict limits around the question being addressed that inference would require.
 Data mining is vulnerable to the danger of overfitting, where a model is fit so closely to the available sample of
data that it describes not merely structural characteristics of the data, but random peculiarities as well. In
engineering terms, the model is fitting the noise, not just the signal.
1.4 BIG DATA

 Data mining and Big Data go hand in hand. Big Data is a relative term—data today are big by reference to the past,
and to the methods and devices available to deal with them.
 The challenge Big Data presents is often characterized by the four V’s—volume, velocity, variety, and veracity.
Volume refers to the amount of data.
a. Velocity refers to the flow rate—the speed at which it is being generated and changed. V
b. Variety refers to the different types of data being generated (currency, dates, numbers, text, etc.).
c. Veracity refers to the fact that data is being generated by organic distributed processes (e.g., millions of people signing up
for services or free downloads) and not subject to the controls or quality checks that apply to data collected for a study.

 Most large organizations face both the challenge and the opportunity of Big Data because most
routine data processes now generate data that can be stored and, possibly, analyzed.
1.5 DATA SCIENCE

 Data science is a mix of skills in the areas of statistics, machine learning, math, programming, business, and IT.
 Data science itself is thus broader than the other concepts we discussed above, and it is a rare individual who
combines deep skills in all the constituent areas.
 Although Big Data is the motivating power behind the growth of data science, most data scientists do not actually
spend most of their time working with terabyte-size or larger data.
 Data of the terabyte or larger size would be involved at the deployment stage of a model. There are manifold
challenges at that stage, most of them IT and programming issues related to data-handling and tying together
different components of a system.
1.6 WHY ARE THERE SO MANY DIFFERENT METHODS?

 The answer is that each method has advantages and disadvantages.

 The usefulness of a method can depend on factors such as the size of the dataset, the types of patterns that exist
in the data, whether the data meet some underlying assumptions of the method, how noisy the data are, and the
particular goal of the analysis. the goal is to find a combination of household income level and household lot size
that separates buyers (solid circles) from nonbuyers (hollow circles) of riding mowers. The first method (left
panel) looks only for horizontal and vertical lines to separate buyers from nonbuyers, whereas the second
method (right panel) looks for a single diagonal line.
 Different methods can lead to different results, and their performance can vary. It is therefore customary in data
mining to apply several different methods and select the one that appears most useful for the goal at hand.
1.6 WHY ARE THERE SO MANY DIFFERENT
METHODS?(CONTINUATION)

 The goal is to find a combination of household income level and household lot size that separates buyers (solid
circles) from nonbuyers (hollow circles) of riding mowers. The first method (left panel) looks only for horizontal
and vertical lines to separate buyers from nonbuyers, whereas the second method (right panel) looks for a single
diagonal line.

Two methods for separating owners from nonowners

1.7 TERMINOLOGY AND NOTATION
Because of the hybrid parentry of data mining, its practitioners often use multiple terms to refer to the
same thing. To a statistician, it is the dependent variable or the response. Here is a summary of terms used:

• Algorithm • Model
• Attribute see Predictor • Observation
• Case see Observation. • Outcome Variable see Response.
• Confidence • Output Variable see Response.
• Dependent Variable see Response. • P (A|B)
• Estimation see Prediction. • Predictor
• Feature see Predictor. • Profile
• Holdout Data (or holdout set) • Record see Observation.
• Input Variable see Predictor.
1.7 TERMINOLOGY AND NOTATION(CONTUNUATION)

• Response • Score
• Sample • Success Class
• Score • Supervised Learning
• Success Class • Target see Response.
• Prediction • Test Data (or test set)
• Predictor • Training Data (or training set)
• Profile • Unsupervised Learning
• Record see Observation. • Validation Data (or validation set)
• Response • Variable
• Sample
1.8 ROAD MAPS TO THIS BOOK

Data mining from a process perspective. Numbers in parentheses indicate chapter numbers
1.8 ROAD MAPS TO THIS BOOK(CONTINUATION)

Organization of data mining methods in this book, according to the nature of the data*

(Ebook PDF) Data Mining For Business Analytics: Concepts, Techniques, and Applications in R PDF Download
83% (6)
(Ebook PDF) Data Mining For Business Analytics: Concepts, Techniques, and Applications in R PDF Download
44 pages
Unit - 2 Data Minig Notes
No ratings yet
Unit - 2 Data Minig Notes
15 pages
Unit 2. Process of Educational Management
100% (1)
Unit 2. Process of Educational Management
65 pages
Data Mining Concepts
100% (3)
Data Mining Concepts
122 pages
Datamining 1
No ratings yet
Datamining 1
30 pages
Questioning Basics of Physics
100% (7)
Questioning Basics of Physics
279 pages
Introduction To Data Mining For Business Analytics
No ratings yet
Introduction To Data Mining For Business Analytics
51 pages
Data Mining Notes For BCA 5th Sem 2019 PDF
No ratings yet
Data Mining Notes For BCA 5th Sem 2019 PDF
46 pages
Data Mining Merged PDF CS1 CS8
No ratings yet
Data Mining Merged PDF CS1 CS8
272 pages
2 Buss Intel Analytics
No ratings yet
2 Buss Intel Analytics
43 pages
Data Mining vs. Statistics: Pavel Brusilovsky
No ratings yet
Data Mining vs. Statistics: Pavel Brusilovsky
22 pages
Chapter 1
No ratings yet
Chapter 1
12 pages
BIDW Lecture 2
No ratings yet
BIDW Lecture 2
33 pages
Chapter 3-IB
No ratings yet
Chapter 3-IB
69 pages
TFB M1 C2 Data Mining
No ratings yet
TFB M1 C2 Data Mining
18 pages
Lecture 2
No ratings yet
Lecture 2
18 pages
DMBI Theory
No ratings yet
DMBI Theory
15 pages
Brain School by Howard Eaton
75% (4)
Brain School by Howard Eaton
288 pages
Data Mining Unit 1
No ratings yet
Data Mining Unit 1
39 pages
(Ebook PDF) Data Mining For Business Analytics: Concepts, Techniques, and Applications in R Download
No ratings yet
(Ebook PDF) Data Mining For Business Analytics: Concepts, Techniques, and Applications in R Download
48 pages
DM - Unit-1 - Fundamentals of Data Mining
No ratings yet
DM - Unit-1 - Fundamentals of Data Mining
43 pages
Introduction - Ch.1: Data Mining For Business Analytics in R
No ratings yet
Introduction - Ch.1: Data Mining For Business Analytics in R
17 pages
Data Mining
No ratings yet
Data Mining
17 pages
DMT Unit1
No ratings yet
DMT Unit1
46 pages
Chapter 1 Introduction
No ratings yet
Chapter 1 Introduction
10 pages
SQL Server 2008 For Business Intelligence: UTS Short Course
No ratings yet
SQL Server 2008 For Business Intelligence: UTS Short Course
43 pages
DM - Unit I-Updated
No ratings yet
DM - Unit I-Updated
65 pages
BANA 560 - Lecture - 2 - Data - Mining - Overview - Data - Exploration
No ratings yet
BANA 560 - Lecture - 2 - Data - Mining - Overview - Data - Exploration
38 pages
Combinepdf 1
No ratings yet
Combinepdf 1
74 pages
Ware House Server
No ratings yet
Ware House Server
89 pages
Chapter 1 Data Mining Lecture Note
No ratings yet
Chapter 1 Data Mining Lecture Note
31 pages
Analytics Methods
No ratings yet
Analytics Methods
40 pages
Chapter 01 2
No ratings yet
Chapter 01 2
19 pages
ModelQB - Part B&C-1
No ratings yet
ModelQB - Part B&C-1
51 pages
Lecture 1428550844
No ratings yet
Lecture 1428550844
87 pages
Data Mining & Data Warehousing
No ratings yet
Data Mining & Data Warehousing
84 pages
DM Unit-1
No ratings yet
DM Unit-1
27 pages
Bi Short Notes
No ratings yet
Bi Short Notes
15 pages
Data Mining
No ratings yet
Data Mining
30 pages
Data Mining - An Overview
No ratings yet
Data Mining - An Overview
40 pages
Data Mining
No ratings yet
Data Mining
6 pages
Ba Unit 3 Own
No ratings yet
Ba Unit 3 Own
7 pages
01-Introduction To Data Mining
No ratings yet
01-Introduction To Data Mining
43 pages
Unit No 3
No ratings yet
Unit No 3
10 pages
Big Data 4 (3 - 4)
No ratings yet
Big Data 4 (3 - 4)
13 pages
Introduction To Data Mining & Business Intelligence
No ratings yet
Introduction To Data Mining & Business Intelligence
25 pages
DM Unit 1
No ratings yet
DM Unit 1
10 pages
Introduction To Data Mining-Week1
No ratings yet
Introduction To Data Mining-Week1
43 pages
Dr. Gaurav Dixit: Department of Management Studies
No ratings yet
Dr. Gaurav Dixit: Department of Management Studies
26 pages
What Is Business Analytics?: Predictive Analytics Descriptive Analytics Prescriptive Analytics
No ratings yet
What Is Business Analytics?: Predictive Analytics Descriptive Analytics Prescriptive Analytics
35 pages
Datamining: by Guan Hang Su Cs157A Section 2 Fall 2005
0% (1)
Datamining: by Guan Hang Su Cs157A Section 2 Fall 2005
31 pages
Synopsis Print
No ratings yet
Synopsis Print
4 pages
Data Mining, Data Pattern, Machine Learning (Week 2
No ratings yet
Data Mining, Data Pattern, Machine Learning (Week 2
19 pages
Mehrdad Jalali: Jalali@mshdiau - Ac.ir Jalali - Mshdiau.ac - Ir
No ratings yet
Mehrdad Jalali: Jalali@mshdiau - Ac.ir Jalali - Mshdiau.ac - Ir
27 pages
Definition of Politics: Report By: Kevin John C. de Guzman Mrs. Janelle Rosal Bsict Iii-A Instructor
No ratings yet
Definition of Politics: Report By: Kevin John C. de Guzman Mrs. Janelle Rosal Bsict Iii-A Instructor
7 pages
Vinee
100% (1)
Vinee
28 pages
Research Problem, Objectives and
No ratings yet
Research Problem, Objectives and
54 pages
Legal Philosophy As Practical Philosophy: Revus
No ratings yet
Legal Philosophy As Practical Philosophy: Revus
25 pages
Data Mining: An Overview From A Database Perspective
No ratings yet
Data Mining: An Overview From A Database Perspective
30 pages
Design Process For Museum & Exhibition
100% (2)
Design Process For Museum & Exhibition
15 pages
An Introduction To Data Mining
No ratings yet
An Introduction To Data Mining
47 pages
An Intelligent Knowledge Extraction Framework For Recognizing Identification Information From Real-World ID Card Images
No ratings yet
An Intelligent Knowledge Extraction Framework For Recognizing Identification Information From Real-World ID Card Images
10 pages
Meaning and Intentionality in Wittgenstein's Later Philosophy
No ratings yet
Meaning and Intentionality in Wittgenstein's Later Philosophy
13 pages
Data Mining and Data Warehouse BY: Dept. of Computer Science Engineering
No ratings yet
Data Mining and Data Warehouse BY: Dept. of Computer Science Engineering
10 pages
Exploring The Effect of AI Apps On Students
No ratings yet
Exploring The Effect of AI Apps On Students
6 pages
Audio Script - Answer Keys 1
No ratings yet
Audio Script - Answer Keys 1
8 pages
TutorialOnNeuralModelingSystems 2
0% (1)
TutorialOnNeuralModelingSystems 2
10 pages
Identifying Knowledge Claims
100% (1)
Identifying Knowledge Claims
16 pages
4.2 Decision Tree-1
No ratings yet
4.2 Decision Tree-1
11 pages
Yes Bank Financial Analysis
No ratings yet
Yes Bank Financial Analysis
19 pages
Report Rohun Sjmoon
No ratings yet
Report Rohun Sjmoon
6 pages
TRW Summarising
No ratings yet
TRW Summarising
4 pages
Tax Planning Strategies and Wealth Management Unit 2
No ratings yet
Tax Planning Strategies and Wealth Management Unit 2
47 pages
Study On Content Based Image Retrieval
No ratings yet
Study On Content Based Image Retrieval
2 pages
IV-Day 5
No ratings yet
IV-Day 5
3 pages
The Informed Design Teaching and Learning Matrix
No ratings yet
The Informed Design Teaching and Learning Matrix
60 pages
Chapter 1-3 Research Lecture
No ratings yet
Chapter 1-3 Research Lecture
146 pages
16pf - 6th Edition - Competency Profile and Interview Guide - Ella - SAMPLE
No ratings yet
16pf - 6th Edition - Competency Profile and Interview Guide - Ella - SAMPLE
26 pages
DMBAR Chapter 4 Dimension Reduction
No ratings yet
DMBAR Chapter 4 Dimension Reduction
25 pages
DMBAR Chapter 14 Association Rules and Collaborative Filtering
No ratings yet
DMBAR Chapter 14 Association Rules and Collaborative Filtering
21 pages
Brain e Tics Parent Guide
100% (2)
Brain e Tics Parent Guide
10 pages
Lecture Notes 1.2 Computational Thinking Real Life Examples
No ratings yet
Lecture Notes 1.2 Computational Thinking Real Life Examples
15 pages
MCS-224 June 2024
No ratings yet
MCS-224 June 2024
5 pages
My Philosophy of Education
100% (1)
My Philosophy of Education
16 pages
Manali Petrochemicals Limited
No ratings yet
Manali Petrochemicals Limited
6 pages
Topic Segmentation For Textual Document Written in Arabic Language
No ratings yet
Topic Segmentation For Textual Document Written in Arabic Language
10 pages
Developing Gestalt Counselling - Chapter 7 - Understanding Gestalt Theories of Self and Their Implications
No ratings yet
Developing Gestalt Counselling - Chapter 7 - Understanding Gestalt Theories of Self and Their Implications
6 pages
Barriers To Communication
No ratings yet
Barriers To Communication
14 pages
Personal Affirmation Statement Personal Affirmation Statement
No ratings yet
Personal Affirmation Statement Personal Affirmation Statement
1 page
Group7 Analytical Exposition
No ratings yet
Group7 Analytical Exposition
9 pages
Commed HealthEducationHABAWEL
No ratings yet
Commed HealthEducationHABAWEL
7 pages
Data Analytics with Generative AI
From Everand
Data Analytics with Generative AI
Younish P
No ratings yet
DATA ANALYSIS AND DATA SCIENCE: Unlock Insights and Drive Innovation with Advanced Analytical Techniques (2024 Guide)
From Everand
DATA ANALYSIS AND DATA SCIENCE: Unlock Insights and Drive Innovation with Advanced Analytical Techniques (2024 Guide)
WINTON CLEM
No ratings yet
Data Science Career Guide Interview Preparation
From Everand
Data Science Career Guide Interview Preparation
Gradient Publication
No ratings yet
Data Analytics with Python: Data Analytics in Python Using Pandas
From Everand
Data Analytics with Python: Data Analytics in Python Using Pandas
Frank Millstein
3/5 (1)
Data Mining: Fundamentals and Applications
From Everand
Data Mining: Fundamentals and Applications
Fouad Sabry
No ratings yet

DMBAR Chapter 1

Uploaded by

DMBAR Chapter 1

Uploaded by

DATA MINING FOR BUSINESS

Galit Shmueli , Peter. C Bruce, Inbal Yahav,

CHAPTER 2 Overview of the Data Mining Process

 The answer is that each method has advantages and disadvantages.

Two methods for separating owners from nonowners

You might also like