0% found this document useful (0 votes)

11 views19 pages

Lecture 2.1.1 2.1.2

The document outlines the course outcomes and content for a Business Analytics course, focusing on data warehousing and data mining concepts. It details the differences between data warehousing and data mining, as well as OLTP and OLAP systems, and introduces data marts and data lakes. Additionally, it emphasizes the importance of architectural aspects in data mining and provides references for further reading.

Uploaded by

Siddharth Singh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views19 pages

Lecture 2.1.1 2.1.2

Uploaded by

Siddharth Singh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 19

INSTITUTE-USB

DEPARTMENT-BBA
Bachelor Of Business Administration
Business Analytics (22BAT-264)
Instructor:Ms. Nikita Bhardwaj

Business Analytics (22BAT-264)Operation Research(BAT-308)

DISCOVER . LEARN . EMPOWER
Operation Research(BAT-308)
Business Analytics
Course Outcome
CO Title Level
Number
Understand
CO1 To demonstrate the concepts and methods of
business analytics and their role in business and
society
To apply data processing tools for exploratory Application
CO2 analysis an to demonstrate its effectiveness to
diverse audience

To enhance the analytical skills of students by Analyze

CO3
providing the knowledge of various analytical
software and tools

To evaluate analytical solutions for assessing their Evaluate

CO4
effectiveness in contemporary business world

To build the expertise in delivering practical Application

CO5 https://www.or.tum.de/en/home/
solutions for complex business problems
Contents to be Covered
• Data Warehousing
• Data Mining
• Architectural aspects of Data Mining
• Differences Between Data Warehouse and Data Mining
• OLTP (Online Transaction Processing)
• OLAP (Online Analytical Processing)
• Data Mart
• Data Lake
Data Warehousing:

• Data warehousing involves the process of collecting, storing,

and managing large volumes of data from various sources to
support decision-making processes within an organization.
• It integrates data from different operational systems into a
central repository, known as a data warehouse, which is
optimized for querying and analysis.
• The data in a warehouse is structured in a way that facilitates
reporting, analytics, and data mining activities.
• Data warehousing helps organizations to consolidate and
organize their data for efficient analysis and reporting,
providing a single source of truth for decision-making.
Data Mining:
• Data mining is the process of discovering patterns, trends, and
insights from large datasets using various statistical,
mathematical, and machine learning techniques.
• It involves extracting useful information and knowledge from
the data stored in a data warehouse or other repositories.
• Data mining techniques can be used to uncover hidden patterns,
relationships, and correlations within the data, which can then be
utilized for predictive analysis, anomaly detection, and other
decision-making tasks.
• Data mining algorithms can identify meaningful patterns in data
that may not be immediately apparent, helping organizations
gain valuable insights and make informed decisions.
January 23, 2025 Data Mining: Concepts and Techniques
Architectural aspects of Data Mining
• The architectural aspects of data mining involve the design and implementation of systems and
components that support the data mining process. Here are the key architectural aspects:
• 1.Data Sources:
• Data mining systems typically begin with various data sources, including databases, data
warehouses, data lakes, flat files, APIs, and external data sources.
• Architectural considerations involve identifying and accessing relevant data sources, ensuring data
quality, and integrating data from disparate sources into a unified data environment.
• 2.Data Preprocessing:
• Before data mining algorithms can be applied, data preprocessing is necessary to clean, transform,
and prepare the data for analysis.
• Architectural aspects include defining preprocessing tasks such as missing value imputation,
outlier detection, data normalization, feature selection, and dimensionality reduction.

Data Mining: Concepts and Techniques

3.Data Mining Algorithms:
• Data mining algorithms are the core components of a data mining system, responsible for discovering
patterns, trends, and insights from the data.
• Architectural considerations involve selecting appropriate algorithms based on the characteristics of the
data and the objectives of the analysis, implementing algorithms efficiently, and integrating them into the
system.
4.Model Evaluation and Validation:
• Data mining models need to be evaluated and validated to assess their performance and reliability.
• Architectural aspects include defining evaluation metrics, designing validation procedures such as cross-
validation or holdout validation, and implementing mechanisms for comparing and selecting the best-
performing models.
5.Model Deployment:
• Once data mining models are developed and validated, they need to be deployed into production systems
for real-world use.
• Architectural considerations involve integrating models into operational systems, designing interfaces for
model input and output, monitoring model performance, and updating models as new data becomes
available

Data Mining: Concepts and Techniques

6.Scalability and Performance:
• Architectural aspects of data mining systems include considerations for scalability and
performance to handle large volumes of data and complex analytical tasks efficiently.
• This may involve distributed computing architectures, parallel processing, optimization
techniques, and resource management strategies to ensure scalability and performance.
7.Integration with Business Processes:
• Data mining systems need to be integrated with existing business processes and decision-
making workflows to derive actionable insights and value.
• Architectural considerations involve aligning data mining activities with business objectives,
providing tools and interfaces for decision-makers to access and interpret results, and
incorporating feedback loops for continuous improvement.

Data Mining: Concepts and Techniques

Differences Between Data Warehouse and Data
Mining
Purpose:
• Data warehousing focuses on the process of storing and managing data to support
reporting and analysis.
• Data mining focuses on extracting insights and knowledge from data through advanced
analytical techniques.
Activities:
• Data warehousing involves data integration, transformation, and storage in a central
repository.
• Data mining involves pattern recognition, predictive modeling, and knowledge discovery
from the stored data.
Goal:
• The goal of data warehousing is to provide a centralized and structured repository of data
for analysis and reporting purposes.
• The goal of data mining is to uncover hidden patterns and insights within the data that can
be used for decision-making and predictive analysis.

January 23, 2025 Data Mining: Concepts and Techniques

Output:
• Data warehousing produces structured data repositories optimized for querying and
reporting.
• Data mining produces insights, patterns, and models derived from the data analysis
process.
Techniques:
• Data warehousing primarily involves data integration, schema design, and data
storage techniques.
• Data mining involves various statistical, mathematical, and machine learning
techniques such as clustering, classification, regression, and association rule
mining.
OLTP (Online Transaction Processing):

• OLTP is a type of database processing that focuses on managing and executing high-volume
transactional workloads in real-time.
• It is optimized for handling a large number of short, atomic transactions such as inserting,
updating, and deleting records in a database.
• OLTP systems are designed to ensure data integrity, concurrency control, and high availability to
support day-to-day operational activities of an organization.
• These systems typically have normalized database schemas to minimize redundancy and
maintain consistency in transaction processing.
OLAP (Online Analytical
Processing):
• OLAP is a technology used for querying, analyzing, and aggregating large
volumes of data to facilitate decision-making and business intelligence
activities.
• It enables users to perform complex multidimensional analysis, drill-down, and
slice-and-dice operations on data stored in a data warehouse or a
multidimensional database.
• OLAP systems provide fast query performance and support advanced analytical
functions such as data mining, forecasting, and trend analysis.
• These systems often utilize denormalized or star/snowflake schema designs to
optimize query performance and enable efficient analytical processing.

January 23, 2025 Data Mining: Concepts and Techniques

Differences Between OLTP and OLAP :
1.Purpose:
• OLTP systems are designed for transaction processing, focusing on capturing and managing day-to-day operational data.

• OLAP systems are designed for analytical processing, supporting complex querying and analysis of historical and aggregated data.

2.Workload:
• OLTP systems handle a large number of short, atomic transactions involving insertions, updates, and deletions of data.

• OLAP systems handle analytical queries that involve aggregating, summarizing, and analyzing large volumes of historical data.

3.Data Structure:
• OLTP systems typically have normalized database schemas to minimize redundancy and ensure data integrity.

• OLAP systems often use denormalized or star/snowflake schema designs to optimize query performance and facilitate complex analysis.

4.Query Complexity:
• OLTP queries are typically simple and focused on retrieving or modifying individual records or small subsets of data.

• OLAP queries are more complex and involve aggregating, grouping, and analyzing data across multiple dimensions to derive insights.

5.Usage:
• OLTP systems are used for day-to-day operational activities such as order processing, inventory management, and customer transactions.

• OLAP systems are used for business intelligence, reporting, and decision support activities such as sales analysis, financial reporting, and
market trend analysis.
Data Mart:
• A data mart is a subset of a data warehouse that is focused on a
specific area or department within an organization.
• It contains a smaller, more focused set of data that is tailored to
meet the needs of a particular group of users, such as sales,
marketing, or finance.
• Data marts are typically designed to support specific business
functions or analytical requirements, providing users with easy
access to relevant data for analysis and reporting.
• They are often created using a top-down approach, where data is
extracted from the central data warehouse and transformed to meet
the specific needs of the target business unit or department.
January 23, 2025 Data Mining: Concepts and Techniques
Data Lake:
• A data lake is a centralized repository that allows organizations to store all their
structured and unstructured data at any scale.
• It enables organizations to store data in its raw format, without the need for
extensive pre-processing or schema design, making it suitable for storing
diverse types of data, such as text, images, videos, and sensor data.
• Data lakes are designed to support a wide range of use cases, including data
exploration, advanced analytics, machine learning, and data discovery.
• They are often built using scalable distributed storage systems such as Hadoop
Distributed File System (HDFS) or cloud-based storage solutions like Amazon
S3 or Azure Data Lake Storage.

January 23, 2025 Data Mining: Concepts and Techniques

Differences Between Data Mart and Data Lake:
Structure:

• Data marts are structured repositories that contain curated and pre-processed
data, typically organized around specific business functions or departments.
• Data lakes are unstructured or semi-structured repositories that store raw
data in its native format, without the need for extensive schema design or
data modeling.
Scope:

• Data marts have a narrow scope and are focused on specific business areas
or departments within an organization.
• Data lakes have a broader scope and can store data from multiple sources
and domains, catering to a wide range of analytical and business needs.
January 23, 2025 Data Mining: Concepts and Techniques
Data Processing:
• Data marts involve the extraction, transformation, and loading (ETL) of data from the central
data warehouse or operational systems to create curated datasets.
• Data lakes store raw data in its native format, allowing for on-demand processing and analysis
using tools and technologies such as Apache Spark, Hadoop, or cloud-based analytics services.
Use Cases:
• Data marts are used for structured reporting, ad-hoc querying, and analysis within specific
business units or departments.
• Data lakes are used for exploratory data analysis, advanced analytics, machine learning, and data
science initiatives that require access to raw and diverse datasets.
Agility and Scalability:
• Data marts are relatively rigid and may require significant effort to modify or extend to
accommodate new data sources or analytical requirements.
• Data lakes are highly flexible and scalable, allowing organizations to store and analyze vast
amounts of data from diverse sources with minimal constraints on schema design or data
processing.
References

• TEXT BOOKS
Introduction to Data Mining, Tan, Steinbach and Vipin Kumar, Pearson Education,
2016
•REFERENCE BOOKS
Data Mining: Concepts and Techniques, Pei, Han and Kamber, Elsevier (2 nd
edition)
• Journals
• https://www.sciencedirect.com/topics/computer-science/data-generalization
Thank you

Please Send Your Queries on:

e-Mail: [email protected]

Google Cloud Platform for Data Engineering: From Beginner to Data Engineer using Google Cloud Platform
From Everand
Google Cloud Platform for Data Engineering: From Beginner to Data Engineer using Google Cloud Platform
alasdair gilchrist
5/5 (1)
03-Data Warehousing and OLAP Technology
No ratings yet
03-Data Warehousing and OLAP Technology
28 pages
Lecture 1 & 2
No ratings yet
Lecture 1 & 2
14 pages
Module 1 Chapter 2
No ratings yet
Module 1 Chapter 2
53 pages
Data Mining
No ratings yet
Data Mining
4 pages
Data Warehousing
No ratings yet
Data Warehousing
61 pages
What Is A Data Warehouse?
No ratings yet
What Is A Data Warehouse?
59 pages
Introduction To Data Warehouse
No ratings yet
Introduction To Data Warehouse
17 pages
Chapter 1
No ratings yet
Chapter 1
55 pages
Chapter 1&2
No ratings yet
Chapter 1&2
91 pages
Data Ware Housing and Olap Technology
No ratings yet
Data Ware Housing and Olap Technology
27 pages
Data Warehousing & Mining
No ratings yet
Data Warehousing & Mining
154 pages
Data Warehousing Mining
No ratings yet
Data Warehousing Mining
26 pages
Data Warehosing and Data Mining
No ratings yet
Data Warehosing and Data Mining
15 pages
CH 03
No ratings yet
CH 03
27 pages
358 44 Datamining and Warehousing 4.4
No ratings yet
358 44 Datamining and Warehousing 4.4
155 pages
Data Warehousing & Mining
No ratings yet
Data Warehousing & Mining
154 pages
Data Mining Abstract
No ratings yet
Data Mining Abstract
6 pages
Concepts and Techniques: Data Mining
No ratings yet
Concepts and Techniques: Data Mining
54 pages
What Is A Data Warehouse?: Data Warehouse Architecture From Data Warehousing To Data Mining
No ratings yet
What Is A Data Warehouse?: Data Warehouse Architecture From Data Warehousing To Data Mining
27 pages
CST 466
No ratings yet
CST 466
24 pages
Data Mining and Warehousing
No ratings yet
Data Mining and Warehousing
18 pages
Concepts and Techniques: Data Mining
No ratings yet
Concepts and Techniques: Data Mining
58 pages
Data Warehousing
No ratings yet
Data Warehousing
63 pages
Session 35 - Data Mining and Data Warehousing
No ratings yet
Session 35 - Data Mining and Data Warehousing
14 pages
DMDW Chapter 1
No ratings yet
DMDW Chapter 1
31 pages
CH 4 (Data Warehousing)
No ratings yet
CH 4 (Data Warehousing)
57 pages
Concepts and Techniques: Data Mining
No ratings yet
Concepts and Techniques: Data Mining
55 pages
Concepts and Techniques: Data Mining
No ratings yet
Concepts and Techniques: Data Mining
58 pages
Chapter Two
No ratings yet
Chapter Two
59 pages
What Is A Data Warehouse?
No ratings yet
What Is A Data Warehouse?
44 pages
??? ????????? ???
No ratings yet
??? ????????? ???
21 pages
Unit 3 Data Mining1
No ratings yet
Unit 3 Data Mining1
53 pages
Unit 5 Notes
No ratings yet
Unit 5 Notes
19 pages
Data Mining & Pattern Warehousing
No ratings yet
Data Mining & Pattern Warehousing
6 pages
Concepts and Techniques: Data Mining
No ratings yet
Concepts and Techniques: Data Mining
58 pages
DWM 2
No ratings yet
DWM 2
31 pages
Concepts and Techniques: Data Mining
No ratings yet
Concepts and Techniques: Data Mining
73 pages
Data Warehouse and Data Mining Syllabus
No ratings yet
Data Warehouse and Data Mining Syllabus
5 pages
Data Mining - GDi Techno Solutions
No ratings yet
Data Mining - GDi Techno Solutions
145 pages
What Is A Data Warehouse?
No ratings yet
What Is A Data Warehouse?
42 pages
Data Warehouse & Data Mining
No ratings yet
Data Warehouse & Data Mining
59 pages
Data Mning by Jaiwei Han Chapter 2
No ratings yet
Data Mning by Jaiwei Han Chapter 2
90 pages
DWM Assigment-Questions Ans
No ratings yet
DWM Assigment-Questions Ans
67 pages
Data Mining & Business Intelligence
No ratings yet
Data Mining & Business Intelligence
322 pages
Data Mining
No ratings yet
Data Mining
25 pages
Unit-I Part II Erp
No ratings yet
Unit-I Part II Erp
60 pages
Week 7 - Data Warehousing and OLAP
No ratings yet
Week 7 - Data Warehousing and OLAP
4 pages
Concepts and Techniques: Data Mining
No ratings yet
Concepts and Techniques: Data Mining
57 pages
Unit Iii
No ratings yet
Unit Iii
10 pages
Course Outline Data Mining
No ratings yet
Course Outline Data Mining
4 pages
Data Mining: Concepts and Techniques: - Slides For Textbook - Chapter 2
No ratings yet
Data Mining: Concepts and Techniques: - Slides For Textbook - Chapter 2
86 pages
Data Mining and Data Warehousing
No ratings yet
Data Mining and Data Warehousing
6 pages
Concepts and Techniques: Data Mining
No ratings yet
Concepts and Techniques: Data Mining
61 pages
2-Introduction To Data Warehouse and OLAP-26!07!2023
No ratings yet
2-Introduction To Data Warehouse and OLAP-26!07!2023
13 pages
Resume 1
100% (1)
Resume 1
106 pages
Data Warehousing and Data Mining
100% (1)
Data Warehousing and Data Mining
30 pages
Data Warehousing
100% (1)
Data Warehousing
154 pages
Mastering Data Mining Techniques
From Everand
Mastering Data Mining Techniques
Dhaanyalakshmi Ahuja
No ratings yet
Practical Data Strategies and Recipes
From Everand
Practical Data Strategies and Recipes
Tom Henricksen
No ratings yet
1.1.2 Digital Marketing Introduction-II
No ratings yet
1.1.2 Digital Marketing Introduction-II
9 pages
Chapter-2 - Soil Erosion Pollution Control Measures
No ratings yet
Chapter-2 - Soil Erosion Pollution Control Measures
13 pages
9a-Chapter 1.1 Introduction To Green Buildings
No ratings yet
9a-Chapter 1.1 Introduction To Green Buildings
29 pages
Lecture 1.3 1.4
No ratings yet
Lecture 1.3 1.4
16 pages
Pa-1 Psy Paper Xi Final
No ratings yet
Pa-1 Psy Paper Xi Final
4 pages
Principal Letter of Recommendation
No ratings yet
Principal Letter of Recommendation
1 page
Altmetrics: An Overview and Evaluation: Online Information Review June 2017
No ratings yet
Altmetrics: An Overview and Evaluation: Online Information Review June 2017
9 pages
CTFC Qualification Specification-Specification PDF
No ratings yet
CTFC Qualification Specification-Specification PDF
14 pages
Skills Workbook: Unit 6
100% (1)
Skills Workbook: Unit 6
90 pages
Unit 1 Culture and Personality
No ratings yet
Unit 1 Culture and Personality
21 pages
Academic Performance Rating Scale PDF
No ratings yet
Academic Performance Rating Scale PDF
17 pages
5 Music and Arts Week 3 4
No ratings yet
5 Music and Arts Week 3 4
6 pages
Pcm-ANSWER SHEET PART TEST-2 XIITH
No ratings yet
Pcm-ANSWER SHEET PART TEST-2 XIITH
4 pages
Oak Hill SH PDF
No ratings yet
Oak Hill SH PDF
29 pages
3yr - LEC - Psy Aspects of Skin Diseases
No ratings yet
3yr - LEC - Psy Aspects of Skin Diseases
19 pages
2018 2019 FHT Annual Operating Plan FINAL1
No ratings yet
2018 2019 FHT Annual Operating Plan FINAL1
55 pages
Pages From 2024 P6 Science Prelim Exam SCGS
No ratings yet
Pages From 2024 P6 Science Prelim Exam SCGS
40 pages
Ingersoll Rand Dollars For Doers 091516
No ratings yet
Ingersoll Rand Dollars For Doers 091516
2 pages
Minorities Problems
No ratings yet
Minorities Problems
13 pages
Flexibility in Learning
No ratings yet
Flexibility in Learning
9 pages
Treib, O. Et Al. (2007) Modes of Governance, Towards A Conceptual Clarification
No ratings yet
Treib, O. Et Al. (2007) Modes of Governance, Towards A Conceptual Clarification
22 pages
2020-DEED-OF-ACCEPTANCE-donation-MLQES 2
No ratings yet
2020-DEED-OF-ACCEPTANCE-donation-MLQES 2
3 pages
Complete The Conversations With The Correct Words in Parenthenses
No ratings yet
Complete The Conversations With The Correct Words in Parenthenses
3 pages
2058 s14 Ms 12 PDF
100% (1)
2058 s14 Ms 12 PDF
7 pages
RR 1
No ratings yet
RR 1
4 pages
Computer Vision Ii: Ai Courses by Opencv
No ratings yet
Computer Vision Ii: Ai Courses by Opencv
8 pages
SF 2 Daily Attendance Editable 013650
No ratings yet
SF 2 Daily Attendance Editable 013650
3 pages
C1 Revision Checklist Atomic Structure Periodic Table (Comb)
No ratings yet
C1 Revision Checklist Atomic Structure Periodic Table (Comb)
4 pages
DLL EAPP FirstQuarter 1617
No ratings yet
DLL EAPP FirstQuarter 1617
4 pages
Parent/Guardian'S Signature: Attendance Record Department of Education
No ratings yet
Parent/Guardian'S Signature: Attendance Record Department of Education
2 pages
(IJCST-V3I5P29) : Mr. Sachin Ashok Vanjari, Dr. R. B. Ingle
No ratings yet
(IJCST-V3I5P29) : Mr. Sachin Ashok Vanjari, Dr. R. B. Ingle
5 pages
Wilson2012
No ratings yet
Wilson2012
16 pages
T1 - Lariviere Et Al 2017
No ratings yet
T1 - Lariviere Et Al 2017
9 pages
Graduands Names As Per Ids 1
No ratings yet
Graduands Names As Per Ids 1
68 pages

Lecture 2.1.1 2.1.2

Uploaded by

Lecture 2.1.1 2.1.2

Uploaded by

INSTITUTE-USB

Business Analytics (22BAT-264)Operation Research(BAT-308)

To enhance the analytical skills of students by Analyze

To evaluate analytical solutions for assessing their Evaluate

To build the expertise in delivering practical Application

• Data warehousing involves the process of collecting, storing,

Data Mining: Concepts and Techniques

Data Mining: Concepts and Techniques

Data Mining: Concepts and Techniques

January 23, 2025 Data Mining: Concepts and Techniques

January 23, 2025 Data Mining: Concepts and Techniques

January 23, 2025 Data Mining: Concepts and Techniques

Please Send Your Queries on:

You might also like