0% found this document useful (0 votes)

10 views8 pages

DWM Important Answer

Uploaded by

khan2547abdul

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views8 pages

DWM Important Answer

Uploaded by

khan2547abdul

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

1). Explain integration and data mining system with data warehouse.

Ans: Integration in the context of a data warehouse involves the consolidation

of data from various disparate sources into a cohesive and unified repository.
This process ensures that the data warehouse contains consistent, accurate, and
comprehensive data, enabling effective analysis and reporting.

Key Aspects of Integration:

1. Data Sources:
a. Integration involves pulling data from multiple sources such as
transactional databases, flat files, spreadsheets, cloud storage, and
external data providers.
2. ETL Processes:
a. Extract, Transform, Load (ETL): This is the core process used for data
integration in a data warehouse. ETL involves:
i. Extract: Retrieving data from various source systems.
ii. Transform: Cleaning, validating, and transforming data into a
consistent format. This may involve data cleansing,
deduplication, normalization, and applying business rules.
iii. Load: Storing the transformed data into the data warehouse.
3. Data Quality:
a. Ensuring data quality is a critical part of integration. This involves
removing errors, inconsistencies, and duplicates, and ensuring data
conforms to defined standards and formats.
4. Data Consistency:
a. Integration processes ensure that data from different sources is
consistent, meaning that it uses the same definitions,
measurements, and formats across the entire data warehouse.

Data Mining with Data Warehouse

Data Mining refers to the process of discovering patterns, correlations, and

insights from large datasets using statistical, machine learning, and
computational techniques. In the context of a data warehouse, data mining
leverages the consolidated and historical data stored within the warehouse to
extract valuable information.

Key Aspects of Data Mining:

1. Data Preparation:
a. Data stored in the data warehouse is often pre-processed and
cleaned, which is essential for effective datamining. This
preparation includes dealing with missing values, outliers, and
ensuring data consistency.
2. Pattern Discovery:
a. Data mining techniques are used to discover patterns and
relationships within the data. Common techniques include
clustering, classification, regression, association rule learning, and
anomaly detection.
3. Predictive Modeling:
a. Data mining involves building models that can predict future trends
or behaviors based on historical data. For example, predicting
customer churn, sales forecasting, or identifying fraudulent
transactions.
4. Data Analysis Techniques:
a. Clustering: Grouping similar data points together based on specific
characteristics.
b. Classification: Assigning data points to predefined categories or
classes.
2). Write short note on multilevel and multidimensional association rule.

Ans: Multilevel association rules refer to discovering associations within data

that is organized into different levels of abstraction or granularity in a hierarchy.
This approach helps to identify pattern sat various levels, such as general
(higher level) and specific (lower level) relationships.

Example: Consider a retail scenario:

• Higher Level: "Electronics" category

• Lower Level: Specific items like "Laptops" and "Smartphones"

Key Features:

• Hierarchical Data: The data is organized in a hierarchy (e.g., category,

subcategory, item).
• Support Thresholds: Different minimum support thresholds can be
applied at different levels of the hierarchy. Higher levels may have lower
support thresholds to identify broader patterns, while lower levels may
have higher support thresholds for more specific patterns.
• Rule Discovery: Rules can be discovered at each level, providing insights
into both general and detailed associations. For example, a general rule
might be that "customers who buy electronics often buy accessories,"
while a more specific rule might be that "customers who buy laptops
often buy laptop bags."

Applications:

• Retail: Understanding buying patterns at different product levels.

• Marketing: Identifying customer preferences across various product
categories.
• Healthcare: Discovering patterns in patient data at different levels of
diagnosis.

Multidimensional Association Rules

Multidimensional association rules involve discovering associations among data

attributes that span multiple dimensions or categories. This type of rule mining
examines relationships between different attributes, not just within a single
attribute.

Example: In a sales database, dimensions might include:

• Time: Days, weeks, months

• Location: City, region, country
• Product: Category, subcategory, brand

Key Features:

• Multiple Dimensions: Involves more than one attribute in the rule

formation. For example, "customers in region X who buy product Y during
the holiday season also tend to buy product Z."
• Composite Rules: Rules are formed using combinations of different
dimensions, providing a more comprehensive understanding of patterns
and associations.
• Support and Confidence: The concepts of support (frequency of the
itemset) and confidence (likelihood of the consequent given the
antecedent) are extended to consider multiple dimensions
simultaneously.

Applications:

• Business Intelligence: Analyzing sales patterns across different times,

locations, and products to optimize inventory and marketing strategies.
• Healthcare: Understanding correlations between patient demographics,
treatments, and outcomes.
• Web Usage Mining: Examining user behavior across different sessions,
pages, and actions to improve website design and user experience.
3). Discuss association and correlation rule with example.

Ans: Association rules are a fundamental concept in data mining that identify
relationships between items in large datasets. These rules help uncover
patterns such as the co-occurrence of items within transactions.

Components:

• Antecedent (LHS): The item or set of items found on the left-hand side of
the rule.
• Consequent (RHS): The item or set of items found on the right-hand side
of the rule.
• Support: The frequency with which the itemset appears in the dataset.
• Confidence: The likelihood that the consequent appears in transactions
containing the antecedent.
• Lift: The ratio of the observed support to that expected if the items were
independent.

Example: In a supermarket, analyzing purchase transactions might reveal the

following association rule:

• Rule: If a customer buys bread (antecedent), they are likely to buy butter
(consequent).
• Support: 10% of all transactions include both bread and butter.
• Confidence: 70% of transactions that include bread also include butter.
• Lift: If the lift is 2, it means customers buying bread are twice as likely to
buy butter compared to random chance.

Correlation Rules

Correlation rules, or correlation analysis, involve measuring the strength and

direction of a linear relationship between two variables. Unlike association
rules, correlation does not imply causation but indicates how changes in one
variable are associated with changes in another.

Components:

• Correlation Coefficient (r): A statistical measure that calculates the

strength and direction of a relationship between two variables, ranging
from -1 to 1.
• +1: Perfect positive correlation (as one variable increases, the other
also increases).
• 0: No correlation (no relationship between variables).
• -1: Perfect negative correlation (as one variable increases, the other
decreases).

Example: Consider a dataset of students’ hours studied and their exam scores:

• Variables: Hours studied and exam scores.

• Correlation Coefficient: Suppose the correlation coefficient (r) is 0.85. This
indicates a strong positive correlation, meaning students who study more
hours tend to score higher on exams.
4). Differentiate between agglomerative and divisive method.
Ans:
Parameters Agglomerative Method Divisive Method
S.No.
1. Category Bottom-up approach Top-down approach
2. Approach each data point starts in all data points start in
its own cluster, and the a single cluster, and
algorithm recursively the algorithm
merges the closest pairs recursively splits the
of clusters until a single cluster into smaller
cluster containing all the sub-clusters until each
data points is obtained. data point is
in its own cluster.
3. Complexity Agglomerative clustering Comparatively less
level is generally more expensive as divisive
computationally clustering only
expensive, especially for requires the
large datasets as this calculation of
approach requires the distances between
calculation of all pairwise sub- clusters, which
distances between can reduce the
data points, which can be computational
computationally burden.
expensive.
4. Outliers Agglomerative divisive clustering
clustering can handle may create sub-
outliers better than clusters around
divisive clustering since outliers, leading to
outliers can be absorbed suboptimal
into larger clustering
clusters results.
5. Interpretability Agglomerative clustering divisive clustering can
tends to produce more be more difficult to
interpretable results interpret since the
since the dendrogram dendrogram shows
shows the merging the splitting process
process of the clusters, of the clusters, and
and the user can choose the user must choose
the number of a stopping
clusters based on the criterion to
desired level of determine the
granularity. number of clusters.
5). solve following example by KNN algorithm, where k=3.
Ans: Sure! Let's solve a classification problem using the K-Nearest Neighbors
(KNN) algorithm with k=3

Example Dataset

Assume we have the following dataset with two features (X1 and X2) and a class
label:
ID X1 X2 Class
1 1 2 A
2 2 3 A
3 3 3 B
4 5 5 B
5 8 8 B
6 6 7 A

New Data Point

We want to classify the new data point (X1, X2) = (4,4).

Steps for KNN Algorithm

1. Calculate the distance between the new data point and all points in the
dataset.
2. Sort the distances and determine the k-nearest neighbors.
3. Count the class labels of the k-nearest neighbors.
4. Assign the class label that appears most frequently among the k-nearest
neighbors to the new data point.

Step 1: Calculate Distances

We'll use the Euclidean distance formula to calculate the distance between the
new data point (4,4) and each point in the dataset:

Euclidean distance= (X1new −X1train )2+(X2new −X2train )2

• Distance to ID 1: (4−1)2+(4−2)2 =9+4 =13 ≈3.61

• Distance to ID 2: (4−2)2+(4−3)2 =4+1 =5 ≈2.24
• Distance to ID 3: (4−3)2+(4−3)2 =1+1 =2 ≈1.41
• Distance to ID 4: (4−5)2+(4−5)2 =1+1 =2 ≈1.41
• Distance to ID 5: (4−8)2+(4−8)2 =16+16 =32 ≈5.66
• Distance to ID 6: (4−6)2+(4−7)2 =4+9 =13 ≈3.61

Step 2: Sort Distances

Sort the distances in ascending order and identify the k-nearest neighbors (k = 3):

ID Distance Class
3 1.41 B
4 1.41 B
2 2.24 A
1 3.61 A
6 3.61 A
5 5.66 B

The 3 nearest neighbors are:

• ID 3: Class B
• ID 4: Class B
• ID 2: Class A

Step 3: Count Class Labels

Count the class labels of the 3 nearest neighbors:

• Class B: 2 neighbors
• Class A: 1 neighbor

Step 4: Assign Class Label

The class label that appears most frequently among the 3 nearest neighbors is B.

Conclusion

The new data point (4,4) is classified as Class B using the KNN algorithm with k=3.
6). Difference between classification & prediction.

Ans: Classification and prediction are both key tasks in machine learning and data
analysis, but they serve different purposes and have distinct characteristics. Here's
a breakdown of the differences between classification and prediction:

Classification:

1. Classification is the process of identifying which category a new observation

belongs to base on a training data set containing observations whose
category membership is known.

2. In classification, the accuracy depends on finding the class label correctly.

3. In classification, the model can be known as the classifier.

4. A model or the classifier is constructed to find the categorical labels.

5. For example, the grouping of patients based on their medical records can be
considered a classification.

Prediction:

1. Predication is the process of identifying the missing or unavailable numerical

data for a new observation.

2. In prediction, the accuracy depends on how well a given predictor can guess
the value of a predicated attribute for new data.

3. In prediction, the model can be known as the predictor.

4. A model or a predictor will be constructed that predicts a continuous -valued

function or ordered value.

5. For example, We can think of prediction as predicting the correct treatment

for a particular disease for a person.
7). Difference between OLTP & OLAP.
Ans:
Feature OLTP OLAP
Characteristic It is a system which is used to It is a system which is used
manage operational Data. to manage informational
Data.
Users Clerks, clients, and information Knowledge workers,
technology professionals. including managers,
executives, and analysts.
System OLTP system is a customer- OLAP system is market-
orientation oriented, transaction, and oriented, knowledge
query processing are done by workers including
clerks, clients, and information managers, do data analysts
technology professionals. executive and analysts.
Data contents OLTP system manages current OLAP system manages a
data that too detailed and are large amount of historical
used for decision making. data, provides facilitates for
summarization and
aggregation, and stores and
manages data at different
levels of granularity. This
information makes the data
more comfortable to use in
informed decision making.
Database Size 100 MB-GB 100 GB-TB
Database design OLTP system usually uses an OLAP system typically uses
entity-relationship (ER) data either a star or snowflake
model and application- model and subject-oriented
oriented database design. database design.
Volume of data Not very large Because of their large
volume, OLAP data are
stored on multiple storage
media.
Insert and Short and fast inserts and Periodic long-running batch
Updates updates proposed by end- jobs refresh the data.
users.
8). Explain data ware house architecture in details.

Ans: Data warehouse architecture refers to the design and structure of a data
warehouse, which is a central repository for integrated data from various sources,
used for analysis and reporting. The architecture of a data warehouse typically
follows one of several established models, but the most common and widely
accepted is the three-tier architecture. Here's a detailed look at the components of
this architecture:

1. Three-Tier Data Warehouse Architecture

A. Bottom Tier: Data Sources and ETL Processes

This tier involves the extraction, transformation, and loading (ETL) processes,
which are crucial for preparing the data for analysis. It includes:

• Data Sources: These can be operational databases, external data sources, flat
files, or any other form of data storage that feeds into the data warehouse.
Sources can be both structured (like relational databases) and unstructured
(like log files or social media data).

• ETL Processes:

o Extraction: Extracting data from various source systems.

o Transformation: Cleaning, filtering, and transforming the data to
ensure consistency and quality.
o Loading: Loading the transformed data into the data warehouse. This
may involve both initial load and incremental updates.

B. Middle Tier: Data Warehouse Storage

This tier is where the core of the data warehouse resides. It typically involves:

• Data Warehouse Database: This is a centralized repository where the

integrated data is stored. It is often optimized for query performance and
can use various database management systems (DBMS), such as relational
databases (e.g., Oracle, SQL Server) or specialized data warehouse appliances
(e.g., Teradata).

• Data Marts: These are subsets of the data warehouse, tailored to specific
business lines or departments. Data marts can be dependent (sourced
directly from the data warehouse) or independent (sourced directly from
operational systems).

• OLAP Cubes: Online Analytical Processing (OLAP) cubes are multi-

dimensional data structures that allow for complex querying and analysis.
They are designed to facilitate fast retrieval and are optimized for read-
heavy operations.

C. Top Tier: Front-end Tools and Access Layers

The top tier involves the tools and interfaces that end-users interact with to
perform analysis, reporting, and data mining. It includes:

• Query and Reporting Tools: Tools like SQL-based query tools, business
intelligence (BI) platforms (e.g., Tableau, Power BI), and reporting software
(e.g., Crystal Reports) that allow users to generate and view reports.

• OLAP Tools: Tools that enable users to interact with OLAP cubes, perform
multidimensional analysis, and generate detailed reports.

• Data Mining Tools: Advanced analytical tools that use statistical and machine
learning techniques to discover patterns and insights from the data.

• Dashboards: Interactive visual interfaces that provide at-a-glance views of

key performance indicators (KPIs) and other important metrics.

NYSF Leveraged Buyout Model Template
No ratings yet
NYSF Leveraged Buyout Model Template
20 pages
Yiye Avila - Dones Del Espíritu
50% (4)
Yiye Avila - Dones Del Espíritu
1 page
Drilling For Non Technical People
100% (5)
Drilling For Non Technical People
87 pages
DMA Notes
No ratings yet
DMA Notes
40 pages
Data Mining Unit 1-1
No ratings yet
Data Mining Unit 1-1
11 pages
Dmbi
No ratings yet
Dmbi
9 pages
Data Mining-Unit-1
No ratings yet
Data Mining-Unit-1
21 pages
UNIT 1 Introduction of Data Mining
No ratings yet
UNIT 1 Introduction of Data Mining
11 pages
Assignment No3DWM
No ratings yet
Assignment No3DWM
15 pages
HTCB Unit 3
No ratings yet
HTCB Unit 3
6 pages
Question
No ratings yet
Question
27 pages
Data Mining U3
No ratings yet
Data Mining U3
19 pages
Data Mining Unit-III
No ratings yet
Data Mining Unit-III
5 pages
Ba Unit 2 Imp
No ratings yet
Ba Unit 2 Imp
9 pages
Unit 3 Data Warehousing and Data Mining
No ratings yet
Unit 3 Data Warehousing and Data Mining
7 pages
Data Mining
No ratings yet
Data Mining
3 pages
Data Mining
No ratings yet
Data Mining
7 pages
Lecture 2.3.5 2.3.6
No ratings yet
Lecture 2.3.5 2.3.6
19 pages
Unit3mining Association Rules
No ratings yet
Unit3mining Association Rules
21 pages
DM UNIT-1 Question and Answer
No ratings yet
DM UNIT-1 Question and Answer
25 pages
Lec 02
No ratings yet
Lec 02
33 pages
8 Data Mining Algorithms
No ratings yet
8 Data Mining Algorithms
8 pages
Module 4
No ratings yet
Module 4
24 pages
Data Mining Long Answers
No ratings yet
Data Mining Long Answers
4 pages
DWM Assigment-Questions Ans
No ratings yet
DWM Assigment-Questions Ans
67 pages
206 Data Mining
No ratings yet
206 Data Mining
28 pages
Data Warehouse and Data Mining - Definition and Concepts
No ratings yet
Data Warehouse and Data Mining - Definition and Concepts
20 pages
Assignment 2nd DMDW
No ratings yet
Assignment 2nd DMDW
11 pages
Q.1. What Is Data Mining?
No ratings yet
Q.1. What Is Data Mining?
15 pages
Advanced Databases and Mining Unit 3
No ratings yet
Advanced Databases and Mining Unit 3
30 pages
Soln 1
100% (1)
Soln 1
6 pages
Data Mining Unit2
No ratings yet
Data Mining Unit2
9 pages
Unit-II Association Rules
No ratings yet
Unit-II Association Rules
16 pages
DM Unit 1 PDF
No ratings yet
DM Unit 1 PDF
9 pages
Data Mining and Data Warehousing Notes ct1
No ratings yet
Data Mining and Data Warehousing Notes ct1
12 pages
Screenshot 2025-04-09 at 10.35.12 AM
No ratings yet
Screenshot 2025-04-09 at 10.35.12 AM
31 pages
DM Unit-2
No ratings yet
DM Unit-2
22 pages
Data Mining
No ratings yet
Data Mining
14 pages
Data Mining Answer Key
No ratings yet
Data Mining Answer Key
10 pages
Data Mining Unit 1
No ratings yet
Data Mining Unit 1
39 pages
Unit 3 DWDM
No ratings yet
Unit 3 DWDM
25 pages
Datawarehouse Assignment
No ratings yet
Datawarehouse Assignment
11 pages
Answers PDF
No ratings yet
Answers PDF
9 pages
3.data Pre-Processing Concepts
No ratings yet
3.data Pre-Processing Concepts
8 pages
Ai Pass
No ratings yet
Ai Pass
12 pages
Unit 5
No ratings yet
Unit 5
9 pages
Model Question Paper and Solution - DWDM
No ratings yet
Model Question Paper and Solution - DWDM
57 pages
Assignment Solution 074
No ratings yet
Assignment Solution 074
8 pages
MCS 221 New P
No ratings yet
MCS 221 New P
41 pages
Unit 3 PPT (BA)
No ratings yet
Unit 3 PPT (BA)
19 pages
Association Rule Mining
No ratings yet
Association Rule Mining
61 pages
Data Mining
No ratings yet
Data Mining
44 pages
Data Mining Questions
100% (1)
Data Mining Questions
7 pages
Data Science & Big Data Analysis Module 1,2,3,4,5
No ratings yet
Data Science & Big Data Analysis Module 1,2,3,4,5
70 pages
Unit 1
No ratings yet
Unit 1
18 pages
Solutions To DM I MID (A)
100% (1)
Solutions To DM I MID (A)
19 pages
TMK - DWDM - Unit 4. From Government Engineering College
No ratings yet
TMK - DWDM - Unit 4. From Government Engineering College
176 pages
29427association Rule
No ratings yet
29427association Rule
12 pages
Data Mining Tutorials
No ratings yet
Data Mining Tutorials
52 pages
BDA Class1
No ratings yet
BDA Class1
33 pages
Introduction to Robotics
From Everand
Introduction to Robotics
Swarnalata Verma
No ratings yet
Statistical Classification: Fundamentals and Applications
From Everand
Statistical Classification: Fundamentals and Applications
Fouad Sabry
No ratings yet
Data Science through R. Unsupervised Learning. Dimension Reduction Techniques: Principal Components, Factor Analysis and Correspondence Analysis
From Everand
Data Science through R. Unsupervised Learning. Dimension Reduction Techniques: Principal Components, Factor Analysis and Correspondence Analysis
César Pérez López
No ratings yet
A Brief History of Social Studies
No ratings yet
A Brief History of Social Studies
6 pages
Replacing The Hood Maxfire
No ratings yet
Replacing The Hood Maxfire
2 pages
25 Pounder Identification
100% (2)
25 Pounder Identification
11 pages
Physical Exam-1 2022 Revamp
No ratings yet
Physical Exam-1 2022 Revamp
6 pages
RT Procedure
100% (4)
RT Procedure
14 pages
2 Disbursement Request Form
No ratings yet
2 Disbursement Request Form
1 page
National Institute of Disaster Management: TH TH
No ratings yet
National Institute of Disaster Management: TH TH
15 pages
Report Aslan Sissekenov
No ratings yet
Report Aslan Sissekenov
2 pages
Oct2023
No ratings yet
Oct2023
7 pages
ENV-WKP (2023) 19 en
No ratings yet
ENV-WKP (2023) 19 en
92 pages
Professional Development Plan
No ratings yet
Professional Development Plan
3 pages
Hiking
No ratings yet
Hiking
13 pages
KY-040 Arduino Rotary Encoder User Manual
No ratings yet
KY-040 Arduino Rotary Encoder User Manual
5 pages
Vitamin B in Low Back Pain: A Randomised, Double-Blind, Placebo-Controlled Study
No ratings yet
Vitamin B in Low Back Pain: A Randomised, Double-Blind, Placebo-Controlled Study
6 pages
Chapter Four Structure of Cooperatives in Ethiopia
100% (1)
Chapter Four Structure of Cooperatives in Ethiopia
3 pages
Kpi 2025
No ratings yet
Kpi 2025
2 pages
Course Outline
No ratings yet
Course Outline
12 pages
6005 Completo
No ratings yet
6005 Completo
196 pages
Romance Astrology PDF
No ratings yet
Romance Astrology PDF
311 pages
Bài ôn tập học kì II - lop 8
No ratings yet
Bài ôn tập học kì II - lop 8
29 pages
Model YCRL Remote Condenser Scroll Liquid Chiller Style A: FORM 150.27-EG1 (1210)
No ratings yet
Model YCRL Remote Condenser Scroll Liquid Chiller Style A: FORM 150.27-EG1 (1210)
44 pages
POLIMER
No ratings yet
POLIMER
28 pages
Easylyte Plus Manual: Page 3 of About 90,800 Results (0.31 Seconds)
No ratings yet
Easylyte Plus Manual: Page 3 of About 90,800 Results (0.31 Seconds)
1 page
Launchpad
No ratings yet
Launchpad
41 pages
Love With Pain E-Book
No ratings yet
Love With Pain E-Book
124 pages
Information and Resources For Starting A Home-Based Food Business
No ratings yet
Information and Resources For Starting A Home-Based Food Business
2 pages
E2030 Conceptual Framework Key Competencies For 2030
No ratings yet
E2030 Conceptual Framework Key Competencies For 2030
24 pages

DWM Important Answer

Uploaded by

DWM Important Answer

Uploaded by

1). Explain integration and data mining system with data warehouse.

Ans: Integration in the context of a data warehouse involves the consolidation

Key Aspects of Integration:

Data Mining with Data Warehouse

Data Mining refers to the process of discovering patterns, correlations, and

Key Aspects of Data Mining:

Ans: Multilevel association rules refer to discovering associations within data

Example: Consider a retail scenario:

• Higher Level: "Electronics" category

• Hierarchical Data: The data is organized in a hierarchy (e.g., category,

• Retail: Understanding buying patterns at different product levels.

Multidimensional Association Rules

Multidimensional association rules involve discovering associations among data

Example: In a sales database, dimensions might include:

• Time: Days, weeks, months

• Multiple Dimensions: Involves more than one attribute in the rule

• Business Intelligence: Analyzing sales patterns across different times,

Example: In a supermarket, analyzing purchase transactions might reveal the

Correlation rules, or correlation analysis, involve measuring the strength and

• Correlation Coefficient (r): A statistical measure that calculates the

• Variables: Hours studied and exam scores.

New Data Point

We want to classify the new data point (X1, X2) = (4,4).

Steps for KNN Algorithm

Step 1: Calculate Distances

Euclidean distance= (X1new −X1train )2+(X2new −X2train )2

• Distance to ID 1: (4−1)2+(4−2)2 =9+4 =13 ≈3.61

Step 2: Sort Distances

The 3 nearest neighbors are:

Step 3: Count Class Labels

Count the class labels of the 3 nearest neighbors:

Step 4: Assign Class Label

1. Classification is the process of identifying which category a new observation

2. In classification, the accuracy depends on finding the class label correctly.

3. In classification, the model can be known as the classifier.

4. A model or the classifier is constructed to find the categorical labels.

1. Predication is the process of identifying the missing or unavailable numerical

3. In prediction, the model can be known as the predictor.

4. A model or a predictor will be constructed that predicts a continuous -valued

5. For example, We can think of prediction as predicting the correct treatment

1. Three-Tier Data Warehouse Architecture

A. Bottom Tier: Data Sources and ETL Processes

o Extraction: Extracting data from various source systems.

B. Middle Tier: Data Warehouse Storage

• Data Warehouse Database: This is a centralized repository where the

• OLAP Cubes: Online Analytical Processing (OLAP) cubes are multi-

C. Top Tier: Front-end Tools and Access Layers

• Dashboards: Interactive visual interfaces that provide at-a-glance views of

You might also like