0% found this document useful (0 votes)

30 views

DataWareHouse Notes

Uploaded by

krish2021

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

30 views

DataWareHouse Notes

Uploaded by

krish2021

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 4

Datawarehouse:- System that aggregates data from multiple sources into central repository of structured data to support

analytics (OLAP-OnLine Analytical Processing). Supports ML, AI, data mining, OLAP and reporting.

Another def:- Subject/business oriented (customer/supplier/product/sales etc.), integrated (data collected from
multiple data sources), time-variant (timely collection of data over period) and non-volatile (existing data is not changed
just new data appended) collection of data to support mgmt. decision making process.

DWH provided on appliances, on-cloud, on-premises and mixed solutions by IBM, Oracle, Microsoft, amazon, Google etc.

Data marts:- domain/user/business function specific repository system (Type- Independent, dependent, hybrid). Specific
schema data repository for ease of retrieval and for analytics.

Data lake:- Repository of raw data in its native form without any preprocessing. For structured, semi-structured and
unstructured data. Cons- Data duplication lead to storage excess and less data quality

Data lakehouse:- To ensure optimized data quality with less storage costs and with schematic data. Pros of both DWH
and Datalake.

FACT and Dimension tables:-

FACT- quantitative/aggregated data of business processes, contains foreign keys to dimension tables

DIMENSION-categorical variables to filter, group fact data. Contains business entities

Data Modeling into FLAT schema, STAR schema or SNOWFLAKE schema depending upon the storage/query processing
requirement.

Why do we use these schemas, and how do they differ?

Star schemas are optimized for reads and are widely used for designing data marts(query boost), whereas snowflake
schemas are optimized for writes and are widely used for transactional data warehousing(writing/size boost).
 Normalization reduces redundancy, data size (5 NF types)

Data Cube Rep:-

Slicing- 1 layer of cube is cut

Dicing- large cube is filtered into small cube

Drill up and down-Drilling up and down into subsequent layers

Pivoting-Rearrange the view of cube

Rolling up- summarize data using aggregate functions

1. Grouping sets- subtotals for every requested tuple of items

2. CUBE-subtotals/totals for combined and single category
3. ROLLUP-
4. Materialized Views:- Snapshot of contents of sql query or to replicate data in staging database or precompute
expensive queries for DWH

DWH architecture:-

DataSources(DB,Datalakes,ERP,OLTPs)ETLProcessing w/o staging areaDWHDatamartReporting/analytical tools

Data Quality concerns:-

 Accuracy (Match b/w src / target system)

 Completeness (missing, null, invalid values)
 Consistency (datatypes, datafields, names etc.)
 Currency (up to date information)

Managing DQ :- DetectCaptureReportInvestigateDiagnoseCorrect and then automating workflows

1.
Question 1
What do we call a normalized version of the star schema?
1 / 1 point
Product schema
Normalized schema
Parent dimension
Snowflake schema
Correct
Correct, the normalized version of the star schema is called a snowflake schema, due to its multiple layers of
branching which resembles a snowflake pattern.
2.
Question 2
Considering a general architectural model for an Enterprise Data Warehouse, which of these components is holding
data and developing workflows?
1 / 1 point
Enterprise data warehouse repository
Staging and sandbox areas
Data sources
Data marts
Correct
Correct, these components are holding data and developing workflows.
3.
Question 3
Materialized Views can be set up to have different refresh options, such as: (Select 1 answer).
1 / 1 point
Populated
Never, upon request, and immediately
Automatically
Manually refresh
Correct
Materialized Views can be set up to have different refresh options, such as “never” (they are only populated when
created, which is useful if the data seldom changes), “upon request” (manually refresh, for example, after changes
to the data have been made, or scheduled refresh, for example, after daily data loads), and “immediately”
(automatically refresh after every statement).
4.
Question 4
Accumulating snapshot fact tables are used to __________.
0 / 1 point
extract data
process events
load data
record events
Incorrect
Incorrect, please review the Facts and Dimensional Modeling video.
5.
Question 5
In what location is data from source systems extracted to?
1 / 1 point
Target systems
Operating system
Staging area
Business intelligence platform
Correct
Correct, a staging area is a separate location where data from source systems is extracted to.
6.
Question 6
Materialized views can be used to __________.
1 / 1 point
safely work with affecting source database
automatically safe query results
replicate data
synchronize updates
Correct
Correct, they can be used to replicate data, for example to be used in a staging database

 2 design approaches of DWH:- Top down (SRCDWHDM) and Bottom-Ups (SRCDMDWH)

Download Study Guide for 1Z0 071 Oracle Database 12c SQL Oracle Certification Prep 1st Edition Matthew Morris ebook All Chapters PDF
100% (1)
Download Study Guide for 1Z0 071 Oracle Database 12c SQL Oracle Certification Prep 1st Edition Matthew Morris ebook All Chapters PDF
55 pages
Abinitio Session 1
100% (1)
Abinitio Session 1
237 pages
THE STEP BY STEP GUIDE FOR SUCCESSFUL IMPLEMENTATION OF DATA LAKE-LAKEHOUSE-DATA WAREHOUSE: "THE STEP BY STEP GUIDE FOR SUCCESSFUL IMPLEMENTATION OF DATA LAKE-LAKEHOUSE-DATA WAREHOUSE"
From Everand
THE STEP BY STEP GUIDE FOR SUCCESSFUL IMPLEMENTATION OF DATA LAKE-LAKEHOUSE-DATA WAREHOUSE: "THE STEP BY STEP GUIDE FOR SUCCESSFUL IMPLEMENTATION OF DATA LAKE-LAKEHOUSE-DATA WAREHOUSE"
AJIT DASH
2/5 (2)
Workday Interviews Q & A
50% (10)
Workday Interviews Q & A
155 pages
Knime PDF
100% (1)
Knime PDF
222 pages
SE130351 - NgoNhat Thien - DBW301 - Test2
No ratings yet
SE130351 - NgoNhat Thien - DBW301 - Test2
4 pages
Learn Data Warehousing in 24 Hours
From Everand
Learn Data Warehousing in 24 Hours
Alex Nordeen
No ratings yet
Learn SAP BI in 24 Hours
From Everand
Learn SAP BI in 24 Hours
Alex Nordeen
3/5 (1)
BI Testing Tutorial V1.0
100% (1)
BI Testing Tutorial V1.0
29 pages
Data Warehouse 1
No ratings yet
Data Warehouse 1
6 pages
Introduction To Data Warehouse
No ratings yet
Introduction To Data Warehouse
22 pages
Datawarehousing Concepts
No ratings yet
Datawarehousing Concepts
11 pages
DWDM IT-32 DATAWAREHOUSING & DATAMINING
No ratings yet
DWDM IT-32 DATAWAREHOUSING & DATAMINING
9 pages
What Is A Data Warehouse
No ratings yet
What Is A Data Warehouse
11 pages
DW Revision Question With Solutions
No ratings yet
DW Revision Question With Solutions
5 pages
MCS-221 2022 23
No ratings yet
MCS-221 2022 23
26 pages
Interview questions Data Warehouse
No ratings yet
Interview questions Data Warehouse
35 pages
Data Mining & Housing
No ratings yet
Data Mining & Housing
13 pages
Module 1 Data Warehousing Fundamentals
No ratings yet
Module 1 Data Warehousing Fundamentals
17 pages
Data Warehouse - Introduction: Subject-Oriented Integrated Time-Variant Nonvolatile
No ratings yet
Data Warehouse - Introduction: Subject-Oriented Integrated Time-Variant Nonvolatile
69 pages
Data Warehouse and OLAP
No ratings yet
Data Warehouse and OLAP
55 pages
Basic Definitions
No ratings yet
Basic Definitions
5 pages
Data Warehousing Interview Questions - by Shobha Bhagwat - Medium
No ratings yet
Data Warehousing Interview Questions - by Shobha Bhagwat - Medium
9 pages
DW Questions
0% (1)
DW Questions
35 pages
Question With Answer
No ratings yet
Question With Answer
22 pages
DMDW Honeymoon Pack
No ratings yet
DMDW Honeymoon Pack
473 pages
Data Warehousing: People Making Technology Wor K™
100% (1)
Data Warehousing: People Making Technology Wor K™
44 pages
Data Mining ---------1.
No ratings yet
Data Mining ---------1.
34 pages
Unit - 3 Data Warehousing and OLAP Technology
No ratings yet
Unit - 3 Data Warehousing and OLAP Technology
20 pages
7 - Data warehousing & Data Modelling_DE_Feb25
No ratings yet
7 - Data warehousing & Data Modelling_DE_Feb25
18 pages
Chapter 2
No ratings yet
Chapter 2
37 pages
D W H Info: Main Menu DWH Concepts and Fundamentals Back
No ratings yet
D W H Info: Main Menu DWH Concepts and Fundamentals Back
7 pages
Overview of Data Warehousing: AIM: - To Learn Architectural Framework For Data Warehousing Theory
No ratings yet
Overview of Data Warehousing: AIM: - To Learn Architectural Framework For Data Warehousing Theory
10 pages
unit1
No ratings yet
unit1
36 pages
Define Data Warehouse. Differentiate Between OLTP and OLAP Databases
No ratings yet
Define Data Warehouse. Differentiate Between OLTP and OLAP Databases
6 pages
CS8075 DATAWAREHOUSING AND DATA MINING - Watermark
No ratings yet
CS8075 DATAWAREHOUSING AND DATA MINING - Watermark
83 pages
Data Mining-Data Warehouse
No ratings yet
Data Mining-Data Warehouse
7 pages
100 Important Questions with Solutions for Data Warehousing & Data Mining (BCS058)
No ratings yet
100 Important Questions with Solutions for Data Warehousing & Data Mining (BCS058)
119 pages
Test 2 SE130560
100% (1)
Test 2 SE130560
4 pages
Overall DWH Concepts Handbook
No ratings yet
Overall DWH Concepts Handbook
27 pages
DM-M1-PPT v1.11
No ratings yet
DM-M1-PPT v1.11
84 pages
BC0058 Assignment
No ratings yet
BC0058 Assignment
8 pages
Warehousing
No ratings yet
Warehousing
13 pages
Unit IV Data Mining
No ratings yet
Unit IV Data Mining
65 pages
Sri Vidya College of Engineering & Technology - Dept of CSE
No ratings yet
Sri Vidya College of Engineering & Technology - Dept of CSE
4 pages
Basic Definitions
No ratings yet
Basic Definitions
10 pages
Business Intelligence Study Guide
No ratings yet
Business Intelligence Study Guide
24 pages
DWM Practical Notes Theory Answers All
No ratings yet
DWM Practical Notes Theory Answers All
15 pages
DWH
No ratings yet
DWH
12 pages
Data Warehousing
No ratings yet
Data Warehousing
15 pages
Unit IV - Data Warehousing and OLAP Technologies
No ratings yet
Unit IV - Data Warehousing and OLAP Technologies
68 pages
Data Warehousing, Business Analytics and Online Analytical -1 (1)
No ratings yet
Data Warehousing, Business Analytics and Online Analytical -1 (1)
35 pages
ch4 DW summary
No ratings yet
ch4 DW summary
8 pages
Informatica FAQs
No ratings yet
Informatica FAQs
143 pages
Unit Wise-Question Bank UNIT-1 1. Two Marks Question With Answers: 1. What Are The Uses of Multi Feature Cubes?
No ratings yet
Unit Wise-Question Bank UNIT-1 1. Two Marks Question With Answers: 1. What Are The Uses of Multi Feature Cubes?
85 pages
Data Mining Unit 1
No ratings yet
Data Mining Unit 1
46 pages
DWH Start l2
No ratings yet
DWH Start l2
117 pages
Cat Data Mining
No ratings yet
Cat Data Mining
4 pages
Data Ware House Concepts
No ratings yet
Data Ware House Concepts
12 pages
Cs701 Data Warehouse and Data Mining
No ratings yet
Cs701 Data Warehouse and Data Mining
23 pages
DW 101: Introduction To Data Warehouse
No ratings yet
DW 101: Introduction To Data Warehouse
28 pages
5.4 Modelos Data W.
No ratings yet
5.4 Modelos Data W.
28 pages
Greenplum Architecture, Administration, and
No ratings yet
Greenplum Architecture, Administration, and
573 pages
Introduction to Oracle Database Administration
From Everand
Introduction to Oracle Database Administration
Ying Wang
5/5 (1)
Stored Procedure, Function and Trigger
No ratings yet
Stored Procedure, Function and Trigger
35 pages
1Big_Data (1)
No ratings yet
1Big_Data (1)
69 pages
DBMS MCQs - Finals Prep
No ratings yet
DBMS MCQs - Finals Prep
18 pages
Chapter 1 Query Processing
100% (1)
Chapter 1 Query Processing
45 pages
Various Infrastructure Audit Audit Programs
No ratings yet
Various Infrastructure Audit Audit Programs
38 pages
Database Systems A Practical Approach to Design Implementation and Management 6th Edition Connolly Solutions Manualdownload
100% (5)
Database Systems A Practical Approach to Design Implementation and Management 6th Edition Connolly Solutions Manualdownload
48 pages
Rental House Management System Henry Peter Gommans, George Mwenda Njiru, Arphaxad Nguka Owange
No ratings yet
Rental House Management System Henry Peter Gommans, George Mwenda Njiru, Arphaxad Nguka Owange
1 page
AZ900
No ratings yet
AZ900
2 pages
Final Exam Answer
100% (1)
Final Exam Answer
11 pages
23_1_Prakt DDP_JOINS PART 1_12D1_221210014_Rizky Nurdiansyah
No ratings yet
23_1_Prakt DDP_JOINS PART 1_12D1_221210014_Rizky Nurdiansyah
15 pages
DBMS LAB MANUAL 2025 (1)
No ratings yet
DBMS LAB MANUAL 2025 (1)
109 pages
HANA ABAP Parameters
No ratings yet
HANA ABAP Parameters
3 pages
08 - Smartforms Exercise Solutions
No ratings yet
08 - Smartforms Exercise Solutions
20 pages
LEAPFROG - PTFI Leapfrog Geo Implementation
No ratings yet
LEAPFROG - PTFI Leapfrog Geo Implementation
18 pages
Online Missing Vehicle Tracking System
75% (4)
Online Missing Vehicle Tracking System
48 pages
Redshift Interview Guide!
No ratings yet
Redshift Interview Guide!
21 pages
Object Oriented Databases Oodb
No ratings yet
Object Oriented Databases Oodb
35 pages
SQL Joins in Report
No ratings yet
SQL Joins in Report
6 pages
19 - PHP-MVC-Frameworks-REST Api-Lab
No ratings yet
19 - PHP-MVC-Frameworks-REST Api-Lab
12 pages
Teradata Studio Features
No ratings yet
Teradata Studio Features
46 pages
4 PeopleTools
100% (1)
4 PeopleTools
21 pages
Unit - 2 Data Minig Notes
No ratings yet
Unit - 2 Data Minig Notes
15 pages
Excel Road Map - 1
No ratings yet
Excel Road Map - 1
3 pages
Torrent: High-Quality Exam Torrent & Valid Test Dumps & Reliable Guide Torrent
No ratings yet
Torrent: High-Quality Exam Torrent & Valid Test Dumps & Reliable Guide Torrent
5 pages
Web Application For Screening Resume: January 2019
No ratings yet
Web Application For Screening Resume: January 2019
10 pages
Sqoop Interview Questions
No ratings yet
Sqoop Interview Questions
6 pages

DataWareHouse Notes

Uploaded by

DataWareHouse Notes

Uploaded by

Datawarehouse:- System that aggregates data from multiple sources into central repository of structured data to support

FACT and Dimension tables:-

DIMENSION-categorical variables to filter, group fact data. Contains business entities

Why do we use these schemas, and how do they differ?

Data Cube Rep:-

Slicing- 1 layer of cube is cut

Dicing- large cube is filtered into small cube

Drill up and down-Drilling up and down into subsequent layers

Pivoting-Rearrange the view of cube

Rolling up- summarize data using aggregate functions

1. Grouping sets- subtotals for every requested tuple of items

DataSources(DB,Datalakes,ERP,OLTPs)ETLProcessing w/o staging areaDWHDatamartReporting/analytical tools

 Accuracy (Match b/w src / target system)

Managing DQ :- DetectCaptureReportInvestigateDiagnoseCorrect and then automating workflows

 2 design approaches of DWH:- Top down (SRCDWHDM) and Bottom-Ups (SRCDMDWH)

You might also like