Data Warehousing Interview Questions and Answers

The document provides information on data warehousing concepts including data warehouses, data marts, star schemas, dimensions, facts, slowly changing dimensions, and the differences between online transaction processing (OLTP) and online analytical processing (OLAP). It defines a data warehouse as an integrated, subject-oriented, non-volatile, time-variant database that supports decision making. It also defines a data mart as a subset of the data warehouse tailored to specific business requirements. Dimensional models like star schemas and snowflake schemas are also described.

Uploaded by

siva_mm

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOC, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

635 views5 pages

Data Warehousing Interview Questions and Answers

Uploaded by

siva_mm

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOC, PDF, TXT or read online on Scribd

You are on page 1/ 5

Satya Contact No:+91-8095001166

Mail id : [email protected] [email protected] Skype: [email protected]

1. What is Data Warehousing? Integrated, Subject Oriented, Non Volatile, Time Variant database that provides support for decision making. Integrated: Data warehouse is centralized, consolidated database that integrated data derived from entire organization. Multiple Sources Diverse Sources Diverse Format Subject Oriented: Data is arranged and optimized to provide answers to the question from diverse functional areas. Data is organized and summarized by topic. Sales/Marketing/Finance/Distribution/etc. Time Variant: The data warehouse represent flow of data through time. Can contain projected data from statistical models. Data is periodically uploaded then time dependent data is recomputed. Non Volatile: Once data is entered it is never removed. Represents the entire companys history Near team history is continuously added to it. Always growing Must support terabyte database and multiprocessors. 2. What is Data Mart? Subset of Data Warehouse or we can say small data store. According to business requirement data warehouse is divided into data marts. Ex. Territory, Purchase, Sales. 3. Types of Data marts: a) Dependent Data Mart b) Independent Data Mart. Dependent Data Mart: It is like top down approach, data marts are divided according to the data warehouse. Data captured from existing enterprise data warehouse. Independent Data Mart: Data marts are created without data warehouse. Data Marts those are not dependent on data warehouse. Data captured from transaction processing system. 4. Types of Dimension Models? a) Star Schema. B) Snowflake Schema. c) Multi star Model.

5. What is Star Schema? Fact table surrounded by Dimensions tables. Or we can say one Fact table and several Dimension table Fact

Satya Contact No:+91-8095001166

Mail id : [email protected] [email protected] Skype: [email protected]

6. Snowflake Schema: Further normalization of dimension table in star schema is result in snowflake schema. 7. Top down Approach: Data marts are divided according to the data warehouse. 8. Bottom Up Approach: Here Data warehouse is created from data marts. That means by merging data marts data warehouse is created. Note: It is better to prefer Top down Approach, because in bottom up approach number of ETL process increases so there is more chance of errors. 9. Star Schema Grouping: To get confirmed dimension we group two or more star schemas. 10. Conformed Dimension: One Dimension table used in more than one star schema then that dimension called as Conformed Dimension. 11. Types of tables Dimension Table and Fact table Dimension Table: Dimension table contains attributes and levels. It contains descriptive information of the numerical values in the fact table. They contains key attribute of the facts. Ex. Time Dimension Year Quarter Month Week Day Fact Table: Fact Table Contains numerical values and Dimensional ids that is keys associated to dimension table. Ex Sales Order id Product id Customer id Store id Quantity Amount 12. Types of Dimensions. Confirmed Dimension Degenerated Dimension Junk Dimension 13. What is the use of star schema grouping? Star schema groups can facilitate multiple-fact reporting by indicating how to use regular dimensions that are common to many measure dimensions. Multiple-fact reporting is also known as cross-functional reporting.

Satya Contact No:+91-8095001166

Mail id : [email protected] [email protected] Skype: [email protected] 14. Junk Dimension/ Garbage Dimension: Garbage dimension consist of low cardinality columns such as flags, indicator, code and status. Attributes in Garbage dimension are not related to any hierarchy. 15. Conformed Dimension: Conformed Dimension means same thing to each fact table to which it can join. A more precise definition is that two dimension are conformed if they share one or more than one, or all attributes that drawn from same domain 16. Degenerate Dimension: The dimension that itself doesnt contain any primary key, the primary keys of other dimensions acts as primary keys in this dimension. For eg. Sales have Product ID, Customer ID, Territory ID, all these IDs are primary keys in their respective dimension but in Sales there is no primary key these IDs acts as primary keys in this Sales. 17. Types of Measures or Facts Additive Facts: Additive Facts are the facts that can be added across the entire dimension in the fact table, and are the most common type of fact. These facts would be used across several dimensions for summation process. Semi Additive Facts: These are the facts that can be added across some of the dimension but not all. For e.g. Head Count and quality on hand referred to semi additive facts. Non Additive Facts: Facts cannot be added for any dimension. That is, they cannot be logically added between records or facts. 18. Surrogate Keys Surrogate keys are keys that are maintained within the data warehouse instead of natural keys taken from the source data system. It may possible that a single key is being used by different instances of the same entity. The major problem comes when trying to consolidate information from various source systems. We cannot rely on using the natural primary keys of the source system as dimension primary key because there is no guarantee that natural primary keys will be unique for each instance.

19. Granularity: Level of summarization of data element Granularity may be defined as the level of details made available in the dimensional model. It refers to the level of detail or summarization of the unit data in the data warehouse. More detail is the lower level of granularity and less details is the highest level of granularity. So it is better to maintain data at lower level so that data can drill down from higher level to lower level. 20. Slowly Changing Dimension Slowly Changing Dimension is a dimension whose attribute or attributes for a record (row) changes slowly over time. Type 1(Current data): Type 1 approach overwrites the existing dimensional model with new data, and therefore no history is preserved only current data is maintained. This may be best approach to use if the attribute change is simple such as correction in spelling. And if the old value was wrong, it may not be critical that history is not maintained.

Satya Contact No:+91-8095001166

Mail id : [email protected] [email protected] Skype: [email protected] Type 2(Current Data + Whole Historical Data): Type 2 approach adds new dimension row for the changed attribute, and therefore preserves history. As it adds new record for every attribute change, it can significantly increase the database size. Flags: Version No. New, Old Effective Data Range Type3 (Current data + 1 time historical data): Type 3 approach is used only if there is a limited need to preserve and accurately describe history. 21. Types of OLAP ROLAP (Relational OLAP): Multi dimensional analysis using multidimensional view of relational data. A relational database is used as an underlying data structure. Multi dimensional analysis means analysis of data along several dimensions. Ex. Analyzing Revenue by Product, Store and Date. MOLAP (Multi dimensional OLAP): It is an OLAP that uses Multi dimensional database as data structure. HOLAP (Hybrid OLAP) 22. Factless Fact Table: A fact table with only foreign keys and no facts is called Factless Fact Table. The Foreign keys can be used for counting purpose. 23. Hierarchy: Hierarchy defines relationship between the attributes of the dimension that identifies different level that exist within them. Data is maintained at levels, from one level to another level is known as hierarchy. We can arrange members of dimension into one or more hierarchies; each hierarchy can also have multiple hierarchy levels. The relationship between from one level to another is 1 to n. For e.g. A Year have n quarters. 24. OLTP (ONLINE TRANSACTION PROCESSING): OLTP supports ER modeling. OLTP application can perform INSERT, UPDATE and DELETE operation against database operation. OLTP application can keep data of 1 year. Data is normalized. OLTP is a traditional term used for transaction to carry day to day business functions such as ERP, CRM. OLTP solved critical business problems but not designed for analysis and quires. In OLTP database there is detailed and current data 25. OLAP (ONLINE ANALYTICAL PROCESSING): OLAP supports dimensional modeling. OLAP application can perform only INSERT operation. OLAP application can keep n no. of years of data. Data is denormalized. For analyzing data OLAP performs four types of operation drill up, roll up, slice (cuts the cube) and dice (rotate the cube). We can analyze the data in more than 1 dimensional point of view. Use of OLAP is fast response time for ad hoc queries. OLAP were designed to provide analysis and queries efficiently as compared to real time OLTP. OLAP uses multi dimensional model, with primary purpose of running

Satya Contact No:+91-8095001166

Mail id : [email protected] [email protected] Skype: [email protected] complex analytical and ad hoc queries. In OLAP database there is aggregated, historical data, store in multidimensional schema (usually star schema) 26. Difference between OLTP and OLAP. OLTP OLAP OLTPS are the original source of data OLAP data comes from various OLTP databases. Run fundamental business task Provide support for decision making. Short and fast inserts and updates initiated by end users. Long running batch jobs refresh the data. Processing speed is typically very fast. Depends on amount of data. Highly normalized with many tables. Denormalized with fewer tables. OLTP reports run on low volume of data and returns fewer records OLAP reports run on huge volume of data. Reveals a snapshot of ongoing business processes. Multi dimensional views of various kinds of business activities. Simple queries returning relatively few records. Aggregation. OLTP database contains detailed and current data. OLTP database contains historical data and aggregation structure Highly normalized with many tables Denormalized with fewer tables. 27. ODS (Operational Data Store): Many businesses require very fast response time and those businesses dont have access to data warehouse. When subsecond response time is required and integrated data must be accessed, there is structure known as ODS operational data store that is place to go to when high performance processing must be done. 28. 29. 30. Alternative Hierarchy Normalization Denormalization

Big Data Hadoop MCQ Question
No ratings yet
Big Data Hadoop MCQ Question
109 pages
SQL Basic To Advance Interview Question and Answer 1731934628
No ratings yet
SQL Basic To Advance Interview Question and Answer 1731934628
12 pages
Dimensional Modelling
No ratings yet
Dimensional Modelling
26 pages
What Are The Dimensions in Data Warehouse
100% (1)
What Are The Dimensions in Data Warehouse
6 pages
Data Warehousing Basics Interview Questions
No ratings yet
Data Warehousing Basics Interview Questions
24 pages
Data Warehousing Interview Questions - by Shobha Bhagwat - Medium
No ratings yet
Data Warehousing Interview Questions - by Shobha Bhagwat - Medium
9 pages
60+ MySQL Interview Questions and Answers (2025 Updated)
No ratings yet
60+ MySQL Interview Questions and Answers (2025 Updated)
12 pages
Star and Snowflake Schema in Data Warehouse With Examples: What Is Multidimensional Schema?
No ratings yet
Star and Snowflake Schema in Data Warehouse With Examples: What Is Multidimensional Schema?
6 pages
Get All Instructors Names Without Repetition
No ratings yet
Get All Instructors Names Without Repetition
7 pages
DWH BASICS Interview Questions
No ratings yet
DWH BASICS Interview Questions
29 pages
Data Warehousing&Data Mining
No ratings yet
Data Warehousing&Data Mining
170 pages
SQL Refresher Complete Notes PDF
No ratings yet
SQL Refresher Complete Notes PDF
352 pages
External Tables
No ratings yet
External Tables
105 pages
Using ERwin Data Modeler
100% (1)
Using ERwin Data Modeler
46 pages
Dbms Lab # 3: SQL Aggregate & Scalar Functions
No ratings yet
Dbms Lab # 3: SQL Aggregate & Scalar Functions
20 pages
Rakesh Kumar - 21554244 - Big Data - Assessment 2
No ratings yet
Rakesh Kumar - 21554244 - Big Data - Assessment 2
23 pages
Data Warehouse Concepts
No ratings yet
Data Warehouse Concepts
68 pages
SQL Queries
No ratings yet
SQL Queries
11 pages
SQL Interview
100% (1)
SQL Interview
68 pages
SQL Scenario Based Interview Questions - ThinkETL
100% (3)
SQL Scenario Based Interview Questions - ThinkETL
23 pages
Introduction To Database Management System: 1.1 Data
No ratings yet
Introduction To Database Management System: 1.1 Data
9 pages
MDM Questions
No ratings yet
MDM Questions
1 page
DWH QB
No ratings yet
DWH QB
10 pages
50 Frequently Asked Informatica Interview Questions With Answers
No ratings yet
50 Frequently Asked Informatica Interview Questions With Answers
8 pages
Ssis Interview Questions and Answers 2
No ratings yet
Ssis Interview Questions and Answers 2
7 pages
SQL Server and ASP Net Questions & Answers
No ratings yet
SQL Server and ASP Net Questions & Answers
12 pages
DWH Question Bank
No ratings yet
DWH Question Bank
9 pages
Datawarehouse Concepts
No ratings yet
Datawarehouse Concepts
5 pages
(Excerpts From) Investigating Performance: Design and Outcomes With Xapi
From Everand
(Excerpts From) Investigating Performance: Design and Outcomes With Xapi
Janet Laane Effron
No ratings yet
Hadoop Interview Questions New
No ratings yet
Hadoop Interview Questions New
9 pages
Data Warehouse - Concept and Fundamentals: Sridevi
No ratings yet
Data Warehouse - Concept and Fundamentals: Sridevi
25 pages
Data Warehousing MCQ
No ratings yet
Data Warehousing MCQ
71 pages
Part A: SQL Programming: DBMS Lab Manual-2019-20
No ratings yet
Part A: SQL Programming: DBMS Lab Manual-2019-20
33 pages
SQL Performance Improvement
No ratings yet
SQL Performance Improvement
94 pages
IDQ Reference
No ratings yet
IDQ Reference
31 pages
A Scenario Based Questions and Solutions
100% (1)
A Scenario Based Questions and Solutions
8 pages
Multiple Questions On SQL
No ratings yet
Multiple Questions On SQL
7 pages
Ssrs Interview Questions and Answers
No ratings yet
Ssrs Interview Questions and Answers
15 pages
Interview Questions and Answers On Database Basics
No ratings yet
Interview Questions and Answers On Database Basics
13 pages
Etl Interview Questions
No ratings yet
Etl Interview Questions
14 pages
Top 50 Data Warehousing Interview Questions & Answers
No ratings yet
Top 50 Data Warehousing Interview Questions & Answers
8 pages
PL SQL Exercises Questions On PL SQL
No ratings yet
PL SQL Exercises Questions On PL SQL
3 pages
MSBI
No ratings yet
MSBI
30 pages
Unit 1
No ratings yet
Unit 1
61 pages
SQL Faqs
No ratings yet
SQL Faqs
7 pages
Dbms 1 S
No ratings yet
Dbms 1 S
32 pages
Pyspark Questions & Scenario Based
No ratings yet
Pyspark Questions & Scenario Based
25 pages
Tuning SQL Queries - Oracle
100% (1)
Tuning SQL Queries - Oracle
27 pages
How To Make Money in Dividend Stocks
100% (1)
How To Make Money in Dividend Stocks
85 pages
CSE 530 - Database Management Systems: Data Warehousing Presentation by Ali Gardezi Prashanth Janardanan Aaron Sheffield
No ratings yet
CSE 530 - Database Management Systems: Data Warehousing Presentation by Ali Gardezi Prashanth Janardanan Aaron Sheffield
69 pages
A Data Pipeline Should Address These Issues:: Topics To Study
No ratings yet
A Data Pipeline Should Address These Issues:: Topics To Study
10 pages
Data Warehouse Questions
No ratings yet
Data Warehouse Questions
2 pages
Ssis Interview Imp1
No ratings yet
Ssis Interview Imp1
4 pages
Abinitio CookBook
100% (2)
Abinitio CookBook
236 pages
DataWarehouseInterview Part1
No ratings yet
DataWarehouseInterview Part1
4 pages
Conditional Formatting in Cognos 10
No ratings yet
Conditional Formatting in Cognos 10
17 pages
Data Vault and HQDM Principles PDF
No ratings yet
Data Vault and HQDM Principles PDF
8 pages
NCFM Tecnical Analusis Module
88% (8)
NCFM Tecnical Analusis Module
172 pages
Me 3BTnaTiCv9wU52h4gdA Chamillard-C-Unity-Book
No ratings yet
Me 3BTnaTiCv9wU52h4gdA Chamillard-C-Unity-Book
509 pages
E-Commerce Infrastructure
100% (1)
E-Commerce Infrastructure
15 pages
Accu-Chek Smart Pix Manual-EN-2.2.1
No ratings yet
Accu-Chek Smart Pix Manual-EN-2.2.1
190 pages
Framework Manager Interview Questions
No ratings yet
Framework Manager Interview Questions
4 pages
Lecture 1
No ratings yet
Lecture 1
44 pages
How To Change IP Address Using A Batch File Script On
No ratings yet
How To Change IP Address Using A Batch File Script On
2 pages
Avaya Contact Recorder: Release 12.0 Planning, Installation and Administration Guide
No ratings yet
Avaya Contact Recorder: Release 12.0 Planning, Installation and Administration Guide
408 pages
SAP HANA and Real Time Analytics - BI
No ratings yet
SAP HANA and Real Time Analytics - BI
8 pages
Data Engineering with Scala and Spark: Build streaming and batch pipelines that process massive amounts of data using Scala
From Everand
Data Engineering with Scala and Spark: Build streaming and batch pipelines that process massive amounts of data using Scala
Eric Tome
No ratings yet
Mastering Data Engineering and Analytics with Databricks: A Hands-on Guide to Build Scalable Pipelines Using Databricks, Delta Lake, and MLflow (English Edition)
From Everand
Mastering Data Engineering and Analytics with Databricks: A Hands-on Guide to Build Scalable Pipelines Using Databricks, Delta Lake, and MLflow (English Edition)
Manoj Kumar
No ratings yet
Postgis-2 0 0
No ratings yet
Postgis-2 0 0
612 pages
License
No ratings yet
License
10 pages
Cognos BI Report Developer Resume For 3 Years
No ratings yet
Cognos BI Report Developer Resume For 3 Years
4 pages
B.E Cse Batchno 156
No ratings yet
B.E Cse Batchno 156
58 pages
Interview Que Only For Cognos
No ratings yet
Interview Que Only For Cognos
6 pages
HTTP Transformation Overview
100% (1)
HTTP Transformation Overview
10 pages
Techawakening Org Forward Multiple Emails On Gmail With Goog
No ratings yet
Techawakening Org Forward Multiple Emails On Gmail With Goog
47 pages
Trellix Application Control For Windows Essentials
No ratings yet
Trellix Application Control For Windows Essentials
10 pages
Abhijeet Resume
No ratings yet
Abhijeet Resume
4 pages
Instant Pentaho Data Integration Kitchen
From Everand
Instant Pentaho Data Integration Kitchen
Sergio Ramazzina
No ratings yet
Conditional Formatting in Cognos 10
No ratings yet
Conditional Formatting in Cognos 10
17 pages
Module-2 (Microsoft Word)
No ratings yet
Module-2 (Microsoft Word)
29 pages
Good Things Come To Those Who Wait, But Only Those Things Left by Those Who Hustle' by Abraham Lincoln
No ratings yet
Good Things Come To Those Who Wait, But Only Those Things Left by Those Who Hustle' by Abraham Lincoln
12 pages
Bharatidwconsultancy Blogspot Cognos 10 Framework
No ratings yet
Bharatidwconsultancy Blogspot Cognos 10 Framework
6 pages
Why Organizations Are Migrating From Ibm Lotus Notes To Platform-As-A-Service (Paas)
No ratings yet
Why Organizations Are Migrating From Ibm Lotus Notes To Platform-As-A-Service (Paas)
7 pages
Cognos10 - REPORT - STUDIO Global Classess Use
No ratings yet
Cognos10 - REPORT - STUDIO Global Classess Use
12 pages
R.sai Lakshmi Cognos
No ratings yet
R.sai Lakshmi Cognos
5 pages
1 Xxguidefingersonly
No ratings yet
1 Xxguidefingersonly
49 pages
Sample Exercises in Cognos 8 Report Studio
No ratings yet
Sample Exercises in Cognos 8 Report Studio
3 pages
Cognossimplified Blogspot in 2013 02 Cardinalities in Cognos
No ratings yet
Cognossimplified Blogspot in 2013 02 Cardinalities in Cognos
4 pages
Guzzle
No ratings yet
Guzzle
49 pages
An Approach To How To Trade in Commodities Market 13052013
No ratings yet
An Approach To How To Trade in Commodities Market 13052013
6 pages
Backup Implementation Proposal: Computing Services
No ratings yet
Backup Implementation Proposal: Computing Services
9 pages
CSPP Geo GRB Installation Guide v0.3
No ratings yet
CSPP Geo GRB Installation Guide v0.3
13 pages
E A Cds V: Mbedded Nalytics Iews
No ratings yet
E A Cds V: Mbedded Nalytics Iews
10 pages
Oss Unit2
No ratings yet
Oss Unit2
20 pages
Alternating Page Headers: Tips or Technique
No ratings yet
Alternating Page Headers: Tips or Technique
8 pages
Disable Windows Driver Signature
No ratings yet
Disable Windows Driver Signature
7 pages
Using A TypeDescriptionProvider To Support Dynamic Run-Time Properties - CodeProject
No ratings yet
Using A TypeDescriptionProvider To Support Dynamic Run-Time Properties - CodeProject
9 pages
Nacm Poster Hi Rez
No ratings yet
Nacm Poster Hi Rez
1 page
Microsoft Cybersecurity Reference Architecture (MCRA)
100% (1)
Microsoft Cybersecurity Reference Architecture (MCRA)
2 pages
Databricks Essentials: A Guide to Unified Data Analytics
From Everand
Databricks Essentials: A Guide to Unified Data Analytics
Robert Johnson
No ratings yet
V87 Combined Platform Card
No ratings yet
V87 Combined Platform Card
5 pages
AndroidUpdates 2024
No ratings yet
AndroidUpdates 2024
3 pages
Samc2150 614
No ratings yet
Samc2150 614
3 pages
22 Karan Rathod AJP Practical 11
No ratings yet
22 Karan Rathod AJP Practical 11
6 pages
JupyterLab Notebook Cheatsheet
No ratings yet
JupyterLab Notebook Cheatsheet
2 pages
Uploading Dewar DM1200 Teleprotection Relay Settings
No ratings yet
Uploading Dewar DM1200 Teleprotection Relay Settings
2 pages
Learn Hive in 24 Hours
From Everand
Learn Hive in 24 Hours
Alex Nordeen
No ratings yet
Setting Up Hadoop Made Easy
100% (1)
Setting Up Hadoop Made Easy
18 pages
SQL Basics PDF
No ratings yet
SQL Basics PDF
28 pages
SnapLogic Second Edition
From Everand
SnapLogic Second Edition
Gerardus Blokdyk
No ratings yet

Data Warehousing Interview Questions and Answers

Uploaded by

Data Warehousing Interview Questions and Answers

Uploaded by

Satya Contact No:+91-8095001166

Mail id : [email protected] [email protected] Skype: [email protected]

Satya Contact No:+91-8095001166

Mail id : [email protected] [email protected] Skype: [email protected]

Satya Contact No:+91-8095001166

Satya Contact No:+91-8095001166

Satya Contact No:+91-8095001166

You might also like