0% found this document useful (0 votes)
8 views10 pages

BA Unit2 Own

A Data Warehouse is a centralized storage system designed for analyzing historical data to aid in decision-making, integrating data from various sources while maintaining consistency. It features subject-orientation, integration, time-variance, and non-volatility, and supports processes like data collection, cleaning, storage, and analysis. Data Marts are smaller, focused subsets of Data Warehouses tailored for specific business functions, and OLAP tools enhance data analysis by providing multidimensional views and fast processing capabilities.

Uploaded by

SUJITHA M
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views10 pages

BA Unit2 Own

A Data Warehouse is a centralized storage system designed for analyzing historical data to aid in decision-making, integrating data from various sources while maintaining consistency. It features subject-orientation, integration, time-variance, and non-volatility, and supports processes like data collection, cleaning, storage, and analysis. Data Marts are smaller, focused subsets of Data Warehouses tailored for specific business functions, and OLAP tools enhance data analysis by providing multidimensional views and fast processing capabilities.

Uploaded by

SUJITHA M
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 10

1) Understanding Data Warehouse in a Simple Way

A Data Warehouse (DW) is a big storage system where companies store their data for
analysis and decision-making, rather than for everyday transactions.

What is a Data Warehouse?


A Data Warehouse is a relational database designed specifically for:
✅ Storing and analyzing historical data (past transactions, customer trends, etc.).
✅ Helping business leaders make decisions based on past data.
✅ Collecting and combining data from multiple sources (e.g., sales records, customer
databases, and finance reports).

Key Features of a Data Warehouse


A Data Warehouse has four important characteristics:

1. Subject-Oriented (Focused on Business Topics)

 A Data Warehouse is designed to analyze specific business subjects like:


o Customers (Who buys the most? What are their preferences?)
o Sales (What products sell best? During which season?)
o Products (Which products perform well? Which need improvement?)
 It does not store unnecessary operational data (like employee login records or daily
bank transactions).

2. Integrated (Combines Data from Different Sources)

 A Data Warehouse pulls data from multiple systems (such as sales databases,
marketing tools, and finance systems) and integrates it into one central place.
 This ensures consistency by following the same naming conventions, formats, and
data types.
 Example: If one system stores a date as "DD/MM/YYYY" and another stores it as
"YYYY-MM-DD", the Data Warehouse converts them into a standard format
before storing.

3. Time-Variant (Stores Historical Data)

 A Data Warehouse keeps a record of past data so businesses can analyze trends
over time.
 Example: A company can retrieve sales reports from the last 3 months, 6 months,
or even 5 years ago to understand patterns and make future predictions.
 Unlike operational databases that keep only current data, a Data Warehouse
maintains historical records for decision-making.

4. Non-Volatile (Data Doesn’t Change Frequently)

 Once data is added to a Data Warehouse, it cannot be modified (no updates,


deletions, or inserts).
 This ensures data consistency and allows for faster data retrieval.

How Does a Data Warehouse Work?


A Data Warehouse follows a process to collect, clean, store, and analyze data:

Step 1: Data Collection (Extracting Data from Sources)

 Data is gathered from multiple sources like:


o Transaction databases (sales records, customer purchases).
o Flat files (CSV, Excel, log files).
o Online transaction records (e-commerce, banking transactions).

Step 2: Data Cleaning and Integration (Ensuring Accuracy)

 The data is cleaned to remove errors, duplicates, and inconsistencies.


 Example: If a customer is listed twice in different databases, the Data Warehouse
merges the information into a single, accurate record.

Step 3: Storing Data in the Data Warehouse

 The cleaned data is stored in structured tables.


 It is organized into different subjects (e.g., sales, customers, products) for easy
access.

Step 4: Analyzing and Querying the Data


 Business users and analysts use Business Intelligence (BI) tools to:
o Run queries to find trends and patterns.
o Generate reports, dashboards, charts, and graphs.
o Make data-driven decisions for business growth.

Goals of Data Warehousing


• To help reporting as well as analysis
• Maintain the organization's historical information
• Be the foundation for decision making.

Here’s a simplified and detailed explanation of the content:

What is a Data Mart?


A Data Mart is a smaller part of a Data Warehouse that is designed for a specific business
function or department (e.g., sales, finance, marketing).

 Think of a Data Warehouse as a giant library containing all company data, while a
Data Mart is a smaller section that focuses on a particular subject.
 Data Marts help teams access the exact data they need quickly, without having to
search through the entire Data Warehouse.

Types of Data Marts


There are three types of Data Marts based on how they are created:

1. Dependent Data Mart (Top-Down Approach)

 Built from an existing Data Warehouse.


 The Data Mart pulls the required data from the Data Warehouse.
 Since the Data Warehouse manages all data, there is no need for extra integration.
 Example: A retail company has a main Data Warehouse containing all data. From
this, it creates separate Sales and Customer Data Marts.

2. Independent Data Mart (Bottom-Up Approach)

 Created separately without a Data Warehouse.


 Each Data Mart is built independently for different business functions.
 Later, multiple Independent Data Marts can be combined to form a Data Warehouse.
 Example: A company starts with a Sales Data Mart and later builds an HR Data
Mart. Eventually, they integrate all Data Marts into a Data Warehouse.

3. Hybrid Data Mart


 Combines data from a Data Warehouse and other sources.
 Useful when quickly integrating new data sources, such as after a company merger
or launching a new product.
 Example: A bank already has a Data Warehouse but also collects real-time
customer transaction data from external sources to analyze fraud risks.

Steps to Build a Data Mart


To create a Data Mart, businesses follow these five key steps:

1. Designing (Planning Phase)

 Decide what data is needed for the Data Mart.


 Identify business goals and technical requirements.
 Choose which data sources to use.
 Create a logical design (how data is structured) and a physical design (how it is
stored).

2. Constructing (Building the Database)

 Set up the database where the data will be stored.


 Create tables, indexes, and relationships to organize the data properly.

3. Populating (Loading Data)

 Extract data from source systems (e.g., databases, files).


 Clean and transform the data into the correct format.
 Load the data into the Data Mart.
 Store metadata (data about the data) to track where it came from.

4. Accessing (Using the Data)

 Set up a user-friendly interface so that business users can easily query and analyze
the data.
 Create dashboards, reports, graphs, and charts.
 Use BI tools to help decision-makers analyze trends and patterns.

5. Managing (Maintaining the Data Mart)

 Secure the data by controlling access.


 Monitor and optimize performance to ensure fast query response times.
 Ensure reliability so the Data Mart is always available, even during system failures.

Difference Between Data Warehouse and Data Mart


Feature Data Warehouse Data Mart
A smaller, focused database for
A large database that stores all
Definition specific departments or business
company data for analysis
functions
Very large (stores entire Smaller (stores department-
Size
organization’s data) specific data)
More complex, requires advanced Simpler, easier to set up and
Complexity
tools and skilled professionals maintain
Cost Expensive to build and maintain Cheaper and quicker to implement
Handles huge datasets, may take Optimized for fast access to
Performance
longer to process queries specific data
Collects data from multiple
Extracts data from a Data
Data Sources sources (databases, files, online
Warehouse or specific sources
systems)
Used by executives, analysts, and Used by specific teams (e.g., sales,
Usage decision-makers for long-term marketing, finance) for quick
planning insights
Bottom-Up Approach (Data Marts
Implementation Top-Down Approach (Data
first, then combined into a Data
Approach Warehouse first, then Data Marts)
Warehouse)

How is a Data Warehouse Different from a Normal


Database?
Feature Operational Database Data Warehouse
Handles daily business
Purpose transactions (e.g., sales, banking, Used for analysis and decision-making
orders)
Data Type Current, real-time data Historical and current data
Optimized for fast transactions Optimized for fast data retrieval and
Structure
(adding, updating, deleting records) analysis
Employees handling transactions Business analysts, executives, decision-
Users
(cashiers, customer service) makers
Data Data is not updated or deleted, only new
Frequently updated
Updates data is added
Smaller, stores only necessary
Size Very large, stores years of historical data
current data
Processing OLTP (Online Transaction OLAP (Online Analytical Processing) –
Type Processing) – Fast transactions Complex queries and reports
A bank's customer transaction A bank's data warehouse that stores
Example system that records deposits and customer transactions for fraud detection
withdrawals and trend analysis
Understanding OLAP (Online Analytical Processing) in a
Simple Way
What is OLAP?

OLAP stands for Online Analytical Processing. It is a powerful tool that helps businesses
analyze data from different angles to identify trends, patterns, and insights.

Imagine OLAP as a super-fast calculator that can process huge amounts of business data
quickly, helping companies make better decisions.

Key Benefits of OLAP

OLAP provides five main benefits for businesses:

1. Multidimensional Data Analysis – Allows users to view data in different ways,


such as by product, time, or location.
2. Advanced Business Calculations – Performs complex calculations like profit
margins, growth rates, and trend predictions.
3. Reliable and Accurate Data – Ensures that the data used for decision-making is
trustworthy and consistent.
4. Fast Analysis (Speed-of-Thought Processing) – Helps businesses get insights
within seconds, making it easier to act quickly.
5. Flexible Reporting – Enables users to create custom reports and dashboards
without needing technical expertise.

Key Features of OLAP


1. Fast Response Time

 OLAP systems are designed to provide quick answers to business queries.


 Most simple reports should generate results within one second, and complex reports
should take no longer than 20 seconds.

2. Powerful Analysis Capabilities

 OLAP can handle any type of business logic and advanced statistical analysis.
 Even though some technical setup is needed, the system should remain easy to use
for business users.

3. Data Sharing & Security

 OLAP systems support multiple users, allowing different teams to analyze data at
the same time.
 If users need to update data, the system ensures security and accuracy, preventing
unauthorized changes.
4. Multidimensional Data View

 The biggest strength of OLAP is that it allows users to view data from multiple
perspectives (e.g., sales by region, sales by product, or sales by time).
 This is useful because businesses need different viewpoints to make the best
decisions.

5. Efficient Data Storage

 OLAP systems can store large amounts of data efficiently while handling data
sparsity (i.e., missing or incomplete data) without wasting storage space.

Understanding OLAP Operations and OLTP vs. OLAP in


a Simple Way
OLAP Operations: How Data Analysis Works

OLAP provides several operations that help businesses analyze data from different angles.
Let’s break them down in an easy-to-understand way.

1. Roll-Up (Zooming Out)

 What it does: It summarizes the data to show a higher-level view.


 Example: If you are looking at daily sales, rolling up might combine the data to
show monthly or yearly sales instead.
 How it works: It removes details and groups data together for easier analysis.
 Think of it as: Zooming out on a map to see the whole country instead of just a single
city.

2. Drill-Down (Zooming In)

 What it does: It breaks down summarized data to show more details.


 Example: If you are viewing yearly sales, drilling down might split it into months
or even daily sales.
 How it works: It adds more details by looking deeper into the data.
 Think of it as: Zooming in on a map to see streets and neighborhoods instead of just
the whole city.

3. Slice (Filtering Data by One Dimension)


 What it does: It focuses on one particular section of data by filtering based on a
single factor.
 Example: If you have sales data for all regions, slicing might show only the sales
for New York.
 How it works: It removes everything except the selected category.
 Think of it as: Looking at just one layer of a cake while ignoring the others.

4. Dice (Filtering Data by Multiple Dimensions)

 What it does: It filters data based on two or more factors at the same time.
 Example: If you have sales data for different products and regions, dicing might
show only laptop sales in California.
 How it works: It creates a smaller, focused dataset from a larger one.
 Think of it as: Cutting out a specific section from a Rubik’s cube.

5. Pivot (Changing the View of Data)

 What it does: It rotates or re-arranges the way data is displayed for better analysis.
 Example: If sales data is shown by regions in rows and months in columns,
pivoting might swap them to show months in rows and regions in columns.
 How it works: It helps users see data in different ways without changing the actual
numbers.
 Think of it as: Turning a spreadsheet from portrait to landscape mode to see a better
view.

Difference Between OLTP and OLAP


OLTP (Online Transaction Processing) and OLAP (Online Analytical Processing) are two
different types of databases used for different purposes.

1. What is OLTP? (Used for Daily Transactions)

✅ Used for daily business operations like sales, banking, and order processing.
✅ Handles a large number of small transactions (insert, update, delete).
✅ Example: An e-commerce website processing orders in real time.
✅ Data is always current and accurate because it’s updated frequently.

Think of it as: A store's cash register, where every purchase updates the system immediately.

2. What is OLAP? (Used for Analysis and Reporting)


✅ Used for analyzing and summarizing historical data.
✅ Handles complex queries involving large amounts of data.
✅ Example: A business checking last year’s sales trends to plan future marketing.
✅ Data is stored in an organized, summarized way for quick analysis.

Think of it as: A business report that reviews sales performance over the past year.

Key Differences Between OLTP and OLAP

Feature OLTP OLAP


Purpose Handles daily transactions Analyzes historical data
Speed Very fast for small operations Fast for complex queries
Data Type Current, real-time data Summarized, historical data
Example ATM withdrawal, online shopping order Sales reports, business analytics
Usage Used by frontline staff (cashiers, bank clerks) Used by managers and analysts

Why is this important?

✅ OLTP keeps businesses running smoothly by processing real-time transactions.


✅ OLAP helps businesses make better decisions by analyzing past trends.

Would you like an example of how a real company uses OLAP in their business? 😊

You might also like