0% found this document useful (0 votes)
11 views

What Is a Data Warehouse

The document provides an overview of data warehousing, including its architecture, ETL processes, and the role of metadata. It discusses the characteristics of data warehouses, the differences between databases and data warehouses, and the types of data warehouses such as Enterprise Data Warehouse, Operational Data Store, and Data Mart. Additionally, it highlights the features of autonomous databases and their benefits in automating database management tasks.

Uploaded by

HoD CSE
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views

What Is a Data Warehouse

The document provides an overview of data warehousing, including its architecture, ETL processes, and the role of metadata. It discusses the characteristics of data warehouses, the differences between databases and data warehouses, and the types of data warehouses such as Enterprise Data Warehouse, Operational Data Store, and Data Mart. Additionally, it highlights the features of autonomous databases and their benefits in automating database management tasks.

Uploaded by

HoD CSE
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 9

CCS341 DATA WAREHOUSING

UNIT I INTRODUCTION TO DATA WAREHOUSE 5

Data warehouse Introduction - Data warehouse components- operational database Vs data


warehouse – Data warehouse Architecture – Three-tier Data Warehouse Architecture -
Autonomous Data Warehouse- Autonomous Data Warehouse Vs Snowflake - Modern Data
Warehouse

UNIT II ETL AND OLAP TECHNOLOGY 6

What is ETL – ETL Vs ELT – Types of Data warehouses - Data warehouse Design and Modeling
-Delivery Process - Online Analytical Processing (OLAP) - Characteristics of OLAP -
Online Transaction Processing (OLTP) Vs OLAP - OLAP operations- Types of OLAP- ROLAP
Vs MOLAP Vs HOLAP.

UNIT III META DATA, DATA MART AND PARTITION STRATEGY 7

Meta Data – Categories of Metadata – Role of Metadata – Metadata Repository – Challenges


for Meta Management - Data Mart – Need of Data Mart- Cost Effective Data Mart-
Designing Data Marts- Cost of Data Marts- Partitioning Strategy – Vertical partition –
Normalization – Row Splitting – Horizontal Partition

UNIT IV DIMENSIONAL MODELING AND SCHEMA 6

Dimensional Modeling- Multi-Dimensional Data Modeling – Data Cube- Star Schema-


Snowflake schema- Star Vs Snowflake schema- Fact constellation Schema- Schema
Definition - Process Architecture- Types of Data Base Parallelism – Datawarehouse Tools

UNIT V SYSTEM & PROCESS MANAGERS 6

Data Warehousing System Managers: System Configuration Manager- System Scheduling


Manager - System Event Manager - System Database Manager - System Backup
Recovery Manager - Data Warehousing Process Managers: Load Manager – Warehouse
Manager- Query Manager – Tuning – Testing
PRACTICAL EXERCISES
1. Data exploration and integration with WEKA
2. Apply weka tool for data validation
3. Plan the architecture for real time application
4. Write the query for schema definition
5. Design data ware house for real time applications
6. Analyse the dimensional Modeling
7. Case study using OLAP
8. Case study using OTLP
9. Implementation of warehouse testing.
What Is a Data Warehouse: Overview, Concepts and How It Works
What Is a Data Warehouse
Data warehouses serve as a central repository for storing and analyzing
information to make better informed decisions. An organization's data warehouse
receives data from a variety of sources, typically on a regular basis, including
transactional systems, relational databases, and other sources.

A data warehouse is a centralized storage system that allows for the storing,
analyzing, and interpreting of data in order to facilitate better decision-making.
Transactional systems, relational databases, and other sources provide data into
data warehouses on a regular basis.
A data warehouse is a type of data management system that facilitates and
supports business intelligence (BI) activities, specifically analysis. Data
warehouses are primarily designed to facilitate searches and analyses and usually
contain large amounts of historical data.
A data warehouse can be defined as a collection of organizational data and
information extracted from operational sources and external data sources. The data
is periodically pulled from various internal applications like sales, marketing, and
finance; customer-interface applications; as well as external partner systems. This
data is then made available for decision-makers to access and analyze. So what is
data warehouse? For a start, it is a comprehensive repository of current and
historical information that is designed to enhance an organization’s performance.
Key Characteristics of Data Warehouse

The main characteristics of a data warehouse are as follows:

 Subject-Oriented
A data warehouse is subject-oriented since it provides topic-wise information
rather than the overall processes of a business. Such subjects may be sales,
promotion, inventory, etc. For example, if you want to analyze your company’s
sales data, you need to build a data warehouse that concentrates on sales. Such a
warehouse would provide valuable information like ‘who was your best customer
last year?’ or ‘who is likely to be your best customer in the coming year?’
 Integrated
A data warehouse is developed by integrating data from varied sources into a
consistent format. The data must be stored in the warehouse in a consistent and
universally acceptable manner in terms of naming, format, and coding. This
facilitates effective data analysis.

 Non-Volatile
Data once entered into a data warehouse must remain unchanged. All data is read-
only. Previous data is not erased when current data is entered. This helps you to
analyze what has happened and when.

 Time-Variant
The data stored in a data warehouse is documented with an element of time, either
explicitly or implicitly. An example of time variance in Data Warehouse is
exhibited in the Primary Key, which must have an element of time like the day,
week, or month.

Database vs. Data Warehouse


Although a data warehouse and a traditional database share some similarities, they
need not be the same idea. The main difference is that in a database, data is
collected for multiple transactional purposes. However, in a data warehouse, data
is collected on an extensive scale to perform analytics. Databases provide real-time
data, while warehouses store data to be accessed for big analytical queries.
Data warehouse is an example of an OLAP system or an online database query
answering system. OLTP is an online database modifying system, for example,
ATM. Learn more about the OLTP vs. OLAP differences.
Data Warehouse Architecture
Usually, data warehouse architecture comprises a three-tier structure.

Bottom Tier
The bottom tier or data warehouse server usually represents a relational database
system. Back-end tools are used to cleanse, transform and feed data into this layer.

Middle Tier
The middle tier represents an OLAP server that can be implemented in two ways.
The ROLAP or Relational OLAP model is an extended relational database
management system that maps multidimensional data process to standard relational
process.

The MOLAP or multidimensional OLAP directly acts on multidimensional data


and operations.

Top Tier
This is the front-end client interface that gets data out from the data warehouse. It
holds various tools like query tools, analysis tools, reporting tools, and data mining
tools.

How Data Warehouse Works

Data Warehousing integrates data and information collected from various sources
into one comprehensive database. For example, a data warehouse might combine
customer information from an organization’s point-of-sale systems, its mailing
lists, website, and comment cards. It might also incorporate confidential
information about employees, salary information, etc. Businesses use such
components of data warehouse to analyze customers.

Data mining is one of the features of a data warehouse that involves looking for
meaningful data patterns in vast volumes of data and devising innovative strategies
for increased sales and profits.

Types of Data Warehouse


There are three main types of data warehouse.
Enterprise Data Warehouse (EDW)

This type of warehouse serves as a key or central database that facilitates decision-
support services throughout the enterprise. The advantage to this type of
warehouse is that it provides access to cross-organizational information, offers a
unified approach to data representation, and allows running complex queries.
Operational Data Store (ODS)

This type of data warehouse refreshes in real-time. It is often preferred for routine
activities like storing employee records. It is required when data warehouse
systems do not support reporting needs of the business.

Data Mart

A data mart is a subset of a data warehouse built to maintain a particular


department, region, or business unit. Every department of a business has a central
repository or data mart to store data. The data from the data mart is stored in the
ODS periodically. The ODS then sends the data to the EDW, where it is stored and
used.

Data Warehousing Tools

Wondering what Data warehouse tools is? Well, these are software components
used to perform several operations on an extensive data set. These tools help to
collect, read, write and transfer data from various sources. What do data
warehouses support? They are designed to support operations like data sorting,
filtering, merging, etc.

Data warehouse applications can be categorized as:


 Query and reporting tools
 Application Development tools
 Data mining tools
 OLAP tools

Oracle Cloud provides a set of data management services built on self-driving


Oracle Autonomous Database technology to deliver automated patching, upgrades,
and tuning, including performing all routine database maintenance tasks while the
system is running, without human intervention.

What are two characteristics of autonomous data warehouse?

Autonomous Data Warehouse continuously monitors all aspects of system


performance. It adjusts autonomously to ensure consistent high performance even
as workloads, query types, and the number of users vary over time.
What are the types of autonomous database?

Autonomous Database supports different workload types, including: Data


Warehouse, Transaction Processing, JSON Database, and APEX Service. Each of
these workload types provides performance improvements and additional features
that support operations for the specified workload.
Which two are key characteristics of Oracle Autonomous database?

These two characteristics, self-driving and self-securing, are key differentiators of


Oracle Autonomous Database, enabling organizations to benefit from reduced
administrative overhead, improved system performance, and enhanced data
security.
Why is a database autonomous?
An autonomous database also allows an organization to refocus database
management staff on higher-level work that creates greater business value, such as
data modeling, assisting programmers with data architecture, and planning for
future capacity.
Which two tasks does the autonomous database perform?
Key Features of Autonomous Database
 Provisioning new databases.
 Growing or shrinking storage and compute resources.
 Patching and upgrades.
 Backup and recovery.
 What is autonomous transaction processing?
 Oracle Autonomous Transaction Processing is a fully automated database
service optimized to run transactional, analytical, and batch workloads
concurrently.
 What is autonomous method?
 Autonomous Control Systems (ACS) are software tools designed using
model-based engineering, artificial intelligence (AI), machine learning
(ML), and data acquisition to enable self-governance of vehicle control
functions with little to no human intervention for extended periods in an
uncertain or contested environment.
Autonomous Data Warehouse

Autonomous Data Warehouse continuously monitors all aspects of system


performance. It adjusts autonomously to ensure consistent high performance even
as workloads, query types, and the number of users vary over time. Comparing
performance in the cloud (PDF) Autonomous Database value realization paper
(PDF) Features.

Autonomous Database

An autonomous database is a cloud database that uses machine learning to


automate database tuning, security, backups, updates, and other routine
management tasks traditionally performed by DBAs. Unlike a conventional
database, an autonomous database performs all these tasks and more without
human intervention.

An autonomous data warehouse is designed to reduce the complexity of managing


a data warehouse while improving efficiency and reliability. Two key
characteristics of an ADW are that it is optimized for complex SQL queries and
that statistics are gathered as part of DML (Data Manipulation Language)
operations.

Our Autonomous Data Platform self-manages and self-optimizes by sending


Alerts, Insights and Recommendations (AIR) based on Cloud Agents connected to
your data team's specific data policies and preferences.

Autonomous Database is a service offering based on Oracle Database (version 18c


and later), which runs in the Oracle Cloud. Self-Securing, combined with Self-
Driving and Self-Repairing attributes, comprise the 3 key categories of
autonomous capabilities within the Oracle Autonomous Database.

An autonomous system is one that can achieve a given set of goals in a changing
environment—gathering information about the environment and working for an
extended period of time without human control or intervention. Driverless cars and
autonomous mobile robots (AMRs) used in warehouses are two common
examples.
With an autonomous database, developers can quickly build scalable and secure
enterprise applications from data housed in a preconfigured, fully managed, and
secure environment.

Autonomous Database supports different workload types, including: Data


Warehouse, Transaction Processing, JSON Database, and APEX Service. Each of
these workload types provides performance improvements and additional features
that support operations for the specified workload.
With an autonomous database, developers can quickly build scalable and secure enterprise
applications from data housed in a preconfigured, fully managed, and secure environment.

You might also like