0% found this document useful (0 votes)
107 views3 pages

29.1 Types of Data (Metadata, Raw Data & Summary Data)

A data warehouse stores historical and cumulative data from different data streams to be used for forecasting, reporting, and analysis. It involves cleaning and transforming data and loading it into tables. A data warehouse records all changes without erasing previous data, allowing examination of changes over time. It stores data in relational databases, analytics databases, data warehouse applications, or cloud-based databases. The data warehouse contains metadata, summary data, and raw data. Metadata defines the data. Summary data aggregates historical data into time-based categories for analysis over time. Raw data loads without processing for further analysis.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
107 views3 pages

29.1 Types of Data (Metadata, Raw Data & Summary Data)

A data warehouse stores historical and cumulative data from different data streams to be used for forecasting, reporting, and analysis. It involves cleaning and transforming data and loading it into tables. A data warehouse records all changes without erasing previous data, allowing examination of changes over time. It stores data in relational databases, analytics databases, data warehouse applications, or cloud-based databases. The data warehouse contains metadata, summary data, and raw data. Metadata defines the data. Summary data aggregates historical data into time-based categories for analysis over time. Raw data loads without processing for further analysis.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 3

Types of Data in a Data Warehouse.

What is a Data Warehouse?


A Data Warehouse is a complex system that stores historical and cumulative data
used for forecasting, reporting, and data analysis. It involves collecting, cleansing, and
transforming data from different data streams and loading it into fact/dimensional tables.
Since it is non-volatile, it records all data changes as new entries without erasing its
previous state. This feature is closely related to being time-variant, as it keeps a record of
historical data, allowing you to examine changes over time.

The Database
The most crucial component and the heart of each architecture is the database.
The warehouse is where the data is stored and accessed.

When creating the data warehouse system, you first need to decide what kind of
database you want to use.
There are four types of databases you can choose from:

1. Relational databases (row-centered databases).


2. Analytics databases (developed to sustain and manage analytics).
3. Data warehouse applications (software for data management and hardware for
storing data offered by third-party dealers).
4. Cloud-based databases (hosted on the cloud).

Data
Once the system cleans and organizes the data, it stores it in the data warehouse.
The data warehouse represents the central repository that stores metadata,
summary data, and raw data coming from each source.

 Metadata.  The data that is used to represent other data is known as


metadata, in other words metadata is the information that defines the data,
for example, the index of a book serves as a metadata for the contents in
the book. Using this analogy, we can say that metadata is the concise data
that leads us to detailed data. This allows data analysts to classify, locate,
and direct queries to the required data.

 Summary data. The data in the data warehouse is a historical record of


activity and conditions inside an enterprise. Summarizing the data is the
process of aggregating your historical data into time-based categories
such as hourly, daily, weekly, and so on. The summary data is generated by
the warehouse manager to perform historical analysis of the data over time.
With summarized data, the performance of queries can be improved
considerably.
 Raw data is the actual data loading into the repository, which has not been
processed. Having the data in its raw form makes it accessible for further
processing and analysis.

You might also like