0% found this document useful (0 votes)
30 views10 pages

Big Data Analytics - 7th Sem VTU 2018 Scheme - Class 3

The document discusses Data Warehousing (DW) as an organized collection of integrated databases that support decision-making functions, emphasizing its importance for business reporting, data mining, and improving efficiency. It outlines key design considerations and requirements for an effective DW, such as being subject-oriented, integrated, time-variant, and nonvolatile. Additionally, it describes two approaches for DW development: top-down and bottom-up, highlighting their respective advantages and challenges.

Uploaded by

venurao
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
30 views10 pages

Big Data Analytics - 7th Sem VTU 2018 Scheme - Class 3

The document discusses Data Warehousing (DW) as an organized collection of integrated databases that support decision-making functions, emphasizing its importance for business reporting, data mining, and improving efficiency. It outlines key design considerations and requirements for an effective DW, such as being subject-oriented, integrated, time-variant, and nonvolatile. Additionally, it describes two approaches for DW development: top-down and bottom-up, highlighting their respective advantages and challenges.

Uploaded by

venurao
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 10

Big Data Analytics

(15CS82)
Venugopala Rao A S
Dept. of CSE, SMVITM, Bantakal
Module 3
• Data Warehousing
• A data warehouse (DW) is an organized collection of
integrated, subject oriented databases designed to support
decision support functions.
• DW is organized such a way as to provide clean enterprise-
wide data in a standardized format for reports, queries, and
analysis.
• DW is physically and functionally separate from an operational
and transactional database.
• Creating a DW for analysis and queries demands for
significant investment in time and effort.
• It has to be constantly kept up-to-date for it to be useful.
• DW offers many business and technical benefits.
BDA-15CS82
Module 3
• DW supports business reporting and data mining activities.
• It can facilitate distributed access to up-to-date business
knowledge for departments and functions, thus improving
business efficiency and customer service.
• DW can present a competitive advantage by facilitating
decision making and helping reform business processes.
• DW enables a consolidated view of corporate data, all cleaned
and organized.
• DW thus provides better and timely information.
• It simplifies data access and allows end users to perform
extensive analysis.
• It enhances overall IT performance by not burdening the
operational databases used by Enterprise Resource Planning
(ERP) and other systems.
BDA-15CS82
Module 3
• Case study:
• Indian University of Health

• Design Considerations for DW


• The objective of DW is to provide business knowledge to
support decision making.
• For DW to serve its objective, it should be aligned around
those decisions.
• It should be comprehensive, easy to access, and up-to-date.

BDA-15CS82
Module 3
• Some requirements for a good DW:
• Subject oriented: To be effective, a DW should be designed
around a subject domain,
• i.e. to help solve a certain category of problems.
• Integrated: The DW should include data from many functions
that can shed light on a particular subject area.
• Thus the organization can benefit from a comprehensive view
of the subject area.
• Time-variant (time series): The data in DW should grow at
daily or other chosen intervals.
• That allows latest comparisons over time.

BDA-15CS82
Module 3
• Nonvolatile: DW should be persistent, that is, it should not be
created on the fly from the operations databases.
• Thus, DW is consistently available for analysis, across the
organization and over time.
• Summarized: DW contains rolled-up data at the right level for
queries and analysis.
• The process of rolling up the data helps create consistent
granularity for effective comparisons.
• It also helps reduces the number of variables or dimensions of
the data to make them more meaningful for the decision
makers.

BDA-15CS82
Module 3
• Not normalized: DW often uses a star schema, which is a
rectangular central table, surrounded by some look-up tables.
• The single table view significantly enhances speed of queries.
• Metadata: Many of the variables in the database are computed from
other variables in the operational database.
• E.g.: total daily sales may be a computed field.
• The method of its calculation for each variable should be effectively
documented.
• Every element in the DW should be sufficiently well-defined.
• Near Real-time and/or right-time (active): DWs should be updated
in near real-time in many high transaction volume industries, such as
airlines.
• The cost of implementing and updating DW in real time could be
discouraging though.
• Another downside of real-time DW is the possibilities of
BDA-15CS82
Module 3
• DW Development Approaches
• There are two approaches to developing DW: top down and
bottom up.
• The top-down approach is to make a comprehensive DW that
covers all the reporting needs of the enterprise.
• The bottom-up approach is to produce small data marts, for the
reporting needs of different departments or functions, as
needed.
• The smaller data marts will eventually align to deliver
comprehensive EDW capabilities.
• The top-down approach provides consistency but takes more
time and resources.
• The bottom-up approach leads to healthy local ownership and
maintainability of data
BDA-15CS82
Module 3
• Difference between Data Mart and Data Warehouse

BDA-15CS82
Module 3

BDA-15CS82

You might also like