Unit 4
Unit 4
Stage Area :
Since the data, extracted from the external sources does not follow a
particular format, so there is a need to validate this data to load into
datawarehouse. For this purpose, it is recommended to use ETL tool.
• E(Extracted): Data is extracted from External data source.
• L(Load): Data is loaded into datawarehouse after transforming it into the standard
format.
Data-warehouse :
• Data Mining:
• The practice of analysing the big data present in datawarehouse is data mining.
It is used to find the hidden patterns that are present in the database or in
datawarehouse with the help of algorithm of data mining.
• This approach is defined by Inmon as – datawarehouse as a central repository
for the complete organisation and data marts are created from it after the
complete datawarehouse has been created.
Advantages of Top-Down Approach
1.Since the data marts are created from the datawarehouse, provides consistent dimensional
view of data marts.
2.Improved data consistency: The top-down approach promotes data consistency by ensuring
that all data marts are sourced from a common data warehouse. This ensures that all data is
standardized, reducing the risk of errors and inconsistencies in reporting.
3.Easier maintenance: Since all data marts are sourced from a central data warehouse, it is
easier to maintain and update the data in a top-down approach. Changes can be made to
the data warehouse, and those changes will automatically propagate to all the data marts
that rely on it.
4.Better scalability: The top-down approach is highly scalable, allowing organizations to add
new data marts as needed without disrupting the existing infrastructure. This is particularly
important for organizations that are experiencing rapid growth or have evolving business
needs.
5.Improved governance: The top-down approach facilitates better governance by enabling
centralized control of data access, security, and quality. This ensures that all data is managed
consistently and that it meets the organization’s standards for quality and compliance.
6. Reduced duplication: The top-down approach reduces data
duplication by ensuring that data is stored only once in the data
warehouse. This saves storage space and reduces the risk of data
inconsistencies.
2.Then, the data go through the staging area (as explained above) and
loaded into data marts instead of datawarehouse. The data marts are
created first and provide reporting capability. It addresses a single
business area.
2. We can accommodate more number of data marts here and in this way datawarehouse can
be extended.
3. Also, the cost and time taken in designing this model is low comparatively.
4. Incremental development: The bottom-up approach supports incremental development,
allowing for the creation of data marts one at a time. This allows for quick wins and
incremental improvements in data reporting and analysis.
5. User involvement: The bottom-up approach encourages user involvement in the design and
implementation process. Business users can provide feedback on the data marts and
reports, helping to ensure that the data marts meet their specific needs.
6. Flexibility: The bottom-up approach is more flexible than the top-down approach, as it
allows for the creation of data marts based on specific business needs. This approach can be
particularly useful for organizations that require a high degree of flexibility in their reporting
and analysis.
6. Faster time to value: The bottom-up approach can deliver faster
time to value, as the data marts can be created more quickly than a
centralized data warehouse. This can be particularly useful for smaller
organizations with limited resources.