0% found this document useful (0 votes)
7 views

Building Databases

This lecture slides discuss about how to build data bases for second year college students.
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views

Building Databases

This lecture slides discuss about how to build data bases for second year college students.
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 40

BUILDING THE DATABASE

METADATA AND BUSINESS INTELLIGENCE

www.ue.edu.ph
BUILDING THE DATABASE
Is one of the most difficult and complex parts of
implementing a data warehouse environment.
ETL( Extract, Transform and Load )

www.ue.edu.ph
EXTRACT , TRANSFORM AND LOAD (ETL )

The ETL (extract, transform, and load) work is the


most frequently underestimated work that is done
on a data warehouse project. It is also the least
understood set of tasks by everyone other than the
developers themselves.

www.ue.edu.ph
EXTRACT , TRANSFORM AND LOAD (ETL )

It would seem simple:


• Copy the data,
• move it around a little bit,
• and load it for use.
In reality, the ETL process is rarely
straightforward and simple.

www.ue.edu.ph
EXTRACT , TRANSFORM AND LOAD (ETL )

It might sound simple but it is very complex and frequently


underestimated work during data warehousing.

What Work Is Being


Done?

ETL Data
Source Data Warehouse

www.ue.edu.ph
EXTRACT , TRANSFORM AND LOAD (ETL )

Several major steps are involved in the development of a production


ETL system

ETL system requirements

ETL system design

ETL system construction

ETL system testing

www.ue.edu.ph
ETL SYSTEM REQUIREMENTS


Based on Data Profiling

Processing Rules

Guidelines for Compliance with Legal
Requirements

Processing window

Audit Trail Specifications
www.ue.edu.ph
ETL SYSTEM DESIGN

The dimensional model is the target that the ETL


system will build.
The ETL system design provides the details about how to get
from where the data is now to this target dimensional model.
Some organizations require that every little detail be defined,
including all of the specific rules for building the dimension,
and that fact tables be documented prior to starting to build
the ETL system.

www.ue.edu.ph
ETL SYSTEM DESIGN


Specific Rules for Building the Dimension

Fact Tables

Data Element Definitions

Data Mapping Audit Trail

Backup/Recovery Capabilities.

www.ue.edu.ph
ETL SYSTEM CONSTRUCTION


Includes writing programs or using
technology to perform the work

Work can be divided up among different
people or even different teams

Using the cohesive design, each team can
work on its part
www.ue.edu.ph
ETL SYSTEM TESTING

Important to conduct a thorough testing

Identify any bottlenecks and improve the system's
performance. In order to prepare for testing

Series of test cases need to be developed to provide
realistic conditions to determine whether the system is
working properly

Test cases must represent actual business situations and
need to be defined by representatives from the business
community
www.ue.edu.ph
EXTRACT , TRANSFORM AND LOAD (ETL )

Characteristics of a Production System

Handles Provide Ensure Restart Provide Audit


exceptions warnings and Fallback Trail

Include Backup
and Recovery

www.ue.edu.ph
EXTRACT , TRANSFORM AND LOAD (ETL )
System Functionality: What Exactly does an ETL
System do? The system will provide several common functions. It
describe the functionality and flow of an ETL system to
directly populate dimensional data structures, called
presentation servers or data marts. Additional steps
would be needed to populate a normalized data
warehouse, but most of the functionality would be
similar

ETL Data
Source Data Warehouse

www.ue.edu.ph
EXTRACT , TRANSFORM AND LOAD (ETL )

Last Name, First Name,


Extraction :
ETL
Middle Name…

Extracting the Address, Location, City,

data can be a Street…

lot harder Customer Status, Active,


Depositor…
than it sounds
Transactional information,
products, salary, rate…

www.ue.edu.ph
EXTRACT , TRANSFORM AND LOAD (ETL )

Transformation : Arranged by Last name,

represents a wide
First name, Middle Name

TL
E
variety of Select all “ PH”

functions that are Select all salary ranging

performed to
10000 to 50000

take the data Combine 4 last number of


contact number to first 4
letters of name for Account

Update all accounts with


past due notice

www.ue.edu.ph
EXTRACT , TRANSFORM AND LOAD (ETL )

Marketing
Sales Customer Accounting
Customer
Table Customer Table
Master

List of Do we
Customers with know the
Transactions customer
ID?
Check for
Yes Duplication
Build Customer
Validate
Attributes
Transformation : Has this
customer
Reference Data

changed?
represents a wide Yes High‐level Customer dimension data flow diagram
variety of functions Build Changed

that are performed Customer Row

to take the data Changed


Customer Customer Errors Customer Errors
Dimension
Rows

www.ue.edu.ph
EXTRACT , TRANSFORM AND LOAD (ETL )

Sales Date Customer Product


Dimension Dimension Dimension

Detailed Validate and Validate and


Validate and
Sales Lookup Customer Lookup Product Validate Fact data
Lookup date key
Transactions Key Key

Convert Cases to

Transformation : represents a Sales Fact


Rows
Consumer Units

wide variety of functions that Sales Calculate

are performed to take the Transaction


Errors
Contributions

data
High- level sales fact table data flow diagram[1]

www.ue.edu.ph
OTHER TASKS DURING TRANSFORMATION

Isolate transactions of interest


Store only those that are needed
Example:
Returned Transactions
Sale Transactions
Check the existence of dimension reference data for each sales
transaction
Example:
Transaction ID
Customer ID

www.ue.edu.ph
OTHER TASKS DURING TRANSFORMATION

Validate fact fields


Check the correctness of the values
Example:
Check Sales value

Translate fact values for ease of use


Store consistent values
Trade-off between deriving or storing the values

www.ue.edu.ph
EXTRACT , TRANSFORM AND LOAD (ETL )

L
ET
Data Warehouse

User Cubes

Load: Last step in


ETL Process is to
load the data

www.ue.edu.ph
Fundamentals of Data Warehouse

Business Role in ETL

www.ue.edu.ph
ROLE OF BUSINESS IN ETL
Provide data, information and requirements necessary
for building the data warehouse

www.ue.edu.ph
ROLE OF BUSINESS IN ETL
Provide guidance and critical decisions needed by the
business

Do that…

No
Are we
going to
push the
sales?

Yes
Do this…

www.ue.edu.ph
DEFINING THE BUSINESS RULES
As the roles are clearly stated, the Business must also
define and provide the rules on how to process the
data.
Help to pick the
data source.

Define Define the


integration process
rules. rules.

www.ue.edu.ph
DEFINING THE EXPECTED RESULTS:
TEST PLAN FOR ETL
Knowing the expected result is needed to
ensure data is correct. Is this Tested?
Is it completed?
Does it have the right
configurations?
Do we do a stress test?
What happens when this fails?

ETL

www.ue.edu.ph
Fundamentals of Data Warehouse

Applying New Business Rules

www.ue.edu.ph
APPLYING NEW BUSINESS RULES
Process and results must be validated to ensure that the new
rules have been implemented.

Old Process
and Systems

New Data Warehouse


Old Process Old Process
and Systems and Systems

Old Process
and Systems

www.ue.edu.ph
27
Working Toward Long‐Term Solutions

Problems that arise should be solved not only during its


occurrence but also for the long term fix or future solution.

Quick fix
ETL 12345
Error Code Redeploy ETL
12345

Record and Provide


Future Fix so Error
Code 12345 will not
happen again

www.ue.edu.ph
Fundamentals of Data Warehouse

Tracking Progress—Are We There Yet?

www.ue.edu.ph
Tracking Progress—Are We There Yet?

It is difficult to track real progress, but at a working level, detail


is needed to forecast what needs to be done and what has
already been done.

Task No 1. Status : Completed

Task No 2. Status : In Progress

Task No 3. Status : Not Started

www.ue.edu.ph
What Else Can You Do to Help?

• Encourage,
• Acknowledge and
• Support your team!!!

www.ue.edu.ph
Proactive Communication
The project manager is the spokesperson to share progress,
concerns, and changes in timeline, deliverables, or costs

www.ue.edu.ph
Ensuring Continued Business Participation

After the work has been done, developers are much needed for the support of the
deployment if errors or bugs happens during production.

Why??
Because they are the only one that can fix and provide enhancements
technically.

Finished Data Deployment Published


Warehouse Cubes

Developer

www.ue.edu.ph
Fundamentals of Data Warehouse

Development Support

www.ue.edu.ph
DEVELOPMENT SUPPORT
Once the design works, development begins.

www.ue.edu.ph
BAFWARE: Fundamentals of Data Warehouse

Testing the ETL

www.ue.edu.ph
TESTING THE ETL
Testing ensures the data provided by the system is
accurate and right.

ETL
1 2

1 ETL 2

Peer to Peer Reviews and Tests


www.ue.edu.ph
Why Does It Take So Long and Cost So
Much?
Long decision cycle
New perspectives on the data
Lack of business input
A lot of hard work
Insufficient data profiling
Indirect communication
Experience level of the team
Lack of access to the right IT people

www.ue.edu.ph
ETL TOOLS

Cognos Accelerator
Copy Manager, Data Migrator for SAP,
PeopleSoft (Information Builders)
DataPropagator (IBM)
ETI Extract (Evolutionary Technologies)
Sagent Solution (Sagent Technology)
PowerMart (Informatica)…

www.ue.edu.ph
REPORT AND QUERY TOOLS

Actuate e.Reporting Suite (Actuate)


Brio One (Brio Technologies)
Business Objects
Crystal Reports (Crystal Decisions)
Impromptu (Cognos)
Oracle Discoverer, Oracle Reports
QMF (IBM)
SAS Enterprise Reporter…

www.ue.edu.ph

You might also like