Basics of Data Integration
Basics of Data Integration
• Data from several heterogenous data sources (MS Excel spreadsheets, MS Access CSVfile, etc.) can
be extracted and brought together in a data warehouse.
• Even when DIIT expands into several branches in multiple cities, it still can have one ware-house to
support the information needs of the institution.
• Data can be conveniently retrieved for analysis and generating reports (like the report on spending
requested above).
Identify
Relation
between
Entities
Identify Key
attribute
Identify
other
relevant
attributes
Draw ER
diagram
Review with
Business
Users
Problems posed by ER Modeling
• End Users finds it difficult to comprehend and traverse
Eliminates redundant data Does not eliminate redundant data where appropriate
It is a complex maze with hundreds of entities It has logical grouped set in schemas
ACCURACY,
COMPLETENESS,
VALIDITY,
CONSISTENCY,
TIMELINESS,
Accuracy
During ETL
Helps in identifying what data
Package to extract and what filters to
apply
Design
• With almost 14,000 locations, Domino’s was already the largest pizza
company in the world by 2015. But when the company launched
its AnyWare ordering system, it was suddenly faced with an avalanche of
data. Users could now place orders through virtually any type of device or
app, including smart watches, TVs, car entertainment systems, and social
media platforms.
• That meant Domino’s had data coming at it from all sides. By putting
reliable data profiling to work, Domino’s now collects and analyzes
data from all of the company’s point of sales systems in order to streamline
analysis and improve data quality. As a result, Domino’s has gained deeper
insights into its customer base, enhanced its fraud detection processes,
boosted operational efficiency, and increased sales.