0% found this document useful (0 votes)

3 views3 pages

Analytics Engineering Case Study

The document outlines a case study for blending reports from seven independent source systems into a master report that displays the latest customer data side by side. It details the structure of the reports, the hierarchy of customer relationships, and assumptions regarding data handling and uniqueness across systems. The goal is to create a unified view of customer metrics while addressing potential naming inconsistencies and hierarchical relationships among different systems.

Uploaded by

santosh.kumarsantosh801110

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

3 views3 pages

Analytics Engineering Case Study

Uploaded by

santosh.kumarsantosh801110

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 3

Analytics Engineering Case Study

We have a collection of 7 source systems (each responsible for different aspects of our
broader product offering) operating somewhat independently. As part of regular processes
they each generate a report showing the status of customers (for the activity covered by that
system). These reports are a “point in time” view, so the metrics will change over time and
are generated at source independently.

The reports are generated at varying cadence (some daily, others weekly) and arrive in
multiple formats (Flat files, database connections and CDC interfaces). They all have the
same columns.

The requirement is to blend these reports in one master report, which allows data
consumers to see all the latest rows for each company (as represented in each of the
sources) side by side.

Report Samples
Below are samples from 2 of the 7 source systems (here called A and B). Each report has
the columns:
● Company name : Text
● Internal Company Reference : An ID from the respective source system. It can be a
number, or it can be text
● Status Summary : Text description
● Metrics 1 - 5 (can we think of some good examples) : They are numeric (some
integers, some fixed-point numbers).

System A
Name Ref Status M1 M2 M3 M4 M5

Bobs Widgets BOBW1 OPEN 2 44.5 4 4 4

Bobs Widgets Events BOBW2 OPEN 3 39.6 6 1 11.5

Jukebox Studios JBS01 NEW 7 38.7 2 6 8

System B
Name Ref Status M1 M2 M3 M4 M5

Bobs Widgets Ltd 2001 ACTIVE 1 0 2 2 1

Gig Management 1001 INACTIVE 4 0 0 0 0

Jukebox Services 2021 ACTIVE 2 44.21 1 0 0

Output
This is the output as suggested by the stakeholder:
Name System Ref Status M1 M2 M3 M4 M5

Bobs Widgets LTD A BOBW1 ACTIVE 2 44.5 4 4 4

Bobs Widgets LTD A BOBW2 ACTIVE 3 39.6 6 1 11.5

Bobs Widgets LTD B 2001 ACTIVE 1 0 2 2 1

Jukebox Studios A JBS01 NEW 7 38.7 2 6 8

Gig Management B 1001 INACTIVE 4 0 0 0 0

Jukebox Services B 2021 ACTIVE 2 44.21 1 0 0

Customer Relationships & Hierarchy

Looking at the sample data, we can immediately see a question about combining data for a
common customer, in this case Is “Bobs widgets” and “Bobs widgets ltd” are the same
overall customer. Whereas despite similar names “Jukebox Studios” and “Jukebox Services”
are entirely independent and unrelated.

Hierarchies are used to represent different parts of a business, or aspects of the business
activity within a system. In some cases this is because each brand is represented
independently but rolled up to a parent “group”.
● Systems A and D have a 2 tier hierarchy (one parent entity can have multiple
children).
● Systems B, F and G have a 1 tier hierarchy (each entity is independent and has no
children)
● Systems C and E have a 3 tier hierarchy (each parent will have 1 or more child
entities, each of the children will have 1 or more children)

We have access to the individual system hierarchy information as a list of key/value pairs
(child references to parent references).

We have some Independent mapping data (from multiple sources), for relationships between
systems. These are stored as key/value pairs which can be used to map between customers
as represented in different systems.

Assumptions
At this point, it can be assumed:
1. The data from any individual system can be generated at any level in the hierarchy,
but it will be consistent within the customer representation within the system and
there will be no overlap/requirement for deduping an individual report.
2. Each new report from each system will completely overwrite the previous one, only
the data points in the latest report are required.
3. An individual company can exist in multiple source systems, the name may be
inconsistent.
4. All metrics are additive.
5. All input rows are to be included in the output (the reporting interface will do any
required filtering).
6. No further aggregations or calculations will be required on the data for the output.
7. The “Internal Company Reference” is unique within each source system, however it
is not unique across systems (company 1001 in System B may not be the same as
company 1001 in system D).

Chapter 04
0% (1)
Chapter 04
32 pages
PLC Interview Questions and Answers
No ratings yet
PLC Interview Questions and Answers
6 pages
Haulage Calculation - Minesight Haulage
100% (2)
Haulage Calculation - Minesight Haulage
12 pages
Parallel and Distributed Computing Test One October 2022
100% (1)
Parallel and Distributed Computing Test One October 2022
3 pages
configuring-Of-Computer-Systems-And-Networks - PDF Module in ICT CHS 10 Teacher Guide
No ratings yet
configuring-Of-Computer-Systems-And-Networks - PDF Module in ICT CHS 10 Teacher Guide
2 pages
DHTML: Dynamic and Interactive Web Sites
No ratings yet
DHTML: Dynamic and Interactive Web Sites
23 pages
Yrc 1000
No ratings yet
Yrc 1000
79 pages
DBMS Record For Degree Students
No ratings yet
DBMS Record For Degree Students
53 pages
Business Data Analytics Part 3
No ratings yet
Business Data Analytics Part 3
59 pages
Unit IV Notes
100% (1)
Unit IV Notes
28 pages
Acceptance Testing and ETL Process j8Mus6Ctvj
No ratings yet
Acceptance Testing and ETL Process j8Mus6Ctvj
19 pages
1) Explain in Detail Drill Up & Drill Down Operations
No ratings yet
1) Explain in Detail Drill Up & Drill Down Operations
24 pages
Hammer Call Master FAQs
No ratings yet
Hammer Call Master FAQs
8 pages
Pic
No ratings yet
Pic
12 pages
8 Data Warehousing
No ratings yet
8 Data Warehousing
113 pages
Lecture 12-13 (31-MAY-01-JUNE - 07-08 - JUNE-2023) - CH09 - PPT
No ratings yet
Lecture 12-13 (31-MAY-01-JUNE - 07-08 - JUNE-2023) - CH09 - PPT
88 pages
Unit 345 DW Autosaved
No ratings yet
Unit 345 DW Autosaved
68 pages
(Subject Code: 410243) (Class: TE Computer Engineering) : Data Analytics
No ratings yet
(Subject Code: 410243) (Class: TE Computer Engineering) : Data Analytics
68 pages
Data Warehouseclass
No ratings yet
Data Warehouseclass
25 pages
Capstone Project
No ratings yet
Capstone Project
57 pages
Chapter-1 Introduction To Data Analytics
No ratings yet
Chapter-1 Introduction To Data Analytics
34 pages
Data Warehousing and Mining
No ratings yet
Data Warehousing and Mining
69 pages
Chapter-04-Analisis Dan Drfinisi Kebutuhan Datawarehouse
No ratings yet
Chapter-04-Analisis Dan Drfinisi Kebutuhan Datawarehouse
56 pages
Chap 7 SYS210 DR Samreen Sep 2023
No ratings yet
Chap 7 SYS210 DR Samreen Sep 2023
56 pages
ANL201 Study Unit 3 - 2023
No ratings yet
ANL201 Study Unit 3 - 2023
48 pages
BD1 1
0% (1)
BD1 1
9 pages
Module 5
No ratings yet
Module 5
29 pages
Business Intelligence Overview
No ratings yet
Business Intelligence Overview
20 pages
Moshi Moshi
No ratings yet
Moshi Moshi
25 pages
Data and Analysis in The Real World Week 1 Quiz
No ratings yet
Data and Analysis in The Real World Week 1 Quiz
11 pages
Data Sources Data Handling Data Visualization
No ratings yet
Data Sources Data Handling Data Visualization
23 pages
C1 Week 1 Quiz
No ratings yet
C1 Week 1 Quiz
12 pages
06 Data Warehouse Design and Analytics
No ratings yet
06 Data Warehouse Design and Analytics
36 pages
Datamarts, Extraction Tools, and Cognos Data Manager: Whitepaper
No ratings yet
Datamarts, Extraction Tools, and Cognos Data Manager: Whitepaper
10 pages
Data Warehouse C
No ratings yet
Data Warehouse C
34 pages
What Is Data Analytics
No ratings yet
What Is Data Analytics
12 pages
RB-HF520B RB-HF420B: Operating Instructions
No ratings yet
RB-HF520B RB-HF420B: Operating Instructions
20 pages
The Ufed Software-Based Mobile Forensic Solution
No ratings yet
The Ufed Software-Based Mobile Forensic Solution
4 pages
Paper 2 Datawarehouse Notes
No ratings yet
Paper 2 Datawarehouse Notes
20 pages
Data Warehouse
No ratings yet
Data Warehouse
14 pages
FundamentalsOfDesigningDW MelissaCoates
No ratings yet
FundamentalsOfDesigningDW MelissaCoates
87 pages
DA - Presentation - 20250421 - 182554 - 0000
No ratings yet
DA - Presentation - 20250421 - 182554 - 0000
19 pages
Unit - 1 Learning Notes
No ratings yet
Unit - 1 Learning Notes
11 pages
Data Warehouse
No ratings yet
Data Warehouse
11 pages
Data Repositories in Data Analytics
No ratings yet
Data Repositories in Data Analytics
8 pages
Inputs To System Design
No ratings yet
Inputs To System Design
8 pages
Tasbi Ul Hasan-20023247
No ratings yet
Tasbi Ul Hasan-20023247
10 pages
Requirements Analysis
No ratings yet
Requirements Analysis
15 pages
Unit III DWM
No ratings yet
Unit III DWM
13 pages
Unit 3 DWH
No ratings yet
Unit 3 DWH
14 pages
Getting Started With Antconc Wide Emu 2013
No ratings yet
Getting Started With Antconc Wide Emu 2013
11 pages
Lesson 9 System Design Is The Phase That Bridges The Gap Between Problem Domain and The
No ratings yet
Lesson 9 System Design Is The Phase That Bridges The Gap Between Problem Domain and The
8 pages
6 1 DWM 2019 S
No ratings yet
6 1 DWM 2019 S
7 pages
Summary For Exam
No ratings yet
Summary For Exam
8 pages
Introduction To Data Analysis
No ratings yet
Introduction To Data Analysis
5 pages
2 Senza HDD e DVD
No ratings yet
2 Senza HDD e DVD
12 pages
Talend WP Mastering Reference Data
No ratings yet
Talend WP Mastering Reference Data
11 pages
BI Assignment 1
No ratings yet
BI Assignment 1
6 pages
Data Warehousing - Data Marting
No ratings yet
Data Warehousing - Data Marting
4 pages
Identifying Master Data
100% (1)
Identifying Master Data
8 pages
221
No ratings yet
221
2 pages
BI Architecture
No ratings yet
BI Architecture
4 pages
Multidimensional Cube
No ratings yet
Multidimensional Cube
3 pages
RDBMS Stands For Relational Database Management System. It's A Type of Database Management System That Stores Data in
No ratings yet
RDBMS Stands For Relational Database Management System. It's A Type of Database Management System That Stores Data in
2 pages
Hidden Patterns, Unknown Correlations, Market Trends, Customer Preferences and Other Useful Information That Can Help Organizations Make More-Informed Business Decisions
No ratings yet
Hidden Patterns, Unknown Correlations, Market Trends, Customer Preferences and Other Useful Information That Can Help Organizations Make More-Informed Business Decisions
4 pages
Inputs To System Design: Implement?"
No ratings yet
Inputs To System Design: Implement?"
20 pages
Business Intelligence 13 26
No ratings yet
Business Intelligence 13 26
14 pages
CCS Powerpoint Template
No ratings yet
CCS Powerpoint Template
18 pages
Data Warehouse Testing - Approaches and Standards
No ratings yet
Data Warehouse Testing - Approaches and Standards
8 pages
CH 2 Operating System Structure
No ratings yet
CH 2 Operating System Structure
32 pages
Gmail - Daraz UX Case Study
No ratings yet
Gmail - Daraz UX Case Study
10 pages
What Are The Different Types of Joins? What Is The Difference Between Them? Inner Join
No ratings yet
What Are The Different Types of Joins? What Is The Difference Between Them? Inner Join
9 pages
Ilodiuba Harrison IT Defense Report
No ratings yet
Ilodiuba Harrison IT Defense Report
35 pages
Continental OTS Brochure Landscape EN 2022
No ratings yet
Continental OTS Brochure Landscape EN 2022
24 pages
18CSMP68 Jssateb SKN PM
No ratings yet
18CSMP68 Jssateb SKN PM
110 pages
Thinkpad X1 Carbon (3Rd Gen) : 20Bs002Tus
No ratings yet
Thinkpad X1 Carbon (3Rd Gen) : 20Bs002Tus
3 pages
Expertpdf Htmltopdf Converter v5 3
No ratings yet
Expertpdf Htmltopdf Converter v5 3
1 page
Icanview 372
No ratings yet
Icanview 372
31 pages
CMD - How To Send File Over Serial Port in Windows Command Prompt - Stack Overflow
No ratings yet
CMD - How To Send File Over Serial Port in Windows Command Prompt - Stack Overflow
2 pages
Activity 2 For ER Diagram
No ratings yet
Activity 2 For ER Diagram
2 pages
Group Assignment 1 - Group Lab Activity I
No ratings yet
Group Assignment 1 - Group Lab Activity I
8 pages
Ict1103 Assessment4
No ratings yet
Ict1103 Assessment4
24 pages
Lesson 6 When Technology and Humanity Cross
No ratings yet
Lesson 6 When Technology and Humanity Cross
16 pages
Data Base Management Systems 2020
No ratings yet
Data Base Management Systems 2020
2 pages
02.PB Python Conditional Statements Lab
No ratings yet
02.PB Python Conditional Statements Lab
4 pages
EJB 3.0 Database Persistence with Oracle Fusion Middleware 11g
From Everand
EJB 3.0 Database Persistence with Oracle Fusion Middleware 11g
Deepak Vohra
No ratings yet
EJB 3.0 Database Persistence with Oracle Fusion Middleware 11g: LITE
From Everand
EJB 3.0 Database Persistence with Oracle Fusion Middleware 11g: LITE
Deepak Vohra
No ratings yet
Visual SourceSafe 2005 Software Configuration Management in Practice
From Everand
Visual SourceSafe 2005 Software Configuration Management in Practice
Aleksandar Seovic
No ratings yet
Using Yocto Project with BeagleBone Black
From Everand
Using Yocto Project with BeagleBone Black
H M Irfan Sadiq
No ratings yet
JBoss Tools 3 Developers Guide
From Everand
JBoss Tools 3 Developers Guide
Anghel Leonard
No ratings yet

Analytics Engineering Case Study

Uploaded by

Analytics Engineering Case Study

Uploaded by

Analytics Engineering Case Study

Bobs Widgets BOBW1 OPEN 2 44.5 4 4 4

Bobs Widgets Events BOBW2 OPEN 3 39.6 6 1 11.5

Jukebox Studios JBS01 NEW 7 38.7 2 6 8

Bobs Widgets Ltd 2001 ACTIVE 1 0 2 2 1

Gig Management 1001 INACTIVE 4 0 0 0 0

Jukebox Services 2021 ACTIVE 2 44.21 1 0 0

Bobs Widgets LTD A BOBW1 ACTIVE 2 44.5 4 4 4

Bobs Widgets LTD A BOBW2 ACTIVE 3 39.6 6 1 11.5

Bobs Widgets LTD B 2001 ACTIVE 1 0 2 2 1

Jukebox Studios A JBS01 NEW 7 38.7 2 6 8

Gig Management B 1001 INACTIVE 4 0 0 0 0

Jukebox Services B 2021 ACTIVE 2 44.21 1 0 0

Customer Relationships & Hierarchy

You might also like