Business Intelligence Concepts: Rajashmita Kar

Download as pdf or txt
Download as pdf or txt
You are on page 1of 32

Business Intelligence

Concepts
RAJASHMITA KAR
[email protected]

AGENDA

What is BI??
BI Lifecycle
Chalenges of Building BI
Data Warehouse Concepts
Key Performance Indicator
Normalization and its type
Data Cleansing
Slowly Changing Dimension

Dec16,2013
April 13, 2011

What is BI??

Dec16,2013
April 13, 2011

The term Business Intelligence (BI) represents the tools and systems
that play a key role in the strategic planning process of the corporation.

These systems allow a company to gather, store, access and analyze


corporate data to aid in decision-making.

It is an environment in which business users receive information that is


reliable, secure, consistent, understandable, easily manipulated and
timely...facilitating more informed decision making.

Dec16,2013
April 13, 2011

BI LIFECYCLE

Dec16,2013
April 13, 2011

Data Sources

ETL Software
S
T
A
G
I
N
G

Tr ansac tion D ata


Prod

IBM

Mkt

IMS

HR

Fi n

VSAM

Ascential

Oracle
Ext ract

Acctg

Sybase

Othe r In ter nal D ata


ERP

SAP

Informix
SAS

HarteHanks

Users

A NA L YS TS

Cognos
Teradata
IBM
Load
Informatica

D
A
T
A

Exte rnal Data


Demographic

A
R
E
A
O
P
E
R
A
T
I
O
N
A
L

Data Analysis
Tools and
Applications

SQL

Sagent

W e b D ata
Clickstream

Data Stores

S
T
O
R
E

Cl ean/Scrub
Trans form
Fi rst logic

Dat a
W arehouse

Dat a Mart s

SAS
M AN A GER S

Fi nance
Essbase
Marketing

Queri es,Reporting,
DSS/EIS,
Data M ining
EX EC U TIV ES
Micro Strat egy

Meta
Dat a

Sales
Microsoft

Si ebel
Busi ness
Objects

OP ER A TIO NA L
PER SO NN EL

W eb
Browser
C US TOM ER S/

October30,2012
April 13, 2011 SU PP LIE RS

Challenges of Building BI
Solution

Dec16,2013
April 13, 2011

Data exists in multiple places

Data is not formatted to support complex analysis

Different kinds of workers have different data needs

What data should be examined and in what detail

How will users interact with that data

Dec16,2013
April 13, 2011

Data WareHouse

Dec16,2013
April 13, 2011

Components of Data Warehouse

Dec16,2013
April 13, 2011

Cubes
Measures
Key Performance Indicator
Dimensions
---Attributes
---Hierarchy

Dec16,2013
April 13, 2011

DATA MODELLING

Dec16,2013
April 13, 2011

DATA MODELING
process that produces abstract data models for one or more
database components of the data warehouse
Types of Data Modeling
Conceptual Data Model
Logical Data Model
Physical Data Model

Dec16,2013
April 13, 2011

DIMENSIONAL MODELLING
Fact Table
Dimension Table

Dec16,2013
April 13, 2011

STAR SCHEMA AND SNOW FLAKES


SCHEMA
STAR SCHEMA

SNOW FLAKES SCHEMA

Dec16,2013
April 13, 2011

KEY PERFORMANCE INDICATOR (KPI)

A set of quantifiable measures that a company or industry


uses to gauge or compare performance in terms of
meeting their strategic and operational goals.
KPIs vary between companies and industries, depending
on their priorities or performance criteria

Dec16,2013
April 13, 2011

NORMALIZATION

Dec16,2013
April 13, 2011

NORMALIZATION
Database normalization is the process of removing
redundant data from your tables in to improve storage
efficiency, data integrity, and scalability.
Normalization Forms:

First Normal Form(1NF)


Second Normal Form(2NF)
Third Normal Form(3NF)
Boyce-Codd Normal Form(BCNF)

Dec16,2013
April 13, 2011

3NF

Third normal form (3NF) requires that there are no


functional dependencies of non-key attributes on something
other than a candidate key.
A table is in 3NF if all of the non-primary key attributes are
mutually independent
There should not be transitive dependencies

Dec16,2013
April 13, 2011

If I know # of Pages, can I find out Author's Name? No. Can I find out
Author's pseudonym? No.
If I know Author's Name, can I find out # of Pages? No. Can I find
out Author's pseudonym YES.
Therefore, Author's pseudonym is functionally dependent upon
Author's Name, not the PK for its existence. It has to go.

Dec16,2013

BCNF
Client Interview

-FD1 clientNo, interviewDate

interviewTime, staffNo, roomNo (Primary Key)

-FD2 staffNo, interviewDate, interviewTime

clientNo

(Candidate key)

-FD3 roomNo, interviewDate, interviewTime

clientNo, staffNo (Candidate key)

-FD4 staffNo, interviewDate

(not a candidate key)

roomNo

-As a consequece the ClientInterview relation may suffer from update anomalies.
-For example, two tuples have to be updated if the roomNo need be changed for staffNo
SG5 on the 13-May-02.

Dec16,2013

To transform the ClientInterview relation to BCNF, we must remove the violating


functional dependency by creating two new relations called
Interview and StaffRoom as shown below,
Interview (clientNo, interviewDate, interviewTime, staffNo),
StaffRoom(staffNo, interviewDate, roomNo)

Dec16,2013

DATA CLEANSING
WHEN
WHY

WHAT
HOW

Dec16,2013
April 13, 2011

SLOWLY CHANGING DIMENSION(SCD)


Dimensions that change over time are called SCD
Example: Product Price, Customer Address etc
SCD Types:
Type I SCD
Type II SCD
Type III SCD
Type 4 SCD

Dec16,2013
April 13, 2011

SCD (IMPLEMENTATION)
DIMENSIONS
CATEGORY
BRAND
MODEL

SCDIMPLEMENTEDONDIMENSIONTABLES
ONLYANDNOTONFACTTABLES

STORE
VENDOR

FACTS
INVENTORY_

CATEGORY(SCDI)
BRAND(SCDI)
MODEL(SCDIIbyDATE)

TRANSACTION_FILE

STORE(SCDIIbyFLAG)
GOODS_
RECEIVING

VENDOR(SCDI)

Dec16,2013
April 13, 2011

THANK YOU

Dec16,2013
April 13, 2011

You might also like