10 BI Projects
10 BI Projects
2
Lifecycle and Road Map
• Begins with project planning
– assess the organization’s readiness
for a data warehouse initiative,
resources and justification
• Ongoing project management
– keep the lifecycle on track
• Business requirements definition
– Understand needs of the business
and translate them into design
considerations
– two-way arrow between project
planning and business
requirements definition (interplay)
3
Lifecycle and Road Map
• Design and implementation: three parallel
tracks
– Technology:
• Technical architecture design
– overall framework to support the integration of
multiple technologies
• Evaluate and select specific products (not
before!)
– Data:
• Translate requirements into dimensional models
• Transformed dimensional models into physical
structures (performance tuning strategies)
• Data staging extract-transformation-load (ETL)
processes
– Analytic applications
• Reports, OLAP queries, KPIs and dashboards
4
Lifecycle and Road Map
• Deployment
– Bring together the technology,
data, and analytic application
tracks
– Education and support
• Ongoing maintenance
– Support the data warehouse and
its user community
– Future data warehouse growth
by initiating subsequent projects,
returning to the beginning of the
lifecycle
5
Project Planning
6
Project Planning
Assesing Readiness
• Is there a strong business sponsor?
– leader who can convince his or her peers to support the warehouse
– vision for the potential impact of a data warehouse on the
organization
• Is there a strong business motivation?
– solve critical business problems, not just nice-to-have
• Is it feasible?
– Especially, data availability from operational systems
• Good relationship between business and IT?
– Does the IT organization understand and respect the business and
vice versa?
• Any existing analytical culture?
– Do business analysts make decisions based on quantitative facts or
7
based on intuition, anecdotal evidence, and gut reactions?
Project Planning
Staffing
Roles from the business side
8
Project Planning
Staffing
Roles from the business and/or IT side
(technical resources that understand the business)
11
Business Requirements
Definition
12
Business Requirements
Definition
• Gather requirements by meeting with
business user representatives
• We can’t ask business managers about the
granularity or dimensionality of their critical
data
• We need to talk to them about
– what they do,
– why they do it,
– how they make decisions,
– and how they hope to make decisions in the
future
13
Business Requirements
Definition: Prioritization
14
Business Requirements
Definition: Prioritization
• Upper right corner: projects needing immediate
attention because they’re high-impact and feasible
• Lower left cell: be avoided, impossible that do little for
the business.
• Lower right cell: don’t justify short-term attention,
although project teams sometimes focus here
because these projects are doable and no one will
notice if the project doesn’t go well!
• Upper left cell: meaningful opportunities
– While the data warehouse project team is focused on
projects in the shaded upper right cell, other IT teams
should address the current feasibility limitations of those
in the upper left cell
15
Technical Architecture
Design
• The architecture:
– key input are the business requirements definition
– allows to catch problems on paper
– supports the coordination of parallel efforts
– Speeding development through the reuse of modular
components
– identifies the immediately required components
versus those which will be incorporated at a later
date
– serves as a communication tool:
• a consistent set of technical requirements within the team,
upward to management, and outward to vendors
16
Technology Track
17
Technical Architecture
Design
• Architecture Task Force
– two to three people
– the technical architect, the data staging
designer and analytic application developer
– conduct additional interviews within the IT
organization
– ensure both backroom and front room
representation
18
Product Selection and
Installation
• The architecture plan is similar to a
“shopping list”
– Allows to select products that fit into the plan’s
framework to deliver the necessary functionality
• Develop a spreadsheet-based evaluation
matrix that identifies
– the evaluation criteria,
– weighting factors to indicate importance
• Opt for a trial period
– opportunity to put the product to real use
– preserves negotiating power
19
Data Track
20
Dimensional Modeling
• Immediately following the business
requirements definition draft the data
warehouse bus matrix
• Final prioritization step of the business
requirements activities identifies the specific
business process that will be tackled first
• Conduct design workshops to create the
dimensional schema
– A small team: business system analyst,
business subject matter expert, business power
analyst, and data modeler
21
Dimensional Modeling
• Once the modeling team is ready, it
communicates and validates the design
with a broader audience:
– first within the IT and data warehouse team
– and then with others in the business
community
• simplify the schema (e.g., hide the join keys)
22
Analytic Applications Track
23
Analytic Application
Specification
• Following the business requirements definition,
identify a starter set of approximately 10 to 15
analytic applications
– the number of specific analyses that can be created
from a single template merely by changing variables
will be large
• Establish standards for the applications
– consistent output and look-and-feel (GUI)
• Leverage the Web and customizable information
portals as the dominant strategies for
disseminating application access
24
Deployment
25
Deployment
• The technology, data, and analytic application
tracks converge at deployment
• Common problem: incomplete deployment
– avoid missing deadlines by serving incomplete data
– endangers acceptance by business users
• Two phases:
– “Alpha test” phase by the core project team
performing an end-to-end system test
– “Beta test” phase by a limited set of business users
performing user-acceptance test
• Package deployment with education and support
– critical to acceptance
26
Maintenance and Growth
27
Maintenance and Growth
• User support is crucial immediately following the
deployment
– for the first several weeks following user education,
the support team should be working with the users
• Demand for growth, either for:
– new users,
– new data,
– new applications, or
– major enhancements
• Go back to the beginning of the process
– leverage much of the earlier work, especially
regarding the technical and data architectures
28
Outsourcing?
• Cloud-based application service providers
(ASPs) can take off much of the load of
developing and supporting data
warehouses
– Has risks and advantages
• Basic question: are we willing to trust a
strategic responsibility to an outsider?
29
Outsourcing?
Advantages Risks
• ASP already has better skills than • ASP can go out of business
our IT
• ASP may upgrade its software
• ASP has configured a complete set
of hardware and software on its own schedule
components that are known to work – May not want to make custom
• ASP has spare hardware capacity to modifications
respond to explosive demands (“pay • ASP may support your
as you go”) competitors
• ASP has centralized economies of – You don’t have direct visibility of
scale for backup and recovery the security procedures of an
• Costs of the ASP can be smaller, ASP
isolated and better managed than • Level-of-service agreement
by internal IT
should come from your
• ASP takes care of its own personnel organization
management
– not from the lawyers working for
the ASP
30
Security issues
• Destruction of the facility: terrorist
attack, fire or flooding => loosing
hardware and even key persons
• Deliberate sabotage by insider: destroy
the system, logically and physically
• Cyberwarfare: access unauthorized
information, alter information, and disable
systems (e.g., denial-of-service attacks)
31
Reaction to security issues
• Distributed systems: multiple computers and locations reduce
vulnerability to sabotage and single-point failures
• Parallel communication paths: Internet is a robust
communications network that is highly parallelized and adapts itself
continuously to its own changing topology
– But Internet is locally vulnerable if key switching centers (where high
performance Web servers attach directly to the Internet backbone) are attacked
• Extended storage-area networks: disk drives, archive systems,
and backup devices located in separate buildings on a fairly big
campus
– backup and copying can be performed at extraordinary speeds
– Daily backups to removable media taken to secure storage
• Authentication and access:
– users should be authenticated in a uniform way (regardless of whether they are
inside the building or coming in over the Internet)
– associates the user with a named role that determines the information entitled
to see 32