0% found this document useful (0 votes)

9 views18 pages

1c-OLAP-BI-Using The AMBER Data Repository To Analy

Uploaded by

P B

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

9 views18 pages

1c-OLAP-BI-Using The AMBER Data Repository To Analy

Uploaded by

P B

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 18

Tutorial The AMBER Project

• Assessing, Measuring and Benchmarking

Resilience in computer systems and
Using the AMBER Data Repository components (AMBER)
to Analyze, Share and Cross-exploit • Coordination Action supported by the
Dependability Data European Commission in the 7th FP
Marco Vieira

m vi ei r a@dei .uc.pt
• Coordinating and advancing research in
University of Coimbra, Portugal

resilience measurement and benchmarking in

T he Sec ond Inter nati onal Confer ence on Dependability ( D EPEN D 2009)

Athens/Gl yfada, Gr eece, J une 18,2009

computer systems and infrastructures

Current challenges AMBER objectives

• Quality of measurements • State-of-the art survey

• Integration of the human and technical • Research agenda
components of the analysis • Data repository
• Dynamic and adaptive systems and
• Others:
networks
– Dissemination events (workshops, panels, etc)
• Integration with the development – Benchmarking tools
processes – Training material

3 4

This Tutorial… Problems

• How to analyze the usually large amount of

raw data produced in dependability evaluation
Learn how to use the experiments?
AMBER Data Repository • How to compare results from different
experiments or results of similar experiments
to analyze and share data across different systems?
– Different and incompatible tools, data formats, and
from dependability setup details…
evaluation experiments • How to share raw experimental results among
research teams?
DEPEND 2009, Athens/Glyfada, Greece, June 18,2009 5 DEPEND 2009, Athens/Glyfada, Greece, June 18,2009 6

1
Marco Vieira, University of Coimbra, Portugal

Current situation ADR Vision and objectives

• The situation today is not good!!! • Vision

• Spreadsheets and other specific tools to – Become a worldwide repository for
dependability related data
analyze results
– Not standard and difficult to build • Key objectives:
– Provide state-of-the-art data
• Difficult to compare data and generalize
analysis
conclusions – Allow data comparison and cross-
• Researchers share final results and exploitation
conclusions – Facilitate worldwide data sharing and
– Papers, mainly dissemination
– Raw data is not shared • Potential tool to increase the impact of
DEPEND 2009, Athens/Glyfada, Greece, June 18,2009 7 8

research
DEPEND 2009, Athens/Glyfada, Greece, June 18,2009

Data analysis approach Outline

• Repository to analyze, compare, and share 1. Business Intelligence

results
• Use a business intelligence approach:
2. Data Warehousing & OLAP
– Data warehouse to store data
– On-Line Analytical Processing (OLAP) to analyze
data 3. Using DW to analyze dependability related data
– Data mining algorithms to identify (unknown)
phenomena in the data
– Information retrieval for data in textual 4. The AMBER Data Repository
formats
• Adopt the same life cycle of BI data 9 DEPEND 2009, Athens/Glyfada, Greece, June 18,2009 10

DEPEND 2009, Athens/Glyfada, Greece, June 18,2009

What is Business Intelligence?

• Business Intelligence (BI):

– Getting the right information, to the right
decision makers, at the right time
• BI is an enterprise-wide platform that
1. Business Intelligence supports, data gathering, reporting, analysis
and decision making
• BI is meant to:
– Fact-based decision making
– “Single version of the truth”
• BI includes reporting and analytics
DEPEND 2009, Athens/Glyfada, Greece, June 18,2009 12

2
Marco Vieira, University of Coimbra, Portugal

Five classic BI questions Typical BI technologies

• ETL Tools (Extract, Transform, and Load)

• Repositories
• What happened? Past
– Data Warehouse
• What is happening?
• Why did it happen? Present • Analytical tools
• What will happen? – Reporting and querying
Future – OLAP
• What do I want to happen?
– Data mining

• Information retrieval

DEPEND 2009, Athens/Glyfada, Greece, June 18,2009 13 DEPEND 2009, Athens/Glyfada, Greece, June 18,2009 14

Many proprietary products Some open source/free producs

ACE*COMM Microsoft SAS Institute • Eclipse BIRT Project:.

Ab Initio Microsoft Analysis Services Siebel Systems • Freereporting.com:
Actuate PerformancePoint Server 2007
Proclarity Spotfire (now Tibco) • JasperSoft:
ComArch
Oracle Corporation
CyberQuery Hyperion Solutions
StatSoft • OpenI:
Dimensional Insight Corporation SPSS • Palo (OLAP database):
IBM Panorama Software Telerik Reporting
Applix Pentaho • Pentaho:
Pervasive Teradata
Cognos • RapidMiner
InetSoft Pilot Software, Inc. Thomson Data
PRELYTIS Analyzer
• SpagoBI:
Informatica
Prospero Business
Information Builders Suite • Weka
LogiXML Qliktech
LucidEra SAP Business Inf.
Warehouse • Some products from big companies can be used freely
MicroStrategy
Business Objects
OutlookSoft
DEPEND 2009, Athens/Glyfada, Greece, June 18,2009 15 DEPEND 2009, Athens/Glyfada, Greece, June 18,2009 16

What is a Data Warehouse?

• Big database that stores data for decision

support
• Built from the operational data collected from

2. Data Warehousing transactional DB and other operational systems

Operational DB

& OLAP & other systems Data Warehouse

Users

DEPEND 2009, Athens/Glyfada, Greece, June 18,2009 18

3
Marco Vieira, University of Coimbra, Portugal

Basic DW components Data volume

Data warehouse Users

• Less than 20 GBytes
Operational DB
(presentation servers) – Small dimension; runs in a PC
Ad hoc
• From 20 to 100 GBytes
queries
Legacy systems – Medium dimension; needs a powerful workstation
Reports

Data • From 100 Gbytes to 1 TBytes

Spreadsheets, Specific
files, ...
Staging
apps – Large dimension; needs a powerful server,
Area
Models and
normally with parallel processing
other
tools
• More than 1 TBytes
External sources
– Very large dimension; massive parallel
processing
DEPEND 2009, Athens/Glyfada, Greece, June 18,2009 19 DEPEND 2009, Athens/Glyfada, Greece, June 18,2009 20

Some characteristics Temporal dependency

• Temporal dependency • The data is collected over time

– Do not represent a specific moment
• Non volatile – Represents the history

• Target oriented
• A temporal reference must be associated to all
• Data integration and consistency data in the database
• Designed for queries

DEPEND 2009, Athens/Glyfada, Greece, June 18,2009 21 DEPEND 2009, Athens/Glyfada, Greece, June 18,2009 22

Non volatile Target oriented

• The data in the DW is never updated • The data warehouse must only store data
relevant for decision support
• The DW stores historic data (historic memory)
collected from the operational databases • Many operational data (needed for everyday
management) is not relevant for the DW
• After being load (from the operational
databases) there is only one operation:
– Queries

DEPEND 2009, Athens/Glyfada, Greece, June 18,2009 23 DEPEND 2009, Athens/Glyfada, Greece, June 18,2009 24

4
Marco Vieira, University of Coimbra, Portugal

Data integration and consistency Designed for queries

• In a operational environment the information • After being load the

may be stored in different locations using data never changes:
different representations – Only queries are The data must be stored
allowed in such a way that
improves performance
• That data must be integrated and made • DW stores a large
consistent before being load in the DW amount of data

Multidimensional view
Partial denormalization
DEPEND 2009, Athens/Glyfada, Greece, June 18,2009 25 DEPEND 2009, Athens/Glyfada, Greece, June 18,2009 26

Dimensional model The multidimensional model

• Typical model in operational databases: E/R • Facts stored in a multidimensional array

• The dimensional model follows a different • The dimensions are used to index the array
approach
– Stores the same data
• Usually built using data from operational
– Data organization is user oriented databases Sales

e
• Easy to understand or
St Lisbon 2
Coimbra
• Very good performance for queries Milk
Oil 5
• Data Warehouses built over complex E/R Product
Sugar 3
models never succeed Coffee Jan Feb Mar Apr
Date
DEPEND 2009, Athens/Glyfada, Greece, June 18,2009 27 DEPEND 2009, Athens/Glyfada, Greece, June 18,2009 28

Star model Facts

• The typical dimensional model is a star • Represent the business measures

structure with:
• The most useful facts are:
– A central table with facts
– Numbers
– Several dimensions tables describing the
– Additives
facts
ID_dim 1 ID_dim 3
Dimension 1 Facts Table
Attributes Attributes
.. ID_dim 1 ..
Dimension
. 3 ID_dim 2 .
ID_dim 3
ID_dim 4

Dimension 2 Dimension 4
ID_dim 2 Fact 1 ID_dim 4
Fact 2
.
Attributes . Attributes
.. . ..
. Fact n .
DEPEND 2009, Athens/Glyfada, Greece, June 18,2009 29 DEPEND 2009, Athens/Glyfada, Greece, June 18,2009 30

5
Marco Vieira, University of Coimbra, Portugal

Facts table Dimensions

• Comprises several numeric attributes (facts) • Each dimension represents a business

and foreign keys to the dimensions parameter
• Normalized table – Time, clients, products, etc

• Relationships M:1 with the business • Represent a entry point for the analysis of the
dimensions facts

• Contains normally a large number • Represent different point-of-views for the

of records analysis of the facts
• Represents typically 95% of the space used
by the DW
DEPEND 2009, Athens/Glyfada, Greece, June 18,2009 31 DEPEND 2009, Athens/Glyfada, Greece, June 18,2009 32

Dimension tables Star schema example

• Strongly denormalized Stores

Time
– For performance ID_time
Day Store
• Dimensions have hierarchies Day_of_week
Week_of_year
Sale
ID_store
Month Name
– Day  Month  Year  … Contain a large set of Trimester ID_time
ID_product
Local
Year District
attributes ID_store
Area
Units_sold Num_te
Purchase_cost
• Typically comprise a small number of records Product
ID_product
Sale_value
Num_Clients
llers

(when compared to the facts table) Name

Type
Brand
Category
Pack
Description

DEPEND 2009, Athens/Glyfada, Greece, June 18,2009 33 DEPEND 2009, Athens/Glyfada, Greece, June 18,2009 34

Low level queries User interfaces

Tim
ID_time • Explore data in Data Warehouses
Day e Store
Day_of_week
Week_of_year ID_store – Typical OLAP tools
Sales
Month
Trimester ID_time
Name
Local
• Access the relational engine using SQL
Year ID_product
ID_store District
Area
• Data presentation using tables, graphics, reports, etc
Product
Num_te
llers
• Targeted for ad-hoc queries
ID_product Units_sold
Name
Purchase_cost
Sale_value – Other tools
Type Num_Clients
Brand • Data mining
Category select avg (sale_value x units_sold)
Pack
from sale, time, product • Modeling
Descripti
on where JOIN_TABLES
group by brand, month

DEPEND 2009, Athens/Glyfada, Greece, June 18,2009 35 DEPEND 2009, Athens/Glyfada, Greece, June 18,2009 36

6
Marco Vieira, University of Coimbra, Portugal

Queries - Slice and Dice Drill-Down & Roll-Up

Drill-Down Roll-up
Sales by time and Most generic category
product Sales by store and
brand
Intermediate category

Most detailed category

Full Detail

DEPEND 2009, Athens/Glyfada, Greece, June 18,2009 37 DEPEND 2009, Athens/Glyfada, Greece, June 18,2009 38

DEPEND 2009, Athens/Glyfada, Greece, June 18,2009 39 DEPEND 2009, Athens/Glyfada, Greece, June 18,2009 40

Example – Retail sales Retail sales – Business data

• Set of stores belonging to the same enterprise • Where to collect the data?
– POS - point of sales
• Goal: Analysis of sales
– Operational database
• Each store has several departments (food, • What to measure?
hygiene and cleaning, etc) – Sales

• Sells thousands of products • Goals?

– Maxi
• Products are identified using a unique mize
number the
profit
DEPEND 2009, Athens/Glyfada, Greece, June 18,2009 41
– Maxi 42

mum
sales
price
possi
ble
– Lowe
7
r
costs

–
More
client
s
Marco Vieira, University of Coimbra, Portugal

Retail sales – Facts Retail sales – Dimensions

• Examples of relevant decision support facts: • Main dimensions:

– Number of units sold – Product x Store x Time
– Acquisition costs
• Are there other relevant dimensions?
– Sale value
– Supplier? – Promotions? – Client?
– Number of clients that bought the product
– Employee responsible for the store on that day?
• Question: is it possible to obtain base data
• It is normally possible to add extra dimensions
(from the operational system) for these facts?
• All the dimensions have a 1:M relationship
with the facts
DEPEND 2009, Athens/Glyfada, Greece, June 18,2009 43 DEPEND 2009, Athens/Glyfada, Greece, June 18,2009 44

Retail sales Granularity

ID_product • Example: record the daily sales for all products

ID_product ID_time
description
full_description
ID_store ID_store – Analyze in detail (price, quantity, etc) the products
ID_promotion name
SKU_number
package_size units_sold store_number
store_street_a
sold every day, in each store, …
brand purchase_cost ddress
subcategory
category
sale_value
num_Clients
city
store_county
• Retail sales granularity:
department store_state
package_type
diet_type weight ID_time store_zip
sales_district
– Products x Store x Promotion x Day
weight_unit_of_measure date ID_promotion sales_region
units_per_retail_case
units_per_shipping_case
day_of_week
day_number
number
name
store_manager
store_phone • The granularity defines the detail of the DW and
cases_per_pallet _in_month type_price_red store_FAX
shelf_width_cm
shelf_height_cm
day_number_overall
week_number_in_year
type_advertisement
type_poster
floor_plan_type
photo_processin
has a strong impact in the size
shelf_depth_cm week_number_overall Type_coupons g_type
……... Month
quarter
promotion_cost
start_date end_date
finance_services_type
first_opened_date
• The granularity must be adjusted to the
fiscal_period ……... last_remodel_date
year
holiday_flag
store_sqft
grocery_sqft
analysis requirements
………. frozen_sqft meat_sqft
……...
DEPEND 2009, Athens/Glyfada, Greece, June 18,2009 45 DEPEND 2009, Athens/Glyfada, Greece, June 18,2009 46

Retail sales – Details Retail sales – Details

ID_product
• Mctandatory dimension that
ID_pro du
ID_product
ID_product
ID_time ID_time
description represents the DW tDI e_mrotspe oral
ID_store description ID_store ID_store
full_description name full_description • Must chaIDr_apcrotmeortiizone the name
SKU_number ID_promodtieonpendency store_number SKU_number store_number
package_size units_sold store_street_address package_size as
products units_sold store_street_a
brand • Must describe
purchase_cost
time cstore_county
atisy seen brand ddress
subcategory sale_value subcategory seenpburychtahsee_cobstusiness city
category
by
num_Clietnhtse business managsteormstore_zip
e_setatnet
category num_Clients store_county
department
package_type
department
package_type • manage
Must smael e_vnthe
contain atu
le attributes that store_state
store_zip
• Is typically generate d
ID_time sales_district ID_time
diet_type weight diet_type
weight_unit_of_measure date ID_promotion s alesi_store_manager
nregaion weight date are relevant forID p _ opr somteot rio i nor sales_district
s sales_region
store_manager
units_per_retail_case day_of_week sy n t h etic
nu m b er store_phone weight_unit_of_measure
querie
day_of_week number store_phone store_FAX
units_per_shipping_case day_number name store_FAX units_per_retail_case day_•nuItmbiser_ani_smtornoht ngly floor_plan_type
cases_per_pallet
shelf_width_cm
_in_month
day_number_overall
week_number_in_year
• manner
Ittyitype_advertisement
spe_nproci te_gredenerated frflooomphoto_processing_type
r_pltahn_etype
units_per_shipping_case
cases_per_pallet
day_number_overall
dtea nbormalized
wek_nn am
um eb e
type_price_red
l er_i(nw_yeharich is
photo_processing_type
finance_services_type
shelf_height_cm shelf_width_cm first_opened_date
type_poster
shelf_depth_cm
……...
week_number_overall
Month
opType_coupons
e r a t i o nal databasfirst_opened_date
ypt e _p o ts e r feinasnce_services_type shelf_height_cm
shelf_depth_cm
a yt lps
Month _e oadvtyrep
week_number_overall
other sitiecman
dimension e lt in Tysp)e_coupons last_remodel_date
promotion_cost quarter promotion_cost store_sqft grocery_sqft
quarter • Intscarlu_dt daeet s all the last_remodel_date ……... fiscal_period start_date frozen_sqft
fiscal_period year end_date meat_sqft
year reco strodres_sqft
end_date grocery_sqft holiday_flag ……... ……...
holiday_flag re…p…r.e..senting the ……….
………. considered in the DW meat_sqft
……...
tim e p e r i od
fr oze June
DEPEND 2009, Athens/Glyfada, Greece, n_ sq18,2009
ft 47 DEPEND 2009, Athens/Glyfada, Greece, June 18,2009 48

8
Marco Vieira, University of Coimbra, Portugal

Retail sales – Details Retail sales – Details

ID_product
ID_product
• Must characterize tIhD_emit setores as •IDC_prhodaurctacterizes the existing ID_time
•prdesIIconDprim
tth_proinoisodexample
tiuoctns
ID_store
description
full_description
ID_store
seen by the busineIsDs_pmromaonoit angement name full_description there is onlyID_promotion
IoDn_seortde imension related to IpD
name
r om
_ s ot er

SKU_number
units_sold store_number otions
•SKRU_enupmrberesents a very importantudniitms_seolnd store_number

•
package_size store_street_a package_size store_street_address
brand ddress purchase_cost
subcategory Must contain the atptsurarl ice_bhvu aslteee_csostthat sion
brand
• Managers want to know the impact of promotions in the salescitiyn order to
category relevant for posteri o r q u e ries
a u
num_Clients
city
store_county
subcategory sale_value store_county
category target new promotions to specnifuicm_pCrloei dnutscts, stores and timestore_zip
store_state
department
are store_state department
package_type
diet_type •
Incu
l IDd_etimsegeographical attributes
store_zip
sales_district
package_type
diet_type
ID_time sales_district
sales_region
weight (localization)
date
day_of_week
ID_promotion
number
sales_region weight date
day_of_week
ID_promotion
store_manager
weight_unit_of_measure store_manager weight_unit_of_measure number store_phone
day_number_in_month day_number
units_per_retail_case
units_per_shipping_case • Inclu des time attributes
day_number_overall
name
type_price_red
store_phone
store_FAX
units_per_retail_case
units_per_shipping_case _in_month
name
type_pri
store_FAX
floor_plan_type
cases_per_pallet type_advertisement floor_plan_type cases_per_pallet day_number_overall ce_red photo_processin
w ee k _
n um b e
r_ n
i_ y week_number_in_year
shelf_width_cm
shelf_height_cm
(ope nweek_number_overall
i n g d a te ,
Month
type_poster
Type_coupons
photo_processin
g_type
shelf_width_cm
shelf_height_cm week_number_overall
type_advertisement
type_poster
g_type
finance_services_type
shelf_depth_cm quarter promotion_cost finance_services_type shelf_depth_cm Month Type_coupons first_opened_date
……... ear fiscal_period start_date end_date first_opened_date ……... quarter promotion_cost last_remodel_date
… ). year ……... last_remodel_date fiscal_period start_date store_sqft
holiday_flag store_sqft year end_date grocery_sqft
………. grocery_sqft holiday_flag ……... frozen_sqft meat_sqft
frozen_sqft meat_sqft ……….
……...
……...
DEPEND 2009, Athens/Glyfada, Greece, June 18,2009 49 DEPEND 2009, Athens/Glyfada, Greece, June 18,2009 50

More than one star Several stars

Sales Stock
ID_time
ID_product Time ID_time
ID_product
Store ID_store ID_warehouse Warehouse
units_sold quant_available
purchase_cost quant_out
sale_value purchase_cost Sales
num_clients last_sell_price Orders Dimension: Time
Product Dimension: Time Dimension: Component
Dimension: Component Dimension: Client
• Two or more starts can be connected using one Dimension: Supplier Dimension: Contract
Dimension: Contract
or more dimensions
Stocks
• Shared dimensions must be conform Dimension: Time
Dimension: Component
– Contain consistent data when considering each star Dimension: Warehouse

• Drill across: query that crosses more than one

start
DEPEND 2009, Athens/Glyfada, Greece, June 18,2009 51 DEPEND 2009, Athens/Glyfada, Greece, June 18,2009 52

Questions

? 3. Using DW to analyze
dependability data

DEPEND 2009, Athens/Glyfada, Greece, June 18,2009 53

9
Marco Vieira, University of Coimbra, Portugal

Basic elements of a DW A DW for experimental data

Operations Analysis Experiments Result
Operational DB analysis
Fault injection
Multidimensional OLAP application
Multidimensional OLAP application tools
server (result analysis)
server (result analysis)
Robustness testing
Exp. System A
Legacy Systems tools
Ad hoc
Ad hoc
Dependability
queries
Data queries benchmarking Data LAN/
Spread sheets, Warehouse Warehouse Interne Statistical
files ... Net Statistical experiments
Exp. System B t
Reporting Reporting
Any other
experimental
External sources environment

Field
Exp. dataN
System

DEPEND 2009, Athens/Glyfada, Greece, June 18,2009 55 DEPEND 2009, Athens/Glyfada, Greece, June 18,2009 56

Two types of data in experimental

Key points of the proposed approach dependability evaluation
Experiments Multidimensional
OLAP tool database Network
Exp. Setup A Ad hoc
queries

Exp. Setup B Experiment

?
Management System Exp. control data
Data Faults definition Target System
Warehouse Net

Statistical Readouts
Exp. Setup N Reporting
(impact of faults)
Two types of data:
• General approach to store results from dependability
evaluation experiments
• Measures collected from the target system (FACTS)
– For example, raw data representing error detection efficiency, recovery
• Data from different experiments can be compared/cross- time, failure modes, etc
exploit (only if it makes Wsehnastes’toni csiodme?pare)
• Features of the target system and experimental setup
• Raw data is available (not only the final results)
that have impact on the measures (DIMENSIONS)
• Results can be analyzed and shared world wide by using – For example, attributes describing the target systems, the different
web-enabled versions of OLAP tools configurations, the workload, the faultload, etc
DEPEND 2009, Athens/Glyfada, Greece, June 18,2009 57 DEPEND 2009, Athens/Glyfada, Greece, June 18,2009 58

The multidimensional model The star schema

• Facts are stored in a multidimensional array

• Dimensions are used to access the array
according to any possible criteria
e m
st
sy

System B
et

System A
rg
Ta

Faultload

Workload

DEPEND 2009, Athens/Glyfada, Greece, June 18,2009 59 DEPEND 2009, Athens/Glyfada, Greece, June 18,2009 60

10
Marco Vieira, University of Coimbra, Portugal

Basic elements of the proposed Basic elements of the proposed

approach approach
Experiment Multidimensional Analysis Experiment Multidimensional Analysis
s database s database
Exp. Setup A Ad hoc Exp. Setup A Ad hoc
queries queries

Exp. Setup B Exp. Setup B

Data Loading Data
Warehouse Net Warehouse Net
applications
Statistical Statistical
Exp. Setup N Reporting Exp. Setup N Reporting

The experimental setups are used as they are. You can use your Loading applications
favorite dependability evaluation tool and do the experiments • General purpose loading applications
in the usual way. It’s necessary… • Some transformations in the data are normally necessary for
• To know the format of the raw results consistency
• To have access to the results
DEPEND 2009, Athens/Glyfada, Greece, June 18,2009 61 DEPEND 2009, Athens/Glyfada, Greece, June 18,2009 62

Basic elements of the proposed Basic elements of the proposed

approach approach
Experiment Multidimensional Analysis Experiment Multidimensional Analysis
s database s database
Exp. Setup A Ad hoc Exp. Setup A Ad hoc
queries queries

Exp. Setup B Exp. Setup B

Loading Data Loading Data

Warehouse Net Warehouse Net
applications applications
Statistical Statistical
Exp. Setup N Reporting Exp. Setup N Reporting

Data warehouse
• Raw data is available in a standard star schema (facts + dimensions)
Analysis
• Results from different experiments are compatible and can be compared/
• Commercial OLAP tools are used to analyze the raw data and
analyzed together, then they are stored in the same star schema (or in compute the measures. These tools are designed to be used by
scheme that share at least one dimension) managers: very easy to use :-)
• If results are from different unrelated experiments then they are stored in a • Just need an internet browser to analyze the data
separated schema

DEPEND 2009, Athens/Glyfada, Greece, June 18,2009 63 DEPEND 2009, Athens/Glyfada, Greece, June 18,2009 64

Steps needed to put our approach Example: Recovery and

into practice Performance Evaluation in DBMS
• Tuning of a large DBMS is very complex
1. Definition of the adequate star schema to store
• Administrators tend to focus on performance
the data. Create the tables in the data warehouse
tuning and disregard the recovery features
2. Use general-purpose loading application to • Administrators seldom have feedback on how
define the loading plans for each table in the star good a given configuration is
schema
• A technique to characterize the performance
3. Run the loading plans to load the star tables and the recoverability in DBMS is needed
with the raw data collected from the experiments

4. Every time a new experiment is done 65 DEPEND 2009, Athens/Glyfada, Greece, June 18,2009 66

corresponding loading plans are run again to add

the new data to the data warehouse

5. Analyze the data: calculate measures,

find unexpected results, analyze trends, 11
etc
DEPEND 2009, Athens/Glyfada, Greece, June 18,2009
Marco Vieira, University of Coimbra, Portugal

Operator faults injection and

The Approach
recovery
• Extending existing performance benchmarks
to evaluate recoverability features in DBMS
• Include a faultload and new measures
• Faultload based on operator faults
• Measures related to recovery:
– Recovery time
– Data integrity violations
– Lost transactions

DEPEND 2009, Athens/Glyfada, Greece, June 18,2009 67 DEPEND 2009, Athens/Glyfada, Greece, June 18,2009 68

Experimental setup The data storage model

Test

DEPEND 2009, Athens/Glyfada, Greece, June 18,2009 69

DEPEND 2009, Athens/Glyfada, Greece, June 18,2009

Definition of the adequate star

Steps towards data analyzes
schema: Identify the process/activity
1. Definition of the adequate star schema • Experiments to characterize the performance
a. Identify the process/activity and the recoverability in DBMS
b. Identify the facts • Includes a faultload and new measures
c. Identify the dimensions
d. Define the data granularity • Faultload based on operator faults

2. Load the data • Measures related to recovery

3. Analyze the data

DEPEND 2009, Athens/Glyfada, Greece, June 18,2009 71 DEPEND 2009, Athens/Glyfada, Greece, June 18,2009 72

12
Marco Vieira, University of Coimbra, Portugal

Definition of the adequate star Definition of the adequate star

schema: Identify the facts schema: Identify the dimensions

DEPEND 20 09, Athens/Glyfada, Greece, June 18,2009 73 DEPEND 20 09, Athens/Glyfada, Greece, June 18,2009 74

Definition of the adequate star

The star schema
schema: Define the data granularity
• Performance and recovery results
– Per experiment
– Per SUT
– Per workload
– Per fault type

DEPEND 2009, Athens/Glyfada, Greece, June 18,2009 75 DEPEND 2009, Athens/Glyfada, Greece, Ju ne 18,2009 76

Analyze the data: Example of query

Load the data
construction

ETL

DEPEND 2009, Athens/Glyfada, Greece, June 18,2009 77 DEPEND 2009, Athens/Glyfada, Greece, June 18,2009 78

13
Marco Vieira, University of Coimbra, Portugal

Analyze the data: Example of query

Questions
answer

?
DEPEND 2009, Athens/Glyfada, Greece, June 18,2009 79 DEPEND 2009, Athens/Glyfada, Greece, June 18,2009 80

AMBER Repository
vision and objectives
• Vision
− Become a worldwide repository
for dependability related data

4. The AMBER Data • Key objectives:

− Provide state-of-the-art data

Repository analysis
− Allow data comparison and
cross-exploitation
− Facilitate worldwide data
sharing and dissemination

• Potential tool to increase

DEPEND the
2009, Athens/Glyfada, Greece, June 18,2009 82

impact of research

Potential use Data analysis approach

• Research team level • Repository to analyze, compare, and share

− Perform the analysis of data in an efficient way results
− Efficient dissemination of the results of the team • Use a business intelligence approach:
• Project level − Data warehouse to store data
− Sharing and cross-exploitation of results from − On-Line Analytical Processing (OLAP) to analyze data
different project teams − Data mining algorithms to identify (unknown)
• World wide phenomena in the data
− Common repository to store and share data − Information retrieval to access data in textual formats
− Many teams are performing dependability • Adopt the same life cycle of BI data
evaluation but there are no results available at the
web • Use technology already available for DW, DM &
DEPEND 2009, Athens/Glyfada, Greece, June 18,2009 83 DEPEND 2009, Athens/Glyfada, Greece, June 18,2009 84
IR

14
Marco Vieira, University of Coimbra, Portugal

Step User registration

s
1. User registration • ADR users must undergo a registration
procedure
2. Multidimensional analysis
• Provide identification information that is
3. Definition of the loading plans verified by the ADR support team
7. Load the data − To filter malicious users

8. Definition of data ownership policies • Contact information is used to get in touch

with the potential repository user
9. Analysis of the data
• • To access the repository users must
Analyze DBench-OLTP results using
OLAP authenticate
DEPEND 2009, Athens/Glyfada, Greece, June 18,2009 85 DEPEND 2009, Athens/Glyfada, Greece, June 18,2009 86

Multidimensional analysis The DBench-OLTP benchmark

• Design an adequate multidimensional data

model
• User has the required expertise to design the
data model 
− Send to the ADR support team the SQL scripts
needed to create the database tables
• The ADR team helps the user defining the
model
− The user only needs to explain us the experimental
setup and the format of the data collected
DEPEND 2009, Athens/Glyfada, Greece, June 18,2009 87 DEPEND 2009, Athens/Glyfada, Greece, June 18,2009 88

Format of the raw data Data model (1)

• Raw data collected by DBench-OLTP is • Key steps:

composed of tens of CSV files (one from each − Identification of the facts that characterize the
run) problem under analysis
− Identification of the dimensions that may influence
• Each row contains data from an injection slot the facts
− Identification, duration, number of transactions − Definition of the granularity of the data stored in the
executed, data integrity errors discovered, type of star schema
fault injected, moment of fault injection, workload
used, etc)

• A text file describes the experiment and

the characteristics of the SUB
DEPEND 2009, Athens/Glyfada, Greece, June 18,2009 89 DEPEND 2009, Athens/Glyfada, Greece, June 18,2009 90

15
Marco Vieira, University of Coimbra, Portugal

Data model (2) Definition of the loading plans

• Data extraction
− SQL scripts to extract data from the CSV files to a
temporary database schema (data staging area)
• Data transformation
− SQL scripts transform the data into an adequate format

• Data load
− SQL scripts to load the transformed data into the
data warehouse
• Loading plans documented and stored in
DEPEND 2009, Athens/Glyfada, Greece, June 18,2009 91
the ADR
DEPEND 2009, Athens/Glyfada, Greece, June 18,2009

Load the data Data ownership policy

• Executing the loading plans created before • Data ownership policies of ADR are divided in two
main groups
• If new data becomes available we just need to
− Private data
rerun the plans
− Proprietary data
− e.g., if the benchmark is executed in other systems
− Collaborative data
• The documentation of the DBench-OLTP
• For the DBench-OLTP data we have decided to
includes papers and technical reports
use a collaborative approach
− This is considered as part of the DBench-OLTP
− Allows other potential users of the benchmark to
data
compare their results with the ones available in the
− It is loaded to the repository and made available ADR
to the potential readers of the data
DEPEND 2009, Athens/Glyfada, Greece, June 18,2009 93 DEPEND 2009, Athens/Glyfada, Greece, June 18,2009 94

Analysis of the data OLAP Wizard

• On-line Analytical Processing (OLAP) tools • Selection of query type (crosstab or

− Support the analysis in a very flexible way table) and characteristics (title, graph,
− Provide high query performance and easy, intuitive text area, etc)
data navigation
• Selection of measures and dimensional
• Oracle Business Intelligence Discoverer Plus attributes
(ODP)
• Setting the query layout
− Commercial tool included in Oracle Business
Intelligence package • Selection of the fields to be used to sort
− Widely used by industry Used freely for research
the results
purposes under an Oracle Academy Agreement
• Creation of parameters used to filter data
DEPEND 2009, Athens/Glyfada, Greece, June 18,2009 95 DEPEND 2009, Athens/Glyfada, Greece, June 18,2009 96

16
Marco Vieira, University of Coimbra, Portugal

Some results Quick demo…

• Murphy's law…


DEPEND 2009, Athens/Glyfada, Greece, June 18,2009 97 DEPEND 2009, Athens/Glyfada, Greece, June 18,2009 98

http://www.amber-project.eu Questions

Do you have
data? ?
Share Them!

DEPEND 2009, Athens/Glyfada, Greece, June 18,2009 99 DEPEND 2009, Athens/Glyfada, Greece, June 18,2009 100

Generic bibliography ADR bibliography

• Ralph Kimbal, Margy Ross, “The Data • Madeira, H., Costa, J., Vieira, M. , "The OLAP and Data
Warehousing Approaches for Analysis and Sharing of Results
Warehouse Toolkit: The Complete Guide to from Dependability Evaluation Experiments", International
Dimensional Modeling” (Second Edition), Ed. Conference on Dependable Systems and Networks, DSN-
J. Wiley & Sons, Inc, 2002. DCC 2003, San Francisco, CA, USA, June 2003
• Pintér, G., Madeira, H., Vieira, M., Pataricza, A., Majzik, I. , "A
• Ralph Kimbal, “The Data Warehouse Lifecycle Data Mining Approach to Identify Key Factors in Dependability
Toolkit”, Ed. J. Wiley & Sons, Inc, 2001. Experiments", Fifth European Dependable Computing
Conference (EDCC-5), Budapest, Hungary, April 2005

DEPEND 2009, Athens/Glyfada, Greece, June 18,2009 101 DEPEND 2009, Athens/Glyfada, Greece, June 18,2009 102

17
Marco Vieira, University of Coimbra, Portugal

ADR bibliography

• Pintér, G., Madeira, H., Vieira, M., Majzik, I., Pataricza, A. ,

"Integration of OLAP and Data Mining for Analysis of Results
from Dependability Evaluation Experiments", International
Journal of Knowledge Management Studies (IJKMS), Volume
2 – Issue 4 – 2008, Inderscience Publishers, July 2008
• Vieira, M., Mendes, N., Durães, J., Madeira, H. , "The
AMBER Data Repository", DSN 2008 Workshop on
Resilience Assessment and Dependability Benchmarking
(DSN-RADB08), Anchorage, Alaska, June 2008
• Vieira, M., Mendes, N., Durães, J. , "A Case Study on Using
the AMBER Data Repository for Experimental Data Analysis",
SRDS 2008 Workshop on Sharing Field Data and Experiment
Measurements on Resilience of Distributed Computing
Systems, Naples, Italy, October 2008
DEPEND 2009, Athens/Glyfada, Greece, June 18,2009 103

Lecture 2 - Introduction To Big Data Analytics
No ratings yet
Lecture 2 - Introduction To Big Data Analytics
55 pages
BI-Using The AMBER Data Repository To Analy
No ratings yet
BI-Using The AMBER Data Repository To Analy
18 pages
Data Mining Mid Syllabus
No ratings yet
Data Mining Mid Syllabus
185 pages
Bi 2025
No ratings yet
Bi 2025
106 pages
Multi-Terabyte MySQL Data Warehouses - Absolutely! Presentation
100% (1)
Multi-Terabyte MySQL Data Warehouses - Absolutely! Presentation
33 pages
Mining Kind of Data
No ratings yet
Mining Kind of Data
24 pages
DWDM 011
No ratings yet
DWDM 011
48 pages
Business Intelligence and Data Warehousing
No ratings yet
Business Intelligence and Data Warehousing
117 pages
DBMS II Seven 7
No ratings yet
DBMS II Seven 7
13 pages
Cours-BI 2
No ratings yet
Cours-BI 2
52 pages
Chapter6 DataWareHousing Final
No ratings yet
Chapter6 DataWareHousing Final
46 pages
Data Warehousing
No ratings yet
Data Warehousing
21 pages
Cours BI - 23 - 24 - Session - 2 & 3
No ratings yet
Cours BI - 23 - 24 - Session - 2 & 3
101 pages
Data Mining
No ratings yet
Data Mining
142 pages
AFM - Module 1-2
No ratings yet
AFM - Module 1-2
45 pages
Data Warehousing 1
No ratings yet
Data Warehousing 1
29 pages
BI Unit 1 Data Warehouse
No ratings yet
BI Unit 1 Data Warehouse
169 pages
Data Warehousing - Lecture - 4
No ratings yet
Data Warehousing - Lecture - 4
15 pages
IT DWDM Unit I New PPT
No ratings yet
IT DWDM Unit I New PPT
60 pages
CH3 Data Warehousing
No ratings yet
CH3 Data Warehousing
51 pages
المستند
No ratings yet
المستند
23 pages
Lecture 1
No ratings yet
Lecture 1
43 pages
Gate Pass Managment System
100% (3)
Gate Pass Managment System
26 pages
Bid M Course
No ratings yet
Bid M Course
76 pages
Data Mining and Warehosuing Lecture 01
No ratings yet
Data Mining and Warehosuing Lecture 01
36 pages
DWDM Unit 1
No ratings yet
DWDM Unit 1
122 pages
Lecture DW 021
No ratings yet
Lecture DW 021
195 pages
Prof. Ramesh Behl Rbehl@imi - Edu
No ratings yet
Prof. Ramesh Behl Rbehl@imi - Edu
60 pages
Business Intelligence - Data Warehouse Implementation
100% (1)
Business Intelligence - Data Warehouse Implementation
157 pages
CH 1
No ratings yet
CH 1
65 pages
Modeling&ETLDesign PDF
No ratings yet
Modeling&ETLDesign PDF
71 pages
1a Ravi
No ratings yet
1a Ravi
37 pages
UML - Structural and Behavioral Things
100% (1)
UML - Structural and Behavioral Things
34 pages
Iare DWDM PPT Cse
No ratings yet
Iare DWDM PPT Cse
249 pages
Traditional Enterprise BI
No ratings yet
Traditional Enterprise BI
47 pages
Data Warehouse Modeling
100% (1)
Data Warehouse Modeling
87 pages
Ba Important
No ratings yet
Ba Important
13 pages
Data Warehouse & Data Mining
No ratings yet
Data Warehouse & Data Mining
41 pages
DWM Unit-I Notes
No ratings yet
DWM Unit-I Notes
9 pages
UNIT-1 (RIT-062) : Data Warehousing
No ratings yet
UNIT-1 (RIT-062) : Data Warehousing
34 pages
DWDM Unit 1
No ratings yet
DWDM Unit 1
103 pages
Data Warehouse
No ratings yet
Data Warehouse
5 pages
Sample Model
No ratings yet
Sample Model
33 pages
DW Life Cycle
No ratings yet
DW Life Cycle
114 pages
Introduction To Data Warehousing and Business Intelligence
No ratings yet
Introduction To Data Warehousing and Business Intelligence
72 pages
Building A DW
No ratings yet
Building A DW
3 pages
Data Warehousing: Chetan R Assistant Professor, Dept. of ISE SJB Institute of Technology
No ratings yet
Data Warehousing: Chetan R Assistant Professor, Dept. of ISE SJB Institute of Technology
23 pages
Understanding Data Warehouse
No ratings yet
Understanding Data Warehouse
24 pages
Data Warehouse
No ratings yet
Data Warehouse
5 pages
Home Automation Catalogue
No ratings yet
Home Automation Catalogue
78 pages
Introduction To Business Intelligence and Data Analysis
No ratings yet
Introduction To Business Intelligence and Data Analysis
40 pages
Unit No: 01 Introduction To Data Warehouse: by Pratiksha Meshram
No ratings yet
Unit No: 01 Introduction To Data Warehouse: by Pratiksha Meshram
38 pages
DW Notes
No ratings yet
DW Notes
72 pages
Business Intelligence/ Data Warehousing: Lakshmi Prashad PMG
100% (1)
Business Intelligence/ Data Warehousing: Lakshmi Prashad PMG
101 pages
11 Stefano Rizzi DW and Beyond
No ratings yet
11 Stefano Rizzi DW and Beyond
33 pages
3.T3439-Advanced Data Driven Decision Making
No ratings yet
3.T3439-Advanced Data Driven Decision Making
4 pages
Object Oriented Programming Using Java String Handling
No ratings yet
Object Oriented Programming Using Java String Handling
60 pages
Unit 2-chapter-6-DES
No ratings yet
Unit 2-chapter-6-DES
76 pages
Bitcoin MOOC Lecture 1
No ratings yet
Bitcoin MOOC Lecture 1
54 pages
C++ Programming
No ratings yet
C++ Programming
46 pages
DM Part 2
No ratings yet
DM Part 2
24 pages
NDLI CLUB Registration Process: Website: Https://club - Ndl.iitkgp - Ac.in/club-Home Support Email
0% (1)
NDLI CLUB Registration Process: Website: Https://club - Ndl.iitkgp - Ac.in/club-Home Support Email
27 pages
PC Troubleshooting I Syllabus
100% (2)
PC Troubleshooting I Syllabus
6 pages
Overcoming LLM Challenges Using RAG-Driven Precision in Coffee Leaf Disease Remediation
No ratings yet
Overcoming LLM Challenges Using RAG-Driven Precision in Coffee Leaf Disease Remediation
6 pages
Syllabus 2024 Even BCT
No ratings yet
Syllabus 2024 Even BCT
4 pages
Chapter 5 BI Intelligence
No ratings yet
Chapter 5 BI Intelligence
30 pages
Case Study (Chapter 7) BI Intelligencee
No ratings yet
Case Study (Chapter 7) BI Intelligencee
37 pages
DWH Start l2
No ratings yet
DWH Start l2
117 pages
Generative AI in Higher Education - A Strategic Framework For Transformation and Innovation
No ratings yet
Generative AI in Higher Education - A Strategic Framework For Transformation and Innovation
16 pages
ASCENTZ IEEE Titles 2021 - 2022
No ratings yet
ASCENTZ IEEE Titles 2021 - 2022
41 pages
BI ModelAdequacyinEconometrics
No ratings yet
BI ModelAdequacyinEconometrics
33 pages
Data Mining& Data Warehousing.
No ratings yet
Data Mining& Data Warehousing.
13 pages
Cisco SmartPlay Select (SP) - Smartplay Select Program Guide SP v2.2
100% (1)
Cisco SmartPlay Select (SP) - Smartplay Select Program Guide SP v2.2
31 pages
Computer Capsule July 2015
No ratings yet
Computer Capsule July 2015
19 pages
Nptel Lec1
No ratings yet
Nptel Lec1
21 pages
1a-OpenCV4Android SDK
No ratings yet
1a-OpenCV4Android SDK
7 pages
Experiment L TLC
No ratings yet
Experiment L TLC
5 pages
0000389000-02 Tendersure Africa Sez Limited
No ratings yet
0000389000-02 Tendersure Africa Sez Limited
1 page
123intelligentagent 190406015753
No ratings yet
123intelligentagent 190406015753
12 pages
Chapter 1 Business View of IT
No ratings yet
Chapter 1 Business View of IT
12 pages
Pharma Batch: Data Warehousing
No ratings yet
Pharma Batch: Data Warehousing
32 pages
1 Enterprise Data Architecture and Data Governance Final
No ratings yet
1 Enterprise Data Architecture and Data Governance Final
7 pages
1 Spring Blockchain - Presentation
No ratings yet
1 Spring Blockchain - Presentation
8 pages
Chapter 1 BCT
No ratings yet
Chapter 1 BCT
10 pages
3-Role of Metadata Data Governance
No ratings yet
3-Role of Metadata Data Governance
7 pages
5.advanced Scripts in Scratch
No ratings yet
5.advanced Scripts in Scratch
36 pages
How Evolution of Database Led To Data Mining
No ratings yet
How Evolution of Database Led To Data Mining
10 pages
Google Cloud Professional Cloud Architect Exam Prep Sheet
100% (2)
Google Cloud Professional Cloud Architect Exam Prep Sheet
15 pages
Case-Study - Two-Global-Banks-Stay-Ahead-of-Fraudsters-MD009731
No ratings yet
Case-Study - Two-Global-Banks-Stay-Ahead-of-Fraudsters-MD009731
1 page
Ict 11 Hardware and Software
No ratings yet
Ict 11 Hardware and Software
34 pages
LogicalDOC Security
No ratings yet
LogicalDOC Security
11 pages
Linux Installation Overview
No ratings yet
Linux Installation Overview
9 pages
Session One
No ratings yet
Session One
26 pages
Oops Interview Questions
No ratings yet
Oops Interview Questions
10 pages
DDR4 Basics
No ratings yet
DDR4 Basics
12 pages
Non-Terrestrial Networks in 5G Amp Beyond A Survey
No ratings yet
Non-Terrestrial Networks in 5G Amp Beyond A Survey
23 pages
Sap MM Index
No ratings yet
Sap MM Index
11 pages
DOCUMEN
No ratings yet
DOCUMEN
10 pages
More
No ratings yet
More
4 pages
Shared Repository Pattern
No ratings yet
Shared Repository Pattern
10 pages
Verinite Profile - Johins Johnson - Senior Consultant
No ratings yet
Verinite Profile - Johins Johnson - Senior Consultant
5 pages
NRPL ADS-B Esite A4 2002114 2
No ratings yet
NRPL ADS-B Esite A4 2002114 2
3 pages
Milestone XProtect Essential NVR Cheat Sheet
No ratings yet
Milestone XProtect Essential NVR Cheat Sheet
3 pages
CSC429 - Assignment - Storage Medium
No ratings yet
CSC429 - Assignment - Storage Medium
9 pages
Computer Networking Image Gallery: Inside This Article
No ratings yet
Computer Networking Image Gallery: Inside This Article
6 pages
Scrivener Keyboard Shortcuts
No ratings yet
Scrivener Keyboard Shortcuts
3 pages
Implementing the Stakeholder Based Goal-Question-Metric (Gqm) Measurement Model for Software Projects
From Everand
Implementing the Stakeholder Based Goal-Question-Metric (Gqm) Measurement Model for Software Projects
Dr. Prashanth Harish Southekal
No ratings yet
NUnit in Practice: Definitive Reference for Developers and Engineers
From Everand
NUnit in Practice: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Jaeger Distributed Tracing in Practice: Definitive Reference for Developers and Engineers
From Everand
Jaeger Distributed Tracing in Practice: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet

1c-OLAP-BI-Using The AMBER Data Repository To Analy

Uploaded by

1c-OLAP-BI-Using The AMBER Data Repository To Analy

Uploaded by

Tutorial The AMBER Project

• Assessing, Measuring and Benchmarking

resilience measurement and benchmarking in

Athens/Gl yfada, Gr eece, J une 18,2009

computer systems and infrastructures

Current challenges AMBER objectives

• Quality of measurements • State-of-the art survey

This Tutorial… Problems

• How to analyze the usually large amount of

Current situation ADR Vision and objectives

• The situation today is not good!!! • Vision

Data analysis approach Outline

• Repository to analyze, compare, and share 1. Business Intelligence

DEPEND 2009, Athens/Glyfada, Greece, June 18,2009

What is Business Intelligence?

• Business Intelligence (BI):

Five classic BI questions Typical BI technologies

• ETL Tools (Extract, Transform, and Load)

Many proprietary products Some open source/free producs

ACE*COMM Microsoft SAS Institute • Eclipse BIRT Project:.

What is a Data Warehouse?

• Big database that stores data for decision

2. Data Warehousing transactional DB and other operational systems

& OLAP & other systems Data Warehouse

DEPEND 2009, Athens/Glyfada, Greece, June 18,2009 18

Basic DW components Data volume

Data warehouse Users

Data • From 100 Gbytes to 1 TBytes

Some characteristics Temporal dependency

• Temporal dependency • The data is collected over time

Non volatile Target oriented

Data integration and consistency Designed for queries

• In a operational environment the information • After being load the

Dimensional model The multidimensional model

• Typical model in operational databases: E/R • Facts stored in a multidimensional array

Star model Facts

• The typical dimensional model is a star • Represent the business measures

Facts table Dimensions

• Comprises several numeric attributes (facts) • Each dimension represents a business

• Contains normally a large number • Represent different point-of-views for the

Dimension tables Star schema example

• Strongly denormalized Stores

(when compared to the facts table) Name

Low level queries User interfaces

Queries - Slice and Dice Drill-Down & Roll-Up

Most detailed category

Steps for the design of the star

Example – Retail sales Retail sales – Business data

• Sells thousands of products • Goals?

Retail sales – Facts Retail sales – Dimensions

• Examples of relevant decision support facts: • Main dimensions:

Retail sales Granularity

ID_product • Example: record the daily sales for all products

Retail sales – Details Retail sales – Details

Retail sales – Details Retail sales – Details

More than one star Several stars

• Drill across: query that crosses more than one

DEPEND 2009, Athens/Glyfada, Greece, June 18,2009 53

Basic elements of a DW A DW for experimental data

Two types of data in experimental

Exp. Setup B Experiment

The multidimensional model The star schema

• Facts are stored in a multidimensional array

Basic elements of the proposed Basic elements of the proposed

Exp. Setup B Exp. Setup B

Basic elements of the proposed Basic elements of the proposed

Exp. Setup B Exp. Setup B

Loading Data Loading Data

Steps needed to put our approach Example: Recovery and

corresponding loading plans are run again to add

5. Analyze the data: calculate measures,

Operator faults injection and

Experimental setup The data storage model

DEPEND 2009, Athens/Glyfada, Greece, June 18,2009 69

Definition of the adequate star

2. Load the data • Measures related to recovery