0% found this document useful (0 votes)
249 views

Data Warehouse Introduction

The document discusses the key differences between OLTP (Online Transaction Processing) and OLAP (Online Analytical Processing) systems. OLTP systems are operational systems that handle high volumes of simple transactions like inserts and updates, while OLAP systems are used for analysis and involve more complex queries and aggregations on historical data stored in a dimensional model. The document also provides an overview of common Business Intelligence (BI) components like data warehouses, ETL processes, BI tools, and ETL tools. It describes the extract, transform, load (ETL) process for moving data from source systems into a data warehouse for analysis.

Uploaded by

Venkatesan Raj
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
249 views

Data Warehouse Introduction

The document discusses the key differences between OLTP (Online Transaction Processing) and OLAP (Online Analytical Processing) systems. OLTP systems are operational systems that handle high volumes of simple transactions like inserts and updates, while OLAP systems are used for analysis and involve more complex queries and aggregations on historical data stored in a dimensional model. The document also provides an overview of common Business Intelligence (BI) components like data warehouses, ETL processes, BI tools, and ETL tools. It describes the extract, transform, load (ETL) process for moving data from source systems into a data warehouse for analysis.

Uploaded by

Venkatesan Raj
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

DATA Warehouse

OLTP vs. OLAP


We can divide IT systems into transactional (OLTP) and analytical (OLAP). In general we can assume that OLTP systems provide source data to data warehouses, whereas OLAP systems help to analyze it.

- OLTP (On-line Transaction Processing) is characterized by a large number of short on-line transactions (INSERT, UPDATE, DELETE). The main emphasis for OLTP systems is put on very fast query processing, maintaining data integrity in multi-access environments and an effectiveness measured by number of transactions per second. In OLTP database there is detailed and current data, and schema used to store transactional databases is the entity model (usually 3NF). - OLAP (On-line Analytical Processing) is characterized by relatively low volume of transactions. Queries are often very complex and involve aggregations. For OLAP systems a response time is an effectiveness measure. OLAP applications are widely used by Data Mining techniques. In OLAP database there is aggregated, historical data, stored in multi-dimensional schemas (usually star schema).

The following table summarizes the major differences between OLTP and OLAP system design.

OLTP System Online Transaction Processing (Operational System)


Source of data Purpose of data What the data Inserts and Updates Queries Processing Speed Space Requirements Database Design Backup and Recovery Operational data; OLTPs are the original source of the data. To control and run fundamental business tasks Reveals a snapshot of ongoing business processes Short and fast inserts and updates initiated by end users Relatively standardized and simple queries Returning relatively few records Typically very fast

OLAP System Online Analytical Processing (Data Warehouse)


Consolidation data; OLAP data comes from the various OLTP Databases To help with planning, problem solving, and decision support Multi-dimensional views of various kinds of business activities Periodic long-running batch jobs refresh the data Often complex queries involving aggregations Depends on the amount of data involved; batch data refreshes and complex queries may take many hours; query speed can be improved by creating indexes Larger due to the existence of aggregation structures and history data; requires more indexes than OLTP Typically de-normalized with fewer tables; use of star and/or snowflake schemas

Can be relatively small if historical data is archived Highly normalized with many tables

Backup religiously; operational data is critical Instead of regular backups, some environments to run the business, data loss is likely to may consider simply reloading the OLTP data as a entail significant monetary loss and legal recovery method liability

What is Business Intelligence?


Business Ingelligence (BI) - technology infrastructure for gaining maximum information from available data for the purpose of improving business processes. Typical BI infrastructure components are as follows: software solution for gathering, cleansing, integrating, analyzing and sharing data. Business Intelligence produces analysis and provides believable information to help making effective and high quality business decisions. The most common kinds of Business Intelligence systems are:

EIS - Executive Information Systems DSS - Decision Support Systems MIS - Management Information Systems GIS - Geographic Information Systems OLAP - Online Analytical Processing and multidimensional analysis CRM - Customer Relationship Management Business Intelligence systems based on Data Warehouse technology. A Data Warehouse(DW) gathers information from a wide range of company's operational systems, Business Intelligence systems based on it. Data loaded to DW is usually good integrated and cleaned that allows to produce credible information which reflected so called 'one version of the true'.

Business Intelligence tools


The most popular BI tools on the market are:

Oracle - Siebel Business Analytics Applications SAS - Business Intelligence SAP - BusinessObjects XI IBM - Cognos 8 BI Oracle - Hyperion System 9 BI+ Microsoft - Analysis Services MicroStrategy - Dynamic Enterprise Dashboards Pentaho - Open BI Suite Information Builders - WebFOCUS Business Intelligence QlikTech - QlikView TIBCO Spotfire - Enterprise Analytics Sybase - InfoMaker KXEN - IOLAP SPSS ShowCase

ETL tools
List of the most popular ETL tools: Informatica - Power Center IBM - Websphere DataStage(Formerly known as Ascential DataStage) SAP - BusinessObjects Data Integrator IBM - Cognos Data Manager (Formerly known as Cognos DecisionStream) Microsoft - SQL Server Integration Services Oracle - Data Integrator (Formerly known as Sunopsis Data Conductor) SAS - Data Integration Studio Oracle - Warehouse Builder AB Initio Information Builders - Data Migrator Pentaho - Pentaho Data Integration Embarcadero Technologies - DT/Studio IKAN - ETL4ALL IBM - DB2 Warehouse Edition Pervasive - Data Integrator ETL Solutions Ltd. - Transformation Manager Group 1 Software (Sagent) - DataFlow Sybase - Data Integrated Suite ETL Talend - Talend Open Studio Expressor Software - Expressor Semantic Data Integration System Elixir - Elixir Repertoire OpenSys - CloverETL

ETL process ETL (Extract, Transform and Load) is a process in data warehousing responsible for pulling data out of the source systems and placing it into a data warehouse. ETL involves the following tasks: - extracting the data from source systems (SAP, ERP, other oprational systems), data from different source systems is converted into one consolidated data warehouse format which is ready for transformation processing. - transforming the data may involve the following tasks: applying business rules (so-called derivations, e.g., calculating new measures and dimensions), cleaning (e.g., mapping NULL to 0 or "Male" to "M" and "Female" to "F" etc.), filtering (e.g., selecting only certain columns to load), splitting a column into multiple columns and vice versa, joining together data from multiple sources (e.g., lookup, merge), transposing rows and columns, applying any kind of simple or complex data validation (e.g., if the first 3 columns in a row are empty then reject the row from processing) - loading the data into a data warehouse or data repository other reporting applications

Business Intelligence Platforms - Sybase

source:http://www.sybase.com
Products included in Sybase Business Intelligence platform: PowerDesigner WorkSpace Industry Warehouse Studio ASE Database Sybase IQ Data Integration Suite (Replication, Data Federation, Real-time Events, Sybase ETL) InfoMaker

You might also like