DWHM 1

This document discusses database software features that are useful for data warehousing. It describes how database management systems (DBMSs) have been enhanced to support large databases and data warehousing functions like data loading, transformation and replication. It also covers indexing techniques, parallel processing options for DBMSs including horizontal, vertical and hybrid parallelism. Key considerations for selecting a DBMS for a data warehouse are discussed like load balancing, query optimization and high performance data loading.

Uploaded by

Catherine Muhi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

31 views12 pages

DWHM 1

Uploaded by

Catherine Muhi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 12

DATABASE SOFTWARE

Prepared by:
CATHERINE M. MUHI
BSAIS 3
Database software
• This Database software that started out for use in operational OLTP
systems have been enhanced to cater to decision support systems.
DBMSs have also been scaled up to support very large databases.
• Some RDBMS products now include support for the data acquisition
area of the data warehouse. Mass loading and retrieval of data from
other database systems have become easier. Some vendors have paid
special attention to the data transformation function. Replication
features have been reinforced to assist in bulk refreshes and
incremental loading of the data warehouse.
• Bit-mapped indexes could be very effective in a data warehouse
environment to index on fields that have a smaller number of distinct
values.
For example, in a database table containing geographic regions, the
number of distinct region codes is few. But frequently, queries involve
selection by regions. In this case, retrieval by a bit-mapped index on the
region code values can be very fast. Vendors have strengthened this type
of indexing.
Parallel Processing Options
• intended only for machines with multiple processors.
• Most of the current database software can parallelize a large number of
operations. These operations include the following: mass loading of data,
full table scans, queries with exclusion conditions, queries with grouping,
selection with distinct values, aggregation, sorting, creation of tables using
subqueries, creating and rebuilding indexes, and so on. Notice that this is an
impressive list of operations that the RDBMS can process in parallel.
Interquery Parallelization
• In this method, several server processes handle multiple requests simultaneously.
• Multiple queries may be serviced based on your server configuration and the
number of available processors. You may successfully take advantage of this
feature of the DBMS on SMP systems, thereby increasing the throughput and
supporting more concurrent users.
• However, interquery parallelism is limited. Multiple queries are processed
concurrently, but each query is still being processed serially by a single server
process. Suppose a query consists of index read, data read, join, and sort
operations; these operations are carried out in this order. Each operation must
finish before the next one can begin. Parts of the same query do not execute in
parallel. To overcome this limitation, many DBMS vendors have come up with
versions of their products to provide intraquery parallelization
Intraquery Parallelization
Let us say a query from one of your users consists of an index
read, a data read, a data join, and a data sort from the data
warehouse database. A serial processing DBMS will process
this query in the sequence of these base operations and produce
the result set. However, while this query is executing on one
processor in the SMP system, other queries can execute in
parallel. This method is the interquery parallelization discussed
above. The first group of operations in Figure 8-15 illustrates
this method of execution.
Using the intraquery parallelization technique, the DBMS splits
the query into the lower level operations of index read, data
read, data join, and data sort. Then each one of these basic
operations is executed in parallel on a single processor. The
final result set is the consolidation of the intermediary results.
Let us review three ways a DBMS can provide intraquery
parallelization, that is, parallelization of parts of the operations
within the same query itself.
Horizontal Parallelism
• The data is partitioned across multiple
disks. Parallel processing occurs
within each single task in the query;
for example, data read, which is
performed on multiple processors
concurrently on different sets of data
to be read from multiple disks. After
the first task is completed from all of
the relevant parts of the partitioned
data, the next task of that query is
carried out, and then the next one
after that task, and so on.
Vertical Parallelism
• This kind of parallelism occurs among different tasks, not just a single task
in a query as in the case of horizontal parallelism.
• All component query operations are executed in parallel, but in a pipelined
manner.
• This assumes that the RDBMS has the capability to decompose the query
into subtasks; each subtask has all the operations of index read, data read,
join, and sort. Then each subtask executes on the data in serial fashion. In
this approach, the database records are ideally processed by one step and
immediately given to the next step for processing, thus avoiding wait
times. Of course, in this method, the DBMS must possess a very high level
of sophistication in decomposing tasks.
Hybrid Method
• In this method, the query decomposer partitions the query both
horizontally and vertically. Naturally, this approach produces the best
results. You will realize the greatest utilization of resources, optimal
performance, and high scalability.
Selection of the DBMS
• Our discussions of the server hardware and the DBMS parallel
processing options must have convinced you that selection of the
DBMS is most crucial. You must choose the server hardware with the
appropriate parallel architecture. Your choice of the DBMS must
match with the selected server hardware. These are critical decisions
for your data warehouse.
Apart from the criteria that the selected DBMS must have load balancing and parallel processing
options, the other key features listed below must be considered when selecting the DBMS for your
data warehouse.

• Query governor - to anticipate and abort runaway queries

• Query optimizer - to parse and optimize user queries Query management - to balance
the execution of different types of queries
• Load utility - for high-performance data loading, recovery, and restart
• Metadata management - with an active data catalog or dictionary
• Scalability - in terms of both number of users and data volumes
• Extensibility - having hybrid extensions to OLAP databases
• Portability - across platforms
• Query tool Application Program Interfaces (APIs)—for tools from leading vendors
• Administration - providing support for all DBA functions
END.

Unit 5 Parallel and Distributed Databases
No ratings yet
Unit 5 Parallel and Distributed Databases
22 pages
Unit 1-1
No ratings yet
Unit 1-1
60 pages
FortiSASE - Training
No ratings yet
FortiSASE - Training
73 pages
Sap Hana Tutorial
93% (14)
Sap Hana Tutorial
160 pages
FCR Prima PDF
100% (1)
FCR Prima PDF
320 pages
Adaptive Query Processing
No ratings yet
Adaptive Query Processing
140 pages
The DB2 Engine - The Life Cycle of A SQL Statement
No ratings yet
The DB2 Engine - The Life Cycle of A SQL Statement
70 pages
Presentation9 - Query Processing and Query Optimization in DBMS
No ratings yet
Presentation9 - Query Processing and Query Optimization in DBMS
36 pages
Ads Unit 3
No ratings yet
Ads Unit 3
8 pages
6.830/6.814 - Notes For Lecture 4: Database Internals Overview
No ratings yet
6.830/6.814 - Notes For Lecture 4: Database Internals Overview
7 pages
Oracle Explain Plans EXPLAINED
100% (1)
Oracle Explain Plans EXPLAINED
35 pages
DBMS
No ratings yet
DBMS
27 pages
Querry Processing and Indexing, Hashing
No ratings yet
Querry Processing and Indexing, Hashing
24 pages
SQL Tuning Workshop v1
100% (2)
SQL Tuning Workshop v1
360 pages
Data Warehouse Fourth Unit Notes
No ratings yet
Data Warehouse Fourth Unit Notes
11 pages
ADBMS Parallel and Distributed Databases
No ratings yet
ADBMS Parallel and Distributed Databases
98 pages
8 Query Optimization
No ratings yet
8 Query Optimization
39 pages
ADB - CH2 - Advanced SQL
No ratings yet
ADB - CH2 - Advanced SQL
60 pages
CPS 216: Advanced Database Systems: Shivnath Babu
No ratings yet
CPS 216: Advanced Database Systems: Shivnath Babu
23 pages
Solutions For Improving Data Extraction From Virtual Data Warehouses
No ratings yet
Solutions For Improving Data Extraction From Virtual Data Warehouses
10 pages
DBMS Unit 4
No ratings yet
DBMS Unit 4
9 pages
Revis Ioin
No ratings yet
Revis Ioin
5 pages
Co-So-Du-Lieu - Carlo-A.-Curino - Mit6 - 830f10 - Lec04-Introduction-To-Database-Internals - (CC) - (Cuuduongthancong - Com)
No ratings yet
Co-So-Du-Lieu - Carlo-A.-Curino - Mit6 - 830f10 - Lec04-Introduction-To-Database-Internals - (CC) - (Cuuduongthancong - Com)
7 pages
04 Advanced Database System Chap 02 (RVUNC)
No ratings yet
04 Advanced Database System Chap 02 (RVUNC)
50 pages
Databases LEVEL 3 Notes
No ratings yet
Databases LEVEL 3 Notes
29 pages
Lecture 2.1.1
No ratings yet
Lecture 2.1.1
21 pages
DINLect 1
No ratings yet
DINLect 1
69 pages
Simulating Performance of Parallel Database Systems
No ratings yet
Simulating Performance of Parallel Database Systems
6 pages
PDF Document BIDA 2
No ratings yet
PDF Document BIDA 2
21 pages
Chapter 4 Bing
No ratings yet
Chapter 4 Bing
5 pages
Parallel and Distributed Databases in DBMS
No ratings yet
Parallel and Distributed Databases in DBMS
31 pages
Database Performance and Query Optimization
No ratings yet
Database Performance and Query Optimization
334 pages
Dbms Unit-1 - Important Points
No ratings yet
Dbms Unit-1 - Important Points
58 pages
Steps in Program Development
No ratings yet
Steps in Program Development
53 pages
Final - Database Enterprise Manager
No ratings yet
Final - Database Enterprise Manager
68 pages
Elective-I Advanced Database Management Systems: Unit Ii
100% (1)
Elective-I Advanced Database Management Systems: Unit Ii
141 pages
Query Processing in Distributed Database
No ratings yet
Query Processing in Distributed Database
20 pages
Lecture15 Fall
No ratings yet
Lecture15 Fall
102 pages
Geoff Noel
No ratings yet
Geoff Noel
59 pages
Query Optimization in Databases
No ratings yet
Query Optimization in Databases
6 pages
Microsoft - Practicetest.az 400.v2021!04!06.by - Isabelle.70q
No ratings yet
Microsoft - Practicetest.az 400.v2021!04!06.by - Isabelle.70q
65 pages
Database Performance Optimization. Andrey Avtomonov
100% (1)
Database Performance Optimization. Andrey Avtomonov
26 pages
Tunning Dss Queries
No ratings yet
Tunning Dss Queries
16 pages
Dbms
No ratings yet
Dbms
14 pages
Adv DBMS-Unit 2
No ratings yet
Adv DBMS-Unit 2
15 pages
ADTHEORY1
No ratings yet
ADTHEORY1
15 pages
Dbms 1
No ratings yet
Dbms 1
23 pages
Data Models (CH 2)
No ratings yet
Data Models (CH 2)
7 pages
Introduction To Parallel Databases
No ratings yet
Introduction To Parallel Databases
24 pages
Parallel Database: Architecture For Parallel Databases. Parallel Query Evaluation Parallelizing Individual Operations
No ratings yet
Parallel Database: Architecture For Parallel Databases. Parallel Query Evaluation Parallelizing Individual Operations
27 pages
Database Management Systems
No ratings yet
Database Management Systems
42 pages
Introduction To DBMS
No ratings yet
Introduction To DBMS
37 pages
S.D.N Implementation Using Mininet PDF
No ratings yet
S.D.N Implementation Using Mininet PDF
7 pages
Ajp All Practicals
0% (1)
Ajp All Practicals
97 pages
Distributed Query Processing
No ratings yet
Distributed Query Processing
31 pages
Object Oriented Programming (OOP) Using C++ (Freebooks - PK) PDF
No ratings yet
Object Oriented Programming (OOP) Using C++ (Freebooks - PK) PDF
569 pages
Final KTHM College Internship Ship Report Formate Rutuja
100% (1)
Final KTHM College Internship Ship Report Formate Rutuja
22 pages
Unit No.4 Parallel Database
No ratings yet
Unit No.4 Parallel Database
32 pages
Parallel & Distributed Databases: C S 5 6 1 - S P R I N G 2 0 1 2 Wpi, Mohamed Eltabakh
No ratings yet
Parallel & Distributed Databases: C S 5 6 1 - S P R I N G 2 0 1 2 Wpi, Mohamed Eltabakh
23 pages
ParallelDBs PDF
No ratings yet
ParallelDBs PDF
23 pages
Unit - Iii Database Management Systems
No ratings yet
Unit - Iii Database Management Systems
42 pages
Unit VIII - Query Processing and Security
No ratings yet
Unit VIII - Query Processing and Security
29 pages
UNIT-3: Introduction To Parallel Database and I/O Parallelism
No ratings yet
UNIT-3: Introduction To Parallel Database and I/O Parallelism
52 pages
SQL Tuning Workshop v2
100% (1)
SQL Tuning Workshop v2
172 pages
JVM
No ratings yet
JVM
6 pages
Managing Sophos Firewall in Sophos Central
No ratings yet
Managing Sophos Firewall in Sophos Central
32 pages
User Instructions Gemalto Tokens
No ratings yet
User Instructions Gemalto Tokens
5 pages
0a880a0c-59b6-49d9-a2d1-e8b0c35044e1
No ratings yet
0a880a0c-59b6-49d9-a2d1-e8b0c35044e1
558 pages
Appium Essentials - Sample Chapter
100% (1)
Appium Essentials - Sample Chapter
20 pages
Sheet 25
No ratings yet
Sheet 25
5 pages
Migration of SQL Server To Oracle
No ratings yet
Migration of SQL Server To Oracle
7 pages
A6274 - DS800 QuickStart (ROC800)
No ratings yet
A6274 - DS800 QuickStart (ROC800)
60 pages
Comprehensive Guide To Android App Development Using Java Asapp
No ratings yet
Comprehensive Guide To Android App Development Using Java Asapp
4 pages
Notes On Memcached
No ratings yet
Notes On Memcached
5 pages
Compiler and Interpreter: What Is A Compiler?
No ratings yet
Compiler and Interpreter: What Is A Compiler?
2 pages
Lesson Plan
No ratings yet
Lesson Plan
23 pages
Seashore: The Incomplete Guide
No ratings yet
Seashore: The Incomplete Guide
57 pages
Tuning SQL Queries - Oracle
100% (1)
Tuning SQL Queries - Oracle
27 pages
GGG
No ratings yet
GGG
14 pages
Requirements Validation
No ratings yet
Requirements Validation
24 pages
Anton Nazarov IOS 2024
No ratings yet
Anton Nazarov IOS 2024
2 pages
Not So Cozy An Uncomfortable Examination of A Suspected Apt29 Phishing Campaign
No ratings yet
Not So Cozy An Uncomfortable Examination of A Suspected Apt29 Phishing Campaign
8 pages
Powershell Import Update Users Groups From CSV File
No ratings yet
Powershell Import Update Users Groups From CSV File
2 pages
Autotuner BETA NDA
No ratings yet
Autotuner BETA NDA
1 page
Free Chegg Answer From Techcrachi #Ch-129522
No ratings yet
Free Chegg Answer From Techcrachi #Ch-129522
2 pages
Rust Error
No ratings yet
Rust Error
6 pages
Redshift Essentials: Definitive Reference for Developers and Engineers
From Everand
Redshift Essentials: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Practical Data Strategies and Recipes
From Everand
Practical Data Strategies and Recipes
Tom Henricksen
No ratings yet
SAS Interview Questions You'll Most Likely Be Asked
From Everand
SAS Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
No ratings yet
SAS Programming Guidelines Interview Questions You'll Most Likely Be Asked
From Everand
SAS Programming Guidelines Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
No ratings yet

DWHM 1

Uploaded by

DWHM 1

Uploaded by

DATABASE SOFTWARE

• Query governor - to anticipate and abort runaway queries

You might also like