0% found this document useful (0 votes)
4 views6 pages

BI Mod 3

The document outlines various multidimensional database design techniques, including Star Schema, Snowflake Schema, Galaxy Schema, Normalized OLAP, and OLAP Cubes, each serving specific analytical needs. It also discusses the importance of physical database design, detailing aspects like partitioning, clustering, indexing, backup and recovery, and parallel query execution. Additionally, it highlights the roles involved in database design, the significance of incremental rollout in BI, and security measures for BI applications and internet access.

Uploaded by

vikaspotdar143
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views6 pages

BI Mod 3

The document outlines various multidimensional database design techniques, including Star Schema, Snowflake Schema, Galaxy Schema, Normalized OLAP, and OLAP Cubes, each serving specific analytical needs. It also discusses the importance of physical database design, detailing aspects like partitioning, clustering, indexing, backup and recovery, and parallel query execution. Additionally, it highlights the roles involved in database design, the significance of incremental rollout in BI, and security measures for BI applications and internet access.

Uploaded by

vikaspotdar143
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Q 2] List and define the popular multidimensional database design techniques

?
Ans : Multidemension database : Multidimensional database design refers to the organization and
structuring of data in a way that allows for efficient and intuitive analysis of information from
multiple dimensions. In simpler terms, it is a method of arranging data that enables users to view it
from various perspectives simultaneously.Multidimensional database design is crucial for effective
data warehousing and OLAP (Online Analytical Processing) systems.

1. Star Schema:

o Fact Tables: Central tables containing quantitative data (measures) like sales amounts, quantities,
etc.

o Dimension Tables: Surrounding tables that contain descriptive attributes related to the fact
table, like time, product, or location.

o Design: Fact tables are connected to dimension tables through foreign keys, creating a star-like
structure.

2. Snowflake Schema:

o Normalization: Dimension tables are normalized into multiple related tables, reducing
redundancy and improving data integrity.

o Design: A variation of the star schema where dimension tables are split into hierarchical layers,
creating a snowflake-like structure.

3. Galaxy Schema (Fact Constellation Schema):

o Multiple Fact Tables: Contains multiple fact tables that share dimension tables.

o Design: Useful for complex databases where multiple fact tables relate to the same dimensions
but are used for different analytical purposes.

4. Normalized OLAP (NOLAP):

o Normalization: Data is stored in a normalized form, reducing redundancy and improving


consistency.

o Design: Useful in scenarios where OLAP operations are not as critical, and transactional data
processing is more important.

5. OLAP Cubes:

o Cube Design: Defines measures, dimensions, and hierarchies to allow for multidimensional
analysis.

o Design: Cubes store aggregated data and allow for quick querying and analysis across different
dimensions.

These techniques help structure and organize data for efficient querying and analysis,
supporting various business intelligence and reporting needs.
Q 4] Write a note on physical database design ?
Ans : Physical database design is a critical phase in the database development process that
translates the logical data model into a physical structure that is optimized for performance, storage
efficiency, and data integrity. It involves making decisions about how data will be stored, accessed,
and managed on physical storage media.

1.Partitioning :-

* This is particularly important for VLDBs

* Partitioning allows the data of one "logical" table to be spread across multiple physical datasets.

* The physical data distribution is based on a partitioning column, which is most commonly date.

* The partitioning column cannot be a derived column, and it cannot contain NULL values.

2.Clustering :-

* Clustering is a very useful technique for sequential access of large amounts of data.

* Database clustering is the process of connecting more than one single database instance or server
to your system.

3. Indexing :-

* It is a data structure technique used to locate and quickly access data in database.

* There are two extreme indexing strategies, one strategy is to index everything, and the other is to
index nothing.

* The objective of indexing is to organize and categorize information in a way that makes it easier to
retrieve and access.

4.Back up and Recovery :-

* Backup refers to creating copies of important documents and data that are stored on your
computer. This process includes backing up your database, videos and other media.

Recovery is the process of recovering deleted or damaged data from backups. If you accidentally
delete something or get corrupted, you can restore it from your backup.

5.Parallel query execution :-

* To improve the performance of a query, break down a single query into components to be run
concurrently.

* Performance is greatly increased when multiple portions of one query run in parallel on multiple
processors.

* Parallel processing is a very important concept for BI applications and should be considered
whenever possible.
Q 7] List and define the deliverables resulting from the database design
activities ?
ANS : Physical data model :- The physical data model, also known as the logical database design, is a
diagram of the physical database structures that will contain the BI data. Depending on the selected
database design schema, this diagram can be an entity-relationship diagram, a star schema diagram,
or a snowflake diagram. It shows tables, columns, primary keys, foreign keys, cardinality, referential
integrity rules, and indices.

ii] Physical design of the BI target databases :- The physical database design components include
dataset placement, index placement, partitioning, clustering, and indexing. These physical database
components must be defined to the DBMS when the BI target databases are created.

iii] Data definition language :- The DDL is a set of SQL instructions that tells the DBMS what types of
physical database structures to create, such as databases, tablespaces, tables, columns, and indices.

iv] Data control language :- The DCL is a set of SQL instructions that tells the DBMS what types of
CRUD access to grant to people, groups, programs, and tools.

v] Physical BI target databases :- Running (executing) the DDL and DCL statements builds the actual BI
target databases.

vi] Database maintenance procedures :- These procedures describe the time and frequency
allocated for performing ongoing database maintenance activities, such as database backups,
recovery (including disaster recovery), and database reorganizations. The procedures should also
specify the process for and the frequency of performance-monitoring activities.

Q 8] List and describe the roles involved in database design activities ?


ANS : * Application lead developer :- The application lead developer and the database administrator
should review all lessons learned during the prototyping activities. The application lead developer
should help the database administrator determine which queries and reports can be executed in
parallel and what type of security is needed.

* Data administrator :- The data administrator should provide the logical data model and the meta
data to the database administrator. The logical data model and the meta data will be helpful to the
database administrator when he or she designs the BI target databases.

* Database administrator :- The database administrator has the primary responsibility for database
design. He or she needs to know the access paths, weigh the projected data volumes and growth
factors, and understand the platform limitations. He or she must create and run the DDL and DCL to
build the physical databases. In addition, he or she is responsible for choosing the most appropriate
implementation options.

* ETL lead developer :- The ETL process is dependent on the database design. The ETL lead
developer should be involved in the database design activities in order to stay informed about any
database design changes that will affect the ETL process or the ETL programming specifications.
Q 9] What is the importance of incremental rollout in BI ?
ANS : When planning the implementation, use the same iterative approach used when developing
the BI application and the meta data repository. The iterative approach, or incremental rollout, works
well because it reduces the risk of exposing potential defects in the BI application to the entire
organization. In addition, it gives you the opportunity to informally demonstrate the BI concepts and
the BI tool features to the business people who were not directly involved in the BI project. Here are
some suggestions.

* Start with a small group of business people :- This small group should consist of not only "power
users" but also some less technology-savvy knowledge workers and business analysts, as well as the
primary business representative who was involved in the development work as a member of the core
team.

* Treat the business people as customerskeeping customer care in mind. Trouble-free


implementation, interactive training, and ongoing support will help you get their buy-in. Always ask
yourself, "What is in it for the customers?"

* Take the opportunity to test your implementation approach. You may consider adjusting your
implementation approach or modifying the BI application prior to the full rollout (e.g., change
cumbersome logon procedures).

* It may be necessary to duplicate implementation activities at multiple sites. Adding these sites
slowly over time is easier than launching them all at the same time.

Q 10] Briefly explain security measures for BI application ?


ANS : * Security measures for Business Intelligence (BI) applications are essential to protect sensitive
data and ensure that information is accessible only to authorized users.

* BI applications often handle a significant amount of valuable and confidential data.

* Organizations that have strong security umbrellas on their mainframes are more likely to pay
attention to security measures for their BI applications on multi-tier platforms.

* Organizations may unintentionally expose themselves to security breaches, especially if they plan
to deliver information from the BI target databases over the Web.

* No off-the-shelf umbrella security solutions can impose this kind of security. This security
requirement would have to be implemented through the various security features of the database
management system (DBMS) and of the access and analysis tools used by the BI application.

* The solution of imposing security at a table level may not be granular enough.

* One possible way to achieve this type of security is to partition the tables either physically or
logically (through VIEWs).

* Partitioning will restrict access solely to the appropriate distributor as long as both the fact tables
and the dimension tables are partitioned.

* An alternative may be to enhance the meta data with definitions of data parameters, which could
control access to the data. This form of security would be implemented with appropriate program
logic to tell the meta data repository the distributor's identity, allowing the application to return the
appropriate data for that distributor only. This type of security measure will be only as good as the
program controlling it.

Q 12] Briefly describe the security for internet access ?


ANS : * The Internet enables distribution of information worldwide, and the BI decision-support
environment provides easy access to organizational data.

* Internet security refers to security designed to protect systems and the activities of employees and
other users while connected to the internet, web browsers, web apps, websites, and networks.

* Internet security solutions protect users and corporate assets from cybersecurity attacks and
threats.

FIG : Security Considerations for Internet Access

* Authentication is the process of identifying a person, usually based on a logon ID and password.
This process is meant to ensure that the person is who he or she claims to be.

* Authorization is the process of granting or denying a person access to a resource, such as an


application or a Web page. In security software, authentication is distinct from authorization, and
most security packages implement a two-step authentication and authorization process.

* Encryption is the "translation" of data into a secret code. It is the most effective way to achieve
data security. To read an encrypted file, you must have access to a secret key or password that
enables you to decrypt it. Unencrypted data is usually referred to as plain text, while encrypted data
is usually referred to as cipher text.

Q 13] Explain data backup and recovery ?


ANS : DATA BACKUP : Data backup is the practice of copying data from a primary to a secondary
location, to protect it in case of a disaster, accident or malicious action.
* Data backup refers to the infrastructure, technologies, and processes that copy organizational data
for restoration in case of failures. A database backup is an exact copy of your database kept in a
separate location.

TYPES OF DATA BACKUP :

> Incremental backup :- An incremental backup is a backup type that only copies data that has been
changed or created since the previous backup activity was conducted.

> High-speed mainframe backup :- Another possibility is to use the mainframe transfer utilities to
pass BI data back to the mainframe for a high-speed backup, which is supported only on the
mainframe.

* Mainframe backup serves as a safety net, ensuring that in the event of data loss or corruption,
critical information can be restored to its original state.

> Partial backup :- A partial backup resembles a full database backup, but a partial backup does not
contain all the filegroups.

* In this partial backup one partition is being backed up, the other partitions can remain available.

RECOVERY : Data recovery is the process of restoring data that has been lost, unintentionally
deleted, corrupted, or made inaccessible.

TYPES OF DATA RECOVERY :

> Physical data recovery: - Physical data recovery means that you can recover all your lost data after
it has been damaged due to some reasons like broken hardware or memory.

> Instant data recovery: - Whenever data is lost in this data recovery methodology, the user is
automatically directed to a backup server where they can quickly have a look at their workload while
IT is taking care of the restorative process in the background.

> Formatted Drive Recovery: Data retrieval from hard disks that have been formatted or initialized
would be achieved through formatted hard disk recovery process

> Logical data recovery :- Logical data recovery is a process of recovering data that has been lost due
to logical failures.

You might also like