Data Ware-House Solved Short q's-II
Data Ware-House Solved Short q's-II
analysis
Or
A data warehouse is simply a single, complete, and consistent store of data obtained from a variety of
sources and made available to end users in a way they can understand and use it in a business context.
A Decision Support System (DSS) is a computer-based tool that helps decision-makers gather, analyze, and
interpret information to make better choices. E.g a bank use a dss to assess loan applications and identify
potential risks.
An operational database stores and manages the data generated by an organization's day-to-day
operations. This includes information like customer transactions, financial records, inventory levels, and
employee data
Operational db is different from data warehouses. It present current, detailed data for transaction
processing rather than historical, aggregated data for analysis
It's a software technology that enables flexible way to make analysis of multi dimensional data, analyze
large amounts of data from different perspectives quickly. Olap is set of tools and approaches to represent
data from multiple dimensions.
There are two types of OLAP i. MOLAP ii. ROLAP
MOLAP is form of OLAP that processes and stores the data directly into multidimensional database.
ROLAP is form of OLAP that performs dynamic multidimensional analysis of data stores in a relational
database rather than multidimensional database.
HOALP is a combination of the advantages of MOLAP and ROLAP to overcome the limitation of each
approach.
Metadata refers to the information that describes and explains the data stored within the warehouse. It's
essentially the data that describes the data.
The purpose of pre interview research is to raise awareness among interviews. Before interview it is crucial
to research the background of organization and any existing data warehousing initiatives. This research
helps interviewers understand context and challenges that interviewees may face.
When same data element is represented differently or has different meanings in different schema. E.g one
platform save data as male/female and other m/f or 0/1
Lead an effective support to conceptual design
Create an environment in which user queries may be formed
Make communication possible between designers and & users
Build a stable platform for logical design
Provide clear and expressive design documentation.
Open ended: Designed to encourage interviewees to provide detail and unrestricted responses. E.g. I)
what are the key objectives you has to face, or ii)what do you think of your data source quality.
Closed ended: Designed to gather specific information, typically yes/no or short response. E.g. I) are you
interested in sorting you purchase by hour or ii) do you want to receive a sale report weekly
Primary events: Events that directly impact business processes e.g. Production line stoppages, machine
breakdowns or order completions.
Secondary events: May not have same level of impact on business e.g. Maintenance requests, worker
attendance or quality control checks
Optional arcs are relationship between two dimensions that may or may not exists. E.g. Sales fact table
might have an optional arc to customer dimension. This would allow you to analyze sales by product
category even if there is no customer.
It is a situations where different data marts use different data models or schema designs to represent
similar data. E.g. One data mart uses star schema while other uses snowflake schema to represent data can
lead integration problem
Minimality is the principle of storing and using only the essential data required fro effective analysis and
decision making. It avoids redundancy, clutter and irrelevant information
Convergence refers to the merging of different technologies, tools, and approaches to create a more
unified and efficient data management system. E.g. Migrate to a cloud platform for increased scalability
and flexibility.
Dimensional attributes are the individual pieces of information that make up a dimension. E.g. Customer
dimension: (name, gender, age, income, status, phone).
Analysis: Involves evaluating the quality, consistency and completeness of data coming from different
operational databases.
Reconciliation: Addressing identified discrepancies by correcting errors, standardizing formats and
resolving conflicts to ensure accuracy and consistency.