0% found this document useful (0 votes)
84 views10 pages

Assignment Data Warehousing (Ajay - 58)

A data warehouse is a large collection of business data from multiple sources that is stored and processed to enable analysis and decision making. It periodically pulls data from sources like sales, marketing and finance systems and loads it after formatting. This processed data is then ready for analysts to access. Some benefits of data warehouses include better data quality, consistency and integrity as well as faster decisions based on complete datasets. However, data warehouses also have some disadvantages such as extra reporting work, high costs, data ownership concerns and less flexibility compared to operational databases.

Uploaded by

Rushil Nagwan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
84 views10 pages

Assignment Data Warehousing (Ajay - 58)

A data warehouse is a large collection of business data from multiple sources that is stored and processed to enable analysis and decision making. It periodically pulls data from sources like sales, marketing and finance systems and loads it after formatting. This processed data is then ready for analysts to access. Some benefits of data warehouses include better data quality, consistency and integrity as well as faster decisions based on complete datasets. However, data warehouses also have some disadvantages such as extra reporting work, high costs, data ownership concerns and less flexibility compared to operational databases.

Uploaded by

Rushil Nagwan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 10

ASSIGNMENT

ON
DATA WAREHOUSING

Submitted To:
Submitted By:
Ms. Sakshi
Ajay
MBA (Final) Sec- B
Roll No. - 58
A data warehouse is a large collection of
business data used to help an organization
make decisions. The concept of the data
warehouse has existed since the 1980s, when
it was developed to help transition data from
merely powering operations to fueling
decision support systems that reveal business
intelligence. The large amount of data in data
warehouses comes from different places such
as internal applications such as marketing,
sales, and finance; customer-facing apps; and
external partner systems, among others.
On a technical level, a data
warehouse periodically pulls data from those
apps and systems; then, the data goes
through formatting and import processes to
match the data already in the warehouse. The
data warehouse stores this processed data so
it’s ready for decision makers to access. How
frequently data pulls occur, or how data is
formatted, etc., will vary depending on the
needs of the organization.
Some benefits of a data warehouse
Organizations that use a data warehouse to
assist their analytics and business intelligence
see a number of substantial benefits:
 Better data — Adding data sources to a
data warehouse enables organizations to
ensure that they are collecting consistent
and relevant data from that source. They
don’t need to wonder whether the data will
be accessible or inconsistent as it comes
in to the system. This ensures higher data
quality and data integrity for sound
decision making.
 Faster decisions — Data in a warehouse
is in such consistent formats that it is
ready to be analyzed. It also provides the
analytical power and a more complete
dataset to base decisions on hard facts.
Therefore, decision makers no longer
need to reply on hunches, incomplete
data, or poor quality data and risk
delivering slow and inaccurate results.
What a data warehouse is not
1. It is not a database
It’s easy to confuse a data warehouse with
a database, since both concepts share some
similarities. The primary difference, however,
comes into effect when a business needs to
perform analytics on a large data collection.
Data warehouses are made to handle this
type of task, while databases are not. Here’s a
comparison chart that tells the difference
between the two:

Database Data Warehouse

Data Aggregated transactional
collected for data, transformed and
multiple stored
What transactional for analytical purposes.
it is purposes. Optimized
Optimized for for aggregation and
read/write retrieval of large data
access. sets.
How Databases Data warehouses store
it’s are made to data from multiple
used quickly databases, which makes
record and it easier to analyze.
retrieve
information.
Databases
are used in
data
warehousing.
However, the
term usually
refers to an
online,
transactional A data warehouse is an
processing analytical database that
database. layers
Type There are on top of transactional
s databases to allow for
other types analytics.
as well,
including csv,
html, and
Excel
spreadsheets
used for
database
purposes.
2. It is not a data lake
Although they both are built for business
analytics purposes, the major difference
between a data lake and a data warehouse is
that a data lake stores all types of raw,
structured, and unstructured data from all data
sources in its native format until it is needed.
By contrast, a data warehouse stores data in
files or folders in a more organized fashion
that is readily available for reporting and data
analysis.
3. It is not a data mart
Data warehouses are also sometimes
confused with data marts. But data
warehouses are generally much bigger and
contain a greater variety of data, while data
marts are limited in their application.
Data marts are often subsets of a warehouse,
designed to easily deliver specific data to a
specific user, for a specific application. In the
simplest terms, data marts can be thought of
as single-subject, while data warehouses
cover multiple subjects.
What are data warehouses used for?
Many types of business data are
analyzed via data warehouses. The need
for a data warehouse often becomes
evident when analytic requirements run
afoul of the ongoing performance of
operational databases. Running a
complex query on a database requires the
database to enter a temporary fixed state.
This is often untenable for transactional
databases. A data warehouse is
employed to do the analytic work, leaving
the transactional database free to focus
on transactions. 
The other benefits of a data warehouse
are the ability to analyze data from
multiple sources and to negotiate
differences in storage schema using
the ETL process.
DISADVANTAGES OF DATA
WAREHOUSING:
Data warehouses are relational databases that act
as data analysis tools, aggregating data from
multiple departments of a business into one data
store. Data warehouses are typically updated as
an end-of-day batch job, rather than being
churned by real time transactional data. Their
primary benefits are giving managers better and
timelier data to make strategic decisions for the
company. However, they have some drawbacks
as well.
Extra Reporting Work
Depending on the size of the organization, a data
warehouse runs the risk of extra work on
departments. Each type of data that's needed in
the warehouse typically has to be generated by
the IT teams in each division of the business. This
can be as simple as duplicating data from an
existing database, but at other times, it involves
gathering data from customers or employees that
wasn't gathered before.
Cost/Benefit Ratio
A commonly cited disadvantage of data
warehousing is the cost/benefit analysis. A data
warehouse is a big IT project, and like many big IT
projects, it can suck a lot of IT man hours and
budgetary money to generate a tool that doesn't
get used often enough to justify the
implementation expense. This is completely
sidestepping the issue of the expense of
maintaining the data warehouse and updating it
as the business grows and adapts to the market.
Data Ownership Concerns
Data warehouses are often, but not always,
Software as a Service implementations, or cloud
services applications. Your data security in this
environment is only as good as your cloud vendor.
Even if implemented locally, there are concerns
about data access throughout the company. Make
sure that the people doing the analysis are
individuals that your organization trusts, especially
with customers' personal data. A data warehouse
that leaks customer data is a privacy and public
relations nightmare.
Data Flexibility
Data warehouses tend to have static data sets
with minimal ability to "drill down" to specific
solutions. The data is imported and filtered
through a schema, and it is often days or weeks
old by the time it's actually used. In addition, data
warehouses are usually subject to ad hoc queries
and are thus notoriously difficult to tune for
processing speed and query speed. While the
queries are often ad hoc, the queries are limited
by what data relations were set when the
aggregation was assembled.

You might also like