SQL Cheat Sheet - 1557131235
SQL Cheat Sheet - 1557131235
Types,
Definition & Example
What is Data Warehousing?
A data warehousing is a technique for collecting and managing data from varied
sources to provide meaningful business insights. It is a blend of technologies and
components which allows the strategic use of data.
You many know that a 3NF-designed database for an inventory system many have
tables related to each other. For example, a report on current inventory
information can include more than 12 joined conditions. This can quickly slow
down the response time of the query and report. A data warehouse provides a new
design which can help to reduce the response time and helps to enhance the
performance of queries for reports and analytics.
• 1960- Dartmouth and General Mills in a joint research project, develop the
terms dimensions and facts.
• 1970- A Nielsen and IRI introduces dimensional data marts for retail sales.
• 1983- Tera Data Corporation introduces a database management system
which is specifically designed for decision support
• Data warehousing started in the late 1980s when IBM worker Paul Murphy
and Barry Devlin developed the Business Data Warehouse.
• However, the real concept was given by Inmon Bill. He was considered as a
father of data warehouse. He had written about a variety of topics for
building, usage, and maintenance of the warehouse & the Corporate
Information Factory.
1. Structured
2. Semi-structured
3. Unstructured data
The data is processed, transformed, and ingested so that users can access the
processed data in the Data Warehouse through Business Intelligence tools, SQL
By merging all of this information in one place, an organization can analyze its
customers more holistically. This helps to ensure that it has considered all the
information available. Data warehousing makes data mining possible. Data mining
is looking for patterns in the data that may lead to higher sales and profits.
Operational Data Store, which is also called ODS, are nothing but data store
required when neither Data warehouse nor OLTP systems support organizations
reporting needs. In ODS, Data warehouse is refreshed in real time. Hence, it is
widely preferred for routine activities like storing records of the Employees.
3. Data Mart:
A data mart is a subset of the data warehouse. It specially designed for a particular
line of business, such as sales, finance, sales or finance. In an independent data
mart, data can collect directly from sources.
In this stage, data is just copied from an operational system to another server. In
this way, loading, processing, and reporting of the copied data do not impact the
operational system's performance.
In this stage, Data warehouses are updated whenever any transaction takes place
in operational database. For example, Airline or railway booking system.
In this stage, Data Warehouses are updated continuously when the operational
system performs a transaction. The Datawarehouse then generates transactions
which are passed back to the operational system.
Load manager: Load manager is also called the front component. It performs with
all the operations associated with the extraction and load of data into the
warehouse. These operations include transformations to prepare the data for
entering into the Data warehouse.
This is categorized into five different groups like 1. Data Reporting 2. Query Tools 3.
Application development tools 4. EIS tools, 5. OLAP tools and data mining tools.
Airline:
In the Airline system, it is used for operation purpose like crew assignment,
analyses of route profitability, frequent flyer program promotions, etc.
Banking:
It is widely used in the banking sector to manage the resources available on desk
effectively. Few banks also used for the market research, performance analysis of
the product and operations.
Healthcare sector also used Data warehouse to strategize and predict outcomes,
generate patient's treatment reports, share data with tie-in insurance companies,
medical aid services, etc.
Public sector:
In the public sector, data warehouse is used for intelligence gathering. It helps
government agencies to maintain and analyze tax records, health policy records,
for every individual.
In this sector, the warehouses are primarily used to analyze data patterns,
customer trends, and to track market movements.
Retain chain:
In retail chains, Data warehouse is widely used for distribution and marketing. It
also helps to track items, customer buying pattern, promotions and also used for
determining pricing policy.
Telecommunication:
A data warehouse is used in this sector for product promotions, sales decisions and
to make distribution decisions.
Hospitality Industry:
Here, are key steps in Datawarehouse implementation along with its deliverables.
7 Maps Operational Data Store to Data Warehouse D/W Data Integration Map
9 Extract Data from Operational Data Store Integrated D/W Data Extracts
• Decide a plan to test the consistency, accuracy, and integrity of the data.
• The data warehouse must be well integrated, well defined and time
stamped.
• While designing Datawarehouse make sure you use right tool, stick to life
cycle, take care about data conflicts and ready to learn you're your mistakes.
• Never replace operational systems and reports
• Don't spend too much time on extracting, cleaning and loading data.
• Ensure to involve all stakeholders including business personnel in
Datawarehouse implementation process. Establish that Data warehousing is
a joint/ team project. You don't want to create Data warehouse that is not
useful to the end users.
• Prepare a training plan for the end users.
• Data warehouse allows business users to quickly access critical data from
some sources all in one place.
• Data warehouse provides consistent information on various cross-functional
activities. It is also supporting ad-hoc reporting and query.
• Data Warehouse helps to integrate many sources of data to reduce stress on
the production system.
• Data warehouse helps to reduce total turnaround time for analysis and
reporting.
• Restructuring and Integration make it easier for the user to use for reporting
and analysis.
• Data warehouse allows users to access critical data from the number of
sources in a single place. Therefore, it saves user's time of retrieving data
from multiple sources.
• Data warehouse stores a large amount of historical data. This helps users to
analyze different time periods and trends to make future predictions.
1. MarkLogic:
MarkLogic is useful data warehousing solution that makes data integration easier
and faster using an array of enterprise features. This tool helps to perform very
complex search operations. It can query different types of data like documents,
relationships, and metadata.
http://developer.marklogic.com/products
2. Oracle:
https://www.oracle.com/index.html
3. Amazon RedShift:
https://aws.amazon.com/redshift/?nc2=h_m1