0% found this document useful (0 votes)
8 views29 pages

Database Warehouse Data Mining

Data warehousing is a process for collecting and managing data from various sources to provide business insights, serving as the core of business intelligence systems. It includes types like Enterprise Data Warehouse, Operational Data Store, and Data Mart, and involves stages from offline operational databases to integrated data warehouses. Data analytics, which encompasses descriptive, diagnostic, predictive, and prescriptive analytics, utilizes data mining to extract valuable information for informed decision-making across various industries.

Uploaded by

Ashmit Singh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views29 pages

Database Warehouse Data Mining

Data warehousing is a process for collecting and managing data from various sources to provide business insights, serving as the core of business intelligence systems. It includes types like Enterprise Data Warehouse, Operational Data Store, and Data Mart, and involves stages from offline operational databases to integrated data warehouses. Data analytics, which encompasses descriptive, diagnostic, predictive, and prescriptive analytics, utilizes data mining to extract valuable information for informed decision-making across various industries.

Uploaded by

Ashmit Singh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 29

U-3 DATA

WAREHOUSING
INTRODUCTIO
N :-

• Data Warehousing (DW) is process for collecting and


managing data from
varied sources to provide meaningful business insights.
• A Data warehouse is typically used to connect and
analyze business data from heterogeneous sources.
• The data warehouse is the core of the BI(business
intelligence ) system which is
built for data analysis and reporting.
Data warehouse system is also known by the
following name:
• Decision Support System (DSS)
• Executive Information System
• Management Information System
• Business Intelligence Solution
• Analytic Application
• Data Warehouse
WHY DO DATA WAREHOUSE IS NEEDED :-

• Data warehouse allows users to access critical data


from the number of sources in a single place.
• It saves user's time of retrieving data from multiple sources.
• Data warehouse stores a large amount of historical data.
This helps users to
analyze different time periods and trends to make future
predictions.
TYPES OF DATA
WAREHOUSE :-
• There are three main types of Data Warehouses (DWH) :-
1) Enterprise Data Warehouse (EDW):
Enterprise Data Warehouse (EDW) is a centralized
warehouse. It provides decision support service across
the enterprise. It offers a unified approach for organizing
and representing data.
2)Operational Data Store:
Operational Data Store, which is also called ODS, are
nothing but data store required when neither Data warehouse
nor OLTP ( Online Transactional Processing) systems support
organizations reporting needs. In ODS, Data warehouse is
refreshed in real time.
2) Data Mart:
A data mart is a subset of the data warehouse. It
specially designed for a particular line of business, such as
sales or finance. In an independent data mart, data can
collect directly from sources.
GENERAL STAGES OF DATA
WAREHOUSE :-
• Offline Operational Database: In this stage, data is just copied from an
operational system
to another server.

• Offline Data Warehouse: In this stage Data in the Data warehouse is


regularly updated from
the Operational Database.
• Real time Data Ware house: In this stage, Data warehouses are updated
whenever any
transaction takes place in operational database.
• Integrated Data Warehouse: In this stage, Data Warehouses are updated
continuously when
the operational system performs a transaction.
COMPONENTS OF DATA
WAREHOUSE:-
• Load manager: Load manager is also called the front component. It performs with all the operations
associated with the extraction
and load of data into the warehouse.

• Warehouse Manager: Warehouse manager performs operations associated with the management of
the data in the warehouse.

• Query Manager: Query manager is also known as backend component. It performs all the operation
operations related to the
management of user queries.

• End-user access tools:


This is categorized into five different groups like
1.Data Reporting
2.Query Tools
3.Application development tools
4.EIS ( Executive Information) tools
5.OLAP tools and data mining tools.
WHAT IS A DATA WAREHOUSE USED
FOR?
Here, are most common sectors where Data warehouse is used:
• Airline: In the Airline system, it is used for operation purpose like crew
assignment, analyses of route profitability, frequent flyer program
promotions etc.
• Banking: It is widely used in the banking sector to manage the resources
available on desk effectively. Few banks also used for the market research,
performance analysis of the product and operations.
• Telecommunication: A data warehouse is used in this sector for product
promotions, sales
decisions and to make distribution decisions.
• Investment and Insurance sector: In this sector, the warehouses are
primarily used to
analyze data patterns, customer trends, and to track market movements.
• And many more…
STEPS TO IMPLEMENT DATA
WAREHOUSE :-
The best way to address the business risk associated with
a Data warehouse implementation is to employ a three
strategy as below.
• Enterprise strategy: Here we identify technical including current
architecture and tools. We also identify facts, dimensions, and
attributes. Data mapping and transformation is also passed.

• Phased delivery: Data warehouse implementation should be phased


based on subject areas. Related business entities like booking and
billing should be first implemented and then integrated with each other.

• Iterative Prototyping: Rather than a big bang approach to


implementation, the Data
warehouse should be developed and tested iteratively.
ADVANTAGES &
DISADVANTAGES:-
• Advantages of Data Warehousing:-
1. Data warehouse provides consistent information on various cross-
functional activities. It is
also supports reporting and query.
2. Data warehouse helps to reduce total turnaround time for analysis and
reporting.
3. Restructuring and Integration make it easier for the user to use for
reporting and analysis.
4. Data warehouse allows users to access critical data from the number of
sources in a single
place. Therefore, it saves user’s time of retrieving data from multiple
sources.
5. Data warehouse stores a large amount of historical data.
This helps users to analyze different time periods and
trends to make future predictions.
• Disadvantages of Data Warehousing :-
1. Not an ideal option for unstructured data.
2. Creation and Implementation of Data Warehouse is surely time
confusing affair.
3. The data warehouse may seem easy, but actually, it is too complex
for the average users.
4. Organizations need to spend lots of their resources for training and
Implementation
purpose.
5. Data Warehouse can be outdated relatively quickly
THE FUTURE OF DATA
WAREHOUSING :-
• Change in Regulatory constrains may limit the ability to
combine source of disparate data. These disparate sources
may include unstructured data which is difficult to store.
• As the size of the databases grows, the estimates of what
constitutes a very large database continue to grow. It is
complex to build and run data warehouse systems which are
always increasing in size. The hardware and software resources
are available today do not allow to keep a large amount of data
online.
• Multimedia data cannot be easily manipulated as text data,
whereas textual information can be retrieved by the relational
software available today. This could be a research subject.
U-3 DATA WAREHOUSE
TOOLS :-

1. Mark Logic.
https://www.marklogic.com/product/getting-started/
2. Oracle. https://www.oracle.com/index.html
3. Amazon Red-Shift.
https://aws.amazon.com/redshift/?nc2=h_m1
U-4 DATA
ANALYTICS

45
INTRODUCTIO
N:-
Data analytics (DA) is the process of examining data sets in
order to find trends and draw conclusions about the
information they contain. Increasingly, data
analytics is done with the aid of specialized systems and softw
are. Data analytics technologies and techniques are widely us
ed in commercial i
ndustries to enable organizations to make more-informed
business decisions. Scientists and researchers also use
analytics tools to verify or disprove scientific models, theories
and hypotheses.
TYPES OF DATA
ANALYTICS:-
1) Descriptive Analytics:- Descriptive analytics simply describes the
answer to what
happened and it alters raw information from numerous data sources to
give important
knowledge into the past.
2) Diagnostic Analytics:- At this stage, historical information can be
classified against other data to acknowledge the topic of why
something happened. Diagnostic analytics provides top to bottom bits
of knowledge into a specific issue.
3) Predictive Analytics:- Predictive analytics is giving hints that it is
something related to future prediction. It uses the discoveries of
descriptive and diagnostic analytics to identify bunches and special
cases and to predict future trends, which makes it a significant device
for estimating.
4) Prescriptive analytics:- The motivation behind prescri
prescriptive
what move toanalytics
make to is eliminate
to be
a future issue or take full advantage
of a promising trend. Prescriptive analytics utilizes advanced tools and
technologies, similar to machine learning, business rules, and
algorithms, which makes it modern to actualize and manage.

Pratibha Patil
– 45
Data analytics is the process of examining datasets to draw conclusions about the information they contain.
There are several types of data analytics, each serving different purposes:

1. Descriptive Analytics - Purpose To describe or summarize historical data, showing what has happened in
the past.
- Example: A retail store analyzes last year's sales data to determine the average monthly sales and the
best-selling products. The result might show that sales peak in December and that Product A is the top seller
during that time.

2. Diagnostic Analytics
-Purpose: To understand why something happened by identifying patterns or correlations in historical data.
- Example: An e-commerce company sees a drop in sales in April. They use diagnostic analytics to find
that a competitor launched a major sale event during that time, causing the decline in their own sales.

3.Predictive Analytics
- Purpose: To forecast future events based on historical data and statistical models.
- Example: A bank uses predictive analytics to forecast which customers are most likely to default on their
loans by analyzing past customer data, such as payment history and income.
4. Prescriptive Analytics
- Purpose: To recommend actions to achieve desired outcomes based on data
analysis.
- Example: A logistics company uses prescriptive analytics to suggest the optimal
delivery routes for their trucks, considering factors such as traffic patterns, weather,
and delivery urgency, to minimize fuel consumption and delivery time.

5.Exploratory Analytics
-Purpose: To explore and discover patterns or relationships in data without having a
specific hypothesis.
- Example: A healthcare organization may explore patient data to identify
unexpected correlations between lifestyle factors and health outcomes, which might
help in forming new research hypotheses.

Each type of data analytics has its unique use case depending on the problem being
addressed or the decision-making process.
DATA
MINING:-
Data mining is a process used by companies to turn raw data
into useful information. By using software to look for patterns
in large batches of data, businesses can learn more about
their customers to develop more effective marketing
strategies, increase sales and decrease costs. In short it is the
process of discovering actionable information from large sets
of data.
Data mining uses mathematical analysis to derive patterns
and trends that exist in data.
• The main purpose of data mining :-
The data mining is the process of uncovering patterns and finding
anomalies and relationships in large datasets that can be used to make
predictions about future trends. The main purpose of data mining is to
extract valuable information from available data.

• Use of Data Mining :-


The data mining is used in various fields like research,
business, marketing, sales, product development, education,
and healthcare. When used appropriately, data mining
provides an extreme advantage over competitive
establishments by providing more information about
customers and helps to develop better and effective
strategies in marketing which will raise the revenue and lower
the cost.
ADVANTAGES AND DISADVANTAGES OF DATA
MINING :-
Advantages:-
• It helps companies gather reliable information
• It’s an efficient, cost-effective solution compared to other
data applications
• It helps businesses make profitable production and
operational adjustments
• Data mining uses both new and legacy systems
• It helps businesses make informed decisions
• It helps detect credit risks and fraud
Disadvantages:-
• Many data analytics tools are complex and challenging to
use. Data scientists need the right training to use the tools
effectively.
• Companies can potentially sell the customer data they have
gleaned to other
businesses and organizations, raising privacy concerns.
• Data mining requires large databases, making the process
hard to manage.
HOW DATA MINING
•WORKS:-
a) First data is collected and loaded into the data warehouse.
• b) Then the data is managed and stored either in the cloud or in
the in-house servers.
• c) The data is assessed by management teams, business analysts,
and information
technology professionals and they determine how to organize the
data.
• d) Then based on the results of the users, it is sorted by the
application software.
• e) Finally, the data is presented by the end-user in a format like a
graph or a table
that is easy to share.
DATA MINING
APPLICATIONS:-
• Banks
• Healthc
are
• Marketin
g
CLOUD
•COMPUTING
Cloud computing is the delivery of computing services—including
servers, storage, databases, networking, software, analytics, and
intelligence—over the Internet (“the cloud”) to offer faster
innovation, flexible resources, and economies of scale.
• In the simplest terms, cloud computing means storing and
accessing data and
programs over the internet instead of your computer's hard drive.
• Cloud computing is named as such because the information being
accessed is found remotely in the cloud or a virtual space.
Companies that provide cloud services enable users to store files
and applications on remote servers and then access all the data
via the Internet.
• Purpose of Cloud Computing:-
The goal of cloud computing is to allow users to take benefit from all of
these technologies, without the need for deep knowledge about or
expertise with each one of them. The cloud aims to cut costs and helps
the users focus on their core business instead of being impeded by IT
obstacles.

• Some of the most common reasons to use the cloud:-


1. File storage: You can store all types of information in the
cloud, including
files and email.
2. File sharing: The cloud makes it easy to share files with
several people at the same time.
3. Backing up data: You can also use the cloud to protect your
files.
ADVANTAGES OF CLOUD
COMPUTING:-
1. Cost Savings.
2. Security.
3. Flexibility.
4. Mobility.
5. Insight.
6. Disaster
Recovery.

You might also like