1 What Is Data Mining
1 What Is Data Mining
1960- Dartmouth and General Mills in a joint research project, develop the
terms dimensions and facts.
1970- A Nielsen and IRI introduces dimensional data marts for retail
sales.
1983- Tera Data Corporation introduces a database management system
which is specifically designed for decision support
# The idea of data warehousing came to the late 1980's when IBM
researchers Barry Devlin and Paul Murphy established the "Business
Data Warehouse.".
However, the real concept was given by Inmon Bill. He was considered as
a father of data warehouse. He had written about a variety of topics for
building, usage, and maintenance of the warehouse & the Corporate
Information Factory.
# The term "Data Warehouse" was first coined by Bill Inmon in 1990.
According to Inmon, a data warehouse is a subject oriented, integrated, time-
variant, and non-volatile collection of data. This data helps analysts to take
informed decisions in an organization.
In short, the data warehousing idea was planned to support an architectural model
for the flow of information from the operational system to decisional support
environments. The concept attempt to address the various problems associated
with the flow, mainly the high costs associated with it.
Data warehouses collaborate data from several sources and ensure data accuracy,
quality, and consistency. In a data warehouse, data is sorted into a formatted pattern
by type and as needed. The data is examined by query tools using several patterns.
Data warehouses store historical data and handle requests faster, helping in online
analytical processing, whereas a database is used to store current transactions in a
business process that is called online transaction processing.
Integrated:
Different sources are put together to build a data warehouse, such as level documents
or social databases. Data in Data warehouse comes from several operational system.
Before data integration some steps are follows:--
2.) Transformation
Non-volatile:
This means the earlier data is not deleted when new data is added to the data
warehouse. Data granularity The operational database and data warehouse are kept
separate and thus continuous changes in the operational database are not shown in
the data warehouse.
Subject Oriented
It provides you with important data about a specific subject like suppliers, products,
promotion, customers, etc. Data warehousing usually handles the analysis and
modelling of data that assist any organization to make data-driven decisions.
Applications of Data Warehouses:
Banking Services
Consumer Goods
Manufacturing
Financial Services
Retail Sectors
Benefits of Data Warehousing
Data mining uses statistics, artificial intelligence, machine learning systems, and
some databases to find hidden patterns in the data. It supports business-related
queries that are time-consuming to resolve.
Research
Education Sector
Transportation
Market Basket Analysis
Business Transactions
Intrusion Detection
Scientific Analysis
Finance and Banking Sector
Insurance and Healthcare
Common Tools and Software Used in Data Warehousing and Data Mining
Let’s look at the common tools and software used in data warehousing and data
mining: Some of the popular data warehouse tools are:
Amazon Redshift
Microsoft Azure
Google BigQuery
Snowflake
Micro Focus Vertica
Amazon DynamoDB
Some of the popular data mining tools are:
RapidMiner
MonkeyLearn
IBM SPSS Modeler
Oracle Data Mining
Knime
Weka
Orange
H2O
Apache Mahout
SAS Enterprise Miner
Let’s look at the common techniques used in data warehousing vs data mining:
The most common techniques of data mining are:
Association
Clustering
Data Visualisation
Data Cleaning
Machine Learning
Classification
Neural Networks
Prediction
Data Warehousing
Outlier Detection
Common Data Warehousing Techniques
Database Compression
Columnar Data Storage
In-Memory Processing
Massive Parallel Processing (MPP)
Scope of Data Mining & Data Warehouse
The scope of data mining vs data warehousing is different from each other. Data
mining involves sorting enormous data sets to identify relationships and patterns
that can easily solve business problems through data analysis. The scope and
techniques of data mining enable enterprises to predict future trends and make
informed business decisions.
On the other hand, the scope of data warehousing lies within any domain that has
something to do with analytics. Now, let us discuss the differences between data
mining and data warehousing challenges faced.