0% found this document useful (0 votes)
16 views3 pages

Data Warehousing and Data Mining

about DATA WAREHOUSING AND DATA MINING

Uploaded by

hod.mba
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views3 pages

Data Warehousing and Data Mining

about DATA WAREHOUSING AND DATA MINING

Uploaded by

hod.mba
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 3

Management Information Systems-UNIT-2(PART-2)

DATA WAREHOUSING AND DATA MINING

DATA WAREHOUSE

A data warehouse is a logical collection of information, gathered from many different databases. Thus
data warehouse may be called as a large database containing historical transactions and other data.
For example – if we take department store dealing in buying and selling grocery items. The data ware
house would deal with granular data, information in its rawest form, within data ware house, each
transaction may be recorded.

The PURPOSE OF DATA WAREHOUSE is permanent storage of detailed information. Data entered into
a data warehouse needs to be processed to ensure that it is clean, complete and in a proper format.
Many a times, a data warehouse is subdivided in to smaller repositories called ‘Data Marts.’ A data
mart is a subset of a data warehouse, in which only the required portion of the data warehouse
information is kept.

Features of Data warehousing

1. SUBJECT-ORIENTED

It focuses on modeling and analysis of data relating to a specific area. The data warehouse is
organized around subject such as product, customer, sales etc.

2. INTEGRATED

It is an integration of data from various different applications like ERP systems, CRM system etc.

3. HISTORICAL PERSPECTIVE

The time variant for a data warehouse has a historical perspective in its approach, For example – past 5-
10 years.

4. NON-VOLATILE

It means data is stored permanently i.e. data once stored cannot be updated

Data warehouses are capable of storing vast quantities of data, but there is a challenge in implementing
data warehousing applications. For successful implementation, organizations need to be very careful
about the data quality. Missing and miscoded data has to be cleaned up, and variables often come in a
variety of types, such as nominal data with no numeric content, dates, counts, averages etc.

Thus, organizations must ensure the data quality in a data warehouse. To make data warehouses useful,
organizations must use BI (business intelligence) tools to process data into meaningful information.
These databases are used for data mining and online analytical processing (OLAP)
The organizations that develop business intelligence (BI) tools create interfaces that help the managers
to quickly grasp business situations. Such an interface is simple to understand and the interpretation by
the managers becomes easy. Example – one such interface is called dash board ,because it looks similar
to a car dash board visual images like speedometer – like indicators for periodic revenues, profits, and
other financial information ;plus bar charts, line graphs, and other graphical representations are used in
dashboards.

DATA MINING

Definition

It is defined as a process used to extract usable data from a larger set of any raw data.

It is the process of discovering or mining knowledge from a large amount of data.

It attempts to extract hidden patterns and trends from large databases.

It also support automatic exploration of data.

Data mining queries are more advanced and sophisticated than those of traditional queries.

For example – a typical traditional query may be” what is the relationship between the amount of
product A and the amount of product B that an organization sold over the past week?”.

Where as in Data Mining, the manager would be interested to know the products that would be in
demand on the coming weekend and thus the query from the data mining may be” find out the
products most likely to have the maximum demand on the coming weekend.”

The combination of data-warehousing techniques and data mining software makes it easier to predict
future outcomes based on patterns discovered within historical data.

Objectives of Data Mining

1. SEQUENCE / PATH ANALYSIS - Finding patterns where one event leads to another.

2. CLASSIFICATION – finding whether certain facts fall into predefined groups.

3. CLUSTERING – finding groups of related facts not previously known and

4. FORECASTING – discovering patterns in data that can lead to reasonable predictions.

Sequence of Steps of Data Mining

1. DATA CLEANING – to remove noise and inconsistent data.

2. DATA INTEGRATION – where multiple data sources may be combined.

3. DATA SELECTION – data relevant to the analysis task are retrieved from the database.
4. DATA TRANSFROMATION – data area transformed into forms appropriate for mining by performing
summary or aggregation operations.

5. DATA MINING – process where intelligent methods are applied in order to extract data patterns.

6. PATTERN EVALUATION – to identify the truly interesting patterns representing knowledge based on
some interestingness measure. patterns are selected on interestingness basis.

7. KNOWLEDGE PRESENTATION – Visualization and knowledge presentation technique are used to


present the mined knowledge to the user.

Techniques/ Applications of Data Mining

1. Retail or marketing

2. Banking

3. Insurance and health care

4. Transportation and 5. Medicine

You might also like