Data Mining and Data Warehousing
Data Mining and Data Warehousing
Data Mining refers to the process of extracting useful data from the
databases.
Data Mining
Definition: Data mining refers to the process of discovering patterns, trends, and useful
information from large datasets using statistical, mathematical, and computational techniques. It
focuses on extracting valuable insights from data for decision-making, predictive analysis, and
knowledge discovery.
Key Characteristics:
Data Warehousing
Definition: Data warehousing is the process of collecting, storing, and managing large amounts
of structured data from various sources in a central repository, specifically designed for query
and analysis rather than transaction processing. It provides an integrated and consistent view of
data over time.
Key Characteristics:
1. Structure: Data is organized in a way that supports analytical queries, often using
schemas like star or snowflake.
2. Purpose: Acts as a centralized system for reporting and business intelligence.
3. Components:
o ETL (Extract, Transform, Load): Processes data from source systems to the
warehouse.
o Data Marts: Subsets of a data warehouse focused on specific business areas.
o OLAP (Online Analytical Processing): Used for multi-dimensional analysis.
4. Applications:
o Historical data analysis
o Performance reporting (e.g., KPIs)
o Business trend analysis
Data Warehouse:
A Data Warehouse refers to a place where data can be stored for useful mining. It is like a
quick computer system with exceptionally huge data storage capacity. Data from the
various organization's systems are copied to the Warehouse, where it can be fetched and
conformed to delete errors. Here, advanced requests can be made against the warehouse
storage of data.
Data warehouse combines data from numerous sources which ensure the data quality,
accuracy, and consistency. Data warehouse boosts system execution by separating
analytics processing from transnational databases. Data flows into a data warehouse from
different databases. A data warehouse works by sorting out data into a pattern that depicts
the format and types of data. Query tools examine the data tables using patterns.
Data warehouses and databases both are relative data systems, but both are made to
serve different purposes. A data warehouse is built to store a huge amount of historical data
and empowers fast requests over all the data, typically using Online Analytical
Processing (OLAP). A database is made to store current transactions and allow quick
access to specific transactions for ongoing business processes, commonly known
as Online Transaction Processing (OLTP).
1. Subject Oriented
A data warehouse is subject-oriented. It provides useful data about a subject instead of the
company's ongoing operations, and these subjects can be customers, suppliers, marketing,
product, promotion, etc. A data warehouse usually focuses on modeling and analysis of
data that helps the business organization to make data-driven decisions.
2. Time-Variant:
The different data present in the data warehouse provides information for a specific period.
3. Integrated
A data warehouse is built by joining data from heterogeneous sources, such as social
databases, level documents, etc.
4. Non- Volatile
Data Mining:
Data mining refers to the analysis of data. It is the computer-supported process of analyzing
huge sets of data that have either been compiled by computer systems or have been
downloaded into the computer. In the data mining process, the computer analyzes the data
and extract useful information from it. It looks for hidden patterns within the data set and try
to predict future behavior. Data mining is primarily used to discover and indicate
relationships among the data sets.
Data mining aims to enable business organizations to view business behaviors, trends
relationships that allow the business to make data-driven decisions. It is also known as
knowledge Discover in Database (KDD). Data mining tools utilize AI, statistics, databases,
and machine learning systems to discover the relationship between the data. Data mining
tools can support business-related questions that traditionally time-consuming to resolve
any issue.
Data Mining can predict the market that helps the business to make the decision. For
example, it predicts who is keen to purchase what type of products.
Data Mining methods can help to find which cellular phone calls, insurance claims, credit, or
debit card purchases are going to be fraudulent.
Data Mining techniques are widely used to help Model Financial Market
Analyzing the current existing trend in the marketplace is a strategic benefit because it
helps in cost reduction and manufacturing process as per market demand.
Business entrepreneurs carry data mining Data warehousing is entirely carried out by
with the help of engineers. the engineers.