0% found this document useful (0 votes)

19 views6 pages

Data Mining

Uploaded by

aakanshasandeep078

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as TXT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

19 views6 pages

Data Mining

Uploaded by

aakanshasandeep078

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as TXT, PDF, TXT or read online on Scribd

You are on page 1/ 6

DATA MINING

What Is Data Mining?

Data mining is the process of searching and analyzing a large batch of raw data in
order to identify patterns and extract useful information.

Companies use data mining software to learn more about their customers. It can help
them to develop more effective marketing strategies, increase sales, and decrease
costs. Data mining relies on effective data collection, warehousing, and computer
processing.

KEY TAKEAWAYS
Data mining is the process of analyzing a large batch of information to discern
trends and patterns.
Data mining can be used by corporations for everything from learning about what
customers are interested in or want to buy to fraud detection and spam filtering.
Data mining programs break down patterns and connections in data based on what
information users request or provide.
Social media companies use data mining techniques to commodify their users in order
to generate profit.
This use of data mining has come under criticism lately as users are often unaware
of the data mining happening with their personal information, especially when it is
used to influence preferences.

How Data Mining Works

Data mining involves exploring and analyzing large blocks of information to glean
meaningful patterns and trends. It is used in credit risk management, fraud
detection, and spam filtering. It also is a market research tool that helps reveal
the sentiment or opinions of a given group of people. The data mining process
breaks down into four steps:

Data is collected and loaded into data warehouses on-site or on a cloud service.
Business analysts, management teams, and information technology professionals
access the data and determine how they want to organize it.
Custom application software sorts and organizes the data.
The end user presents the data in an easy-to-share format, such as a graph or
table.
Data Warehousing and Mining Software
Data mining programs analyze relationships and patterns in data based on user
requests. It organizes information into classes.

For example, a restaurant may want to use data mining to determine which specials
it should offer and on what days. The data can be organized into classes based on
when customers visit and what they order.

In other cases, data miners find clusters of information based on logical

relationships or look at associations and sequential patterns to draw conclusions
about trends in consumer behavior.

Warehousing is an important aspect of data mining. Warehousing is the

centralization of an organization's data into one database or program. It allows
the organization to spin off segments of data for specific users to analyze and use
depending on their needs.

Cloud data warehouse solutions use the space and power of a cloud provider to store
data. This allows smaller companies to leverage digital solutions for storage,
security, and analytics.

Data Mining Techniques

Data mining uses algorithms and various other techniques to convert large
collections of data into useful output. The most popular types of data mining
techniques include:

Association rules, also referred to as market basket analysis, search for

relationships between variables. This relationship in itself creates additional
value within the data set as it strives to link pieces of data. For example,
association rules would search a company's sales history to see which products are
most commonly purchased together; with this information, stores can plan, promote,
and forecast.
Classification uses predefined classes to assign to objects. These classes describe
the characteristics of items or represent what the data points have in common with
each. This data mining technique allows the underlying data to be more neatly
categorized and summarized across similar features or product lines.
Clustering is similar to classification. However, clustering identifies
similarities between objects, then groups those items based on what makes them
different from other items. While classification may result in groups such as
"shampoo," "conditioner," "soap," and "toothpaste," clustering may identify groups
such as "hair care" and "dental health."
Decision trees are used to classify or predict an outcome based on a set list of
criteria or decisions. A decision tree is used to ask for the input of a series of
cascading questions that sort the dataset based on the responses given. Sometimes
depicted as a tree-like visual, a decision tree allows for specific direction and
user input when drilling deeper into the data.
K-Nearest neighbor (KNN) is an algorithm that classifies data based on its
proximity to other data. The basis for KNN is rooted in the assumption that data
points that are close to each other are more similar to each other than other bits
of data. This non-parametric, supervised technique is used to predict the features
of a group based on individual data points.
Neural networks process data through the use of nodes. These nodes are comprised of
inputs, weights, and an output. Data is mapped through supervised learning, similar
to the ways in which the human brain is interconnected. This model can be
programmed to give threshold values to determine a model's accuracy.
Predictive analysis strives to leverage historical information to build graphical
or mathematical models to forecast future outcomes. Overlapping with regression
analysis, this technique aims at supporting an unknown figure in the future based
on current data on hand.
The Data Mining Process
To be most effective, data analysts generally follow a certain flow of tasks along
the data mining process. Without this structure, an analyst may encounter an issue
in the middle of their analysis that could have easily been prevented had they
prepared for it earlier. The data mining process is usually broken into the
following steps.

Step 1: Understand the Business

Before any data is touched, extracted, cleaned, or analyzed, it is important to
understand the underlying entity and the project at hand. What are the goals the
company is trying to achieve by mining data? What is their current business
situation? What are the findings of a SWOT analysis? Before looking at any data,
the mining process starts by understanding what will define success at the end of
the process.

Step 2: Understand the Data

Once the business problem has been clearly defined, it's time to start thinking
about data. This includes what sources are available, how they will be secured and
stored, how the information will be gathered, and what the final outcome or
analysis may look like. This step also includes determining the limits of the data,
storage, security, and collection and assesses how these constraints will affect
the data mining process.
Step 3: Prepare the Data
Data is gathered, uploaded, extracted, or calculated. It is then cleaned,
standardized, scrubbed for outliers, assessed for mistakes, and checked for
reasonableness. During this stage of data mining, the data may also be checked for
size as an oversized collection of information may unnecessarily slow computations
and analysis.

Step 4: Build the Model

With our clean data set in hand, it's time to crunch the numbers. Data scientists
use the types of data mining above to search for relationships, trends,
associations, or sequential patterns. The data may also be fed into predictive
models to assess how previous bits of information may translate into future
outcomes.

Step 5: Evaluate the Results

The data-centered aspect of data mining concludes by assessing the findings of the
data model or models. The outcomes from the analysis may be aggregated,
interpreted, and presented to decision-makers that have largely been excluded from
the data mining process to this point. In this step, organizations can choose to
make decisions based on the findings.

Step 6: Implement Change and Monitor

The data mining process concludes with management taking steps in response to the
findings of the analysis. The company may decide the information was not strong
enough or the findings were not relevant, or the company may strategically pivot
based on findings. In either case, management reviews the ultimate impacts of the
business and recreates future data mining loops by identifying new business
problems or opportunities.

Different data mining processing models will have different steps, though the
general process is usually pretty similar. For example, the Knowledge Discovery
Databases model has nine steps, the CRISP-DM model has six steps, and the SEMMA
process model has five steps.
1
Applications of Data Mining
In today's age of information, almost any department, industry, sector, or company
can make use of data mining.

Sales
Data mining encourages smarter, more efficient use of capital to drive revenue
growth. Consider the point-of-sale register at your favorite local coffee shop. For
every sale, that coffeehouse collects the time a purchase was made and what
products were sold. Using this information, the shop can strategically craft its
product line.

Marketing
Once the coffeehouse above knows its ideal line-up, it's time to implement the
changes. However, to make its marketing efforts more effective, the store can use
data mining to understand where its clients see ads, what demographics to target,
where to place digital ads, and what marketing strategies most resonate with
customers. This includes aligning marketing campaigns, promotional offers, cross-
sell offers, and programs to the findings of data mining.

Manufacturing
For companies that produce their own goods, data mining plays an integral part in
analyzing how much each raw material costs, what materials are being used most
efficiently, how time is spent along the manufacturing process, and what
bottlenecks negatively impact the process. Data mining helps ensure the flow of
goods is uninterrupted.

Fraud Detection
The heart of data mining is finding patterns, trends, and correlations that link
data points together. Therefore, a company can use data mining to identify outliers
or correlations that should not exist. For example, a company may analyze its cash
flow and find a reoccurring transaction to an unknown account. If this is
unexpected, the company may wish to investigate whether funds are being mismanaged.

Human Resources
Human resources departments often have a wide range of data available for
processing including data on retention, promotions, salary ranges, company
benefits, use of those benefits, and employee satisfaction surveys. Data mining can
correlate this data to get a better understanding of why employees leave and what
entices new hires.

Customer Service
Customer satisfaction may be caused (or destroyed) for a variety of reasons.
Imagine a company that ships goods. A customer may be dissatisfied with shipping
times, shipping quality, or communications. The same customer may be frustrated
with long telephone wait times or slow e-mail responses. Data mining gathers
operational information about customer interactions and summarizes the findings to
pinpoint weak points and highlight what the company is doing right.

Benefits of Data Mining

Data mining ensures a company is collecting and analyzing reliable data. It is
often a more rigid, structured process that formally identifies a problem, gathers
data related to the problem, and strives to formulate a solution. Therefore, data
mining helps a business become more profitable, more efficient, or operationally
stronger.

Data mining can look very different across applications, but the overall process
can be used with almost any new or legacy application. Essentially any type of data
can be gathered and analyzed, and almost every business problem that relies on
qualifiable evidence can be tackled using data mining.

The end goal of data mining is to take raw bits of information and determine if
there is cohesion or correlation among the data. This benefit of data mining allows
a company to create value with the information they have on hand that would
otherwise not be overly apparent. Though data models can be complex, they can also
yield fascinating results, unearth hidden trends, and suggest unique strategies.

Limitations of Data Mining

This complexity of data mining is one of its greatest disadvantages. Data analytics
often requires technical skill sets and certain software tools. Smaller companies
may find this to be a barrier of entry too difficult to overcome.

Data mining doesn't always guarantee results. A company may perform statistical
analysis, make conclusions based on strong data, implement changes, and not reap
any benefits. Through inaccurate findings, market changes, model errors, or
inappropriate data populations, data mining can only guide decisions and not ensure
outcomes.

There is also a cost component to data mining. Data tools may require costly
subscriptions, and some bits of data may be expensive to obtain. Security and
privacy concerns can be pacified, though additional IT infrastructure may be costly
as well. Data mining may also be most effective when using huge data sets; however,
these data sets must be stored and require heavy computational power to analyze.
Even large companies or government agencies have challenges with data mining.
Consider the FDA's white paper on data mining that outlines the challenges of bad
information, duplicate data, underreporting, or overreporting.
2

Data Mining and Social Media

One of the most lucrative applications of data mining has been undertaken by social
media companies. Platforms like Facebook, TikTok, Instagram, and Twitter gather
reams of data about their users, based on their online activities.

That data can be used to make inferences about their preferences. Advertisers can
target their messages to the people who appear to be most likely to respond
positively.

Data mining on social media has become a big point of contention, with several
investigative reports and exposes showing just how intrusive mining users' data can
be. At the heart of the issue, users may agree to the terms and conditions of the
sites not realizing how their personal information is being collected or to whom
their information is being sold.
Examples of Data Mining
Data mining can be used for good, or it can be used illicitly. Here is an example
of both.

eBay and e-Commerce

eBay collects countless bits of information every day from sellers and buyers. The
company uses data mining to attribute relationships between products, assess
desired price ranges, analyze prior purchase patterns, and form product categories.
3

eBay outlines the recommendation process as:

Raw item metadata and user historical data are aggregated.

Scrips are run on a trained model to generate and predict the item and user.
A KNN search is performed.
The results are written to a database.
The real-time recommendation takes the user ID, calls the database results, and
displays them to the user.
3

Facebook-Cambridge Analytica Scandal

Another cautionary example of data mining is the Facebook-Cambridge Analytica data
scandal. During the 2010s, the British consulting firm Cambridge Analytica Ltd.
collected personal data from millions of Facebook users. This information was later
analyzed for use in the 2016 presidential campaigns of Ted Cruz and Donald Trump.
It is suspected that Cambridge Analytica interfered with other notable events such
as the Brexit referendum.
4

In light of this inappropriate data mining and misuse of user data, Facebook agreed
to pay $100 million for misleading investors about its uses of consumer data. The
Securities and Exchange Commission claimed Facebook discovered the misuse in 2015
but did not correct its disclosures for more than two years.
5

Frequently Asked Questions

What Are the Types of Data Mining?
There are two main types of data mining: predictive data mining and descriptive
data mining. Predictive data mining extracts data that may be helpful in
determining an outcome. Description data mining informs users of a given outcome.
How Is Data Mining Done?
Data mining relies on big data and advanced computing processes including machine
learning and other forms of artificial intelligence (AI). The goal is to find
patterns that can lead to inferences or predictions from large and unstructured
data sets.

What Is Another Term for Data Mining?

Data mining also goes by the less-used term "knowledge discovery in data," or KDD.

Where Is Data Mining Used?

Data mining applications have been designed to take on just about any endeavor that
relies on big data. Companies in the financial sector look for patterns in the
markets. Governments try to identify potential security threats. Corporations,
especially online and social media companies, use data mining to create profitable
advertising and marketing campaigns that target specific sets of users.

The Bottom Line

Modern businesses have the ability to gather information on their customers,
products, manufacturing lines, employees, and storefronts. These random pieces of
information may not tell a story, but the use of data mining techniques,
applications, and tools helps piece together information.

The ultimate goal of the data mining process is to compile data, analyze the
results, and execute operational strategies based on data mining results.

Data Mining
No ratings yet
Data Mining
395 pages
Data Mining
No ratings yet
Data Mining
89 pages
Lecture 7 & 8 Data Mining
No ratings yet
Lecture 7 & 8 Data Mining
21 pages
DATA MINIING Unit 1 Notes
No ratings yet
DATA MINIING Unit 1 Notes
22 pages
Bana1 Visualization
No ratings yet
Bana1 Visualization
22 pages
Lecture 7 8 Data Mining
No ratings yet
Lecture 7 8 Data Mining
23 pages
Unit 3
No ratings yet
Unit 3
22 pages
Lecture 1 & 2 - Introduction To Data Mining2
No ratings yet
Lecture 1 & 2 - Introduction To Data Mining2
19 pages
Data Mining and Decision Trees: Prof. Sin-Min Lee Department of Computer Science
No ratings yet
Data Mining and Decision Trees: Prof. Sin-Min Lee Department of Computer Science
66 pages
Information Technology in A Global Society - Stuart Gray - 2011
No ratings yet
Information Technology in A Global Society - Stuart Gray - 2011
376 pages
Data Mining
No ratings yet
Data Mining
3 pages
DW and DM Notes
No ratings yet
DW and DM Notes
89 pages
Unit III DWDM
No ratings yet
Unit III DWDM
113 pages
Data Mining
No ratings yet
Data Mining
30 pages
Introduction To Data Mining
No ratings yet
Introduction To Data Mining
48 pages
Combinepdf 1
No ratings yet
Combinepdf 1
74 pages
Big Data & Cloud Computing CME Unit 1
No ratings yet
Big Data & Cloud Computing CME Unit 1
23 pages
ISS-DSS - Module 3
No ratings yet
ISS-DSS - Module 3
23 pages
What Is Data Mining
No ratings yet
What Is Data Mining
8 pages
Data Mining Unit 1 (MSC Ds 3 Sem)
No ratings yet
Data Mining Unit 1 (MSC Ds 3 Sem)
119 pages
SWEN3165 Lecture 9 - Data Mining
No ratings yet
SWEN3165 Lecture 9 - Data Mining
32 pages
Data Mining Tutorial
No ratings yet
Data Mining Tutorial
30 pages
Chapter 3-IB
No ratings yet
Chapter 3-IB
69 pages
IT in Society - Data Mining
No ratings yet
IT in Society - Data Mining
22 pages
IT in Society On Data Mining
No ratings yet
IT in Society On Data Mining
22 pages
Data Mining and Data Warehousing Unit 3 Part 1
No ratings yet
Data Mining and Data Warehousing Unit 3 Part 1
13 pages
Unit-2 Bi
No ratings yet
Unit-2 Bi
58 pages
DM Unit-1
No ratings yet
DM Unit-1
27 pages
Data Mining - Docx Ghhdocx
No ratings yet
Data Mining - Docx Ghhdocx
6 pages
Data Mining Process Week3
No ratings yet
Data Mining Process Week3
13 pages
Data Mining
No ratings yet
Data Mining
21 pages
IBA - MODULe 4.3
No ratings yet
IBA - MODULe 4.3
10 pages
BIDW Lecture 2
No ratings yet
BIDW Lecture 2
33 pages
DM
No ratings yet
DM
15 pages
Data Mining
No ratings yet
Data Mining
18 pages
Data Mining Cognate
No ratings yet
Data Mining Cognate
23 pages
Data Mining PDF
No ratings yet
Data Mining PDF
6 pages
Koha Presentation
No ratings yet
Koha Presentation
36 pages
Data Mining M1
No ratings yet
Data Mining M1
64 pages
DSS Chapter 5
No ratings yet
DSS Chapter 5
9 pages
Data Mining
No ratings yet
Data Mining
41 pages
DWDM 2
No ratings yet
DWDM 2
15 pages
Unit - I Database Mangement Systems
No ratings yet
Unit - I Database Mangement Systems
12 pages
Presentation Data Mining
No ratings yet
Presentation Data Mining
22 pages
Unit 1
No ratings yet
Unit 1
27 pages
Kantar Consultant Interview Questions 1
No ratings yet
Kantar Consultant Interview Questions 1
11 pages
5 Data Mining Proccess and Techniques - Week 7
No ratings yet
5 Data Mining Proccess and Techniques - Week 7
61 pages
Unit 1 Data Mining
No ratings yet
Unit 1 Data Mining
15 pages
Data Science Module 1 Notes
No ratings yet
Data Science Module 1 Notes
16 pages
Unit 3 Ba
No ratings yet
Unit 3 Ba
29 pages
Unit 3 Data Mining
No ratings yet
Unit 3 Data Mining
21 pages
Seminar Data Mining
No ratings yet
Seminar Data Mining
10 pages
Data Warehousing&Dat Mining
No ratings yet
Data Warehousing&Dat Mining
12 pages
Historian SE 2.0 PI Server Reference Guide PDF
No ratings yet
Historian SE 2.0 PI Server Reference Guide PDF
166 pages
Data Mining and Its Applications
No ratings yet
Data Mining and Its Applications
60 pages
Test Bank For Strategic Brand Management Building Measuring and Managing Equity 4th HQ File Download
100% (1)
Test Bank For Strategic Brand Management Building Measuring and Managing Equity 4th HQ File Download
397 pages
Srs Credit Card Version2
63% (8)
Srs Credit Card Version2
11 pages
Introduction To Data Mining - 125604
No ratings yet
Introduction To Data Mining - 125604
7 pages
Absract:: Data, Information, and Knowledge
No ratings yet
Absract:: Data, Information, and Knowledge
7 pages
Data Mining
No ratings yet
Data Mining
12 pages
Best Practices TP Dept 2482021
No ratings yet
Best Practices TP Dept 2482021
37 pages
Kantar - Consultant Interview Questions
No ratings yet
Kantar - Consultant Interview Questions
11 pages
Data Mining, Data Pattern, Machine Learning (Week 2
No ratings yet
Data Mining, Data Pattern, Machine Learning (Week 2
19 pages
Data Mining - Prashant
No ratings yet
Data Mining - Prashant
10 pages
Lab Report Measuring Diet
No ratings yet
Lab Report Measuring Diet
26 pages
Batch Data Communication
No ratings yet
Batch Data Communication
69 pages
DB2 Redirect Restore Using TSM
No ratings yet
DB2 Redirect Restore Using TSM
8 pages
Cognos Impromptu by Gopi
No ratings yet
Cognos Impromptu by Gopi
14 pages
Data Mine
No ratings yet
Data Mine
14 pages
Maintain Storage System - SPRO SAP Customizing Implementation Guide Cross
No ratings yet
Maintain Storage System - SPRO SAP Customizing Implementation Guide Cross
8 pages
Course - DBMS: Course Instructor Dr. K. Subrahmanyam Department of CSE
100% (1)
Course - DBMS: Course Instructor Dr. K. Subrahmanyam Department of CSE
58 pages
Inspector of Legal Metrology Syllabus
No ratings yet
Inspector of Legal Metrology Syllabus
25 pages
SC Technical Detaiks
No ratings yet
SC Technical Detaiks
226 pages
Railway Reservation System Er Diagram
67% (3)
Railway Reservation System Er Diagram
4 pages
A Starters Guide To Serverless On AWS
No ratings yet
A Starters Guide To Serverless On AWS
33 pages
9 Step To Design Data Warehouse
No ratings yet
9 Step To Design Data Warehouse
24 pages
4 CSBS 21CB402 QBM
No ratings yet
4 CSBS 21CB402 QBM
9 pages
Data Mining AND Warehousing: Abstract
No ratings yet
Data Mining AND Warehousing: Abstract
12 pages
Oracle DBA Course Content
100% (1)
Oracle DBA Course Content
2 pages
B. Tech. II (CSE) Semester - III: Indian Institute of Information Technology (IIIT) Surat (Second Year Detailed Syllabus)
No ratings yet
B. Tech. II (CSE) Semester - III: Indian Institute of Information Technology (IIIT) Surat (Second Year Detailed Syllabus)
20 pages
JCA/JDBC Driver: General Notes
No ratings yet
JCA/JDBC Driver: General Notes
32 pages
DB Views in Django
No ratings yet
DB Views in Django
9 pages
IT TG Normalisation Dominos
No ratings yet
IT TG Normalisation Dominos
9 pages
CV of Sami Kazimi Mar 2024
No ratings yet
CV of Sami Kazimi Mar 2024
6 pages
Ashutosh Bhardwaj
No ratings yet
Ashutosh Bhardwaj
2 pages
Manual Installation of The Pentaho Server: How Can We Help You?
No ratings yet
Manual Installation of The Pentaho Server: How Can We Help You?
4 pages
Don Bosco Institute of Technology Bangalore-74: Department of Information Science and Engineering
No ratings yet
Don Bosco Institute of Technology Bangalore-74: Department of Information Science and Engineering
5 pages
ITI Preparation Plan
No ratings yet
ITI Preparation Plan
1 page
E Ishmal Vi Amazon Resume
No ratings yet
E Ishmal Vi Amazon Resume
1 page
Data Analytics and Data Processing Essentials
From Everand
Data Analytics and Data Processing Essentials
gareth thomas
No ratings yet

Data Mining

Uploaded by

Data Mining

Uploaded by

DATA MINING

What Is Data Mining?

How Data Mining Works

In other cases, data miners find clusters of information based on logical

Warehousing is an important aspect of data mining. Warehousing is the

Data Mining Techniques

Association rules, also referred to as market basket analysis, search for

Step 1: Understand the Business

Step 2: Understand the Data

Step 4: Build the Model

Step 5: Evaluate the Results

Step 6: Implement Change and Monitor

Benefits of Data Mining

Limitations of Data Mining

Data Mining and Social Media

eBay and e-Commerce

eBay outlines the recommendation process as:

Raw item metadata and user historical data are aggregated.

Facebook-Cambridge Analytica Scandal

Frequently Asked Questions

What Is Another Term for Data Mining?

Where Is Data Mining Used?

The Bottom Line

You might also like