0% found this document useful (0 votes)
586 views8 pages

Notes DATA MINING MBA III

Data mining is a process used to extract useful information and patterns from large datasets. It involves cleaning, transforming, and modeling data to uncover hidden patterns, correlations, and other insights. The knowledge gained can then be used for tasks like market segmentation, fraud detection, risk modeling, and customer relationship management. Some common applications of data mining include analyzing retail transaction data to determine customer purchasing behaviors, using medical records to identify effective treatments, and applying educational data to understand student performance.

Uploaded by

Mani Bhagat
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
586 views8 pages

Notes DATA MINING MBA III

Data mining is a process used to extract useful information and patterns from large datasets. It involves cleaning, transforming, and modeling data to uncover hidden patterns, correlations, and other insights. The knowledge gained can then be used for tasks like market segmentation, fraud detection, risk modeling, and customer relationship management. Some common applications of data mining include analyzing retail transaction data to determine customer purchasing behaviors, using medical records to identify effective treatments, and applying educational data to understand student performance.

Uploaded by

Mani Bhagat
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 8

DATA MINING

Data mining is one of the most useful techniques that help entrepreneurs,
researchers, and individuals to extract valuable information from huge sets of
data. Data mining is also called Knowledge Discovery in Database (KDD). The
knowledge discovery process includes Data cleaning, Data integration, Data
selection, Data transformation, Data mining, Pattern evaluation, and Knowledge
presentation.

What is Data Mining?


The process of extracting information to identify patterns, trends, and useful
data that would allow the business to take the data-driven decision from huge
sets of data is called Data Mining.

In other words, we can say that Data Mining is the process of investigating
hidden patterns of information to various perspectives for categorization into
useful data, which is collected and assembled in particular areas such as data
warehouses, efficient analysis, data mining algorithm, helping decision
making and other data requirement to eventually cost-cutting and generating
revenue.

Data mining is the act of automatically searching for large stores of information to
find trends and patterns that go beyond simple analysis procedures. Data mining
utilizes complex mathematical algorithms for data segments and evaluates the
probability of future events. Data Mining is also called Knowledge Discovery of
Data (KDD).

Data Mining is a process used by organizations to extract specific data from huge
databases to solve business problems. It primarily turns raw data into useful
information.

Data Mining is similar to Data Science carried out by a person, in a specific


situation, on a particular data set, with an objective. This process includes various
types of services such as text mining, web mining, audio and video mining,
pictorial data mining, and social media mining. It is done through software that
is simple or highly specific. By outsourcing data mining, all the work can be done
faster with low operation costs. Specialized firms can also use new technologies to
collect data that is impossible to locate manually. There are tonnes of information
available on various platforms, but very little knowledge is accessible. The biggest
challenge is to analyze the data to extract important information that can be
used to solve a problem or for company development. There are many powerful
instruments and techniques available to mine data and find better insight from it.

Types of Data Mining:-


Data mining can be performed on the following types of data:

Relational Database:

A relational database is a collection of multiple data sets formally organized by


tables, records, and columns from which data can be accessed in various ways
without having to recognize the database tables. Tables convey and share
information, which facilitates data searchability, reporting, and organization.

Data warehouses:

A Data Warehouse is the technology that collects the data from various sources
within the organization to provide meaningful business insights. The huge amount
of data comes from multiple places such as Marketing and Finance. The extracted
data is utilized for analytical purposes and helps in decision- making for a business
organization. The data warehouse is designed for the analysis of data rather than
transaction processing.

Data Repositories:

The Data Repository generally refers to a destination for data storage. However,
many IT professionals utilize the term more clearly to refer to a specific kind of
setup within an IT structure. For example, a group of databases, where an
organization has kept various kinds of information.

Object-Relational Database:

A combination of an object-oriented database model and relational database model


is called an object-relational model. It supports Classes, Objects, Inheritance, etc.

One of the primary objectives of the Object-relational data model is to close the
gap between the Relational database and the object-oriented model practices
frequently utilized in many programming languages, for example, C++, Java, C,
and so on.

Advantages of Data Mining


o The Data Mining technique enables organizations to obtain knowledge-
based data.
o Data mining enables organizations to make lucrative modifications in
operation and production.
o Compared with other statistical data applications, data mining is a cost-
efficient.
o Data Mining helps the decision-making process of an organization.
o It facilitates the automated discovery of hidden patterns as well as the
prediction of trends and behaviors.
o It can be induced in the new system as well as the existing platforms.
o It is a quick process that makes it easy for new users to analyze enormous
amounts of data in a short time.

Disadvantages of Data Mining


o There is a probability that the organizations may sell useful data of
customers to other organizations for money. As per the report, American
Express has sold credit card purchases of their customers to other
organizations.
o Many data mining analytics software is difficult to operate and needs
advance training to work on.
o Different data mining instruments operate in distinct ways due to the
different algorithms used in their design. Therefore, the selection of the right
data mining tools is a very challenging task.
o The data mining techniques are not precise, so that it may lead to severe
consequences in certain conditions.
Data Mining Applications
Data Mining is primarily used by organizations with intense consumer demands-
Retail, Communication, Financial, marketing company, determine price, consumer
preferences, product positioning, and impact on sales, customer satisfaction, and
corporate profits. Data mining enables a retailer to use point-of-sale records of
customer purchases to develop products and promotions that help the
organization to attract the customer.

These are the following areas where data mining is widely used:

1. Data Mining in Healthcare:


Data mining in healthcare has excellent potential to improve the health system. It
uses data and analytics for better insights and to identify best practices that
will enhance health care services and reduce costs. Analysts use data mining
approaches such as Machine learning, Multi-dimensional database, Data
visualization, Soft computing, and statistics. Data Mining can be used to forecast
patients in each category. The procedures ensure that the patients get intensive
care at the right place and at the right time. Data mining also enables healthcare
insurers to recognize fraud and abuse.
2. Data Mining in Market Basket Analysis:

Market basket analysis is a modeling method based on a hypothesis. If you buy a


specific group of products, then you are more likely to buy another group of
products. This technique may enable the retailer to understand the purchase
behavior of a buyer. This data may assist the retailer in understanding the
requirements of the buyer and altering the store's layout accordingly. Using a
different analytical comparison of results between various stores, between
customers in different demographic groups can be done.

3. Data mining in Education:

Education data mining is a newly emerging field, concerned with developing


techniques that explore knowledge from the data generated from educational
Environments. An organization can use data mining to make precise decisions
and also to predict the results of the student. With the results, the institution
can concentrate on what to teach and how to teach.

4. Data Mining in Manufacturing Engineering:

Data mining tools can be beneficial to find patterns in a complex manufacturing


process. Data mining can be used in system-level designing to obtain the
relationships between product architecture, product portfolio, and data needs
of the customers. It can also be used to forecast the product development period,
cost, and expectations among the other tasks.

5. Data Mining in CRM (Customer Relationship Management):

Customer Relationship Management (CRM) is all about obtaining and holding


Customers, also enhancing customer loyalty and implementing customer-
oriented strategies. To get a decent relationship with the customer, a business
organization needs to collect data and analyze the data. With data mining
technologies, the collected data can be used for analytics.

6. Data Mining in Fraud detection:

Billions of dollars are lost to the action of frauds. Traditional methods of fraud
detection are a little bit time consuming and sophisticated. Data mining provides
meaningful patterns and turning data into information. An ideal fraud detection
system should protect the data of all the users. Supervised methods consist of a
collection of sample records, and these records are classified as fraudulent or
non-fraudulent. A model is constructed using this data, and the technique is made
to identify whether the document is fraudulent or not.

7. Data Mining in Lie Detection:

Apprehending a criminal is not a big deal, but bringing out the truth from him is a
very challenging task. Law enforcement may use data mining techniques to
investigate offenses, monitor suspected terrorist communications, etc. This
technique includes text mining also, and it seeks meaningful patterns in data,
which is usually unstructured text. The information collected from the previous
investigations is compared, and a model for lie detection is constructed.

8. Data Mining Financial Banking:

The Digitalization of the banking system is supposed to generate an enormous


amount of data with every new transaction. The data mining technique can help
bankers by solving business-related problems in banking and finance by
identifying trends, casualties, and correlations in business information and
market costs that are not instantly evident to managers or executives because the
data volume is too large or are produced too rapidly on the screen by experts. The
manager may find these data for better targeting, acquiring, retaining, segmenting,
and maintain a profitable customer.

Challenges of Implementation in Data mining


Although data mining is very powerful, it faces many challenges during its
execution. Various challenges could be related to performance, data, methods, and
techniques, etc. The process of data mining becomes effective when the
challenges or problems are correctly recognized and adequately resolved. The
process of data mining becomes effective when the challenges or problems are
correctly recognized and adequately resolved.

1. Incomplete and noisy data:

The process of extracting useful data from large volumes of data is data
mining. The data in the real-world is heterogeneous, incomplete, and noisy. Data
in huge quantities will usually be inaccurate or unreliable. These problems may
occur due to data measuring instrument or because of human errors. The data
could get changed due to human or system error. All these consequences
(noisy and incomplete data) makes data mining challenging.
2. Data Distribution:

Real-worlds data is usually stored on various platforms in a distributed computing


environment. It might be in a database, individual systems, or even on the internet.
Practically, It is a quite tough task to make all the data to a centralized data
repository mainly due to organizational and technical concerns.

3. Complex Data:

Real-world data is heterogeneous, and it could be multimedia data, including audio


and video, images, complex data, spatial data, time series, and so on. Managing
these various types of data and extracting useful information is a tough task. Most
of the time, new technologies, new tools, and methodologies would have to be
refined to obtain specific information.

4. Performance:

The data mining system's performance relies primarily on the efficiency of


algorithms and techniques used. If the designed algorithm and techniques are
not up to the mark, then the efficiency of the data mining process will be
affected adversely.

5. Data Privacy and Security:

Data mining usually leads to serious issues in terms of data security,


governance, and privacy. For example, if a retailer analyzes the details of the
purchased items, then it reveals data about buying habits and preferences of the
customers without their permission.

6. Data Visualization:

In data mining, data visualization is a very important process because it is the


primary method that shows the output to the user in a presentable way. But many
times, representing the information to the end-user in a precise and easy way is
difficult.

There are many more challenges in data mining in addition to the problems above-mentioned.
More problems are disclosed as the actual data mining process begins, and the success of data
mining relies on getting rid of all these difficulties

You might also like