IT Unit 4
IT Unit 4
Intelligence
What is Data Mining?
The process of extracting information to identify patterns, trends, and useful data that would allow the business to take the data-driven decision from huge sets of data is called
Data Mining.
o There is a probability that the organizations may sell useful data of customers to other organizations for money. As per the report, American Express has sold credit
card purchases of their customers to other organizations.
o Many data mining analytics software is difficult to operate and needs advance training to work on.
o Different data mining instruments operate in distinct ways due to the different algorithms used in their design. Therefore, the selection of the right data mining
tools is a very challenging task.
o The data mining techniques are not precise, so that it may lead to severe consequences in certain conditions.
These are the following areas where data mining is widely used:
Data Mining in Healthcare:
Data mining in healthcare has excellent potential to improve the health system. It uses data and analytics for better insights and to identify best practices that will enhance
health care services and reduce costs. Analysts use data mining approaches such as Machine learning, Multi-dimensional database, Data visualization, Soft computing, and
statistics. Data Mining can be used to forecast patients in each category. The procedures ensure that the patients get intensive care at the right place and at the right time. Data
mining also enables healthcare insurers to recognize fraud and abuse.
Data Mining in Market Basket Analysis:
Market basket analysis is a modeling method based on a hypothesis. If you buy a specific group of products, then you are more likely to buy another group of products. This
technique may enable the retailer to understand the purchase behavior of a buyer. This data may assist the retailer in understanding the requirements of the buyer and altering
the store's layout accordingly. Using a different analytical comparison of results between various stores, between customers in different demographic groups can be done.
Data mining in Education:
Education data mining is a newly emerging field, concerned with developing techniques that explore knowledge from the data generated from educational Environments. EDM
objectives are recognized as affirming student's future learning behavior, studying the impact of educational support, and promoting learning science. An organization can use
data mining to make precise decisions and also to predict the results of the student. With the results, the institution can concentrate on what to teach and how to teach.
Data Mining in Manufacturing Engineering:
Knowledge is the best asset possessed by a manufacturing company. Data mining tools can be beneficial to find patterns in a complex manufacturing process. Data mining can
be used in system-level designing to obtain the relationships between product architecture, product portfolio, and data needs of the customers. It can also be used to forecast
the product development period, cost, and expectations among the other tasks.
Data Mining in CRM (Customer Relationship Management):
Customer Relationship Management (CRM) is all about obtaining and holding Customers, also enhancing customer loyalty and implementing customer-oriented strategies. To
get a decent relationship with the customer, a business organization needs to collect data and analyze the data. With data mining technologies, the collected data can be used
for analytics.
Data Mining in Fraud detection:
Billions of dollars are lost to the action of frauds. Traditional methods of fraud detection are a little bit time consuming and sophisticated. Data mining provides meaningful
patterns and turning data into information. An ideal fraud detection system should protect the data of all the users. Supervised methods consist of a collection of sample
records, and these records are classified as fraudulent or non-fraudulent. A model is constructed using this data, and the technique is made to identify whether the document is
fraudulent or not.
Data Mining in Lie Detection:
Apprehending a criminal is not a big deal, but bringing out the truth from him is a very challenging task. Law enforcement may use data mining techniques to investigate
offenses, monitor suspected terrorist communications, etc. This technique includes text mining also, and it seeks meaningful patterns in data, which is usually unstructured text.
The information collected from the previous investigations is compared, and a model for lie detection is constructed.
Data Mining Financial Banking:
The Digitalization of the banking system is supposed to generate an enormous amount of data with every new transaction. The data mining technique can help bankers by
solving business-related problems in banking and finance by identifying trends, casualties, and correlations in business information and market costs that are not instantly
evident to managers or executives because the data volume is too large or are produced too rapidly on the screen by experts. The manager may find these data for better
targeting, acquiring, retaining, segmenting, and maintain a profitable customer.
Challenges of Implementation in Data mining
Although data mining is very powerful, it faces many challenges during its execution. Various challenges could be related to performance, data, methods, and techniques, etc.
The process of data mining becomes effective when the challenges or problems are correctly recognized and adequately resolved.
Data Distribution:
Real-worlds data is usually stored on various platforms in a distributed computing environment. It might be in a database, individual systems, or even on the internet. Practically,
It is a quite tough task to make all the data to a centralized data repository mainly due to organizational and technical concerns. For example, various regional offices may have
their servers to store their data. It is not feasible to store, all the data from all the offices on a central server. Therefore, data mining requires the development of tools and
algorithms that allow the mining of distributed data.
Complex Data:
Real-world data is heterogeneous, and it could be multimedia data, including audio and video, images, complex data, spatial data, time series, and so on. Managing these
various types of data and extracting useful information is a tough task. Most of the time, new technologies, new tools, and methodologies would have to be refined to obtain
specific information.
Performance:
The data mining system's performance relies primarily on the efficiency of algorithms and techniques used. If the designed algorithm and techniques are not up to the mark,
then the efficiency of the data mining process will be affected adversely.
Data Privacy and Security:
Data mining usually leads to serious issues in terms of data security, governance, and privacy. For example, if a retailer analyzes the details of the purchased items, then it reveals
data about buying habits and preferences of the customers without their permission.
• Rapid Miner Studio: Workflow design, prototyping, validation, etc., are done in this module.
• Rapid Miner Server: This module is used for operating predictive data models.
• Rapid Miner Radoop: For simplification of predictive analysis, this module executes a process in Hadoop.
2. Orange
It is open-source software written in python language. Orange is the best software for analysing data and machine learning. These components are called widgets. These widgets
are used for reading data, analysing components, allowing users to select the features, and showing the data. With orange, data formatting and moving them with the help of
widgets becomes fast and easy.
3. Weka
The University of Waikato develops weka. It is an open-source software used for predictive modelling and analysis of data. Weka has a GUI interface that provides easy and
interactive access to users. It supports SQL and allows a user to connects to the database, and performs operations by firing query. It stores data in a flat-file format.
4. KNIME
It is an open-source developed by KNIME.com AG used for data analytics. It is built by combining data mining and machine learning components. It has been used for
pharmaceutical research, business intelligence, and financial analysis.
5. Sisense
It is not open-source software; it is licensed software, and we have to purchase the license to use this. Small and large organizations use Sisense to handle the data. As it also
supports widgets like orange, it is easy to move data and creates reports by dragging and dropping. Not even technical people can work with Sisense as its GUI based. With the
help of widgets, Sisense generated words are in the form of bar chart, pie chart, line chart, etc.
6. Apache Mahout
The Apache foundation develops it. Apache Mahout aims to create algorithms for machine learning and focus on regression, clustering classification of data. As it is written in a
well-known language like java and contains java libraries that support mathematics operation, it is used for statistical analysis.
7. SSDT
SSDT is short for SQL Server Data Tools. It is used to expand the database development phases in a visual studio. It is widely used for data analysis and provides solutions to
solve business intelligence problems. SSDT provides a table designer to perform table operations like create a table, adding table data, deleting table data, modifying table
content. It allows a user to connect to the database as it supports SQL.
8. Rattle
The Rattle is an open-source developed using the R language. It provides a GUI interface. The inbuilt log close tab enables Rattle to generate duplicate for every activity.
9. DataMelt
It is also known as DMelt. It is used to analyze and visualize data. It is designed for students, engineers, and scientists. It is platform-independent, which means it can run on any
operating system which contains JVM( Java Virtual Machine). It is used to create 2D or 3D plots, random numbers, mathematical operations, algebra equations.
10. SAS
It is developed for managing a large amount of data. It allows a user to modify the data, store data from different locations into one space. As it provides a GUI interface, a non-
technical person can also use this quickly and handles their data efficiently.
• Not yet ready for business, this is the best to implement in business environment.
• Need a separated database, data mining can use available data base.
• only those with high technology that only can use it.
Business performance management is a metric that measures an organisation's overall progression towards
its objectives. When a company uses performance management, it collects and analyses data to evaluate its
business operations. This is a valuable technique that helps the organisation collect quantitative data, such as
the number of sales made in a month or the company's current cash flow. Management teams evaluate the
performance of individual employees and entire departments to make beneficial decisions. Along with
analysing the financial aspects of a business, it also considers employee and customer satisfaction.
Performance management is a beneficial way to evaluate employees and company progress. An organisation
that uses BPM considers crucial data and progress records to analyse its performance. The following are
important aspects of BPM:
Performance management considers how well a company aligns with predetermined objectives. Business
goals serve as motivation and provide a clear objective for all employees to achieve. With performance
management, you can evaluate the rate at which the organisation achieves its milestones and make any
additional changes to help you advance. Performance management allows the management team to create
company-wide business objectives and monitor their progress throughout the year.
Evaluates alternatives
Considering alternatives is important when the business's initial approach produces unexpected results.
BPM's key benefit is that it invites new ideas and encourages innovative thinking among employees. Diverse
viewpoints can lead to a better approach because they consider additional data and help the management
team learn from previous experience.
Improves accountability
When a management team implements performance management, the company holds its employees
accountable. Since managers and supervisors evaluate employees' performance, they consider company goals
more frequently. Employees understand their responsibilities better when held to account by their employers.
When a company uses performance management, its assessment of the company is more transparent.
A proper structure for managing business performance allows an organisation to set clear expectations for
employees and supervisors. It allows management to create a list of employee expectations based on current
performance. Clear and achievable expectations are likely to produce consistent results.
Improves communication
Communication quality can influence your performance management system's success. BPM promotes a
culture of clear communication, greater team engagement and coherence between personal and company
goals. It encourages companies to engage in one-on-one conversations to provide consistent feedback, foster
skill development, integrate team building and promote collaboration.
An effective performance management system is important for identifying employees' skill gaps and
providing a training system to close them. A training plan boosts employee morale because it demonstrates
the organisation values them. Besides acting as a potent talent magnet, professional development
opportunities also increase employee retention rates. To achieve these objectives, an organisation may create
a training budget and determine how specific skills help to maximise their return on investment.
Goal selection
Goal selection is when the business decides on short- and long-term goals. Typically, several
members of the management team think of these goals, which are often realistic and reflect the
business's trajectory. The company can focus on specific goals while postponing others. This
prioritisation allows employees to dedicate time, energy and resources to select objectives.
Information consolidation
Information consolidation involves gathering data about the company. The management team
does this to evaluate and direct decision-making. When a business uses information consolidation,
they aim to provide accurate and reliable information for the team's reference.
Management intervention
Management interventions are steps managers take to enhance the business's operations. They
determine these interventions using the data they collect during information consolidation. Their
actions consider the company's mission and established goals. For example, a supervisor might
check in with an employee weekly rather than bi-weekly. This approach provides an additional
opportunity to ask questions.
IT tool
Goal Seek in Excel (Examples) | How to Use Goal Seek in Excel? (educba.com)