0% found this document useful (0 votes)
7 views29 pages

ML@Chapter 1

Chapter 1 provides an introduction to machine learning (ML), defining it as a subfield of artificial intelligence that enables machines to learn from data without explicit programming. It covers the history, applications, and various types of machine learning techniques, including supervised, unsupervised, semi-supervised, and reinforcement learning. The chapter emphasizes the importance of data quality and the machine learning process, which includes data collection, preparation, modeling, and deployment.

Uploaded by

ah4710519
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views29 pages

ML@Chapter 1

Chapter 1 provides an introduction to machine learning (ML), defining it as a subfield of artificial intelligence that enables machines to learn from data without explicit programming. It covers the history, applications, and various types of machine learning techniques, including supervised, unsupervised, semi-supervised, and reinforcement learning. The chapter emphasizes the importance of data quality and the machine learning process, which includes data collection, preparation, modeling, and deployment.

Uploaded by

ah4710519
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 29

Chapter 1

Introduction to Machine Learning

1
Outlines
What is machine learning
Foundation of Machine learning
History and relationship to other fields
Applications of machine learning
Types of machine learning techniques
Overview of data mining and KDD process
Prediction vs Description modeling
2
Machine Learning

Machine learning (ML) is a subfield of artificial intelligence that enables machines to learn

from data without being explicitly programmed.

Machine learning uses experience to improve performance or to make accurate predictions.

Here, experience refers to the past information available to the learner, which typically takes

the form of electronic data collected and made available for analysis.

This data could be in the form of digitized human-labeled training sets, or other types of

information obtained via interaction with the environment.

In all cases, its quality and size are crucial to the success of the predictions made by the

learner.

In machine learning, algorithm development is core work.

These algorithms are trained on data to learn the hidden patterns and make predictions

based on what they learned.


3
The whole process of training the algorithms is termed as model building.
How does Machine Learning Work?

Broadly Machine Learning process includes Project Setup, Data Preparation,


Modeling and Deployment.

The following figure demonstrates the common working process of Machine


Learning and It follows some set of steps to do the task; a sequential process of its
workflow is as follows:

4
Need for Machine Learning

Human beings, at this moment, are the most intelligent and advanced species on earth because they can

think, evaluate and solve complex problems.

On the other side, AI is still in its initial stage and hasn’t surpassed human intelligence in many aspects.

Then the question is, what is the need to make machines learn? The most suitable reason for doing this

is “to make decisions, based on data, with efficiency and scale”.

Lately, organizations are investing heavily in newer technologies like Artificial Intelligence, Machine

Learning and Deep Learning to get the key information from data to perform several real-world tasks

and solve problems. We can call it data-driven decisions taken by machines, particularly to automate the

process.

These data-driven decisions can be used, instead of programming logic, in problems that cannot be

programmed inherently.

The fact is that we can’t do without human intelligence, but another aspect is that we all need to solve

real-world problems with efficiency at a huge scale. That is why the need for machine learning arises.

5
History of Machine Learning/foundation/

The history of Machine learning roots back to the year 1959, when Arthur

Samuel invented a program that calculates the winning probability in checkers for each

side.

The evolution of Machine learning through decades started with the question, "Can

Machines think?". Then came the rise of neural networks between 1960 and 1970.

Machine learning continued to advance through statistical methods such as Bayesian

networks and decision tree learning.

The revolution of Deep Learning started off in the 2010s with the evolution of tasks such

as natural language processing, convolution neural networks and speech recognition.

Today, machine learning has turned out to be a revolutionizing technology that has become

a part of all fields, ranging from healthcare to finance and transportation.

Statistics and mathematical optimization (mathematical programming) methods comprise


6
the foundations of machine learning.
Machine learning relation to other field

Machine learning (ML) is a field of study in artificial

intelligence concerned with the development and study

of statistical algorithms that can learn from data and generalize to

unseen data, and thus perform tasks without explicit instructions.

Advances in the field of deep learning have allowed neural

networks to surpass many previous approaches in performance.

ML finds application in many fields, including natural language

processing, computer vision, speech recognition, email

filtering, agriculture, and medicine. 7


Applications of Machine Learning
Nowadays; Machine Learning is used almost everywhere. However, some most commonly

used applicable areas of Machine Learning are:

Speech recognition: It is also known as automatic speech recognition (ASR), computer

speech recognition, or speech-to-text, and it is a capability that uses natural language

processing (NLP) to translate human speech into a written format. To perform voice search,

such as Siri, or improve text accessibility, a large number of Mobile Devices incorporate

speech recognition into their systems.

Customer service: Chatbots are replacing human operators on websites and social media,

affecting client engagement. Chatbots answer shipping FAQs, offer personalized advice,

cross-sell products, and recommend sizes. Some common examples are virtual agents on e-

commerce sites, Slack and Facebook Messenger bots, and virtual and voice assistants.

Computer vision: This artificial intelligence technology allows computers to derive

meaningful information from digital images, videos, and other visual inputs that can then be

used for appropriate action. Computer vision, powered by convolutional neural networks, is 8

Recommendation engines: AI algorithms may help to detect trends in data that might be useful for

developing more efficient marketing strategies using past data patterns. Online retailers use recommendation

engines to provide their customers with relevant product recommendations for the purchasing process.

Robotic process automation (RPA): Also known as software robotics, RPA uses intelligent automation

technologies to perform repetitive manual tasks.

Automated stock trading: AI-driven high-frequency trading platforms are designed to optimize stock

portfolios and make thousands or even millions of trades each day without human intervention.

Fraud detection: Machine learning is capable of detecting suspected transactions for banks and others in the

financial sector. A model can be trained by supervised learning, based on knowledge of recent fraudulent

transactions. Anomaly detection may identify transactions that appear unusual, and need to be followed up.

Social Media: Social media platforms are particularly popular among the youth for their user-friendly

features and the ability to connect easily with one's contacts.

 It is all possible through the use of algorithms designed in machine learning.

 For example, Facebook uses Machine Learning to observe and record different activities of users and even

tracks their chats, likes, and comments, and the time individuals spend on various posts. Based on these

observations and learning from the data collected, it suggests friends and pages you should follow.
9
Machine Learning Methods/types/models
Machine learning models can be categorized
mainly into the following four types −

Supervised Machine Learning

Unsupervised Machine Learning

Semi-supervised Machine Learning

Reinforcement Machine Learning

10
Supervised Machine Learning

Supervised machine learning uses labeled datasets to train algorithms

to classify data or predict outcomes.

As input data is inputted into the model, its weights modify until it fits

into the model; this process is known as cross validation which

ensures the model is not overfitted or underfitted.

the algorithm is trained on labeled data, meaning that the correct

answer or output is provided for each input.

The algorithm then uses this labeled data to make predictions about

new, unseen data.


11
Example

12
--OS

Supervised learning helps organizations scale real-world


challenges like spam classification in a different folder from
your inbox.
Different algorithms for supervised learning include neural
networks, naïve Bayes, linear regression, logistic regression,
random forest.

Logistic regression, linear


Techniques regression, decision tree, and
neural network.

13
Unsupervised Machine Learning

Unsupervised machine learning analyses and clusters unlabelled


datasets using machine learning methods.

The algorithms find hidden patterns or data groupings without


human interaction.

This method is useful for exploratory data analysis, cross-selling,


consumer segmentation, and image and pattern recognition.

the algorithm is trained on unlabeled data, meaning that the correct


output or answer is not provided for each input.

Instead, the algorithm must identify patterns and structures in the


14
data on its own.
Example

15
So--

It/UML/ also reduces model features through


dimensionality reduction using prominent methods of
Principal component analysis (PCA) and singular value
decomposition (SVD).

k-means clustering, and probabilistic clustering,


Association rule learning, Dimensionality reduction are
some popular methods\algorithms/ of unsupervised
learning.
16
Summary of differences: supervised vs. unsupervised learning

Supervised learning Unsupervised learning

You train the model with a set of


You train the model to discover
What is it? input data and a corresponding set
hidden patterns in unlabeled data.
of paired labeled output data.

Logistic regression, linear Clustering, association rule


Techniques regression, decision tree, and learning, probability density,
neural network. and dimensionality reduction.

Identify valuable relationship


information between input data
Predict an output based on known
Goal points. This can then be applied
inputs.
to new input to draw similar
insights.

Minimize the error between Find patterns, similarities, or


Approach
predicted outputs and true labels. anomalies within the data.

17
Semi-supervised Machine Learning
As its name implies; Semi-supervised learning is an integration of

supervised and unsupervised learning. This method uses both labeled

and unlabelled data to train ML models for classification and

regression tasks.

Semi-supervised learning is a best practice to utilize to solve the

problem where a user doesn't have enough labeled data for a

supervised learning algorithm.

Is a type of machine learning technique that is an integration of

supervised and unsupervised learning as it uses a major portion of

unlabeled dataset and minor portion of labeled data for training an


18
Example

 Hence, it's an appropriate method to solve the problem where data is partially
labeled or unlabelled.
 Self-training, co-training, and graph-based labeling are some of the popular
19
Semi-supervised learning methods.
Reinforcement Machine Learning

In reinforcement machine learning, the algorithm learns by


receiving feedback in the form of rewards or punishments
based on its actions.

The algorithm then uses this feedback to adjust its behavior


and improve performance.

Is a type of machine learning model that is similar to


supervised learning but does not use sample data to train the
algorithm.

This model learns by trial and error. 20


Example

 Works on interacting with the environment

 No – predefined data

 Learn a series of action

• Self Driving Cars, Gaming, Healthcare

21
When to use: supervised vs. unsupervised learning or both?

You can use supervised learning techniques to solve problems with known outcomes and that have labeled

data available.

Examples include, Risk Evaluation, Forecast Sales email spam classification, image recognition, and stock

price predictions based on known historical data.

You can use unsupervised learning for scenarios where the data is unlabeled and the objective is to discover

patterns, group similar instances, or detect anomalies.

You can also use it for exploratory tasks where labeled data is absent.

Examples include organizing large data archives, building recommendation systems, and grouping customers

based on their purchasing behaviors.

Semi-supervised learning is when you apply both supervised and unsupervised learning techniques to a

common problem. It’s another category of machine learning in itself.

You can apply semi-supervised learning when it’s difficult to obtain labels for a dataset.

You might have a smaller volume of labeled data but a significant amount of unlabeled data. 22
Stages of Machine Learning

23
.

Data collection

Data collection is an initial step in the process of machine learning.

Data is a fundamental part of machine learning, the quality and quantity of your data can have direct consequences for model

performance.

Different sources such as databases, text files, pictures, sound files, or web scraping may be used for data collection.

Data needs to be prepared for machine learning once it has been collected.

This process is to organize the data in an appropriate format, and make sure that they are useful for solving your problem.

Data pre-processing

Pre-processing of data is a key step in the process of machine learning.

It involves deleting duplicate data, fixing errors, managing missing data either by eliminating or filling it in, and adjusting

and formatting the data.

Pre-processing improves the quality of your data and ensures that your machine-learning model can read it right.

The accuracy of your model may be significantly improved by this step.

Choosing the right model:

The next step is to select a machine learning model; once data is prepared then we apply it to ML Models like Linear

regression, decision trees, and Neural Networks that may be selected to implement.

The selection of the model generally depends on what kind of data you're dealing with and your problem.
24
The size and type of data, complexity, and computational resources should be taken into account when choosing a model to
..
Training the model

The next step is to train it with the data that has been prepared after you have chosen a model.

Training is about connecting the data to the model and enabling it to adjust its parameters to predict output more

accurately. Overfitting and underfitting must be avoided during the training.

Evaluating the model

It is important to assess the model's performance before deployment as soon as a model has been trained. This

means that the model has to be tested on new data that they haven't been able to see during training.

Accuracy in classifying problems, precision and recall for binary classification problems, as well as mean error

squared with regression problems, are common metrics to evaluate the performance of a model.

Hyperparameter tuning and optimization

You may need to adjust its hyperparameters to make it more efficient after you've evaluated the model.

Grid searches, where you try different combinations of parameters, and cross-validation, where you divide your

data into subsets and train your model on each subset, to ensure that it performs well on different data sets, are

techniques for hyperparameter tuning.

Predictions and deployment

As soon as the model has been programmed and optimized, it will be ready to estimate new data.
25
This is done by adding new data to the model and using its output for decision-making or other analysis.
Overview of DM and KDD

Data Mining is defined as the procedure of extracting hiden information from huge sets of

data. In other words, we can say that data mining is mining knowledge from data.

Data Mining (DM) is a part of the KDD process relating to methods for extracting patterns from

data [Fayyad].

Data Mining is a problem solving methodology that finds a logical or mathematical description,

of a complex nature, of patterns and regularities in a set of data [Decker and Focardi].

KDD (Knowledge Discovery in Databases) is a process that involves the extraction of useful,

previously unknown, and potentially valuable information from large datasets. The KDD

process is an iterative process and it requires multiple iterations of the above steps to extract

accurate knowledge from the data.

Knowledge Discovery in Databases (KDD) is the non-trivial process of identifying valid, novel,

potentially useful and ultimately understandable patterns in data [Fayyad].

DM: The non-trivial extraction of implicit, previously unknown and potentially useful

knowledge from data. 26


Data Mining Process

Understand Business
• Identify the Company's and Project's Objectives first and Problems that need to be addressed

Understand the Data


• Identify what type of data is needed to solve the issue andCollect it from authentic sources; obtain

access rights, and prepare a data description report

Prepare the Data

• Clean the data: handle missing data, data errors, default values, and data corrections.

• Prepare the data in a format

Model the Data


• Employ algorithms to ascertain data patterns, and create, the model, test it, and validate the model

Evaluation
• Validate models with business goals, and Change the model, adjust the business goal, or revisit the data,

if needed

Deployment
27
• Generate business intelligence, and Continually monitoring, and maintaining the data mining application
Why Data Mining?

Data mining is important to learn for several reasons:

Extracting Insights: Data mining techniques allow users to extract useful information and patterns

from vast amounts of data.

Decision Making: Data mining contributes to the decision-making process. Businesses can predict

future trends and outcomes with a high degree of confidence through the analysis of historical data.

Customer Understanding: By analyzing the behavior, preferences, and purchasing patterns of

customers, data mining enables enterprises to gain a more accurate understanding of their clients..

Risk Management: Using data mining techniques to analyze patterns and anomalies in the data,

businesses can identify possible risks or frauds.

Improved Efficiency: Data mining, which can greatly enhance the efficiency of operations, aids in

automatically discovering patterns and insights from data.

Innovation: Hidden patterns and relationships in the data that can lead to new product ideas,

innovativeness, or business possibilities may be discovered by analyzing it

Personal Development: The analytical and problem-solving skills are enhanced by the knowledge of
28
data mining
Key Areas of Machine Learning

29

You might also like