0% found this document useful (0 votes)

7 views29 pages

ML@Chapter 1

Chapter 1 provides an introduction to machine learning (ML), defining it as a subfield of artificial intelligence that enables machines to learn from data without explicit programming. It covers the history, applications, and various types of machine learning techniques, including supervised, unsupervised, semi-supervised, and reinforcement learning. The chapter emphasizes the importance of data quality and the machine learning process, which includes data collection, preparation, modeling, and deployment.

Uploaded by

ah4710519

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

7 views29 pages

ML@Chapter 1

Uploaded by

ah4710519

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 29

Chapter 1

Introduction to Machine Learning

1
Outlines
What is machine learning
Foundation of Machine learning
History and relationship to other fields
Applications of machine learning
Types of machine learning techniques
Overview of data mining and KDD process
Prediction vs Description modeling
2
Machine Learning

Machine learning (ML) is a subfield of artificial intelligence that enables machines to learn

from data without being explicitly programmed.

Machine learning uses experience to improve performance or to make accurate predictions.

Here, experience refers to the past information available to the learner, which typically takes

the form of electronic data collected and made available for analysis.

This data could be in the form of digitized human-labeled training sets, or other types of

information obtained via interaction with the environment.

In all cases, its quality and size are crucial to the success of the predictions made by the

learner.

In machine learning, algorithm development is core work.

These algorithms are trained on data to learn the hidden patterns and make predictions

based on what they learned.

3
The whole process of training the algorithms is termed as model building.
How does Machine Learning Work?

Broadly Machine Learning process includes Project Setup, Data Preparation,

Modeling and Deployment.

The following figure demonstrates the common working process of Machine

Learning and It follows some set of steps to do the task; a sequential process of its
workflow is as follows:

4
Need for Machine Learning

Human beings, at this moment, are the most intelligent and advanced species on earth because they can

think, evaluate and solve complex problems.

On the other side, AI is still in its initial stage and hasn’t surpassed human intelligence in many aspects.

Then the question is, what is the need to make machines learn? The most suitable reason for doing this

is “to make decisions, based on data, with efficiency and scale”.

Lately, organizations are investing heavily in newer technologies like Artificial Intelligence, Machine

Learning and Deep Learning to get the key information from data to perform several real-world tasks

and solve problems. We can call it data-driven decisions taken by machines, particularly to automate the

process.

These data-driven decisions can be used, instead of programming logic, in problems that cannot be

programmed inherently.

The fact is that we can’t do without human intelligence, but another aspect is that we all need to solve

real-world problems with efficiency at a huge scale. That is why the need for machine learning arises.

5
History of Machine Learning/foundation/

The history of Machine learning roots back to the year 1959, when Arthur

Samuel invented a program that calculates the winning probability in checkers for each

side.

The evolution of Machine learning through decades started with the question, "Can

Machines think?". Then came the rise of neural networks between 1960 and 1970.

Machine learning continued to advance through statistical methods such as Bayesian

networks and decision tree learning.

The revolution of Deep Learning started off in the 2010s with the evolution of tasks such

as natural language processing, convolution neural networks and speech recognition.

Today, machine learning has turned out to be a revolutionizing technology that has become

a part of all fields, ranging from healthcare to finance and transportation.

Statistics and mathematical optimization (mathematical programming) methods comprise

6
the foundations of machine learning.
Machine learning relation to other field

Machine learning (ML) is a field of study in artificial

intelligence concerned with the development and study

of statistical algorithms that can learn from data and generalize to

unseen data, and thus perform tasks without explicit instructions.

Advances in the field of deep learning have allowed neural

networks to surpass many previous approaches in performance.

ML finds application in many fields, including natural language

processing, computer vision, speech recognition, email

filtering, agriculture, and medicine. 7

Applications of Machine Learning
Nowadays; Machine Learning is used almost everywhere. However, some most commonly

used applicable areas of Machine Learning are:

Speech recognition: It is also known as automatic speech recognition (ASR), computer

speech recognition, or speech-to-text, and it is a capability that uses natural language

processing (NLP) to translate human speech into a written format. To perform voice search,

such as Siri, or improve text accessibility, a large number of Mobile Devices incorporate

speech recognition into their systems.

Customer service: Chatbots are replacing human operators on websites and social media,

affecting client engagement. Chatbots answer shipping FAQs, offer personalized advice,

cross-sell products, and recommend sizes. Some common examples are virtual agents on e-

commerce sites, Slack and Facebook Messenger bots, and virtual and voice assistants.

Computer vision: This artificial intelligence technology allows computers to derive

meaningful information from digital images, videos, and other visual inputs that can then be

used for appropriate action. Computer vision, powered by convolutional neural networks, is 8
…

Recommendation engines: AI algorithms may help to detect trends in data that might be useful for

developing more efficient marketing strategies using past data patterns. Online retailers use recommendation

engines to provide their customers with relevant product recommendations for the purchasing process.

Robotic process automation (RPA): Also known as software robotics, RPA uses intelligent automation

technologies to perform repetitive manual tasks.

Automated stock trading: AI-driven high-frequency trading platforms are designed to optimize stock

portfolios and make thousands or even millions of trades each day without human intervention.

Fraud detection: Machine learning is capable of detecting suspected transactions for banks and others in the

financial sector. A model can be trained by supervised learning, based on knowledge of recent fraudulent

transactions. Anomaly detection may identify transactions that appear unusual, and need to be followed up.

Social Media: Social media platforms are particularly popular among the youth for their user-friendly

features and the ability to connect easily with one's contacts.

 It is all possible through the use of algorithms designed in machine learning.

 For example, Facebook uses Machine Learning to observe and record different activities of users and even

tracks their chats, likes, and comments, and the time individuals spend on various posts. Based on these

observations and learning from the data collected, it suggests friends and pages you should follow.
9
Machine Learning Methods/types/models
Machine learning models can be categorized
mainly into the following four types −

Supervised Machine Learning

Unsupervised Machine Learning

Semi-supervised Machine Learning

Reinforcement Machine Learning

10
Supervised Machine Learning

Supervised machine learning uses labeled datasets to train algorithms

to classify data or predict outcomes.

As input data is inputted into the model, its weights modify until it fits

into the model; this process is known as cross validation which

ensures the model is not overfitted or underfitted.

the algorithm is trained on labeled data, meaning that the correct

answer or output is provided for each input.

The algorithm then uses this labeled data to make predictions about

new, unseen data.

11
Example

12
--OS

Supervised learning helps organizations scale real-world

challenges like spam classification in a different folder from
your inbox.
Different algorithms for supervised learning include neural
networks, naïve Bayes, linear regression, logistic regression,
random forest.

Logistic regression, linear

Techniques regression, decision tree, and
neural network.

13
Unsupervised Machine Learning

Unsupervised machine learning analyses and clusters unlabelled

datasets using machine learning methods.

The algorithms find hidden patterns or data groupings without

human interaction.

This method is useful for exploratory data analysis, cross-selling,

consumer segmentation, and image and pattern recognition.

the algorithm is trained on unlabeled data, meaning that the correct

output or answer is not provided for each input.

Instead, the algorithm must identify patterns and structures in the

14
data on its own.
Example

15
So--

It/UML/ also reduces model features through

dimensionality reduction using prominent methods of
Principal component analysis (PCA) and singular value
decomposition (SVD).

k-means clustering, and probabilistic clustering,

Association rule learning, Dimensionality reduction are
some popular methods\algorithms/ of unsupervised
learning.
16
Summary of differences: supervised vs. unsupervised learning

Supervised learning Unsupervised learning

You train the model with a set of

You train the model to discover
What is it? input data and a corresponding set
hidden patterns in unlabeled data.
of paired labeled output data.

Logistic regression, linear Clustering, association rule

Techniques regression, decision tree, and learning, probability density,
neural network. and dimensionality reduction.

Identify valuable relationship

information between input data
Predict an output based on known
Goal points. This can then be applied
inputs.
to new input to draw similar
insights.

Minimize the error between Find patterns, similarities, or

Approach
predicted outputs and true labels. anomalies within the data.

17
Semi-supervised Machine Learning
As its name implies; Semi-supervised learning is an integration of

supervised and unsupervised learning. This method uses both labeled

and unlabelled data to train ML models for classification and

regression tasks.

Semi-supervised learning is a best practice to utilize to solve the

problem where a user doesn't have enough labeled data for a

supervised learning algorithm.

Is a type of machine learning technique that is an integration of

supervised and unsupervised learning as it uses a major portion of

unlabeled dataset and minor portion of labeled data for training an

18
Example

 Hence, it's an appropriate method to solve the problem where data is partially
labeled or unlabelled.
 Self-training, co-training, and graph-based labeling are some of the popular
19
Semi-supervised learning methods.
Reinforcement Machine Learning

In reinforcement machine learning, the algorithm learns by

receiving feedback in the form of rewards or punishments
based on its actions.

The algorithm then uses this feedback to adjust its behavior

and improve performance.

Is a type of machine learning model that is similar to

supervised learning but does not use sample data to train the
algorithm.

This model learns by trial and error. 20

Example

 Works on interacting with the environment

 No – predefined data

 Learn a series of action

• Self Driving Cars, Gaming, Healthcare

21
When to use: supervised vs. unsupervised learning or both?

You can use supervised learning techniques to solve problems with known outcomes and that have labeled

data available.

Examples include, Risk Evaluation, Forecast Sales email spam classification, image recognition, and stock

price predictions based on known historical data.

You can use unsupervised learning for scenarios where the data is unlabeled and the objective is to discover

patterns, group similar instances, or detect anomalies.

You can also use it for exploratory tasks where labeled data is absent.

Examples include organizing large data archives, building recommendation systems, and grouping customers

based on their purchasing behaviors.

Semi-supervised learning is when you apply both supervised and unsupervised learning techniques to a

common problem. It’s another category of machine learning in itself.

You can apply semi-supervised learning when it’s difficult to obtain labels for a dataset.

You might have a smaller volume of labeled data but a significant amount of unlabeled data. 22
Stages of Machine Learning
–

23
.

Data collection

Data collection is an initial step in the process of machine learning.

Data is a fundamental part of machine learning, the quality and quantity of your data can have direct consequences for model

performance.

Different sources such as databases, text files, pictures, sound files, or web scraping may be used for data collection.

Data needs to be prepared for machine learning once it has been collected.

This process is to organize the data in an appropriate format, and make sure that they are useful for solving your problem.

Data pre-processing

Pre-processing of data is a key step in the process of machine learning.

It involves deleting duplicate data, fixing errors, managing missing data either by eliminating or filling it in, and adjusting

and formatting the data.

Pre-processing improves the quality of your data and ensures that your machine-learning model can read it right.

The accuracy of your model may be significantly improved by this step.

Choosing the right model:

The next step is to select a machine learning model; once data is prepared then we apply it to ML Models like Linear

regression, decision trees, and Neural Networks that may be selected to implement.

The selection of the model generally depends on what kind of data you're dealing with and your problem.
24
The size and type of data, complexity, and computational resources should be taken into account when choosing a model to
..
Training the model

The next step is to train it with the data that has been prepared after you have chosen a model.

Training is about connecting the data to the model and enabling it to adjust its parameters to predict output more

accurately. Overfitting and underfitting must be avoided during the training.

Evaluating the model

It is important to assess the model's performance before deployment as soon as a model has been trained. This

means that the model has to be tested on new data that they haven't been able to see during training.

Accuracy in classifying problems, precision and recall for binary classification problems, as well as mean error

squared with regression problems, are common metrics to evaluate the performance of a model.

Hyperparameter tuning and optimization

You may need to adjust its hyperparameters to make it more efficient after you've evaluated the model.

Grid searches, where you try different combinations of parameters, and cross-validation, where you divide your

data into subsets and train your model on each subset, to ensure that it performs well on different data sets, are

techniques for hyperparameter tuning.

Predictions and deployment

As soon as the model has been programmed and optimized, it will be ready to estimate new data.
25
This is done by adding new data to the model and using its output for decision-making or other analysis.
Overview of DM and KDD

Data Mining is defined as the procedure of extracting hiden information from huge sets of

data. In other words, we can say that data mining is mining knowledge from data.

Data Mining (DM) is a part of the KDD process relating to methods for extracting patterns from

data [Fayyad].

Data Mining is a problem solving methodology that finds a logical or mathematical description,

of a complex nature, of patterns and regularities in a set of data [Decker and Focardi].

KDD (Knowledge Discovery in Databases) is a process that involves the extraction of useful,

previously unknown, and potentially valuable information from large datasets. The KDD

process is an iterative process and it requires multiple iterations of the above steps to extract

accurate knowledge from the data.

Knowledge Discovery in Databases (KDD) is the non-trivial process of identifying valid, novel,

potentially useful and ultimately understandable patterns in data [Fayyad].

DM: The non-trivial extraction of implicit, previously unknown and potentially useful

knowledge from data. 26

Data Mining Process

Understand Business
• Identify the Company's and Project's Objectives first and Problems that need to be addressed

Understand the Data

• Identify what type of data is needed to solve the issue andCollect it from authentic sources; obtain

access rights, and prepare a data description report

Prepare the Data

• Clean the data: handle missing data, data errors, default values, and data corrections.

• Prepare the data in a format

Model the Data

• Employ algorithms to ascertain data patterns, and create, the model, test it, and validate the model

Evaluation
• Validate models with business goals, and Change the model, adjust the business goal, or revisit the data,

if needed

Deployment
27
• Generate business intelligence, and Continually monitoring, and maintaining the data mining application
Why Data Mining?

Data mining is important to learn for several reasons:

Extracting Insights: Data mining techniques allow users to extract useful information and patterns

from vast amounts of data.

Decision Making: Data mining contributes to the decision-making process. Businesses can predict

future trends and outcomes with a high degree of confidence through the analysis of historical data.

Customer Understanding: By analyzing the behavior, preferences, and purchasing patterns of

customers, data mining enables enterprises to gain a more accurate understanding of their clients..

Risk Management: Using data mining techniques to analyze patterns and anomalies in the data,

businesses can identify possible risks or frauds.

Improved Efficiency: Data mining, which can greatly enhance the efficiency of operations, aids in

automatically discovering patterns and insights from data.

Innovation: Hidden patterns and relationships in the data that can lead to new product ideas,

innovativeness, or business possibilities may be discovered by analyzing it

Personal Development: The analytical and problem-solving skills are enhanced by the knowledge of
28
data mining
Key Areas of Machine Learning

Machine Learning PPT For Students
70% (10)
Machine Learning PPT For Students
18 pages
ML Notes
No ratings yet
ML Notes
202 pages
Artificial Intelligence and Its Contexts: Anna Visvizi Marek Bodziany Editors
No ratings yet
Artificial Intelligence and Its Contexts: Anna Visvizi Marek Bodziany Editors
234 pages
ML Unit 1
No ratings yet
ML Unit 1
34 pages
Chapter 1
No ratings yet
Chapter 1
27 pages
Machine Learning
No ratings yet
Machine Learning
9 pages
ML, Types, Application, Life Cycle, Issues
No ratings yet
ML, Types, Application, Life Cycle, Issues
29 pages
Unit Iii - Aiml
No ratings yet
Unit Iii - Aiml
47 pages
Machine Learning, History and Types of ML
No ratings yet
Machine Learning, History and Types of ML
18 pages
Machine Learning and Deep Learning Techn
No ratings yet
Machine Learning and Deep Learning Techn
9 pages
Unit3 - Updated
No ratings yet
Unit3 - Updated
116 pages
Unit 5
No ratings yet
Unit 5
26 pages
Eda 5
No ratings yet
Eda 5
48 pages
UNIT III DKD
No ratings yet
UNIT III DKD
48 pages
Machine Learning
No ratings yet
Machine Learning
25 pages
AI - Module-III (Introduction To ML)
No ratings yet
AI - Module-III (Introduction To ML)
20 pages
7 Machine Learning Algirithms
No ratings yet
7 Machine Learning Algirithms
20 pages
What Is Machine Learning
No ratings yet
What Is Machine Learning
10 pages
Unit-I Machine Leaning Notes
No ratings yet
Unit-I Machine Leaning Notes
13 pages
Data Science IV
No ratings yet
Data Science IV
126 pages
Article On Machine Learning
No ratings yet
Article On Machine Learning
4 pages
ML-Unit 1 Merged
No ratings yet
ML-Unit 1 Merged
151 pages
ML-Unit 1
No ratings yet
ML-Unit 1
43 pages
Lecture BSMD - Introduction To ML
No ratings yet
Lecture BSMD - Introduction To ML
16 pages
Module 4 & 5
No ratings yet
Module 4 & 5
58 pages
Unit1 ML
No ratings yet
Unit1 ML
10 pages
ML Unit1 (HKB)
No ratings yet
ML Unit1 (HKB)
7 pages
Machine Learning Tutorial
100% (1)
Machine Learning Tutorial
44 pages
ML 3
No ratings yet
ML 3
21 pages
Question 1: What Is Machine Learning Answer 1
No ratings yet
Question 1: What Is Machine Learning Answer 1
23 pages
Unit 1 Notes
No ratings yet
Unit 1 Notes
68 pages
UNIT I Introduction To Machine Learning
No ratings yet
UNIT I Introduction To Machine Learning
150 pages
Class Notes
No ratings yet
Class Notes
29 pages
Intro To Machine Learning
No ratings yet
Intro To Machine Learning
32 pages
ML Report
No ratings yet
ML Report
19 pages
U20cs604 Machine Learning Unit I
No ratings yet
U20cs604 Machine Learning Unit I
33 pages
Unit 3
No ratings yet
Unit 3
33 pages
Machine Learning
No ratings yet
Machine Learning
97 pages
Machine Learning
100% (2)
Machine Learning
81 pages
Unit I
No ratings yet
Unit I
8 pages
Unit-1 Part-1 Material
No ratings yet
Unit-1 Part-1 Material
45 pages
ML Notes
No ratings yet
ML Notes
101 pages
Report On Machine Learning
No ratings yet
Report On Machine Learning
13 pages
Learning
No ratings yet
Learning
24 pages
Module - 1
No ratings yet
Module - 1
132 pages
MLCH1SEM6DLIHE
No ratings yet
MLCH1SEM6DLIHE
35 pages
UNIT I-Machine Learning
No ratings yet
UNIT I-Machine Learning
68 pages
(IJCST-V9I4P18) :yew Kee Wong
No ratings yet
(IJCST-V9I4P18) :yew Kee Wong
5 pages
Unit 1
No ratings yet
Unit 1
55 pages
Artificial Intelligence Lec 1 PDF
No ratings yet
Artificial Intelligence Lec 1 PDF
15 pages
AI Presentation Machine Learning
100% (2)
AI Presentation Machine Learning
42 pages
Unit V
No ratings yet
Unit V
67 pages
Unit-1 New
No ratings yet
Unit-1 New
48 pages
Unit1 ML
No ratings yet
Unit1 ML
23 pages
ML Notes N
No ratings yet
ML Notes N
254 pages
Machine Learning
No ratings yet
Machine Learning
3 pages
UNIT5
No ratings yet
UNIT5
15 pages
Module 5.1
No ratings yet
Module 5.1
43 pages
ML Lec 1
No ratings yet
ML Lec 1
49 pages
The Art of AI Scrum Master & Work
From Everand
The Art of AI Scrum Master & Work
Tom Henricksen
No ratings yet
MATHEMATICAL FOUNDATIONS OF MACHINE LEARNING: Unveiling the Mathematical Essence of Machine Learning (2024 Guide for Beginners)
From Everand
MATHEMATICAL FOUNDATIONS OF MACHINE LEARNING: Unveiling the Mathematical Essence of Machine Learning (2024 Guide for Beginners)
DAVID MACKAY
No ratings yet
Ashwin Kumar REPORT - 1BI21IS019
No ratings yet
Ashwin Kumar REPORT - 1BI21IS019
57 pages
Majorpptfin
No ratings yet
Majorpptfin
19 pages
L6 Diffusion Models (SP24)
No ratings yet
L6 Diffusion Models (SP24)
209 pages
AI Tutors Curiculum PDF
No ratings yet
AI Tutors Curiculum PDF
78 pages
Omicscl: Unsupervised Contrastive Learning For Cancer Subtype Discovery and Survival Stratification
No ratings yet
Omicscl: Unsupervised Contrastive Learning For Cancer Subtype Discovery and Survival Stratification
6 pages
6.1 Removed
No ratings yet
6.1 Removed
76 pages
ML Notes (Unit 1&2)
No ratings yet
ML Notes (Unit 1&2)
42 pages
Touretzki, Et Al (2022) Machine Learning and The Five Big Ideas in AI
No ratings yet
Touretzki, Et Al (2022) Machine Learning and The Five Big Ideas in AI
36 pages
Quiz 4 5 6
No ratings yet
Quiz 4 5 6
11 pages
Challenges and Issues in Sentiment Analysis - A Comprehensive Survey
No ratings yet
Challenges and Issues in Sentiment Analysis - A Comprehensive Survey
18 pages
Types of Neural Networks
No ratings yet
Types of Neural Networks
11 pages
3.7.1 Copies of Colabarations For 2021 22 Part 3
No ratings yet
3.7.1 Copies of Colabarations For 2021 22 Part 3
240 pages
A Video System Based
No ratings yet
A Video System Based
13 pages
Ai in Sports Cardiology
No ratings yet
Ai in Sports Cardiology
13 pages
10.anomaly Detection
No ratings yet
10.anomaly Detection
24 pages
Erdas Tutorial
100% (1)
Erdas Tutorial
61 pages
Mridul Report
No ratings yet
Mridul Report
43 pages
Combined FDS PPT - Prof Arindam Roy Lectures 1-6
No ratings yet
Combined FDS PPT - Prof Arindam Roy Lectures 1-6
134 pages
Machine Learning Questions
50% (2)
Machine Learning Questions
2 pages
Mini
No ratings yet
Mini
63 pages
Artificial Intelligence in Smart Tourism - A Conceptual Framework
No ratings yet
Artificial Intelligence in Smart Tourism - A Conceptual Framework
11 pages
Electronics 13 02322 v2
No ratings yet
Electronics 13 02322 v2
33 pages
Seminar Report
No ratings yet
Seminar Report
69 pages
Practical Applications of Artificial Intelligence and Value Inve 2019
No ratings yet
Practical Applications of Artificial Intelligence and Value Inve 2019
5 pages
Cegelski - Week 1 Homework
No ratings yet
Cegelski - Week 1 Homework
8 pages
1751941589740
No ratings yet
1751941589740
78 pages
Using Mis 9th Edition Kroenke Test Bank
100% (27)
Using Mis 9th Edition Kroenke Test Bank
30 pages
Deep Ant
No ratings yet
Deep Ant
16 pages
Speeding Up Kernel Methods, and Intro To Unsupervised Learning
No ratings yet
Speeding Up Kernel Methods, and Intro To Unsupervised Learning
103 pages