0% found this document useful (0 votes)

2 views10 pages

MachineLearning in short

The document provides an overview of machine learning concepts, including definitions of key terms such as machine learning models, feature space, and training, validation, and testing sets. It explains the processes of classification and regression, emphasizing the importance of data in improving model accuracy. Additionally, it outlines steps for selecting a machine learning model based on problem definition and data analysis.

Uploaded by

mantineoalessio2

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views10 pages

MachineLearning in short

Uploaded by

mantineoalessio2

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

(Machine) Learning

A computer program is said to learn from experience E with respect to some class of tasks T and

performance measure P if its performance at tasks in T, as measured by P, improves with

experience E. — Tom Mitchell

Machine Learning Model

Machine learning models can be regarded as mathematical (or sometimes architectural)

representation of a machine learning system, including, but not restricted to, the representation of

the input and output data, the algorithm involved in learning, the parameters that define the

learning process and the architecture of the model.

Population

A set of all possible examples relating to the experiment under consideration. This is what the

machine learning models try to predict, the distribution of the target population.

Feature Space (Input)

Feature space is the input space of the model, where the variables (other than the target variable

which we want to predict) live. Features can be numeric or categorical. For example, the weight and

speed of a car are numeric features. Whether the car is a Chevy or a Tesla is a categorical feature.

If you are describing a set of cars using their color, speed, make and model, then all possible

values of these attributes form the feature space.

Feature Vector X

Each entry in the feature space is referred to as an n-dimensional feature vector, where n is the

number of features that define the particular data point.

If you were to define a car using some of its features, the list of features would form the feature

vector. example: [Orange colour, 280mph, 2800lbs] is a 3-dimensional feature vector with 3

features, colour, speed and weight.

Label Space (Output)

The set of labels or target variables associated with each of the feature vectors make up the label

space.

There can be various cars Mustang GT, Roadster, Camaro, etc. All these are labels for the set of

features that define them.

True label y

This is the actual label associated with one particular data point.

Example

An example is a data point including the features and the label. The examples available in the

dataset at hand may not completely exhaust the data distribution.

Roadster[Orange colour, 280mph, 2800lbs] is one example from a data set. If we have a dataset

of 100 cars that belong to just one company, it does not mean that we can make predictions about

cars from other companies as they might have totally different data distribution. It would be

really hard to make predictions about Ford cars based on Tesla cars data.

Predicted label y^

It is the label predicted for a given feature vector by the machine learning model. It may or may not

be correct.

The vector [Orange colour, 280mph, 2800lbs] can be predicted as a Roadster or an orange Beach

Buggy.
Training set

The set of examples that are used to train a machine learning model.

Validation set

This is a subset of the training set (or sometimes separate from the training set set) that is used to

check the current state of the model during the training process. This does not directly contribute to

the training of the model. Validation set can be used to train the parameters of the model or provide

an evaluation metric for the model.

Testing set

The set of data points that are not made accessible to the model unless it has been trained. It is used

to test the trained state of the model.

Different data splits

Classification

Classification models are models that categorize or classify data into 2 (binary classifier) or more

(multi-class classifier) classes.

We will try to understand how a model learns using a small dataset:

Lets plot this data set, such that each axis represents one of the features’ values:
Now we’ll mark the given labels on the points:

If we try to separate the two classes using straight lines, there could theoretically be infinite possible

lines:
If we add another point, the number of possible lines reduce:

adding more points..

and more..
Now if we are given a new point, from the same data distribution, we know where to classify that

point based on our blue line. This blue line is the hypothesis obtained form the trained model.

We still cannot be sure if our blue line is the actual representation of the line dividing the original

Population. Consider the following line, this also separates the two classes of points.

If we get access to even more points, the line may actually change its position. That is why you might

have always heard, more data in machine learning usually yields better results.
Regression

A simple linear regression is a linear approach to modeling the relationship between a scalar label

and one or more explanatory features. Usually, regression models are used to predict continuous

values, like temperature, weight, interest rates, etc.

Consider another toy data set plotted on a graph:

The blue line describes approximately where the points from the distribution lie.

Now if we add some more points from the data distribution to the training set, the line changes

altogether:
Any new point that we add from the same data distribution will lie on this blue curve. Given one of

the 2 features’ value, we can predict the value of the other feature based on its location on the curve.

ex. If we are given x1 = 3, then according to the curve, x2 = 1

Training

As the model kept seeing new points, the position of the line kept on shifting. This is the process of

learning (in case of supervised learning). The more points we get, the better will be the learning

process and better will be the model accuracy.

Hypothesis

It is a function (or model) that we believe is as close to the true function (or model) that describes

the data as possible. In our classification example, the blue line we obtained is one such hypothesis

that describes the data such that all points on one side of the line belong to a similar class.

Hypothesis space

Hypothesis space is the set of all possible models or functions that can be represented by n features,

not necessarily describing the data. The target function has to be selected form this hypothesis

space. Given 2 variables, there can be infinite curves possible in 2 dimensions.

Heuristic

It can be considered as a simple hypothesis space or a decision that intuitively helps us in ultimately

selecting the right model or function. For example, in our classification example, we intuitively

decided to select different forms of straight lines, and not circles or squares, because it was evident

that a line would be sufficient to separate the points. That was our heuristic. We could have selected

circles to engulf the 2 classes, but that would have restricted the test space to just those circles

obtained from the training data. Had we selected other shapes, it could have taken more tries to

arrive at the final line. The right choice of heuristic helps in arriving at the target function quicker.

Target Function

Target function is the function that actually represents the original data distribution. If we had

access to all the possible data points, we could train a model to learn the target function.

Parameters

Model parameters are internal variables whose values can be determined from the data. During

training, model parameter values get updated.

Selecting a Machine Learning model

Now that we know some of the basic terminology involved, let us try to see a how we select a

machine learning model:

1. Define a problem statement. Is it a classification problem or a regression problem?

2. Obtain the data required.

3. Choose a heuristic based on initial data analysis.

4. Choose a machine learning model based on the heuristic and the task at hand.

CS601_Machine Learning_Unit 1_Notes_1672759748
No ratings yet
CS601_Machine Learning_Unit 1_Notes_1672759748
13 pages
Machine Learning HC
No ratings yet
Machine Learning HC
4 pages
Week 4 - Intro to ML
No ratings yet
Week 4 - Intro to ML
37 pages
Lecture 4.2 Supervised Learning Classification
No ratings yet
Lecture 4.2 Supervised Learning Classification
25 pages
Unit 1 Machine Learning
No ratings yet
Unit 1 Machine Learning
10 pages
Unit 1 Machine Learning - PDF Lands
No ratings yet
Unit 1 Machine Learning - PDF Lands
5 pages
2021 Machine Learning Intro
No ratings yet
2021 Machine Learning Intro
43 pages
19 ML Intro
No ratings yet
19 ML Intro
31 pages
Data Analytics_ML lecturenotes
No ratings yet
Data Analytics_ML lecturenotes
85 pages
Machine - Learning - Unit - 1
No ratings yet
Machine - Learning - Unit - 1
70 pages
Lecture 17&18 - Introduction To Machine Learning
No ratings yet
Lecture 17&18 - Introduction To Machine Learning
51 pages
ML 01
No ratings yet
ML 01
24 pages
Notes
No ratings yet
Notes
125 pages
1 - Intro to Machine Learning
No ratings yet
1 - Intro to Machine Learning
34 pages
Unit Ii
No ratings yet
Unit Ii
118 pages
ML Chapter 1
No ratings yet
ML Chapter 1
41 pages
Module 1 ML
No ratings yet
Module 1 ML
78 pages
Machine Learning – I[1]
No ratings yet
Machine Learning – I[1]
126 pages
Lec-1 Introduction
No ratings yet
Lec-1 Introduction
65 pages
Lecture Notes
No ratings yet
Lecture Notes
86 pages
Presentation on ML - Copy
No ratings yet
Presentation on ML - Copy
469 pages
Presentation 6
No ratings yet
Presentation 6
34 pages
Machine Learning
No ratings yet
Machine Learning
33 pages
Lecture 2.2 Example Data Preparation Feature Engineering
No ratings yet
Lecture 2.2 Example Data Preparation Feature Engineering
25 pages
05-1 Supervised Learning
No ratings yet
05-1 Supervised Learning
65 pages
5 Classification
No ratings yet
5 Classification
40 pages
Classification
No ratings yet
Classification
53 pages
Machine Learning Exploring The Model
No ratings yet
Machine Learning Exploring The Model
17 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
24 pages
Machine Learning Basics
No ratings yet
Machine Learning Basics
9 pages
ML Unit 2 Part 1
No ratings yet
ML Unit 2 Part 1
47 pages
IT 802 ML Unit-2 Notes
No ratings yet
IT 802 ML Unit-2 Notes
19 pages
3. Introduction to Machine Learning
No ratings yet
3. Introduction to Machine Learning
20 pages
03-Introduction To Machine Learning - DNN
No ratings yet
03-Introduction To Machine Learning - DNN
35 pages
ML-1-PPT-UNIT-1
No ratings yet
ML-1-PPT-UNIT-1
93 pages
Module 4
No ratings yet
Module 4
28 pages
ML 02 Dataset-Feature Selection PDF
No ratings yet
ML 02 Dataset-Feature Selection PDF
44 pages
Lecture 2 - Supervised Learning
No ratings yet
Lecture 2 - Supervised Learning
6 pages
Machine Learning
No ratings yet
Machine Learning
51 pages
Machine Learning Basics
No ratings yet
Machine Learning Basics
32 pages
Machine Learning Syllabus - 1
No ratings yet
Machine Learning Syllabus - 1
52 pages
Neural Networks Cheat Sheet - 2020 PDF
No ratings yet
Neural Networks Cheat Sheet - 2020 PDF
14 pages
Module 3 Data Science Machine Learning
No ratings yet
Module 3 Data Science Machine Learning
53 pages
ML-2-PPT-UNIT-2
No ratings yet
ML-2-PPT-UNIT-2
214 pages
ML Workshop
No ratings yet
ML Workshop
78 pages
I. The Types of Machine Learning
No ratings yet
I. The Types of Machine Learning
8 pages
Unit III - I
No ratings yet
Unit III - I
15 pages
Machine Learning Models
0% (1)
Machine Learning Models
16 pages
Chapter 1
No ratings yet
Chapter 1
3 pages
module3_DS_ppt
No ratings yet
module3_DS_ppt
68 pages
DS-05 Introduction To Machine Learning
No ratings yet
DS-05 Introduction To Machine Learning
103 pages
Basics of ML and Evaluation
No ratings yet
Basics of ML and Evaluation
42 pages
ML 1 2 3
No ratings yet
ML 1 2 3
54 pages
01 - Introduction
No ratings yet
01 - Introduction
35 pages
5.1 Large Scale ML
No ratings yet
5.1 Large Scale ML
10 pages
Introduction Class
No ratings yet
Introduction Class
134 pages
Lecture 1 - Machine Learning
No ratings yet
Lecture 1 - Machine Learning
148 pages
CS7641 Machine Learning Midterm Notes PDF
No ratings yet
CS7641 Machine Learning Midterm Notes PDF
239 pages
14-004-1 Machine Learning
No ratings yet
14-004-1 Machine Learning
10 pages
Process Performance Models: Statistical, Probabilistic & Simulation
From Everand
Process Performance Models: Statistical, Probabilistic & Simulation
Vishnuvarthanan Moorthy
No ratings yet
Ku Thesis Dissertation
100% (2)
Ku Thesis Dissertation
4 pages
Prospectus: Programme
No ratings yet
Prospectus: Programme
142 pages
The Future of Antarctica: Jeffrey Mcgee David Edmiston Marcus Haward
No ratings yet
The Future of Antarctica: Jeffrey Mcgee David Edmiston Marcus Haward
215 pages
International Management Culture Strategy and Behavior 10th Edition Luthans Test Bank pdf download
100% (4)
International Management Culture Strategy and Behavior 10th Edition Luthans Test Bank pdf download
64 pages
13 The 5e Instructional Model NASA
No ratings yet
13 The 5e Instructional Model NASA
3 pages
WSP Cambodia WSS Turning Finance Into Service For The Future
No ratings yet
WSP Cambodia WSS Turning Finance Into Service For The Future
88 pages
Civil3DforFDOTProjectManagers DougMedleyRandyRoberts PDF
No ratings yet
Civil3DforFDOTProjectManagers DougMedleyRandyRoberts PDF
88 pages
Anxiety The Seminar of Jacques Lacan X PDF
No ratings yet
Anxiety The Seminar of Jacques Lacan X PDF
354 pages
MC Lit 3 Prelim
No ratings yet
MC Lit 3 Prelim
5 pages
Music Genre Classification Project Repor
No ratings yet
Music Genre Classification Project Repor
19 pages
Datasheet CX02-81
No ratings yet
Datasheet CX02-81
2 pages
DRO D80 Manual
No ratings yet
DRO D80 Manual
51 pages
Manual de Reloj Casio 5183
No ratings yet
Manual de Reloj Casio 5183
1 page
Quine (1992), Pursuit of Truth
No ratings yet
Quine (1992), Pursuit of Truth
124 pages
The Youth, A Cog of Change, Foundation of The Future
No ratings yet
The Youth, A Cog of Change, Foundation of The Future
4 pages
form-c
No ratings yet
form-c
3 pages
3 GPa dual-phase stainless steel from synergistic heterogeneous structure and nano-precipitate
No ratings yet
3 GPa dual-phase stainless steel from synergistic heterogeneous structure and nano-precipitate
11 pages
Mark Scheme (Results) January 2013
No ratings yet
Mark Scheme (Results) January 2013
17 pages
Topic 6: Workplace Environment & Ergonomic Principles of Ergonomic
No ratings yet
Topic 6: Workplace Environment & Ergonomic Principles of Ergonomic
16 pages
Detailed Gunner24 Action Sheet
No ratings yet
Detailed Gunner24 Action Sheet
5 pages
SHS Eapp Q1 Las WK5 Day1-4
No ratings yet
SHS Eapp Q1 Las WK5 Day1-4
4 pages
202306-Daily Production Report - SINABANG 3MW (EMERGENCY COD)
No ratings yet
202306-Daily Production Report - SINABANG 3MW (EMERGENCY COD)
310 pages
Wood Anatomy Sem-2 Vishwa
No ratings yet
Wood Anatomy Sem-2 Vishwa
18 pages
4.3 Structural Analysis 4.3.1 Modelling
No ratings yet
4.3 Structural Analysis 4.3.1 Modelling
8 pages
Test 30
No ratings yet
Test 30
4 pages
FengLi2018 Cooper2016
No ratings yet
FengLi2018 Cooper2016
3 pages
Imu-Cet 2024 Syllabus
No ratings yet
Imu-Cet 2024 Syllabus
7 pages
Statistical Test
No ratings yet
Statistical Test
2 pages
Libro Fire Engineering Managing Major Fires
No ratings yet
Libro Fire Engineering Managing Major Fires
330 pages
EWMS 1 - Rev 2 - Temp Jetty and Bridge
No ratings yet
EWMS 1 - Rev 2 - Temp Jetty and Bridge
17 pages