0% found this document useful (0 votes)

7 views10 pages

14-004-1 Machine Learning

The document introduces key concepts in Machine Learning, covering its history, definitions, and the distinction between machine learning and artificial intelligence. It explains supervised and unsupervised learning methodologies, including regression and classification tasks, along with common algorithms used in each. Additionally, it outlines a compulsory task with questions related to real-world applications of machine learning.

Uploaded by

bessie

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

7 views10 pages

14-004-1 Machine Learning

Uploaded by

bessie

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

TASK

Machine Learning
Introduction
WELCOME TO THE MACHINE LEARNING TASK!

This task introduces you to the most important concepts in Machine Learning,
giving you a general overview of the landscape and preparing you to learn about
the intention and theory behind specific supervised and unsupervised learning
algorithms.

The history of machine learning

For decades, visions of machines that can learn the way humans can have captured the
imagination of science-fiction authors and researchers alike. But only in recent years have
machine learning programs been developed that can be applied on a wide scale,
influencing our daily lives.

Machine learning programs are working behind the scenes to produce and curate our
playlists, news feeds, weather reports, and email inboxes. They help us find restaurants,
translate documents, and even meet potential dates. From a business perspective,
machine learning-based software is becoming central to many industries, generating
demand for experts.

In 1959, Arthur Samuel, a pioneer in artificial intelligence and gaming, defined machine
learning as the "field of study that gives computers the ability to learn without being
explicitly programmed." He is known for developing a program capable of playing checkers.
Samuel never programmed exactly which strategies the systems could use. Instead, he
devised a way in which the program could learn such strategies through the experience of
playing thousands of games.

In the 50s, machines were hard to acquire and not very powerful, so machine learning
algorithms were mostly an object of theoretical research. Now that computers are vastly
more powerful and more affordable, machine learning has become a very active field of
study with a variety of real-world applications.
DEFINING MACHINE LEARNING

At its essence, machine learning (ML) can be defined as a computational

methodology focused on deriving insights from data. It enables computers to
acquire knowledge from past observations and independently make predictions or
decisions without relying on explicit programming instructions. By leveraging
data-driven patterns and algorithms, machine learning enables automated
systems to adapt and improve their performance over time.

Unveiling the line between machine learning and artificial intelligence

The term “machine learning” is often used interchangeably with the term “artificial
intelligence” (AI). While the two are very much related, they are not the same thing.
There is much debate about the difference between the two, but a simple way to
look at it for our purposes is to see Machine Learning as a type of artificial
intelligence. Any program that completes a task in a way that can be considered
human-like can be considered an example of artificial intelligence, but only
programs that solve the task by learning without pre-programming are machine
learning programs.

Did you know that a London-based company called Google DeepMind has developed an
artificial intelligence-based gamer, which can play 49 video games from Atari 2600 and
achieves better than a professional human player’s top score in 23 of them? Yes, you read
that right!

According to an article (link provided below), “The software isn’t told the rules of the game.
Instead, it uses an algorithm called a deep neural network to examine the state of the
game and figure out which actions produce the highest total score.”

One of the most impressive, and probably the eeriest example, is that in the boxing game,
the software learned how to pin its opponent on the ropes (which is something only
seasoned players of the game knew how to do), and release a barrage of punches until its
opponent was knocked out! Extremely ruthless, right? Give it a read here.
INPUT AND OUTPUT

Whatever it is that we want a machine learning algorithm to learn, we first need to

express it numerically. The machine-readable version of a task consists of an input
and an output. The input is whatever we want the algorithm to learn from, and the
output is the outcome we want the algorithm to be able to produce. An example
of an input would be the budget or number of awards a movie receives. An
example of output would be the box office sales of that movie.

Since machine learning is a young field that overlaps with several other disciplines,
including statistics, the input and output may be referred to by several other
names.

For input, these include:

● features (named after the fact that inputs typically ‘describe’ something),
● independent variables, and
● explanatory variables (because the output is usually assumed to depend on
or be explained by the input).

For output, alternate terms are:

● labels,
● predictions,
● dependent variables, and
● response variables.

Once we clearly understand the input-output specifics of machine learning

models, we can explore different learning algorithms. While there are various types
of learning methodologies, our main focus in this context will be on supervised and
unsupervised learning. These two approaches are fundamental in the field of
machine learning, as they involve training models using labelled or unlabelled
data, respectively.

SUPERVISED LEARNING

In supervised learning problems, a program predicts an output given an input by

learning from pairs of inputs and outputs (labels); that is, the program learns from
examples that have had the right answers assigned to them beforehand. These
assignments are often called annotations. Because they are considered the
correct answers, they are also called gold labels, gold data, or the gold standard.
The collection of data examples used in supervised learning is called a training set.
A collection of examples used to assess a program's performance is called a test
set. Like a student learning in a language course that teaches only through
exposure, supervised learning problems see a collection of correct answers to
various questions. Then they must learn to provide the correct answers to new but
similar questions.

Continuing our exploration, we will delve into two common types of supervised
learning: regression and classification, which offer valuable tools for predicting
continuous values and categorising data into distinct classes.

Regression

Regression is a prediction task where a program learns to estimate and predict a

continuous output value. It does this by analysing pairs of input features and their
corresponding outputs in a training set. By analysing the training examples, the
program tries to identify patterns and associations that allow it to make precise
estimations.

The main objective of regression is to understand the relationship between the

input variables and a continuous target variable, enabling the program to make
accurate predictions for new inputs that are similar to the training data.

Commonly used metrics to assess the accuracy of a regression model, which allow
for comparing different models or evaluating the performance of a single model,
are:

● R-squared (R2),
● mean squared error (MSE),
● root mean squared error (RMSE),
● mean absolute error (MAE),
● and mean absolute percentage error (MAPE).

R2 is known as the coefficient of determination, quantifying the proportion of

variance in the target variable that can be explained by the features in the model. It
ranges from 0 to 1, with higher values indicating a better fit.

MSE measures the average squared difference between predicted and actual
values, providing an overall measure of prediction accuracy.

RMSE is the square root of MSE and represents the average magnitude of
prediction errors.
Another metric that provides a measure of the average magnitude of errors is
MAE, which calculates the average absolute difference between predicted and
actual values.

Finally, MAPE measures the average percentage difference between predicted and
actual values, which is particularly useful when the magnitude of errors needs to
be assessed relative to the actual values. By default, lower values of MSE, RMSE,
MAE, and MAPE indicate better model performance.

Classification

Unlike regression tasks, a classification process assumes a program is trained to

categorise input data into predefined classes or categories.

By analysing labelled examples, where each example is already assigned to a

specific class, the program learns patterns and relationships between input
features and classes. This knowledge allows the program to accurately classify new,
unseen data, ensuring they are correctly assigned to their respective classes. The
ultimate goal of classification is to develop a model that can make reliable
predictions for unknown instances, effectively categorising them into the
appropriate classes.

To evaluate the effectiveness and performance of classification models, several

commonly used evaluation metrics are available. One such metric is the Gini index,
which measures the impurity or disorder in a set of categorical data. The confusion
matrix is another valuable tool that summarises the model's performance by
displaying counts of true positives, true negatives, false positives, and false
negatives. Additionally, precision and the F1 score can be important metrics too.
Precision calculates the proportion of true positive predictions among all positive
predictions, indicating the model's ability to minimise false positives. The F1 score,
as the harmonic mean of precision and recall, assesses the model's performance
by considering both of these metrics. Together the discussed metrics provide a
comprehensive evaluation of classification models.

Supervised learning algorithms

Finally, we offer a list of common supervised learning algorithms and their typical
usage:
Supervised learning Typical usage
algorithms
Regression Classification

Linear regression ✔

Logistic regression ✔

Decision tree ✔ ✔

Random forest ✔ ✔

Support vector machines (SVM) ✔ ✔

Naïve Bayes ✔

K-nearest-neighbour (KNN) ✔ ✔

UNSUPERVISED LEARNING

In unsupervised learning, a program does not learn from labelled data. Instead, it
attempts to discover patterns in the data on its own.

For example, suppose you have two classes scattered in a 2-dimensional space (as
in the first of the images below) and you want to separate the two data sets (as in
the second image on the right-hand side). Unsupervised learning finds underlying
patterns in the data, allowing the classes to be separated.

To highlight the difference between supervised and unsupervised learning,

consider the following example. Assume that you have collected data describing
the heights and weights of people. An unsupervised clustering algorithm might
produce groups that correspond to men and women, or children and adults. An
example of a supervised learning problem is if we label some of the data with the
person’s sex and then try to induce a rule to predict whether a person is male or
female based on their height and weight.

Unsupervised learning algorithms

The following algorithms have proven to be highly valuable in practical

applications, making them some of the most commonly used methods in
unsupervised learning:

● K-means clustering
● Hierarchical clustering
● t-Distributed Stochastic Neighbour Embedding (t-SNE)
● Gaussian Mixture Models (GMM)
● Autoencoders

SEMI-SUPERVISED LEARNING

Semi-supervised learning is an approach that combines labelled and unlabelled

data to harness the benefits of both supervised and unsupervised learning. While
supervised learning relies on labelled data with known outcomes and
unsupervised learning explores unlabelled data to identify patterns,
semi-supervised learning uses a smaller set of labelled data to guide the learning
process. Simultaneously, it utilises a larger set of unlabelled data containing
valuable information.

By leveraging the combined dataset, the algorithm learns from the labelled
examples and applies that knowledge to predict outcomes for the unlabelled data,
revealing additional patterns and enhancing the model's understanding of the
problem's underlying structure. This approach is particularly advantageous when
acquiring labelled data is expensive or time-consuming, allowing for optimal
resource utilisation and the potential for improved results compared to using just
one data type.
Compulsory Task 1

Answer the following in the provided document titled machine_learning.ipynb.

1. For each of the following examples describe at least one possible input and
output. Justify your answers:
1.1. A self-driving car
1.2. Netflix recommendation system
1.3. Signature recognition
1.4. Medical diagnosis

2. For each of the following case studies, determine whether it is appropriate

to utilise regression or classification machine learning algorithms. Justify
your answers:
2.1. Classifying emails as promotional or social based on their content and
metadata.
2.2. Forecasting the stock price of a company based on historical data and
market trends.
2.3. Sorting images of animals into different species based on their visual
features.
2.4. Predicting the likelihood of a patient having a particular disease based
on medical history and diagnostic test results.

3. For each of the following real-world problems, determine whether it is

appropriate to utilise a supervised or unsupervised machine learning
algorithm. Justify your answers:
3.1. Detecting anomalies in a manufacturing process using sensor data
without prior knowledge of specific anomaly patterns.
3.2. Predicting customer lifetime value based on historical transaction data
and customer demographics.
3.3. Segmenting customer demographics based on their purchase history,
browsing behaviour, and preferences.
3.4. Analysing social media posts to categorise them into different themes.

4. For each of the following real-world problems, determine whether it is

appropriate or inappropriate to utilise semi-supervised machine learning
algorithms. Justify your answers:
4.1. Predicting fraudulent financial transactions using a dataset where
most transactions are labelled as fraudulent or legitimate.
4.2. Analysing customer satisfaction surveys where only a small portion of
the data is labelled with satisfaction ratings.
4.3. Identifying spam emails in a dataset where the majority of emails are
labelled.
4.4. Predicting the probability of default for credit card applicants based on
their complete financial and credit-related information.

HyperionDev strives to provide internationally-excellent course content that helps you

achieve your learning outcomes.

Think that the content of this task, or this course as a whole, can be improved, or think
we’ve done a good job?

Click here to share your thoughts anonymously.

UNIT 1 - Introduction (Types of Machine Learning)
100% (1)
UNIT 1 - Introduction (Types of Machine Learning)
21 pages
Unit1 ML
No ratings yet
Unit1 ML
23 pages
MachineLearning Jan2nd
100% (2)
MachineLearning Jan2nd
171 pages
Machine Learning 22618 ETI
100% (1)
Machine Learning 22618 ETI
14 pages
Fundamentals of Machine Learning II
No ratings yet
Fundamentals of Machine Learning II
13 pages
Machine Learning Practical File
No ratings yet
Machine Learning Practical File
41 pages
UNIT 1 All Notes
No ratings yet
UNIT 1 All Notes
24 pages
MACHINELEARING UNIT 1material
100% (1)
MACHINELEARING UNIT 1material
64 pages
Machine Learning
No ratings yet
Machine Learning
11 pages
Uvuhiihijno
No ratings yet
Uvuhiihijno
14 pages
Unit 1 Machine Learning
No ratings yet
Unit 1 Machine Learning
10 pages
ML 2
No ratings yet
ML 2
4 pages
Machine Learning Basics
No ratings yet
Machine Learning Basics
9 pages
Fulldoc - Dsec Mca - Crime Prediction
No ratings yet
Fulldoc - Dsec Mca - Crime Prediction
56 pages
Week 1
No ratings yet
Week 1
9 pages
Unit 1 Machine Learning - PDF Lands
No ratings yet
Unit 1 Machine Learning - PDF Lands
5 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
17 pages
ML Chapter 1
No ratings yet
ML Chapter 1
41 pages
Unit - 2 Machine Learning
No ratings yet
Unit - 2 Machine Learning
45 pages
Unit Iii Supervised Learning
No ratings yet
Unit Iii Supervised Learning
67 pages
Lecture-07 & 08 (New)
No ratings yet
Lecture-07 & 08 (New)
17 pages
Machine Learning
No ratings yet
Machine Learning
8 pages
Machine Learning and Regression
No ratings yet
Machine Learning and Regression
8 pages
Machine Learning Career Roadmap
No ratings yet
Machine Learning Career Roadmap
17 pages
Task The Problems That Can Be Solved With Machine Learning
No ratings yet
Task The Problems That Can Be Solved With Machine Learning
9 pages
Fire Extinguisher Prediction Using Machine Learning Report
No ratings yet
Fire Extinguisher Prediction Using Machine Learning Report
48 pages
Machine Learning
No ratings yet
Machine Learning
10 pages
Introduction To Machine Learning Basics
No ratings yet
Introduction To Machine Learning Basics
12 pages
Machine Learning - Trading
No ratings yet
Machine Learning - Trading
3 pages
Truncated Doc 4
No ratings yet
Truncated Doc 4
3 pages
Unit 1-1
No ratings yet
Unit 1-1
32 pages
Presenttion 33
No ratings yet
Presenttion 33
2 pages
Unit 1 Notes
No ratings yet
Unit 1 Notes
68 pages
CS601 - Machine Learning - Unit 1 - Notes - 1672759748
No ratings yet
CS601 - Machine Learning - Unit 1 - Notes - 1672759748
13 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
24 pages
What Is Machine Learning
No ratings yet
What Is Machine Learning
4 pages
Module 1
No ratings yet
Module 1
22 pages
ML Unit 1
No ratings yet
ML Unit 1
42 pages
Intro Machine Learning
No ratings yet
Intro Machine Learning
4 pages
Machine Learning Reg
No ratings yet
Machine Learning Reg
45 pages
Tesla Stock Marketing Price Prediction
No ratings yet
Tesla Stock Marketing Price Prediction
62 pages
INTRODUCTION
No ratings yet
INTRODUCTION
51 pages
PROJECT REPORT p2
No ratings yet
PROJECT REPORT p2
82 pages
2 Machine Learning
No ratings yet
2 Machine Learning
69 pages
Lecture 1
No ratings yet
Lecture 1
47 pages
Unit 1 Notes
No ratings yet
Unit 1 Notes
20 pages
Machine Learning
No ratings yet
Machine Learning
51 pages
Unit 1
No ratings yet
Unit 1
24 pages
ML Unit 1
No ratings yet
ML Unit 1
6 pages
MAchine Learning
No ratings yet
MAchine Learning
10 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
11 pages
Unit 1 Machine Learning
No ratings yet
Unit 1 Machine Learning
68 pages
Lec-1 Introduction
No ratings yet
Lec-1 Introduction
65 pages
1 ML M1503-Introduction - ABP
No ratings yet
1 ML M1503-Introduction - ABP
14 pages
Machine Learning
No ratings yet
Machine Learning
20 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
4 pages
AIDS Module 1 Notes Draft
No ratings yet
AIDS Module 1 Notes Draft
30 pages
What Is Machine Learning
No ratings yet
What Is Machine Learning
9 pages
Unit 4
No ratings yet
Unit 4
28 pages
AD3461-Machine Learning Lab Manual
No ratings yet
AD3461-Machine Learning Lab Manual
26 pages
Unit 3 - Decision Making Under Uncertainty in AI
No ratings yet
Unit 3 - Decision Making Under Uncertainty in AI
25 pages
Salesforce Ai
No ratings yet
Salesforce Ai
31 pages
DWH Manual Merged
No ratings yet
DWH Manual Merged
47 pages
COM7039M MachineLearning Assignment Brief-Level 7-1
No ratings yet
COM7039M MachineLearning Assignment Brief-Level 7-1
12 pages
Mini Final Document
No ratings yet
Mini Final Document
49 pages
Ch4 Supervised
No ratings yet
Ch4 Supervised
78 pages
Cad and Dog
No ratings yet
Cad and Dog
5 pages
Data Mining Notes
No ratings yet
Data Mining Notes
297 pages
Final Project Surya S
No ratings yet
Final Project Surya S
62 pages
Short Brief - Machine Learning
No ratings yet
Short Brief - Machine Learning
10 pages
Scikit Learn Tutorials Point Download
No ratings yet
Scikit Learn Tutorials Point Download
50 pages
Sat - 40.Pdf - Agricultural Product Price and Crop Cultivation Prediction Based On SMLT
No ratings yet
Sat - 40.Pdf - Agricultural Product Price and Crop Cultivation Prediction Based On SMLT
11 pages
Ids Course Content
No ratings yet
Ids Course Content
98 pages
Mslearn dp100 03
No ratings yet
Mslearn dp100 03
3 pages
Points Explanation
No ratings yet
Points Explanation
15 pages
A Deep Learning Approach For IDS Thesis
No ratings yet
A Deep Learning Approach For IDS Thesis
124 pages
Cognitive Analytics Platform With AI Solutions For Anomaly Detection
No ratings yet
Cognitive Analytics Platform With AI Solutions For Anomaly Detection
17 pages
Structural Form Finding Enhanced by Graph Neural Networks
No ratings yet
Structural Form Finding Enhanced by Graph Neural Networks
12 pages
Exploring Transfer Learning To Reduce Training Overhead of HPC Data in Machine Learning
No ratings yet
Exploring Transfer Learning To Reduce Training Overhead of HPC Data in Machine Learning
7 pages
ICT583 Case Study (1) (1) .Edited
No ratings yet
ICT583 Case Study (1) (1) .Edited
9 pages
Breast Cancer Detection Using Deep Learning
No ratings yet
Breast Cancer Detection Using Deep Learning
20 pages
Flight Fare Prediction Using Python
No ratings yet
Flight Fare Prediction Using Python
9 pages
Efficient Low-Rank Multimodal Fusion With Modality-Specific Factors
No ratings yet
Efficient Low-Rank Multimodal Fusion With Modality-Specific Factors
10 pages
Forecasting Cryptocurrency Returns From Sentiment Signals: An Analysis of BERT Classifiers and Weak Supervision
No ratings yet
Forecasting Cryptocurrency Returns From Sentiment Signals: An Analysis of BERT Classifiers and Weak Supervision
29 pages
A Multi-Label Approach For Diagnosis Problems in Energy Systems Using LAMDA Algorithm
No ratings yet
A Multi-Label Approach For Diagnosis Problems in Energy Systems Using LAMDA Algorithm
6 pages
An Efficient Edge Deep Learning Computer Vision System To Prevent Sudden Infant Death Syndrome
No ratings yet
An Efficient Edge Deep Learning Computer Vision System To Prevent Sudden Infant Death Syndrome
6 pages
A Primer On Using Artificial Intelligence in The Legal Profession - Harvard Journal of Law & Technology
No ratings yet
A Primer On Using Artificial Intelligence in The Legal Profession - Harvard Journal of Law & Technology
8 pages
Zhang 2020
No ratings yet
Zhang 2020
5 pages
Machine Learning Algorithms for Data Scientists: An Overview
From Everand
Machine Learning Algorithms for Data Scientists: An Overview
Vinaitheerthan Renganathan
No ratings yet
Next Level Deep Machine Learning: Complete Tips and Tricks to Deep Machine Learning
From Everand
Next Level Deep Machine Learning: Complete Tips and Tricks to Deep Machine Learning
Joe Grant
No ratings yet

14-004-1 Machine Learning

Uploaded by

14-004-1 Machine Learning

Uploaded by

TASK

The history of machine learning

At its essence, machine learning (ML) can be defined as a computational

Unveiling the line between machine learning and artificial intelligence

Whatever it is that we want a machine learning algorithm to learn, we first need to

For input, these include:

For output, alternate terms are:

Once we clearly understand the input-output specifics of machine learning

In supervised learning problems, a program predicts an output given an input by

Regression is a prediction task where a program learns to estimate and predict a

The main objective of regression is to understand the relationship between the

R2 is known as the coefficient of determination, quantifying the proportion of

Unlike regression tasks, a classification process assumes a program is trained to

By analysing labelled examples, where each example is already assigned to a

To evaluate the effectiveness and performance of classification models, several

Supervised learning algorithms

Support vector machines (SVM) ✔ ✔

To highlight the difference between supervised and unsupervised learning,

Unsupervised learning algorithms

The following algorithms have proven to be highly valuable in practical

Semi-supervised learning is an approach that combines labelled and unlabelled

Answer the following in the provided document titled machine_learning.ipynb.

2. For each of the following case studies, determine whether it is appropriate

3. For each of the following real-world problems, determine whether it is

4. For each of the following real-world problems, determine whether it is

HyperionDev strives to provide internationally-excellent course content that helps you

Click here to share your thoughts anonymously.

You might also like