Linear Algebra For Machine Learning
Linear Algebra For Machine Learning
Machine Learning
PRESENTATION BY UPLATZ
https://training.uplatz.com
Email: [email protected]
Phone: +44 7836 212635
CONTENT
➢ Basic introduction to linear algebra and Mathematics
equations for Machine
➢ Learning
➢ Matrices
➢ Vectors in linear algebra.
What is Linear Algebra?
Scalar
A scalar is simply a single number. For example 24.
Vector
A Vector is an ordered array of numbers and can be in a row or a column.
Matrix
A Matrix is an ordered 2D array of numbers and it has two indices. The first one
points to the row and the second one to the column.
Matrix
Matrix is a way of writing similar things together to handle and manipulate them as per our requirements easily. In Data Science, it
is generally used to store information like weights in an Artificial Neural Network while training various algorithms. You will be able
to understand my point by the end of this article.
Technically, a matrix is a 2-D array of numbers (as far as Data Science is concerned). For example look at the matrix A below.
Generally, rows are denoted by ‘i’ and column are denoted by ‘j’. The elements are indexed by ‘i’th row and ‘j’th column.We
denote the matrix by some alphabet e.g. A and its elements by A(ij).
In above matrix
A12 = 2 1 2 3
To reach to the result, go along first row and reach to second column.
Order of matrix
7 8 9
Square matrix
Diagonal matrix
Upper triangular matrix
Lower triangular matrix
Scalar matrix
Identity matrix
Matrix
Column matrix
Row matrix
Trace
What is an identity
matrix?
PRESENTATION BY UPLATZ
https://training.uplatz.com
Email: [email protected]
Phone: +44 7836 212635
Introduction to Machine Learning
Machine learning uses various algorithms for building
mathematical models and making predictions using historical
data or information.
What is Machine Learning?
Field of study that gives computers the capability to learn without
being explicitly programmed.
How is it different from traditional
programming?
In traditional programming, we would feed the input data and a well written and
tested program into a machine to generate output. When it comes to machine
learning, input data along with the output associated with the data is fed into the
machine during the learning phase, and it works out a program for itself.
Why do we need Machine
Learning?
➢ Machine Learning can automate many tasks, especially the ones
that only humans can perform with their innate intelligence.
➢ With the help of Machine Learning, businesses can automate
routine tasks. It also helps in automating and quickly create
models for data analysis.
➢ Image recognition, text generation, and many other use-cases are
finding applications in the real world.
Features of Machine Learning
➢ Automation: Nowadays in your Gmail account, there is a spam folder that
contains all the spam emails. You might be wondering how does Gmail know
that all these emails are spam?
➢ Improved customer experience: For any business, one of the most crucial
ways to drive engagement, promote brand loyalty and establish long-lasting
customer relationships is by providing a customized experience and providing
better services.
➢ Automated data visualization: With the help of user-friendly automated data
visualization platforms such as AutoViz, businesses can obtain a wealth of new
insights in an effort to increase productivity in their processes.
➢ Business intelligence.
Transpose
What is Machine Learning
Machine Learning is said as a subset of artificial intelligence that is mainly
concerned with the development of algorithms which allow a computer to learn
from the data and past experiences on their own. The term machine learning was
first introduced by Arthur Samuel in 1959.
A machine has the ability to learn if it can improve its performance by gaining
more data.
A Machine Learning system learns from historical data, builds the prediction
models, and whenever it receives new data, predicts the output for it.
Features of Machine Learning:
➢ Machine learning uses data to detect various patterns in a given
dataset.
➢ It can learn from past data and improve automatically.
➢ It is a data-driven technology.
➢ Machine learning is much similar to data mining as it also deals
with the huge amount of the data.
Need for Machine Learning:
The importance of machine learning can be easily understood by its uses cases,
Currently, machine learning is used in self-driving cars, cyber fraud
detection, face recognition, and friend suggestion by Facebook, etc. Various top
companies such as Netflix and Amazon have build machine learning models that
are using a vast amount of data to analyze the user interest and recommend product
accordingly.
Types of Machine Learning
Machine learning has been broadly categorized into three
categories
➢ Supervised Learning
➢ Unsupervised Learning
➢ Reinforcement Learning
What is Supervised Learning?
➢ Supervised learning is a type of machine learning method in
which we provide sample labelled data to the machine learning
system in order to train it, and on that basis, it predicts the output.
➢ The goal of supervised learning is to map input data with the
output data. The supervised learning is based on supervision, and
it is the same as when a student learns things in the supervision of
the teacher. The example of supervised learning is spam filtering.
➢ Supervised learning can be grouped further in two categories of
algorithms:
➢ Classification
➢ Regression
What is Supervised Learning?
Let us start with an easy example, say you are teaching a kid to
differentiate dogs from cats. How would you do it?
What is Supervised Learning?
4. Product recommendations:
5. Self-driving cars:
A 38 48000 No
B 43 45000 Yes
C 30 54000 No
D 48 65000 No
E 40 Yes
F 35 58000 Yes
Need of Dataset
During the development of the ML project, the developers
completely rely on the datasets. In building ML applications, datasets
are divided into two parts:
Training dataset:
Test Dataset
Below is the list of datasets which are freely available for the public
to work on it:
1. Kaggle Datasets
2. UCI Machine Learning Repository
3. Datasets via AWS
4. Google's Dataset Search Engine
Data Pre-processing in Machine
learning
Data pre-processing is a process of preparing the raw data and
making it suitable for a machine learning model. It is the first and
crucial step while creating a machine learning model.
It involves below steps:
Getting the dataset
Importing libraries
Importing datasets
Finding Missing Data
Encoding Categorical Data
Splitting dataset into training and test set
Feature scaling
1) Get the Dataset
To create a machine learning model, the first thing we required is a
dataset as a machine learning model completely works on data. The
collected data for a particular problem in a proper format is known as
the dataset.
2) Importing Libraries
In order to perform data pre-processing using Python, we need to
import some predefined Python libraries. These libraries are used to
perform some specific jobs. There are three specific libraries that we
will use for data pre-processing, which are:
3) Importing the Datasets
Now we need to import the datasets which we have collected for our
machine learning project. But before importing a dataset, we need to
set the current directory as a working directory.
Extracting dependent and
independent variables:
Extracting independent variable:
To extract an independent variable, we will use iloc[ ] method of Pandas
library. It is used to extract the required rows and columns from the dataset.
x= data_set.iloc[:,:-1].values
In the above code, the first colon(:) is used to take all the rows, and the
second colon(:) is for all the columns. Here we have used :-1, because we
don't want to take the last column as it contains the dependent variable. So
by doing this, we will get the matrix of features.
By executing the above code, we will get output as:
[['India' 38.0 68000.0] ['France' 43.0 45000.0] ['Germany' 30.0 54000.0]
['France' 48.0 65000.0] ['Germany' 40.0 nan] ['India' 35.0 58000.0]
['Germany' nan 53000.0] ['France' 49.0 79000.0] ['India' 50.0 88000.0]
['France' 37.0 77000.0]]
Extracting dependent variable:
➢ Spam Filtering,
➢ Random Forest
➢ Decision Trees
➢ Logistic Regression
➢ Support vector Machines
Unsupervised Machine Learning
As the name suggests, unsupervised learning is a machine learning
technique in which models are not supervised using training dataset.
Instead, models itself find the hidden patterns and insights from the
given data. It can be compared to learning which takes place in the
human brain while learning new things. It can be defined as:
Supervised learning algorithms are trained using labelled data. Unsupervised learning algorithms are trained using unlabelled data.
Supervised learning model takes direct feedback to check if it is predicting correct output or not. Unsupervised learning model does not take any feedback.
Supervised learning model predicts the output. Unsupervised learning model finds the hidden patterns in data.
In supervised learning, input data is provided to the model along with the output. In unsupervised learning, only input data is provided to the model.
The goal of supervised learning is to train the model so that it can predict the output when it is given new data. The goal of unsupervised learning is to find the hidden patterns and useful insights from the unknown dataset.
Supervised learning needs supervision to train the model. Unsupervised learning does not need any supervision to train the model.
Supervised learning can be categorized in Classification and Regression problems. Unsupervised Learning can be classified in Clustering and Associations problems.
Supervised learning can be used for those cases where we know the input as well as corresponding outputs. Unsupervised learning can be used for those cases where we have only input data and no corresponding output data.
Supervised learning model produces an accurate result. Unsupervised learning model may give less accurate result as compared to supervised learning.
Supervised learning is not close to true Artificial intelligence as in this, we first train the model for each data, and then only Unsupervised learning is more close to the true Artificial Intelligence as it learns similarly as a child learns daily routine things by his
It includes various algorithms such as Linear Regression, Logistic Regression, Support Vector Machine, Multi-class It includes various algorithms such as Clustering, KNN, and Apriori algorithm.
➢ Supervised Learning,
➢ Unsupervised Learning,
➢ Reinforcement Learning.
➢ Machine Learning Lifecycle.
Supervised Learning
Supervised learning as the name indicates the presence of a
supervisor as a teacher. Basically supervised learning is a learning in
which we teach or train the machine using data which is well
labelled that means some data is already tagged with the correct
answer.
After that, the machine is provided with a new set of examples(data)
so that supervised learning algorithm analyses the training data(set of
training examples) and produces a correct outcome from labelled
data.
Applications of Supervised
Learning
Sentiment Analysis
Recommendations
Spam Filtration
BioInformatics
Speech Recognition
Object-Recognition for Vision
How Supervised Learning Works
For example, you want to train a machine to help you predict how
long it will take you to drive home from your workplace. Here, you
start by creating a set of labelled data. This data includes:
Weather conditions
Time of the day
Holidays
Types of Supervised Machine Learning Algorithms
Regression
Logistic Regression
Classification
Naïve Bayes Classifiers
Decision Trees
Support Vector Machine
Challenges in Supervised
machine learning
➢ Irrelevant input feature present training data could give inaccurate
results.
➢ Data preparation and pre-processing is always a challenge.
➢ Accuracy suffers when impossible, unlikely, and incomplete
values have been inputted as training data.
➢ If the concerned expert is not available, then the other approach is
"brute-force." It means you need to think that the right features
(input variables) to train the machine on. It could be inaccurate.
Advantages of Supervised Learning
Summary
➢ In Supervised learning, you train the machine using data which is well
"labelled."
➢ You want to train a machine which helps you predict how long it will take you
to drive home from your workplace is an example of supervised learning
➢ Regression and Classification are two types of supervised machine learning
techniques.
➢ Supervised learning is a simpler method while Unsupervised learning is a
complex method.
In order to solve a given problem of supervised learning,
one has to perform the following steps:
➢ Types of Regression:
Linear Regression:
➢ Linear regression is a statistical regression method which is used for
predictive analysis.
➢ It is one of the very simple and easy algorithms which works on
regression and shows the relationship between the continuous variables.
➢ It is used for solving the regression problem in machine learning.
➢ Linear regression shows the linear relationship between the independent
variable (X-axis) and the dependent variable (Y-axis), hence called
linear regression.
➢ If there is only one input variable (x), then such linear regression is
called simple linear regression. And if there is more than one input
variable, then such linear regression is called multiple linear
regression.
➢ The relationship between variables in the linear regression model can be
explained using the below image. Here we are predicting the salary of
an employee on the basis of the year of experience.
Y= aX+b
Here, Y = dependent variables (target variables),
X= Independent variables (predictor variables),
a and b are the linear coefficients
Some popular applications of linear regression are:
Analyzing trends and sales estimates
Salary forecasting
Real estate prediction
Arriving at ETAs in traffic.
Logistic Regression:
➢ Used to solve the classification problems.
➢ Logistic regression algorithm works with the categorical variable
such as 0 or 1, Yes or No, True or False, Spam or not spam, etc.
➢ It is a predictive analysis algorithm which works on the concept of
probability.
➢ Logistic regression uses sigmoid function or logistic function
which is a complex cost function. This sigmoid function is used to
model the data in logistic regression. The function can be
represented as: