6 Introduction To Machine Learning With Python
6 Introduction To Machine Learning With Python
• Please log in 10 mins before the class starts and check your internet connection to avoid any network issues during the LIVE
session
• All participants will be on mute, by default, to avoid any background noise. However, you will be unmuted by instructor if
required. Please use the “Questions” tab on your webinar tool to interact with the instructor at any point during the class
• Feel free to ask and answer questions to make your learning interactive. Instructor will address your queries at the end of on-
going topic
• Raise a ticket through your LMS in case of any queries. Our dedicated support team is available 24 x 7 for your assistance
• Your feedback is very much appreciated. Please share feedback after each class, which will help us enhance your learning
experience
▪ Machine Learning
▪ Supervised Learning
▪ Linear Regression
Hey Dear,
Surprises are waiting at your door!!!!
Reply to the same mail and see the
magic.
Best Regards
Anonymous
many more
To IT Support
& Helpdesk
For Example,
If there are words like lottery, then mark the mail as a spam mail
Such as ‘lottery’
keyword
How can I
improve the sales
of my company?
In a leadership meeting they noticed, that they have a lot of data which is unused which
can be used to boost company sales
Purchase History
Search History
Products in Cart
Address
Recommended Products
Purchase History
Search History
Products in Cart
Purchase History
Search History
Products in Cart
Purchase History
Search History
Products in Cart
Training Data
Data Data
Model
Output
Program Output
Other Recommendations
Mobile Cover
Tempered Glass
Phase 1
Training
Training Learning
Data Algorithms
Phase 2
Testing
1 Collecting Data
2 Data Wrangling
Data is collected from
various sources in a server
3 Analyze Data
5 Test Algorithm
Server
6 Deployment
2 Data Wrangling
Data acquired from sources
3 Analyze Data
4 Train Algorithm
Data filtering
5 Test Algorithm
6 Deployment
Clean Data
1 Collecting Data
2 Data Wrangling
3 Analyze Data
4 Train Algorithm
6 Deployment
Feature selection
2 Data Wrangling
fed
3 Analyze Data
Training
5 Test Algorithm
Dataset Training
6 Deployment
Copyright
Copyright
© 2017,
© edureka and/or its affiliates. All rights reserved.
Test Algorithm
▪ The testing dataset determines the accuracy of our model
1 Collecting Data
2 Data Wrangling
fed Predicted
3 Analyze Data Output
Testing
5 Test Algorithm
Dataset
Testing
6 Deployment
Accuracy??
Copyright
Copyright
© 2017,
© edureka and/or its affiliates. All rights reserved.
Deployment
▪ If the speed and accuracy of the model is acceptable, then that model should be deployed in
the real system
1 Collecting Data
2 Data Wrangling
3 Analyze Data
4 Train Algorithm
Copyright
Copyright
© 2017,
© edureka and/or its affiliates. All rights reserved.
Deployment
▪ After the model is deployed based upon it’s performance the model is updated and improved,
if there is a dip in performance the model is retrained
1 Collecting Data
2 Data Wrangling
3 Analyze Data
4 Train Algorithm
Copyright
Copyright
© 2017,
© edureka and/or its affiliates. All rights reserved.
Let’s move forward and understand Different Types
of Machine Learning through various Use-Cases
Grapes
Big
Big Red
Green or Yellow Rounded shape
Long curved Depression at the top
Cylindrical shape
Small
Green
Round to oval
Bunch shape
Cylindrical
Train The
Feature
Raw Data Model Model Evaluate
Extraction
Train
Labels
Feature Labels
New Data Predict
Extraction
2. We have historical data using which machine can find the relationship between the input and the output
Based upon the model created from the training data, the machine is now able to classify into predefined classes, which in this
case are Apple, Banana or Grapes.
Supervised
1
Learning
Unsupervised
2
Learning
Reinforcement
3
Learning
You will take a fruit and you will arrange them by considering the physical character of that particular fruit
Here you did not learn anything before ,means no train data and no target variable
2. Output is dynamic to the input values, upon input of new values, output might change
3. No predefined output classes. It can only be grouped into clusters based on the characteristics by the
machine at runtime
Supervised ▪ This model can be used to cluster the input data in classes on the basis of their statistical
1
Learning
properties
Unsupervised
2
Learning
Reinforcement
3
Learning
Cluster - 1 Cluster - 2
Example: For a basket full of vegetables, we can cluster different vegetables based upon their
colour or size
▪ We want the machine to learn from the events and the result of their actions
Reward
or
Good Job! Penalty
Bad Job!
▪ An RL agent learns from the consequences of its actions, rather than from being taught explicitly
Siri
HealthCare
Financial Services
Biometrics
Fingerprint Optical
Scanner
▪ In most cases no image of the fingerprint is actually created, only a set of data
that can be used for comparison
Function: Y=F(X)
70%
Machine Statistical
Training
Learning Model
Dataset
Random
Historical Sampling
Data 30%
Prediction and
Prediction Test Testing
Dataset
Prediction
The model is used for predicting outcome of a new data set. Whenever
performance of the model degrades, the model is retrained
Linear Regression
Logistic Regression
Decision Tree
Random Forest
Linear Regression
Used to estimate discrete values (binary values like 0/1, yes/no,
true/false ) based on given set of independent variable(s)
Logistic Regression
Decision Tree
Random Forest
Linear Regression
Used to estimate discrete values (binary values like 0/1, yes/no,
true/false ) based on given set of independent variable(s)
Logistic Regression
Random Forest
Linear Regression
Used to estimate discrete values (binary values like 0/1, yes/no,
true/false ) based on given set of independent variable(s)
Logistic Regression
Random Forest
Random Forest is an ensemble of decision trees. It gives better
prediction and accuracy than decision tree
Naïve Bayes Classifier
Linear Regression
Used to estimate discrete values (binary values like 0/1, yes/no,
true/false ) based on given set of independent variable(s)
Logistic Regression
Random Forest
Random Forest is an ensemble of decision trees. It gives better
prediction and accuracy than decision tree
Naïve Bayes Classifier
▪ An Independent Variable(IDV) is the variable related to the dependent variable in a regression equation
For Example:-
Dependent
Variable
Independent
Variable
Y = a + bX
▪ Y-intercept (a) is that value of the Dependent Variable(y) when the value of the Independent Variable(x) is
zero. It is the point at which the line cuts the y-axis.
▪ Slope (b) is the change in the Dependent Variable for a unit increase in the Independent Variable. It is the
tangent of the angle made by the line with the x-axis.
This technique is used for finding the “best-fitting line” using the “least squares method”.
Regression Line
Matplotlib Numpy
It enables you to make- Stands for Numerical Python,
Bar charts, Scatter plots, Line provides an abundance of useful
Charts, Histograms, Pie charts, features for operations on n-arrays
Contour plots, Quiver plots and matrices in Python
Scatter Plot
4 Train Algorithm
5 Test Algorithm
6 Deployment
3 Analyze Data
x
4 Train Algorithm
5 Test Algorithm
y
6 Deployment
Important Terms
6 Deployment
Estimator Description
lm.fit() Fits a linear model
Scikit learn provides a function called train –test split to train and test data
2 Data Wrangling
5 Test Algorithm
6 Deployment
Copyright
Copyright
© 2017,
© edureka and/or its affiliates. All rights reserved.
Before moving ahead and
building a model. Let me
introduce you to
‘Model Fitting’
What is Model Fitting?
Fitting a model means that you're making your algorithm learn the relationship between predictors and
outcome so that you can predict the future values of the outcome
So the best fitted model has a specific set of parameters which best defines the problem at hand
Forest Land
Forecast
Human Population
Copyright © edureka and/or its affiliates. All rights reserved.
Types of Model Fitting: Underfitting And Overfitting
▪ Machine Learning algorithms first attempt to solve the problem of under-fitting; that is, of taking a line that does not
approximate the data well, and making it to approximate the data better
Machine doesn’t know where to stop in order to solve the problem it can even go ahead from Appropriate to Over Fit
model. When we say a model overfits a dataset we mean, it may have a low error rate for the training data, but it may
not generalize well to the overall population of data we’re interested in
1 Collecting Data
from sklearn.linear_model import LinearRegression
#fitting our model to train and test
2 Data Wrangling
lm = LinearRegression()
model = lm.fit(x_train,y_train)
3 Analyze Data
4 Train Algorithm
5 Test Algorithm
6 Deployment
Copyright
Copyright
© 2017,
© edureka and/or its affiliates. All rights reserved.
Testing the Algorithm
2 Data Wrangling
3 Analyze Data
4 Train Algorithm
5 Test Algorithm
6 Deployment
Copyright
Copyright
© 2017,
© edureka and/or its affiliates. All rights reserved.
Deployment
Download the complete code for Linear regression on the Boston Dataset from the LMS
1 Collecting Data
2 Data Wrangling
3 Analyze Data
4 Train Algorithm
5 Test Algorithm
6 Deployment
Copyright
Copyright
© 2017,
© edureka and/or its affiliates. All rights reserved.
Summary
▪ Express Machine Learning
▪ Supervised Learning