Phyton
Phyton
Mark as done
Experiment-2.2
Understand supervised learning to train and develop classifier models. CO2, CO4
Google Colaboratory
Theory:
Supervised Learning
Supervised learning is a type of machine learning where the model learns from labeled data. In this approach, the dataset
provided to the model contains input features (independent variables) and corresponding target labels (dependent variable). The
model learns the relationship between the inputs and the outputs to make predictions on new, unseen data.
Training Data: The subset of the dataset used to train the model.
Testing Data: The subset used to evaluate the model's performance.
The dataset is typically split into 70–80% training data and 20–30% testing data.
3. Objective:
The goal is to minimize the error between the predicted and actual outputs and to generalize well to unseen data.
https://lms.cuchd.in/mod/page/view.php?id=1883200 05/03/25, 9 59 PM
Page 1 of 10
:
Classification in Supervised Learning:
Classification is a supervised learning task where the output variable is categorical. Examples include:
Binary Classification: Predicting one of two categories (e.g., spam or not spam).
Multi-class Classification: Predicting one of multiple categories (e.g., types of fruits).
1. Data Preprocessing:
Use metrics like accuracy, precision, recall, F1-score, and ROC-AUC to assess the model’s performance.
6. Hyperparameter Tuning:
Optimize the model's performance by adjusting hyperparameters using techniques like Grid Search or Random
Search.
https://lms.cuchd.in/mod/page/view.php?id=1883200 05/03/25, 9 59 PM
Page 2 of 10
:
Common Use Cases of Classification Models:
The coding example will help you understand how to implement these concepts practically.
# Import required libraries
from pycaret.datasets import get_data # To load example datasets
# The target column, "Class variable", has two classes (binary values)
https://lms.cuchd.in/mod/page/view.php?id=1883200 05/03/25, 9 59 PM
Page 3 of 10
:
# Set up the classification environment
s = setup(data=diabetesDataSet, target='Class variable')
# Specifies the dataset and the target column to be used for training
https://lms.cuchd.in/mod/page/view.php?id=1883200 05/03/25, 9 59 PM
Page 4 of 10
:
# Plot the confusion matrix
plot_model(rfModel, plot='confusion_matrix')
# Visualizes the confusion matrix for the Random Forest model to evaluate its performance
# Generates standard evaluation plots like ROC curve, Precision-Recall curve, etc.
https://lms.cuchd.in/mod/page/view.php?id=1883200 05/03/25, 9 59 PM
Page 5 of 10
:
# Save the trained Random Forest model to a file
sm = save_model(rfModel, 'rfModelFile')
# Saves the trained model to a file named 'rfModelFile.pkl' for future use
# Visualizes the importance of features in making predictions with the Random Forest model
https://lms.cuchd.in/mod/page/view.php?id=1883200 05/03/25, 9 59 PM
Page 6 of 10
:
newDataSet = get_data("diabetes").iloc[:10]
# Loads a fresh copy of the diabetes dataset and selects the first 10 rows for testing
# Uses the trained Random Forest model to predict the class labels for the new data
# Outputs the predictions, including the class labels and probabilities for the new dataset
Additional Resources
1. Openclass Room Tutorials: https://openclassrooms.com/en/courses/6389626-train-a-supervised-machine-
learning-model/6405911-build-and-evaluate-a-classification-model
2. Datacamp Tutorial: https://www.datacamp.com/blog/classification-machine-learning
3. GeeksforGeeks - https://www.geeksforgeeks.org/basic-concept-classification-data-mining/
Video Links
1.Machine Learning in Python: Building a Classification Model
https://lms.cuchd.in/mod/page/view.php?id=1883200 05/03/25, 9 59 PM
Page 7 of 10
:
2.Random Forest Algorithm Explained with Python
https://lms.cuchd.in/mod/page/view.php?id=1883200 05/03/25, 9 59 PM
Page 8 of 10
:
TEXT BOOKS/REFERENCE BOOKS
TEXT BOOKS
T1: Data Science from Scratch, Joel Grus, Shroff Publisher Publisher /O’Reilly Publisher Media, 2019.
https://drive.google.com/file/d/1qv89LVaEshX9hcmSS9KDMsvBP-UYC78h/view?usp=sharing
T2: Artificial Intelligence: A Modern Approach, 3rd Edition, by Stuart Russell and Peter Norvig, Pearson Publisher, 2010.
https://drive.google.com/file/d/1G-s5fsBh5rLMdWmIYvyeI2zclcDCAA_D/view?usp=sharing
https://drive.google.com/file/d/1IBgLq2GvyEXURAPfSDm-Eep94X0vYXDb/view?usp=sharing
REFERENCE BOOKS
RB1: Philipp Janert, Data Analysis with Open-Source Tools, Shroff Publisher Publisher /O’Reilly Publisher Media.
https://drive.google.com/file/d/1SVtjE5XEih7_aU433_cAJKiDF41-KuzU/view?usp=sharing
RB2: Andreas C. Müller & Sarah Guido ,Introduction to Machine Learning with Python,published by O'Reilly Media
https://www.nrigroupindia.com/e-
book/Introduction%20to%20Machine%20Learning%20with%20Python%20(%20PDFDrive.com%20)-min.pdf
RB3: Ms.Anitha Patibandla, Dr.B.Jyothi, Ms.K.Bhavana,ARTIFICIAL INTELLIGENCE & MACHINE LEARNING,Lecture notes
https://mrcet.com/downloads/digital_notes/ECE/III%20Year/AI%20&%20ML%20DIGITAL%20NOTES.pdf
https://lms.cuchd.in/mod/page/view.php?id=1883200 05/03/25, 9 59 PM
Page 9 of 10
:
Contact us
! "
Follow us
https://lms.cuchd.in/mod/page/view.php?id=1883200 05/03/25, 9 59 PM
Page 10 of 10
: