0% found this document useful (0 votes)
4 views51 pages

Machine Learning3

The document provides an overview of supervised learning in machine learning, detailing its types, including classification and regression, and specific algorithms like K-Nearest Neighbors (KNN) and Decision Trees. It outlines the steps involved in classification learning, advantages and disadvantages of KNN and Decision Trees, and applications of various algorithms. Additionally, it briefly discusses the Naive Bayes classifier and Simple Linear Regression, emphasizing their principles and uses in different domains.

Uploaded by

ritikakour16
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views51 pages

Machine Learning3

The document provides an overview of supervised learning in machine learning, detailing its types, including classification and regression, and specific algorithms like K-Nearest Neighbors (KNN) and Decision Trees. It outlines the steps involved in classification learning, advantages and disadvantages of KNN and Decision Trees, and applications of various algorithms. Additionally, it briefly discusses the Naive Bayes classifier and Simple Linear Regression, emphasizing their principles and uses in different domains.

Uploaded by

ritikakour16
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 51

The Dalai Lama Institute for Higher Education, Bangalore

Affiliated to Bangalore University

Computer Application Department

VI Semester
MACHINE LEARNING
Course Code: CA-C27T
Supervised learning
Supervised learning is a type of machine learning where the model learns from labeled data ,
meaning each training example consists of input data (features) along with their corresponding
correct output (labels). The goal of supervised learning is to learn a mapping from input to
output based on the available labeled data, such that the model can make accurate predictions
on new, unseen data.
Types of Supervised Learning:
 Classification :
• Classification is a type of supervised learning where the goal is to predict the category
or class label of input data.
• The output variable is discrete and categorical, with a finite number of possible values
or classes.
• Example: Email spam detection, sentiment analysis, image classification.
 Regression :
• Regression is a type of supervised learning where the goal is to predict a continuous
output variable based on input features.
• The output variable is numerical, representing a quantity or value along a continuous
scale.
• Example: Predicting house prices, stock prices, temperature forecasting, salary of person.
Classification
 Binary Classifier:
 A binary classifier is a type of classification model that predicts two possible outcomes or
classes for each input instance. The output variable is binary, meaning it has only two
possible values.
 Examples:
Logistic Regression : A popular linear model used for binary classification tasks.
Support Vector Machines (SVM) : Effective for separating two classes with a hyperplane in the
feature space.

Decision Trees : Can be used for binary classification by splitting the feature space into
regions corresponding to each class.
 Multi-class Classifier:
 A multi-class classifier is a type of classification model that predicts multiple possible
outcomes or classes for each input instance. The output variable can have more than two
possible values.
 Examples:
1. Random Forest : Can handle multi-class classification tasks by combining multiple decision
trees.
2. K-Nearest Neighbors (KNN): Can be used for multi-class classification by assigning the
majority class among the K nearest neighbors.
 Lazy Learner (Instance-based Learning):
 Lazy learners, also known as instance-based learners, delay the process of learning until a
new instance needs to be classified. They store the training instances and perform
classification based on similarity measures between the new instance and the stored
instances.
 Example:
1. K-Nearest Neighbors (KNN): A lazy learning algorithm that stores all instances of the
training data and classifies new instances based on the majority class among its nearest
neighbors.
 Eager Learner (Model-based Learning):
 Eager learners, also known as model-based learners, construct a generalized model from
the training data during the learning phase. This model is then used to make predictions on
new instances without requiring the entire training data.
 Example:
1. Decision Trees : An eager learning algorithm that constructs a tree-based model by
recursively partitioning the feature space based on the training data.
Classsification learning steps.
1. Data Collection and Preprocessing :
1. Gather labeled data, where each instance is associated with a class label. Preprocess the data by handling missing values, outliers, and scaling
features if necessary.

2. Feature Selection and Engineering :


1. Select relevant features that contribute to the predictive task. Create new features or transform existing ones to enhance the model's performance.

3. Model Selection :
1. Choose an appropriate classification algorithm based on the nature of the problem, size of the dataset, and computational resources available.

4. Training the Model :


1. Train the selected model on the labeled training data. The model learns to identify patterns and relationships between input features and class
labels.

5. Model Evaluation :
1. Evaluate the trained model's performance using metrics such as accuracy, precision, recall, and F1-score on a separate validation set or through
cross-validation.

6. Hyperparameter Tuning :
1. Fine-tune the model's hyperparameters to optimize its performance. This may involve grid search, random search, or other optimization
techniques.

7. Testing and Deployment :


1. Assess the model's performance on unseen data using the test set. If satisfactory, deploy the model into production for real-world use.
K-Nearest Neighbors (KNN)
 K-Nearest Neighbors (KNN) is a simple and effective classification algorithm used for both
binary and multi-class classification tasks. It's a type of lazy learning algorithm that stores
all instances of the training data and makes predictions for new instances based on their
similarity to the nearest neighbors.
When knn can use.
KNN working steps
 The K-Nearest Neighbors (KNN) algorithm is a simple and intuitive classification algorithm.
Here are the steps involved in its working:
 Initialize the Training Data :
• Store all instances of the training data along with their corresponding class labels
 Calculate Distance :
• For a new instance (datapoint), calculate the distance between this datapoint and all
other datapoints in the training set.
• Common distance metrics include
• A. Euclidean distance,
• B. Manhattan distance,
• C. Minkowski distance.
 Select Nearest Neighbors :
• Choose the K nearest neighbors to the new instance based on the calculated distances.
K is a predefined hyperparameter.
 Majority Voting :
• For classification tasks, count the class labels of the K nearest neighbors.
• Assign the most common class label among the K nearest neighbors as the predicted
class for the new instance.
 Prediction :
• Once the majority class label (or average value for regression) is determined, assign it as
the prediction for the new instance.
Hyperparameter Tuning :
1. Experiment with different values of K and choose the one that gives the best performance on a
validation set or through cross-validation.
Model Deployment :
1. Deploy the trained KNN model into production for making predictions on new, unseen data.
 Advantages of K-Nearest Neighbors (KNN):
1. Simple and Intuitive: KNN is easy to understand and implement, making it suitable for
beginners in machine learning.
2. Non-parametric: KNN doesn't make any assumptions about the underlying data
distribution, making it robust to noisy data and outliers.
3. Versatile: KNN can be applied to both classification and regression tasks, making it a
versatile algorithm.
4. No Training Phase: KNN is a lazy learner, meaning it doesn't require a training phase. The
model is simply memorizing the training data, making it computationally efficient during
training.
5. Adaptability to New Data: KNN can easily adapt to new data without needing to retrain the
model.
 Disadvantages of K-Nearest Neighbors (KNN):
1. Computationally Expensive: KNN requires calculating distances between the new
instance and all instances in the training data, which can be computationally expensive,
especially for large datasets.
2. High Memory Requirement: Since KNN stores all training instances in memory, it can
require a significant amount of memory, especially for large datasets.
3. Sensitive to Irrelevant Features: KNN is sensitive to irrelevant features and can be
biased towards features with higher magnitudes.
4. Need for Optimal K: The choice of the hyperparameter K (number of neighbors)
significantly affects the model's performance. Finding the optimal value of K can be
challenging and may require experimentation.
 Applications:
1. Classification :
1. KNN is commonly used for classification tasks such as email spam detection, sentiment analysis,
and image recognition.
2. Regression :
1. KNN can be applied to regression problems such as predicting house prices, stock prices, and
demand forecasting.
3. Recommendation Systems :
1. KNN-based collaborative filtering algorithms are used in recommendation systems to suggest items
to users based on their similarity to other users.
4. Anomaly Detection :
1. KNN can be used for anomaly detection in cybersecurity, fraud detection, and network intrusion
detection.
5. Medical Diagnosis :
1. KNN is utilized in medical diagnosis to classify diseases based on patient symptoms and medical
history.
KNN PROGRAM
 This program does the following:
1. Loads the Iris dataset using load_iris() from scikit-learn.
2. Splits the dataset into features (X) and target (y).
3. Splits the data into training and testing sets using train_test_split().
4. Scales the features using StandardScaler() to ensure mean=0 and variance=1.
5. Initializes the KNN classifier with a specified number of neighbors (k).
6. Trains the KNN classifier on the scaled training data.
7. Makes predictions on the scaled testing data.
8. Evaluates the accuracy of the classifier using accuracy_score().
Decision trees

 Decision trees are a supervised learning algorithm used in machine learning for
classification and regression modeling. They can be used to determine whether an
event happened or didn't happen, or to predict continuous values based on
previous data.
DECISION TREE IMPORTANCE
1. TERMINOLOGIES
Root Node :
1. The topmost node of the decision tree, representing the feature that best splits the dataset into subsets. It has no incoming
edges.
2. Leaf Node (Terminal Node) :
1. Nodes at the bottom of the decision tree that do not split further. Each leaf node represents a class label (in classification) or a
predicted value (in regression).
Decision Rule :
1. The rule or condition based on which the dataset is split at each node. It involves comparing the value of a feature with a
threshold.
Impurity :
1. A measure of the disorder or uncertainty in a dataset. Decision trees aim to minimize impurity at each node to make the splits
more informative.
Gini Impurity :
1. A measure of impurity used in classification tasks. It calculates the probability of misclassifying a randomly chosen data point if
it were labeled according to the distribution of classes in the subset.
Entropy :
1. Another measure of impurity is used in classification tasks. It quantifies the uncertainty in a dataset's class distribution.
Pruning :
1. The process of removing nodes or branches from the decision tree to prevent overfitting. Pruning helps simplify the tree and
improve its generalization ability.
 Working Principle:
1. Splitting Criteria :
1. Decision trees recursively split the data into subsets based on the values of features. The
algorithm selects the best feature and split point that maximizes the homogeneity (or purity) of the
subsets.
2. Tree Construction :
1. Starting from the root node, the algorithm iteratively splits the data at each node based on the
selected splitting criteria until a stopping criterion is met. This process continues until the subsets
are pure (all instances belong to the same class) or the tree reaches a predefined depth.
3. Decision Rules :
1. At each internal node, the decision tree applies a decision rule based on a feature value.
4. Leaf Nodes :
1. When a stopping criterion is met (e.g., maximum depth reached, no further improvement in purity),
the algorithm creates a leaf node representing the majority class (in classification) or the average
value (in regression) of the instances in that subset.
5. Prediction :
1. To make predictions for new instances, the decision tree traverses from the root node down to a
leaf node based on the decision rules defined at each node. The predicted class or value at the
leaf node is assigned to the new instance.
 Advantages of Decision Trees:
1. Interpretability :
1. Decision trees are easy to understand and interpret, making them suitable for explaining the
decision-making process to non-experts.
2. Handles Mixed Data Types :
1. Decision trees can handle both numerical and categorical data without the need for feature
scaling or one-hot encoding.
3. Implicit Feature Selection :
1. Decision trees perform implicit feature selection by selecting the most informative features for
splitting the data at each node.
4. Handles Missing Values and Outliers :
1. Decision trees can handle missing values and outliers in the data by choosing alternative
splits.
 Disadvantages of Decision Trees:
1. Overfitting :
1. Decision trees are prone to overfitting, especially on noisy or high-dimensional data.
2. Instability :
1. Small variations in the data or random noise can lead to different decision tree structures,
making them sensitive to the specific training data used.
3. Limited Expressiveness :
1. Decision trees may fail to capture complex relationships in the data, especially when the
decision boundaries are not axis-aligned.
APPLICATION OF DECISION TREE

 Decision trees find applications in various domains due to their simplicity and
interpretability:
1. Medical Diagnosis : Decision trees aid in diagnosing diseases based on symptoms,
guiding healthcare professionals in treatment decisions.
2. Credit Scoring : Financial institutions employ decision trees for credit scoring,
assessing creditworthiness and determining loan approvals.
3. Fraud Detection : Decision trees help detect fraudulent activities in banking,
insurance, and e-commerce by identifying suspicious patterns.
4. Product Recommendation : E-commerce platforms utilize decision trees for
personalized product recommendations, enhancing user experience and boosting
sales.
NAIVE BAYES CLASSIFIER.
 The Naive Bayes classifier is a simple probabilistic machine learning algorithm based
on Bayes' Theorem with an assumption of independence between features. Despite its
simplicity, it is surprisingly effective for classification tasks, especially for text
classification and document categorization.
 Working Principle:
1. Bayes' Theorem :
1. Naive Bayes classifier calculates the probability of a class label given a set of features using
Bayes' Theorem.
2. It assumes that the presence of a particular feature is independent of the presence of any
other feature, hence the term "naive".
2. Training :
1. The classifier estimates the probabilities of each class and the conditional probabilities of
each feature given the class labels from the training data.
3. Classification :
1. To classify a new instance, the classifier calculates the probability of each class given the
features using Bayes' Theorem and selects the class with the highest probability.
APPLICATION OF NAIVE BASE
CLASSIFIER
1. Text Classification : Naive Bayes classifiers are widely used for document classification tasks
such as spam detection, sentiment analysis, and topic categorization.
2. Email Filtering : Naive Bayes classifiers are employed in email filtering systems to classify
emails as spam or non-spam based on the presence of specific words or features.
3. Medical Diagnosis : Naive Bayes classifiers aid in medical diagnosis by predicting the
likelihood of diseases based on patient symptoms and diagnostic test results.
4. Recommendation Systems : Naive Bayes classifiers contribute to recommendation systems
by predicting user preferences and suggesting relevant items or content.
5. Customer Segmentation : Naive Bayes classifiers help in segmenting customers based on
their behavior, demographics, and preferences, enabling targeted marketing strategies.
6. Fraud Detection : Naive Bayes classifiers assist in fraud detection by identifying anomalous
patterns in financial transactions or user behavior.
7. Image Classification : Naive Bayes classifiers are used in image classification tasks, such as
identifying objects in images or recognizing handwritten digits.
 Advantages of Naive Bayes Classifier:
1. Simple and Fast : Naive Bayes is computationally efficient and easy to implement, making
it suitable for large datasets and real-time applications.
2. Handles High-Dimensional Data : Performs well even with high-dimensional data and
irrelevant features due to its assumption of feature independence.
3. Effective with Small Data : Requires a small amount of training data to estimate
parameters accurately, making it suitable for small datasets.
 Disadvantages of Naive Bayes Classifier:
1. Assumption of Feature Independence : The assumption of feature independence may not
hold true in real-world datasets, leading to suboptimal performance.
2. Sensitive to Input Data Quality : Naive Bayes can be sensitive to the quality of input data,
especially when features are highly correlated or contain missing values.
3. Limited Expressiveness : Naive Bayes has limited expressive power compared to more
complex models, which may result in lower accuracy for certain tasks.
REGRESSION
Simple Linear Regression is a statistical method used to model the relationship between
a single independent variable (predictor) and a continuous dependent variable
(response). It assumes a linear relationship between the predictor and the response,
which can be represented by a straight line
SIMPLE LINEAR REGRESSION
 Working of Simple Linear Regression:
1. Data Collection : Gather a dataset containing observations of both the independent
variable (predictor) and the dependent variable (response).
2. Model Training : Fit a linear regression model to the data by estimating the values of
�0β0 and �1β1 that minimize the sum of squared errors (SSE) between the observed and
predicted values of �Y.
3. Model Evaluation : Assess the goodness-of-fit of the model using metrics such as the
coefficient of determination (R-squared), mean squared error (MSE), or residual plots.
4. Prediction : Once the model is trained and evaluated, use it to make predictions on new
or unseen data by plugging in values of �X to estimate corresponding values of �Y.
 Advantages of Simple Linear Regression:
1. Interpretability : The coefficients �0β0 and �1β1 have clear interpretations, allowing for
easy understanding of the relationship between the variables.
2. Computational Efficiency : Simple linear regression is computationally efficient and can
handle large datasets with ease.
3. Visualization : The linear relationship can be visualized easily using scatter plots and
regression lines.
 Disadvantages of Simple Linear Regression:
1. Assumption of Linearity : Simple linear regression assumes a linear relationship between
the predictor and response variables, which may not always hold true in real-world
scenarios.
2. Sensitive to Outliers : Outliers in the data can significantly impact the estimation of
regression coefficients and reduce the accuracy of predictions.
3. Limited Predictive Power : Simple linear regression may not capture complex
relationships between variables, leading to limited predictive power compared to more
sophisticated models.
Logistic Regression

 Logistic Regression is a statistical method used for binary classification tasks, where the
outcome variable is categorical with two possible classes (e.g., yes/no, 1/0). Despite its
name, logistic regression is a classification algorithm rather than a regression algorithm. It
models the probability that an instance belongs to a particular class using the logistic
function.
 Logistic Function:
 The logistic function, also known as the sigmoid function, is the core component of logistic
regression. It is defined as:

Where x is the linear combination of feature values and model coefficients. The
logistic function maps the input (x) to a value between 0 and 1, representing the
probability of belonging to the positive class.
 Application:
 Logistic regression is widely used in various fields, including:
• Credit scoring
• Disease diagnosis
• Spam filtering
• Market segmentation

You might also like