1 - Supervised Learning & Its Types
1 - Supervised Learning & Its Types
a. Classification
b.Regression
a. Classification:
• Classification is a type of supervised learning that categorizes input data into predefined
labels.
• It involves training a model on labeled examples to learn patterns between input features
and output classes.
• In classification, the target variable is a categorical value, which means there are two
classes, such as Yes-No, Male-Female, True-False, etc.
• The model aims to generalize this learning to make accurate predictions on new, unseen
data.
• Algorithms like
• Random Forest
• Decision Trees
• Logistic Regression
4. The classification process aims to accurately label incoming emails as either spam
or non-spam to prevent unwanted emails from reaching users' inboxes.
• It is a modeling technique that has been associated with frequent transactions of buying some
combination of items.
Example: Amazon and many other Retailers use this technique. While viewing some products,
certain suggestions for commodities that some people have bought in the past are shown.
c. Weather Forecasting:
Decision Tree Classification: This type divides a dataset into segments based on feature variables,
making decisions at each node to classify data points.
K-Nearest Neighbors: This type of classification identifies the K closest data points to a given
observation and classifies it based on the majority class among its neighbors.
Logistic Regression: This classification type predicts the probability of Y being associated with the
X input variable.
Naïve Bayes: This classifier is a simple yet effective classification algorithm based on Bayes'
theorem, which calculates the probability of an event given the prior knowledge of conditions related
to the event.
Random Forest Classification: It combines multiple decision trees, each predicting the
probability of the target variable. The final output is determined by averaging these
probabilities.
Support Vector Machines: These utilize support vector classifiers, enhanced with
kernels, to evaluate non-linear decision boundaries effectively by enlarging feature
variable space.
Advantages: Classification
• Helps Banks and Financial Institutions to identify defaulters so that they may approve
Cards, Loan, etc.
Disadvantages: Classification
1. Privacy: When the data is either, there are chances that a company may give some
information about its customers to other vendors or use this information for its profit.
2. Accuracy Problem: To get the best accuracy and result, an accurate model must be
selected.
APPLICATIONS: Classification
• Image Recognition
• Face Recognition
• Disease Diagnosis
• Voice Recognition
b. Regression:
• Predicting House Prices: Regression analysis can be used to predict house prices
based on factors such as square footage, number of bedrooms and bathrooms,
location, and other relevant features.
• Forecasting Sales: Regression analysis can help businesses forecast future sales
based on historical data, market trends, advertising expenditure, and other variables.
3. Predicting Student Performance: Regression analysis can be applied to
predict student performance based on factors such as study hours, attendance,
socioeconomic background, and previous academic achievements.
Decision Tree Regression: The primary purpose of this regression is to divide the
dataset into smaller subsets to plot the value of any data point related to the problem
statement.
Polynomial Regression: This type fits a non-linear equation by using the polynomial
functions of an independent variable.
Random Forest Regression: It is a popular method in Machine Learning that utilizes
multiple decision trees to predict the output by randomly selecting data points from the
dataset for each tree.
Simple Linear Regression: This type is the least complicated form of regression, where
the dependent variable is continuous.
Support Vector Regression: This regression type solves both linear and non-linear
models. It uses non-linear kernel functions, like polynomials, to find an optimal solution
for non-linear models.
Advantages: Regression
• Quantifies Relationships
• Identification of Significance
• Decision Support
Disadvantages: Regression
• Healthcare
• Environmental Science
• The prediction task is a classification when the target variable is discrete. An application is the
identification of the underlying sentiment of a piece of text.
• The prediction task is a regression when the target variable is continuous. An example is
predicting a person’s salary given their education degree, previous work experience,
geographical location, and level of seniority.