0% found this document useful (0 votes)
113 views19 pages

Bike Buyer Prediction Using Classification Algorithm

This document discusses machine learning algorithms and their applications for classification problems. It describes supervised, unsupervised, and semi-supervised learning approaches. Popular classification algorithms like K-nearest neighbors, support vector machines, decision trees, and random forests are explained. The benefits of machine learning include simplifying marketing, improving medical diagnoses and data entry while limitations involve requiring large training data and potential biases. Applications span personal assistants, surveillance, social media services, spam filtering and fraud detection.

Uploaded by

chaitra pujar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
113 views19 pages

Bike Buyer Prediction Using Classification Algorithm

This document discusses machine learning algorithms and their applications for classification problems. It describes supervised, unsupervised, and semi-supervised learning approaches. Popular classification algorithms like K-nearest neighbors, support vector machines, decision trees, and random forests are explained. The benefits of machine learning include simplifying marketing, improving medical diagnoses and data entry while limitations involve requiring large training data and potential biases. Applications span personal assistants, surveillance, social media services, spam filtering and fraud detection.

Uploaded by

chaitra pujar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 19

Bike Buyer Prediction using

Classification Algorithm
Machine learning

 Machine learning (ML) is the scientific study of algorithms


and statistical models that computer systems use in order to
perform a specific task effectively without using explicit
instructions, relying on patterns and inference instead.
 It is a branch of artificial intelligence based on the idea that
systems can learn from data, identify patterns and make
decisions with minimal human intervention.
 Machine learning algorithms build a mathematical model
based on sample data, known as "training data”, in order to
make predictions or decisions without being explicitly
programmed to perform the task
Purpose of Machine Learning

 the purpose of machine learning is not building an


automated duplication of intelligent behaviour, but
using the power of computers to complement and
supplement human intelligence.
 For example, machine learning programs can scan and
process huge databases detecting patterns that are
beyond the scope of human perception
 As a field of science, machine learning shares common
concepts with other disciplines such as statistics,
information theory, game theory, and optimization.
CLASSIFICATION

 Machine learning implementations are classified into


three major categories depending on the nature of the
learning “signal” or “response” available to a learning
system
1: Supervised learning
2: Unsupervised learning
3: Semi-supervised Learning
Supervised learning
 When an algorithm learns from example data and associated
target responses that can consist of numeric values or string
labels, such as classes or tags, in order to later predict the
correct response when posed with new examples comes under
the category of Supervised learning.
 Supervised Learning is classified into:
1:Classification: When inputs are divided into two or more
classes, and the learner must produce a model that assigns unseen
inputs to one or more of these classes”.
2:Regression : Which is also a supervised problem, A case
when the outputs are continuous rather than discrete
Unsupervised learning

 when an algorithm learns from plain examples without


any associated response, leaving to the algorithm to
determine the data patterns on its own.
 In unsupervised learning the algorithm builds a
mathematical model from a set of data which contains
only inputs and no desired output labels.
 Unsupervised Learning is classified into:
Clustering: When a set of inputs is to be divided into
groups
Semi-supervised Learning

 If some learning samples are labeled, but some other are


not labeled, then it is semi-supervised learning
 It makes use of a large amount of unlabeled data for
training and a small amount of labeled data for testing
 Semi-supervised learning is applied in cases where it is
expensive to acquire a fully labeled dataset while more
practical to label a small subset.
DATASETS USED IN MACHINE
LEARNING

There are 3 types of Datasets:


 Training Dataset
 Validate Dataset
 Testing Dataset
Training Dataset
 The sample of data used to fit the model. The actual dataset
that we use to train the model (weights and biases in the case
of Neural Network). The model sees and learns from this data.
Validation Dataset:
 The sample of data used to provide an unbiased evaluation of a
model fit on the training dataset while tuning model hyper
parameters.
 The validation set is used to evaluate a given model, but this is
for frequent evaluation
Test Dataset:
 The sample of data used to provide an unbiased
evaluation of a final model fit on the training dataset .
The Test dataset provides the gold standard used to
evaluate the model. It is only used once a model is
completely trained.
TYPES OF ALGORITHMS

 K-Nearest neighbors
 Support Vector Machines
 Decision Tree Classification
 Random Forest Classification
KNN (K- Nearest Neighbors)
 It’s a simple algorithm that stores all available cases and
classifies any new cases by taking a majority vote of its
k neighbours. The case is then assigned to the class with
which it has the most in common.
 Things to consider before selecting KNN:
1:KNN is computationally expensive
2:Variables should be normalized, or else higher
range variables can bias the algorithm
3:Data still needs to be pre-processed
SVM (Support Vector Machine)
 SVM is a method of classification in which you plot raw
data as points in an n-dimensional space (where n is the
number of features you have).
 The value of each feature is then tied to a particular
coordinate, making it easy to classify the data. Lines
called classifiers can be used to split the data and plot
them on a graph.
Decision Tree
 One of the most popular machine learning algorithms in
use today, this is a supervised learning algorithm that is
used for classifying problems.
 It works well classifying for both categorical and
continuous dependent variables. In this algorithm, we split
the population into two or more homogeneous sets based
on the most significant attributes/ independent variables
Random Forest
 A collective of decision trees is called a Random Forest. To
classify a new object based on its attributes, each tree is
classified, and the tree “votes” for that class. The forest
chooses the classification having the most votes
 Each tree is planted & grown as follows:
 If the number of cases in the training set is N, then a sample
of N cases is taken at random. This sample will be the
training set for growing the tree.
 If there are M input variables, a number m<<="" li=""
style="box-sizing: border-box;">
 Each tree is grown to the largest extent possible. There is no
pruning.
BENEFITS

 Simplifies Product Marketing and Assists in Accurate


Sales Forecasts
 Facilitates Accurate Medical Predictions and Diagnoses
 Simplifies Time-Intensive Documentation in Data Entry
 Improves Precision of Financial Rules and Models
LIMITATIONS

 Machine learning algorithms require massive stores of


training data.
 There is bias in the data.
 AI algorithm don’t collaborate.
 Time constraints in learning
APPLICATIONS

 Virtual Personal Assistants


 Videos Surveillance
 Social Media Services
 Email Spam and Malware Filtering
 Online Fraud Detection
THANK YOU

You might also like