0% found this document useful (0 votes)

50 views108 pages

ML Unit-1 (CEC)

Uploaded by

riskman1919

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

50 views108 pages

ML Unit-1 (CEC)

Uploaded by

riskman1919

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 108

Machine learning is a branch of Artificial Intelligence (AI) and

computer science which focuses on the use of data and

algorithms to imitate the way that humans learn, gradually
improving its accuracy.

Machine learning uses various algorithms for building

mathematical models, statistical models and making predictions
using historical data or information. Currently, it is being used
for various tasks such as image recognition, speech
recognition, email filtering, Facebook auto-
tagging, recommender system, and many more.
 “The function of a machine learning system can
be descriptive, meaning that the system uses the data to
explain what happened; predictive, meaning the system uses
the data to predict what will happen; or prescriptive, meaning
the system will use the data to make suggestions about what
action to take,”
Features of Machine Learning:
 Machine learning uses data to detect various patterns in a
given dataset.
 It can learn from past data and improve automatically.

 It is a data-driven technology.

 Machine learning is much similar to data mining as it also

deals with the huge amount of the data.
Importance of Machine Learning: Machine Learning can
be easily understood by its uses cases, Currently, machine
learning is used in
 Self-driving cars,
 Cyber fraud detection,
 Face recognition
 Facebook
 Netflix
 Recommend Systems
 Rapid increment in the production of data
 Solving complex problems, which are difficult for a human
 Decision making in various sector including finance
 Finding hidden patterns and extracting useful information
from data.
Classification of Machine Learning
At a broad level, machine learning can be classified into three
types:

 Supervised learning
 Unsupervised learning
 Reinforcement learning
 Supervised learning is a type of machine learning method in
which we provide sample labeled data to the machine learning
system in order to train it, and on that basis, it predicts the
output.
 The system creates a model using labeled data to understand
the datasets and learn about each data, once the training and
processing are done then we test the model by providing a
sample data to check whether it is predicting the exact output
or not.
 The goal of supervised learning is to map input data with the
output data. The supervised learning is based on supervision,
and it is the same as when a student learns things in the
supervision of the teacher. The example of supervised learning
is spam filtering.
 Supervised learning can be grouped further in two categories
of algorithms:

 Classification
 Regression
How does supervised machine learning work?
Supervised learning algorithms are good for the
following tasks:
 Binary classification: Dividing data into two
categories.
 Multi-class classification: Choosing more than two
classifications.
 Regression Modeling: Predicting continuous values.
 Assembling: Combining the predictions of multiple
machine learning models to produce an accurate
prediction.
Unsupervised learning is a learning method in which a machine
learns without any supervision.

 The training is provided to the machine with the set of data

that has not been labeled, classified, or categorized, and the
algorithm needs to act on that data without any supervision.
The goal of unsupervised learning is to restructure the input
data into new features or a group of objects with similar
patterns.
 In unsupervised learning, we don't have a predetermined
result. The machine tries to find useful insights from the huge
amount of data. It can be further classifieds into two categories
of algorithms:
 Clustering
 Association
 How does Unsupervised Machine Learning work?
Unsupervised learning algorithms are good for the following
tasks:
 Clustering: Splitting the dataset into groups based on
similarity.

 Anomaly detection: Identifying unusual data points in a data

set.
 Association mining: Identifying sets of items in a data set that
frequently occur together.

 Dimensionality Reduction: Reducing the number of

variables in a data set.
 Reinforcement learning is a feedback-based learning method,
in which a learning agent gets a reward for each right action
and gets a penalty for each wrong action. The agent learns
automatically with these feedbacks and improves its
performance. In reinforcement learning, the agent interacts
with the environment and explores it. The goal of an agent is
to get the most reward points, and hence, it improves its
performance.
 The robotic dog, which automatically learns the movement of
his arms, is an example of Reinforcement learning.
Reinforcement learning is often used in areas such as:

 Robotics: Robots can learn to perform tasks the physical

world using this technique.

 Video Game play: Reinforcement learning has been used to

teach bots to play a number of video games.

 Resource Management: Given finite resources and a defined

goal, reinforcement learning can help enterprises plan out how
to allocate resources.

 Regression and Classification algorithms are Supervised
Learning algorithms. Both the algorithms are used for
prediction in Machine learning and work with the labeled
datasets. But the difference between both is how they are
used for different machine learning problems.

 The main difference between Regression and

Classification algorithms that Regression algorithms are
used to predict the continuous values such as price,
salary, age, etc. and Classification algorithms are used
to predict/Classify the discrete values such as Male or
Female, True or False, Spam or Not Spam, etc.
 Classification is a process of finding a function which
helps in dividing the dataset into classes based on
different parameters.

 In Classification, a computer program is trained on

the training dataset and based on that training, it
categorizes the data into different classes. The task of
the classification algorithm is to find the mapping
function to map the input(x) to the discrete output(y).
 Example: The best example to understand the
Classification problem is Email Spam Detection. The
model is trained on the basis of millions of emails on
different parameters, and whenever it receives a new
email, it identifies whether the email is spam or not.
If the email is spam, then it is moved to the Spam
folder.
 Types of ML Classification Algorithms:
Classification Algorithms can be further divided
into the following types:
 Logistic Regression
 K-Nearest Neighbours (KNN)
 Support Vector Machines
 Kernel SVM
 Naive Bayes
 Decision Tree Classification
 Random Forest Classification
 Regression:

 Regression is a process of finding the correlations

between dependent and independent variables. It helps
in predicting the continuous variables such as
prediction of Market Trends, prediction of House
prices, etc.

 The task of the Regression algorithm is to find the

mapping function to map the input variable(x) to the
continuous output variable(y).
 Example: Suppose we want to do weather
forecasting, so for this, we will use the Regression
algorithm. In weather prediction, the model is trained
on the past data, and once the training is completed, it
can easily predict the weather for future days.
 Types of Regression Algorithm
 Simple Linear Regression
 Multiple Linear Regression
 Polynomial Regression
 Support Vector Regression (SVR)
 Decision Tree Regression
 Random Forest Regression
 Machine learning has given the computer systems the
abilities to automatically learn without being explicitly
programmed. But how does a machine learning system
work? So, it can be described using the Process of
machine learning. Machine learning Process is a life
cyclic process to build an efficient machine learning
project. The main purpose of the life cycle is to find a
solution to the problem or project.
 Machine learning process involves seven major steps,
which are given below:
 Gathering Data
 Data preparation
 Data Wrangling/Data Preprocessing
 Analyse Data
 Train the Model
 Test the Model
 Deployment
 The most important thing in the complete process is
to understand the problem and to know the purpose
of the problem. Therefore, before starting the process,
we need to understand the problem because the good
result depends on the better understanding of the
problem.

 In the complete life cycle process, to solve a problem,

we create a machine learning system called "model",
and this model is created by providing "training". But
to train a model, we need data, hence, life cycle starts
by collecting data.
 Data Gathering is the first step of the machine
learning process. The goal of this step is to identify
and obtain all data-related problems.

 In this step, we need to identify the different data

sources, as data can be collected from various sources
such as files, database, internet, or mobile devices.
It is one of the most important steps of the process.
The quantity and quality of the collected data will
determine the efficiency of the output. The more will
be the data, the more accurate will be the prediction.
 This step includes the below tasks:

 Identify various data sources

 Collect data
 Integrate the data obtained from different sources

 By performing the above task, we get a coherent set

of data, also called as a dataset. It will be used in
further steps.

 After collecting the data, we need to prepare it for
further steps. Data preparation is a step where we put
our data into a suitable place and prepare it to use in
our machine learning training.
Data Exploration:

It is used to understand the nature of data that we have to work

with. We need to understand the characteristics, format, and
quality of data. A better understanding of data leads to an
effective outcome. In this, we find Correlations, general
trends, and outliers.
 Data wrangling is the process of cleaning and
converting raw data into a useable format. It is the
process of cleaning the data, selecting the variable to
use, and transforming the data in a proper format to
make it more suitable for analysis in the next step. It
is one of the most important steps of the complete
process. Cleaning of data is required to address the
quality issues.
 collected data may have various issues, including:

 Missing Values
 Duplicate data
 Invalid data
 Noise
 So, we use various filtering techniques to clean the data.
 It is mandatory to detect and remove the above issues
because it can negatively affect the quality of the
outcome.
 Now the cleaned and prepared data is passed on to
the analysis step. This step involves:

 Selection of Analytical Techniques

 Building Models
 Review the Result
 The aim of this step is to build a machine learning
model to analyze the data using various analytical
techniques and review the outcome. It starts with the
determination of the type of the problems, where we
select the machine learning techniques such
as Classification, Regression, Cluster
analysis, Association, etc. then build the model using
prepared data, and evaluate the model.

 Hence, in this step, we take the data and use machine

learning algorithms to build the model.
 Now the next step is to train the model, in this step
we train our model to improve its performance for
better outcome of the problem.

 We use datasets to train the model using various

machine learning algorithms. Training a model is
required so that it can understand the various
patterns, rules, and, features.
 Once our machine learning model has been trained on
a given dataset, then we test the model. In this step,
we check for the accuracy of our model by providing
a test dataset to it.

 Testing the model determines the percentage

accuracy of the model as per the requirement of
project or problem.
 If the above-prepared model is producing an accurate
result as per our requirement with acceptable speed,
then we deploy the model in the real system. But
before deploying the project, we will check whether
it is improving its performance using available data
or not. The deployment phase is similar to making the
final report for a project.
 Weights and biases (commonly referred to as w and
b) are the learnable parameters of a some machine
learning models.
 Inputs: Inputs are the set of values for which we need to predict a output
value. They can be viewed as features or attributes in a dataset.

 Weights: weights are the real values that are attached with each
input/feature and they convey the importance of that corresponding
Attribute in predicting the final output.

 Bias: Bias is used for shifting the activation function towards left or right,
you can compare this to y-intercept in the line equation.

 Summation Function: The work of the summation function is to bind the

weights and inputs together and calculate their sum.

 Activation Function: It is used to introduce non-linearity in the model.


 Overfitting and Underfitting are the two main
problems that occur in machine learning and degrade
the performance of the machine learning models.

 The main goal of each machine learning model is to

generalize well. Here generalization defines the
ability of an ML model to provide a suitable output
by adapting the given set of unknown input. It means
after providing training on the dataset, it can produce
reliable and accurate output.
 Hence, the underfitting and overfitting are the two
terms that need to be checked for the performance of
the model and whether the model is generalizing well
or not.
Before understanding the overfitting and underfitting,
let's understand some basic term that will help to
understand this topic well:
 Noise: Noise is unnecessary and irrelevant data that
reduces the performance of the model.
 Bias: Bias is a prediction error that is introduced in
the model due to oversimplifying the machine
learning algorithms. Or it is the difference between
the predicted values and the actual values.
 Variance: If the machine learning model performs
well with the training dataset, but does not perform
well with the test dataset, then variance occurs.
Overfitting
 Overfitting occurs when our Machine
Learning model tries to cover all the data points or
more than the required data points present in the
given dataset. Because of this, the model starts
caching noise and inaccurate values present in the
dataset, and all these factors reduce the efficiency and
accuracy of the model. The overfitted model has low
bias and high variance.
 The chances of occurrence of overfitting increase as
much we provide training to our model. It means the
more we train our model, the more chances of
occurring the overfitted model.
 Overfitting is the main problem that occurs
in supervised learning.
 Both overfitting and underfitting cause the degraded
performance of the machine learning model. But the
main cause is overfitting, so there are some ways by
which we can reduce the occurrence of overfitting in
our model.
 Cross-Validation
 Training with more data
 Removing Noise
 Regularization
 Underfitting: A statistical model or a machine
learning algorithm is said to have underfitting when it
cannot capture the underlying trend of the data, i.e., it
only performs well on training data but performs
poorly on testing data. Underfitting destroys the
accuracy of our machine learning model. Its
occurrence simply means that our model or the
algorithm does not fit the data well enough.
Underfitting can be avoided by using more data and
also reducing the features by feature selection.
Reasons for Underfitting:
 High bias and low variance
 The size of the training dataset used is not enough.
 The model is too simple.
 Training data is not cleaned and also contains noise in
it.
Techniques to reduce underfitting:
 Increase model complexity
 Increase the number of features
 Remove noise from the data.
 Increase the duration of training to get better results.
 Curse of Dimensionality refers to a set of problems that
arise when working with high-dimensional data. The
dimension of a dataset corresponds to the number of
attributes/features that exist in a dataset. A dataset with a
large number of attributes, generally of the order of a
hundred or more, is referred to as high dimensional data.
Some of the difficulties that come with high dimensional
data manifest during analyzing or visualizing the data to
identify patterns, and some manifest while training machine
learning models. The difficulties related to training machine
learning models due to high dimensional data are referred to
as the ‘Curse of Dimensionality’.
 The curse of dimensionality will apply to our machine
learning algorithms because as the number of input
dimensions gets larger, we will need more data to
enable the algorithm to generalise sufficiently well.
curse of dimensionality, so will the number of data
points we need. For this reason, we will often have to
be careful about what information we give to the
algorithm, meaning that we need to understand
something about the data in advance.
 As the dimensionality increases, the number of data
points required for good performance of any machine
learning algorithm increases exponentially.
 The curse of dimensionality basically means that the
error increases with the increase in the number of
features.
 The validation set is used to evaluate a given model,
but this is for frequent evaluation.
 A validation dataset is a sample of data held back from
training your model that is used to give an estimate of model
skill while tuning model’s hyperparameters (maximizes the
model's performance, minimizing a predefined loss function to
produce better results with fewer errors).
 Training sets
 Initially, the development method involves initial inputs
within specified project parameters. The process also
requires the expert setting of weightings between the
various connections of so-called neurons within the ML
model or estimator*.

 After the introduction of this first dataset, developers

compare the resulting output to target answers. Next, they
adjust the model's parameters, weighting, and
functionality, as needed.
 More than one epoch or iteration of this adjustment loop
is often necessary. The goal is to achieve a trained or
fitted model that relates to and corresponds with the
expected range of new, unknown data.
 Validation sets
 The next stage involves using a validation dataset to estimate the
accuracy of the ML model concerned. During this phase, developers
ensure that new data classification is precise and results are
predictable.

 Validation datasets comprise unbiased inputs and expected results

designed to check the function and performance of the model.
Different methods of cross-validation (CV) exist, though all aim to
ensure stability by estimating how a predictive model will perform.
An example is the usage of rotation estimation or out-of-sample
testing to assure reasonable precision.

 and fine-tuning involve various iterations. Whatever the

methodology, these verification techniques aim to assess the results
and check them against independent inputs. It is possible also to
adjust the hyperparameters, i.e. the values used to control the overall
process.

 Test sets
 The final step is to use a test set to verify the model's
functionality. Some publications refer to the validation
dataset as a test set, especially if there are only two
subsets instead of three. Similarly, if records in this final
test set have not formed part of a previous evaluation or
cross-validation, they might also constitute a holdout set.

 Test samples provide a simulated real-world check using

unseen inputs and expected results. In practice, there
could be some overlap between validation and testing.
Each procedure shows that the ML model will function in
a live environment once out of testing.
 A confusion matrix presents a table layout of the
different outcomes of the prediction and actual of a
classification problem and helps visualize its
outcomes. It plots a table of all the predicted and
actual values of a classifier.
 A confusion matrix is a tabular summary of the
number of correct and incorrect predictions made
by a classifier. It is used to measure the performance
of a classification model. It can be used to evaluate
the performance of a classification model through the
calculation of performance metrics like accuracy,
precision, recall, and F1-score.
 The confusion matrix is a matrix used to determine
the performance of the classification models for a
given set of test data. It can only be determined if the
true values for test data are known. The matrix itself
can be easily understood, but the related
terminologies may be confusing. Some features of
Confusion matrix are given below:
 For the 2 prediction classes of classifiers, the matrix
is of 2*2 table, for 3 classes, it is 3*3 table, and so
on.
 The matrix is divided into two dimensions, that
are predicted values and actual values along with
the total number of predictions.
 Predicted values are those values, which are predicted
by the model, and actual values are the true values for
the given observations.

 A good model is one which has high TP and TN

rates, while low FP and FN rates.

 If you have an imbalanced dataset to work with, it’s

always better to use confusion matrix as your
evaluation criteria for your machine learning model.
 True Positive: The number of times our actual positive
values are equal to the predicted positive. You predicted
a positive value, and it is correct.
 False Positive: The number of times our model wrongly
predicts negative values as positives. You predicted a
negative value, and it is actually positive.
 True Negative: The number of times our actual
negative values are equal to predicted negative values.
You predicted a negative value, and it is actually
negative.
 False Negative: The number of times our model
wrongly predicts negative values as positives. You
predicted a negative value, and it is actually positive.
Sensitivity tells us what proportion of the positive class
got correctly classified.
 Misclassification rate: It is also termed as Error rate,
and it defines how often the model gives the wrong
predictions. The value of error rate can be calculated as
the number of incorrect predictions to all number of the
predictions made by the classifier. The formula is given
below:
When to use Accuracy / Precision / Recall / F1-Score?
 a. Accuracy is used when the True Positives and True
Negatives are more important. Accuracy is a better
metric for Balanced Data.
 b. Whenever False Positive is much more important
use Precision.
 c. Whenever False Negative is much more important
use Recall.
 d. F1-Score is used when the False Negatives and
False Positives are important. F1-Score is a better
metric for Imbalanced Data.
 ROC or Receiver Operating Characteristic curve
represents a probability graph to show the
performance of a classification model at different
threshold levels. The curve is plotted between two
parameters, which are:
 True Positive Rate or TPR
 False Positive Rate or FPR
 In the curve, TPR is plotted on Y-axis, whereas FPR
is on the X-axis.
 AUC is known for Area Under the ROC curve. As
its name suggests, AUC calculates the two-
dimensional area under the entire ROC curve ranging
from (0,0) to (1,1), as shown below image:
 In the ROC curve, AUC computes the performance of
the binary classifier across different thresholds and
provides an aggregate measure. The value of AUC
ranges from 0 to 1, which means an excellent model
will have AUC near 1, and hence it will show a good
measure of Separability.
 Classification of 3D model
The curve is used to classify a 3D model and separate it from the
normal models. With the specified threshold level, the curve
classifies the non-3D and separates out the 3D models.

 Healthcare
The curve has various applications in the healthcare sector. It can be
used to detect cancer disease in patients. It does this by using false
positive and false negative rates, and accuracy depends on the
threshold value used for the curve.

 Binary Classification
AUC-ROC curve is mainly used for binary classification problems
to evaluate their performance.
 While making predictions, a difference occurs
between prediction values made by the model and
actual values/expected values, and this difference is
known as bias errors or Errors due to bias. It can be
defined as an inability of machine learning algorithms
such as Linear Regression to capture the true
relationship between the data points. Each algorithm
begins with some amount of bias because bias occurs
from assumptions in the model, which makes the target
function simple to learn. A model has either:
 Low Bias: A low bias model will make fewer
assumptions about the form of the target function.
 High Bias: A model with a high bias makes more
assumptions, and the model becomes unable to
capture the important features of our dataset. A high
bias model also cannot perform well on new data.
 Some examples of machine learning algorithms with
low bias are Decision Trees, k-Nearest Neighbours
and Support Vector Machines. At the same time, an
algorithm with high bias is Linear Regression,
Linear Discriminant Analysis and Logistic
Regression.
 The variance would specify the amount of variation in
the prediction if the different training data was used. In
simple words, variance tells that how much a random
variable is different from its expected value. Ideally, a
model should not vary too much from one training
dataset to another, which means the algorithm should
be good in understanding the hidden mapping between
inputs and output variables. Variance errors are either
of low variance or high variance.
 Low variance means there is a small variation in the
prediction of the target function with changes in the
training data set.
 High variance shows a large variation in the
prediction of the target function with changes in the
training dataset.
 Low-Bias, Low-Variance:
The combination of low bias and low variance shows
an ideal machine learning model.

 Low-Bias, High-Variance: With low bias and high

variance, model predictions are inconsistent and
accurate on average. This case occurs when the
model learns with a large number of parameters and
hence leads to an Overfitting
 High-Bias, Low-Variance: With High bias and low
variance, predictions are consistent but inaccurate on
average. This case occurs when a model does not
learn well with the training dataset or uses few
numbers of the parameter. It leads
to Underfitting problems in the model.

 High-Bias, High-Variance:
With high bias and high variance, predictions are
inconsistent and also inaccurate on average.
 While building the machine learning model, it is
really important to take care of bias and variance in
order to avoid overfitting and underfitting in the
model. If the model is very simple with fewer
parameters, it may have low variance and high bias.
Whereas, if the model has a large number of
parameters, it will have high variance and low bias.
So, it is required to make a balance between bias and
variance errors, and this balance between the bias
error and variance error is known as the Bias-
Variance trade-off.

Short Term Water Demand Forecast Modelling Using Artificial Intelligence For Smart Water Management
No ratings yet
Short Term Water Demand Forecast Modelling Using Artificial Intelligence For Smart Water Management
22 pages
ML Unit 2
No ratings yet
ML Unit 2
21 pages
Software Asset Management: What Is It and Why Do I Need It?: A Textbook on the Fundamentals in Software License Compliance, Audit Risks, Optimizing Software License ROI, Business Practices and Life Cycle Management
From Everand
Software Asset Management: What Is It and Why Do I Need It?: A Textbook on the Fundamentals in Software License Compliance, Audit Risks, Optimizing Software License ROI, Business Practices and Life Cycle Management
Carl A. Bolton
No ratings yet
Machine Learning: BE Sixth Semester 20CS610
No ratings yet
Machine Learning: BE Sixth Semester 20CS610
211 pages
Module 1 The Role of Machine Learning in Cyber Security
No ratings yet
Module 1 The Role of Machine Learning in Cyber Security
38 pages
Advance Thread Detection Using AI &ML in Cyber Security
No ratings yet
Advance Thread Detection Using AI &ML in Cyber Security
9 pages
AI Exam
No ratings yet
AI Exam
2 pages
Types of Machine Learning
No ratings yet
Types of Machine Learning
3 pages
ML Full Slides Final
No ratings yet
ML Full Slides Final
458 pages
Machine Learning Lecture
No ratings yet
Machine Learning Lecture
10 pages
ML QB Ans
No ratings yet
ML QB Ans
48 pages
Machine Learning Approaches For Combating Distributed Denial of Service Attacks in Modern Networking Environments
No ratings yet
Machine Learning Approaches For Combating Distributed Denial of Service Attacks in Modern Networking Environments
29 pages
ML Notes
No ratings yet
ML Notes
23 pages
DDoS Attack Identification and Defense Using SDN Based On Machine Learning Method
No ratings yet
DDoS Attack Identification and Defense Using SDN Based On Machine Learning Method
5 pages
Machine L-Lab-Manual
No ratings yet
Machine L-Lab-Manual
90 pages
AI-Finalterm Exam (Obj)
No ratings yet
AI-Finalterm Exam (Obj)
2 pages
ML Lab Manual (5cs4-23)
No ratings yet
ML Lab Manual (5cs4-23)
53 pages
Full Notes
No ratings yet
Full Notes
37 pages
Cns Unit 3
No ratings yet
Cns Unit 3
74 pages
Electronics Engineering Practical Sets For RRB and Competative Exam - by WWW - Learnengineering.in
No ratings yet
Electronics Engineering Practical Sets For RRB and Competative Exam - by WWW - Learnengineering.in
162 pages
Machine Learning Lab Experiments
No ratings yet
Machine Learning Lab Experiments
40 pages
Network Security Shortnote
No ratings yet
Network Security Shortnote
4 pages
ML Lab Manual
No ratings yet
ML Lab Manual
53 pages
Rivest-Shamir - Adleman - Rsa: - Made by Sahil Bhatiya
No ratings yet
Rivest-Shamir - Adleman - Rsa: - Made by Sahil Bhatiya
12 pages
Manual Testing
No ratings yet
Manual Testing
62 pages
DC Digital Communication PART5
No ratings yet
DC Digital Communication PART5
194 pages
Part-A (University Questions) : 1. Write The Categories of Networks. (A/M-2011)
No ratings yet
Part-A (University Questions) : 1. Write The Categories of Networks. (A/M-2011)
38 pages
VPN Basics
No ratings yet
VPN Basics
26 pages
Ec6304 Electronic Circuits-I
100% (1)
Ec6304 Electronic Circuits-I
193 pages
CRYPTOGRAPHY AND NETWORK SECURITY PPT by Me
No ratings yet
CRYPTOGRAPHY AND NETWORK SECURITY PPT by Me
35 pages
Artificial Intelligence Foundations
No ratings yet
Artificial Intelligence Foundations
96 pages
Software Engineering 2
No ratings yet
Software Engineering 2
4 pages
Lecture 4: Balanced Binary Search Trees
No ratings yet
Lecture 4: Balanced Binary Search Trees
8 pages
Number Theory and Public Key Cryptography: Syllabus
No ratings yet
Number Theory and Public Key Cryptography: Syllabus
26 pages
Aec Unit-1
No ratings yet
Aec Unit-1
138 pages
Applications of Number Theory 163: K K K 1 K 1 1 0 0 1 2 K
No ratings yet
Applications of Number Theory 163: K K K 1 K 1 1 0 0 1 2 K
7 pages
ML Complete Notes-AIDS
No ratings yet
ML Complete Notes-AIDS
115 pages
AI Question Bank With Solutions - 2021-22
No ratings yet
AI Question Bank With Solutions - 2021-22
45 pages
Cryptography and Its Applications
No ratings yet
Cryptography and Its Applications
22 pages
Public Key Cryptosystems With Applications
No ratings yet
Public Key Cryptosystems With Applications
21 pages
Machine Learning For Cybersecurity in Smart Grids
No ratings yet
Machine Learning For Cybersecurity in Smart Grids
24 pages
Analysis of Machine Learning Approaches To Packing Detection
No ratings yet
Analysis of Machine Learning Approaches To Packing Detection
32 pages
Ai ML Notes
No ratings yet
Ai ML Notes
18 pages
Artificial Intelligence and Machine Learning
No ratings yet
Artificial Intelligence and Machine Learning
19 pages
DC Lecture Notes & Materials
No ratings yet
DC Lecture Notes & Materials
119 pages
Machine Learning
100% (1)
Machine Learning
12 pages
CNS Unit - 3
100% (1)
CNS Unit - 3
22 pages
Advanced Encryption Standard
No ratings yet
Advanced Encryption Standard
43 pages
DigiCom Notes
No ratings yet
DigiCom Notes
25 pages
chp-4 3
No ratings yet
chp-4 3
21 pages
ML Notes Question Bank Exstraction From Notes
No ratings yet
ML Notes Question Bank Exstraction From Notes
163 pages
CS3491 Unit 1 Notes
No ratings yet
CS3491 Unit 1 Notes
83 pages
Lecture 4: Balanced Binary Search Trees
No ratings yet
Lecture 4: Balanced Binary Search Trees
11 pages
Semiconductors
100% (1)
Semiconductors
8 pages
Web Security: Secure Socket Layer (SSL)
No ratings yet
Web Security: Secure Socket Layer (SSL)
6 pages
Number Theory 1
No ratings yet
Number Theory 1
21 pages
AI and ML Techniques For Cyber Security
No ratings yet
AI and ML Techniques For Cyber Security
8 pages
Unit III (CNS)
No ratings yet
Unit III (CNS)
38 pages
Security and Control in The Cloud
No ratings yet
Security and Control in The Cloud
12 pages
Power Electronics - Quick Guide
100% (1)
Power Electronics - Quick Guide
53 pages
CST 807 Secure Software Engineering
No ratings yet
CST 807 Secure Software Engineering
93 pages
Module 1 PPT
No ratings yet
Module 1 PPT
122 pages
STA101 Lecture 8
No ratings yet
STA101 Lecture 8
26 pages
Estimasi Sumberdaya Untuk Data Dengan Distribusi Lognormal Pada Endapan Urat Emas Gunung Pongkor Dengan Pendekatan Geostatistik
No ratings yet
Estimasi Sumberdaya Untuk Data Dengan Distribusi Lognormal Pada Endapan Urat Emas Gunung Pongkor Dengan Pendekatan Geostatistik
9 pages
Questions Answers Topic 5
No ratings yet
Questions Answers Topic 5
5 pages
113 1 3 Ch06 Variables Control Charts - 2in1
No ratings yet
113 1 3 Ch06 Variables Control Charts - 2in1
43 pages
Statistics Formula
No ratings yet
Statistics Formula
4 pages
Scott and Watson CHPT 4 Solutions
No ratings yet
Scott and Watson CHPT 4 Solutions
4 pages
Distance To Default Based On The CEV-KMV Model
No ratings yet
Distance To Default Based On The CEV-KMV Model
16 pages
Gargallo 2017
No ratings yet
Gargallo 2017
38 pages
MSS Assignment 2
No ratings yet
MSS Assignment 2
4 pages
Quiz Data Management MMW FINAL 1
No ratings yet
Quiz Data Management MMW FINAL 1
30 pages
Tuto 1 CI and Hypothesis Testing For A Single Population Mean
No ratings yet
Tuto 1 CI and Hypothesis Testing For A Single Population Mean
2 pages
Objective Questions For CA4
No ratings yet
Objective Questions For CA4
18 pages
Synopsis
No ratings yet
Synopsis
13 pages
Examples On Continuous Variables Expected Value
No ratings yet
Examples On Continuous Variables Expected Value
4 pages
Unit - I - Curve Fitting
No ratings yet
Unit - I - Curve Fitting
42 pages
8-F-Test (Two-Way Anova With Interaction Effect)
No ratings yet
8-F-Test (Two-Way Anova With Interaction Effect)
14 pages
Probability Distributions Exam Questions
No ratings yet
Probability Distributions Exam Questions
13 pages
Zero Lecture of MTH302
No ratings yet
Zero Lecture of MTH302
29 pages
QUESTIONS TRIAL KMJ AM025 - Part2
No ratings yet
QUESTIONS TRIAL KMJ AM025 - Part2
2 pages
8614 - Assignment 2 Solved (AG)
No ratings yet
8614 - Assignment 2 Solved (AG)
19 pages
Kartu Pelatihan Toefl Prediction: Code of Training Regulations For The Participants
No ratings yet
Kartu Pelatihan Toefl Prediction: Code of Training Regulations For The Participants
1 page
Precision Notes
No ratings yet
Precision Notes
26 pages
Data Mining and Analysis Fundamental Concepts and Algorithms Mohammed J. Zaki 2024 Scribd Download
No ratings yet
Data Mining and Analysis Fundamental Concepts and Algorithms Mohammed J. Zaki 2024 Scribd Download
55 pages
Diabetes - Prediction - Project - Ipynb - Colab
No ratings yet
Diabetes - Prediction - Project - Ipynb - Colab
11 pages
Non Parametric Tests
No ratings yet
Non Parametric Tests
26 pages
Heteroscedasticity Slides PDF
No ratings yet
Heteroscedasticity Slides PDF
17 pages
Measures of Spread
No ratings yet
Measures of Spread
5 pages
Hasil Uji Validitas Dan Reliabilitas Instrumen Sikap: Correlations
No ratings yet
Hasil Uji Validitas Dan Reliabilitas Instrumen Sikap: Correlations
2 pages

ML Unit-1 (CEC)

Uploaded by

ML Unit-1 (CEC)

Uploaded by

Machine learning is a branch of Artificial Intelligence (AI) and

computer science which focuses on the use of data and

Machine learning uses various algorithms for building

 Machine learning is much similar to data mining as it also

 The training is provided to the machine with the set of data

 Anomaly detection: Identifying unusual data points in a data

 Dimensionality Reduction: Reducing the number of

 Robotics: Robots can learn to perform tasks the physical

 Video Game play: Reinforcement learning has been used to

 Resource Management: Given finite resources and a defined

 The main difference between Regression and

 In Classification, a computer program is trained on

 Regression is a process of finding the correlations

 The task of the Regression algorithm is to find the

 In the complete life cycle process, to solve a problem,

 In this step, we need to identify the different data

 Identify various data sources

 By performing the above task, we get a coherent set

It is used to understand the nature of data that we have to work

 Selection of Analytical Techniques

 Hence, in this step, we take the data and use machine

 We use datasets to train the model using various

 Testing the model determines the percentage

 Summation Function: The work of the summation function is to bind the

 Activation Function: It is used to introduce non-linearity in the model.

 The main goal of each machine learning model is to

 After the introduction of this first dataset, developers

 Validation datasets comprise unbiased inputs and expected results

 and fine-tuning involve various iterations. Whatever the

 Test samples provide a simulated real-world check using

 A good model is one which has high TP and TN

 If you have an imbalanced dataset to work with, it’s

 Low-Bias, High-Variance: With low bias and high

You might also like