ML Unit-1
ML Unit-1
Introduction-
• Machine Learning is a branch of artificial intelligence (AI) that focuses on
creating systems capable of learning and making decisions without being
explicitly programmed.
• The idea is to enable computers to learn from experience, analyze data, and
improve their performance over time.
• Machine Learning is like teaching computers to learn from examples.
Instead of giving them strict instructions, we show them lots of examples,
and they figure out patterns by themselves.
• The core idea is to allow machines to automatically improve and adapt their
performance based on experience, uncovering patterns, and making
intelligent choices in diverse applications.
AI vs MLvsDL
Machine Learning (ML) is a subset of artificial intelligence (AI) that focuses on the
development of algorithms and statistical models that enable computers to perform
tasks without explicit programming. ML is widely used across various industries for
its ability to learn from data, recognize patterns, and make predictions or decisions.
Using Traditional Programming Techni‐ques Vs Machine Learning
1.First you would consider what spam typically looks like. You might notice that
some words or phrases (such as “4U,” “credit card,” “free,” and “amazing”) tend to
come up a lot in the subject line
2. You would write a detection algorithm for each of the patterns that you noticed,
and your program would flag emails as spam if a number of these patterns were
detected.
3. You would test your program and repeat steps 1 and 2 until it was good enough
to launch.
Another area where Machine Learning shines is for problems that either are too
complex for traditional approaches or have no known algorithm.
1. Image Recognition:
It is based on the Facebook project named "Deep Face," which is responsible for
face recognition and person identification in the picture.
2. Speech Recognition
While using Google, we get an option of "Search by voice," it comes under speech
recognition, and it's a popular application of machine learning.
3. Traffic prediction:
If we want to visit a new place, we take help of Google Maps, which shows us the
correct path with the shortest route and predicts the traffic conditions.It predicts the
traffic conditions such as whether traffic is cleared, slow-moving, or heavily
congested with the help of two ways:
o Real Time location of the vehicle form Google Map app and sensors
o Average time has taken on past days at the same time.
4.Product recommendations:
5. Self-driving cars:
o Content Filter
o Header filter
o General blacklists filter
o Rules-based filters
o Permission filters
Machine learning is making our online transaction safe and secure by detecting fraud
transaction. Whenever we perform some online transaction, there may be various
ways that a fraudulent transaction can take place such as fake accounts, fake ids,
and steal money in the middle of a transaction. So to detect this, Feed Forward
Neural network helps us by checking whether it is a genuine transaction or a fraud
transaction.
Machine learning is widely used in stock market trading. In the stock market, there
is always a risk of up and downs in shares, so for this machine learning's long short
term memory neural network is used for the prediction of stock market trends.
In medical science, machine learning is used for diseases diagnoses. With this,
medical technology is growing very fast and able to build 3D models that can predict
the exact position of lesions in the brain.
Machine learning life cycle involves seven major steps, which are given below:
o Gathering Data
o Data preparation
o Data Wrangling
o Analyse Data
o Train the model
o Test the model
o Deployment
1. Gathering Data:
Data can be collected from various sources such as files, database, internet,
or mobile devices. The quantity and quality of the collected data will determine the
efficiency of the output. The more will be the data, the more accurate will be the
prediction.This step includes the below tasks:
o Identify various data sources
o Collect data
o Integrate the data obtained from different sources
By performing the above task, we get a coherent set of data, also called as a dataset
2.Data preparation
o Data exploration:
It is used to understand the nature of data that we have to work with. We
need to understand the characteristics, format, and quality of data.A better
understanding of data leads to an effective outcome. In this, we find
Correlations, general trends, and outliers.
o Data pre-processing:
Now the next step is preprocessing of data for its analysis.
3. Data Wrangling
1. Data wrangling is the process of cleaning and converting raw data into a
useable format.. Cleaning of data is required to address the quality issues.
2. It is not necessary that data we have collected is always of our use as some of
the data may not be useful. In real-world applications, collected data may have
various issues, including:
o Missing Values
o Duplicate data
o Invalid data
o Noise
So, we use various filtering techniques to clean the data.It is mandatory to detect and
remove the above issues because it can negatively affect the quality of the outcome.
4. Data Analysis
Now the cleaned and prepared data is passed on to the analysis step. This step
involves:
o Selection of analytical techniques
o Building models
o Review the result
In this step, we take the data and use machine learning algorithms to build the model.
5.Train Model
In this step we train our model to improve its performance for better outcome
of the problem.
• We use datasets to train the model using various machine learning algorithms.
Training a model is required so that it can understand the various patterns,
rules, and, features.
6. Test Model
• Once our machine learning model has been trained on a given dataset, then
we test the model. In this step, we check for the accuracy of our model by
providing a test dataset to it.
• Testing the model determines the percentage accuracy of the model as per the
requirement of project or problem.
7. Deployment
• The last step of machine learning life cycle is deployment, where we deploy
the model in the real-world system.
• If the above-prepared model is producing an accurate result as per our
requirement with acceptable speed, then we deploy the model in the real
system.
• But before deploying the project, we will check whether it is improving its
performance using available data or not.
There are so many different types of Machine Learning systems that it is useful to
classify them in broad categories, based on the following criteria:
1. Whether or not they are trained with human supervision
a) Supervised
b) Unsuper‐vised
c) Semisupervised or Reinforcement Learning
a) online learning
b) batch learning
3. Based on how the system generalizes from the training data to make
predictions on new, unseen data.
a) instance-based learning
b) model-based learning
Supervised Learning
The working of Supervised learning can be easily understood by the below example
and diagram:
5. Dimensionality Reduction:
• Definition: Reducing the number of features while preserving
important information.
• Example Algorithms: Principal Component Analysis (PCA), t-
Distributed Stochastic Neighbor Embedding (t-SNE).
• Use Cases: Visualization, feature engineering, noise reduction.
6. Association:
• Definition: Discovering relationships or associations between
variables in the dataset.
• Example Algorithms: Apriori algorithm for frequent itemset mining.
• Use Cases: Market basket analysis, recommendation systems.
7. Evaluation in Unsupervised Learning:
• Evaluation is often more subjective compared to supervised learning.
• Metrics may depend on the specific task; for example, silhouette score
for clustering.
8. Challenges:
• Lack of clear objectives can make evaluation challenging.
• Interpretability of results may be difficult.
9. Applications:
• Anomaly detection, pattern recognition, exploratory data analysis.
10. Considerations:
• The choice of algorithm depends on the nature of the data and the desired
outcome.
• Preprocessing and scaling may still be necessary for certain unsupervised
learning tasks.
n association rule learning, we typically deal with data in the form of transactions,
where items are purchased together. Let's modify the example to represent a
dataset suitable for association rule learning, specifically for identifying
associations between different employee characteristics:
After applying an association rule learning algorithm, the results might include
rules such as:
• {Monthly Income ($) > 6000} => {Master's}
• This rule suggests that employees with a monthly income greater than
$6000 are likely to have a Master's degree.
• {Years of Experience < 3} => {High School}
• This rule implies that employees with less than 3 years of experience
are likely to have a High School education.
Data plays a significant role in the machine learning process. One of the
significant issues that machine learning professionals face is the absence of good
quality data. Unclean and noisy data can make the whole process extremely
exhausting.
Inadequate Training Data Insufficient Quantity of Training Data
The major issue that comes while using machine learning algorithms is the lack of
quality as well as quantity of data. Although data plays a vital role in the processing
of machine learning algorithms, many data scientists claim that inadequate data,
noisy data, and unclean data are extremely exhausting the machine learning
algorithms. For example, a simple task requires thousands of sample data, and an
advanced task such as speech or image recognition needs millions of sample data
examples. Further, data quality is also important for the algorithms to work ideally,
but the absence of data quality is also found in Machine Learning applications. Data
quality can be affected by some factors as follows:
Data Mismatch In some cases, it’s easy to get a large amount of data for training,
but this data proba‐ bly won’t be perfectly representative of the data that will be
used in production. For example, suppose you want to create a mobile app to take
pictures of flowers and automatically determine their species. You can easily
download millions of pictures of flowers on the web, but they won’t be perfectly
representative of the pictures that will actually be taken using the app on a mobile
device
Stastical Learning
• Statistical learning, also known as statistical machine learning, is a
field of study that focuses on developing and utilizing algorithms and
statistical models to analyze and interpret data.
• The primary goal of statistical learning is to make predictions or
inferences based on data, often with an emphasis on understanding
underlying patterns and relationships.
Training, Testing &Validation dataset
Training Set
This is the actual dataset from which a model trains .i.e. the model sees and
learns from this data to predict the outcome or to make the right decisions.
Testing Set
This dataset is independent of the training set but has a somewhat similar type
of probability distribution of classes and is used as a benchmark to evaluate
the model, used only after the training of the model is complete.
Validation Set
The validation set is used to fine-tune the hyperparameters of the model and is
considered a part of the training of the model. The model only sees this data
for evaluation but not for learn from this data
Training Loss and Test Loss
Training Loss
1. Definition:
• The training loss measures how well a machine learning model
performs on the training data. It is calculated using a loss function that
quantifies the difference between the model's predictions and the
actual target values in the training set.
2. Optimization Objective:
• During training, the goal is to minimize the training loss. Optimization
algorithms, such as gradient descent, are used to adjust the model
parameters to achieve this objective.
3. Overfitting:
• A very low training loss might indicate that the model is fitting the
training data too closely, potentially capturing noise and patterns that
do not generalize well to new, unseen data.
Test Loss:
1. Definition:
• The test loss, also known as validation loss, measures how well a
trained model generalizes to new, unseen data. It is calculated using
the same loss function but on a separate dataset that the model has not
seen during training.
2. Generalization:
• The test loss is crucial for assessing the model's ability to generalize.
A low-test loss indicates that the model is making accurate predictions
on data it has never encountered before.
3. Overfitting Detection:
• Comparing the training loss and test loss helps in detecting
overfitting. If the training loss is significantly lower than the test loss,
it suggests overfitting, and adjustments to the model complexity may
be needed.
Trade-off:
• There is often a trade-off between minimizing training loss and
achieving good generalization. Striking the right balance is essential
to prevent overfitting and ensure the model performs well on new
data.
yi= prediction
• Definition: MSE measures the average squared difference between the predicted
values and the actual values. Squaring the differences penalizes larger errors
more heavily than smaller ones.
• Interpretation: MSE is sensitive to outliers and tends to amplify their impact
+
• Definition: RMSE is the square root of the MSE. It provides a measure of the
average magnitude of the errors in the predicted values, in the same units as the
target variable.
• Formula: RMSE=MSERMSE=MSE
• Interpretation: RMSE is more interpretable than MSE as it is in the same units as
the target variable.
Cross-Validation:
Empirical risk minimization (ERM) is a concept in machine learning where the goal is to find a
model that performs well on the training data. In simple terms, it's about minimizing the error
on the data you have.
1. Empirical Risk:
• Definition: The average error of a model on the training data.
• Example: If you have a dataset of student exam scores and you're
trying to predict grades, the empirical risk would be how well your
model predicts the actual grades for the students in your training set.
2. Minimization:
• Definition: The process of finding the model that minimizes the
empirical risk.
• Example: Adjusting the parameters of your grade prediction model
(like changing weights in a linear regression) to make sure it predicts
the training data grades as accurately as possible.
3. Task:
• Definition: The overall objective you want your model to achieve,
often stated as a minimization problem.
• Example: Your task is to build a model that predicts student grades
accurately. In ERM, you're adjusting the model to minimize the
difference between predicted and actual grades on the training data.
4. Formally, if we have a dataset consisting of input-output pairs (x,y), the
empirical risk of a model f(x) is given by:
R_emp(f) = (1/n) * ∑i L(yi, f(xi))
where n is the size of the dataset,
∑L(yi, f(xi)) is the loss function that measures the discrepancy between the
model's prediction f(xi) and the true output yi for each example i.
1. Structural Risk:
• Definition: The risk associated with the complexity of a model.
• Example: If you have a model that can perfectly memorize every student's
grade in your training data (high complexity), it might not generalize well
to new students. The structural risk is the risk of overfitting.
2. Minimization:
• Definition: The process of finding a model that minimizes both training
error and model complexity.
• Example: Adjusting the model parameters to balance accurate predictions
on the training data while avoiding unnecessary complexity that might
lead to overfitting.
3. Task:
• Definition: The overall objective of finding a model that generalizes well
to new, unseen data.
• Example: Your task is not only to predict grades accurately on the training
data but also to ensure that your model's predictions generalize well to
new students, maintaining a balance between fitting the training data and
avoiding overly complex models.
To sum up, structural risk minimization involves finding a model that not only fits the
training data well but also avoids being too complex. This helps prevent overfitting and
ensures that the model performs well on new, unseen data by striking a balance
between accuracy and simplicity.