ML Unit 1 Solution
ML Unit 1 Solution
SRES’s
SHREE RAMCHANDRA COLLEGE OF ENGINEERING
Lonikand, Pune – 412216
Ref. №: SRCOE/COMP /2024-25/ Date:
UNIT I
Introduction to Machine Learning
SRES’s
SHREE RAMCHANDRA COLLEGE OF ENGINEERING
Lonikand, Pune – 412216
Ref. №: SRCOE/COMP /2024-25/ Date:
UNIT I
Introduction to Machine Learning
In AI, we make intelligent systems In ML, we teach machines with data to perform
to perform any task like a human. a particular task and give an accurate result.
Machine learning and deep learning Deep learning is a main subset of machine
are the two main subsets of AI. learning.
AI has a very wide range of scope. Machine learning has a limited scope.
The main applications of AI are Siri, The main applications of machine learning
customer support using catboats, are Online recommender system, Google
Expert System, Online game search algorithms, Facebook auto friend
On the basis of capabilities, AI can Machine learning can also be divided into
be divided into three types, which mainly three types that are Supervised
are, Weak AI, General AI, learning, Unsupervised learning,
and Strong AI. and Reinforcement learning.
The problem is, the actual unknown underlying function may not be a linear function like
a line.It could be almost a line and require some minor transformation of the input data to
work right.Or it could be nothing like a line in which case the assumption is wrong and
the approach willproduce poor results.
3. Explain various Data formats that conform ML elements. Apr 2022 [5]
Ans:-
Data Formats in Machine Learning
Each data format represents how the input data is represented in memory.
This is important as each machine learning application performs well for a particular data
format and worse for others.
Interchanging between various data formats and choosing the correct format is a major
optimization technique.
NHWC:-
NHWC denotes (Batch size, Height, Width, Channel). This means there is a 4D array
where the first dimension represents batch size and accordingly. This 4D array is laid out
in memory in row major order. Hence, you can visualize the memory layout to
imagine which operations will access consecutive memory (fast) or memory separated
by other data (slow).
NCHW:-
NCHW denotes (Batch size, Channel, Height, Width). This means there is a 4D array
where the first dimension represents batch size and accordingly. This 4D array is laid out
in memory in row major order.
NCDHW:-
NCHW denotes (Batch size, Channel, Depth, Height, Width). This means there is a 5D
array where the first dimension represents batch size and accordingly. This 5D array is
Continuous variables are numeric variables that have an infinite number of values between
any two values. A continuous variable can be numeric or date/time. For example, the length
of a part or the date and time a payment is received.
Regression algorithms are used if there is a relationship between the input variable and the
output variable.
It is used for the prediction of continuous variables, such as Weather forecasting, Market
Trends, etc.
1.Linear Regression
2.Regression Trees
3. Non-Linear Regression
4.Bayesian Linear Regression
5.Polynomial Regression vi)Logistic Regression
Classification
Classification attempts to find the appropriate class label, such as analyzing
positive/negative sentiment, male and female persons, benign and malignant tumors, secure
and unsecure loans etc.
Classification algorithms are used when the output variable is categorical, which means
there are two classes such as Yes-No, Male-Female, True-false, etc
1. Decision Trees
2. Random Forest
3. Support vector Machines
4. Neural network
Cluster analysis finds the commonalities between the data objects and categorizes them as
per the presence and absence of those commonalities.
Association:
An association rule is an unsupervised learning method which is used for finding the
relationships between variables in the large database.It determines the set of items that occurs
together in the dataset.Association rule makes marketing strategy more effective. Such as
people who buy X item (suppose a bread) are also tend to purchase Y (Butter/Jam) item.A
typical example of Association rule is Market Basket Analysis.
The list of some popular unsupervised learning algorithms:
1. K-means clustering
2. KNN (K-Nearest Neighbors)
3.Hierarchical clustering
4. Anomaly detection
5. Neural Networks
6.Principle Component Analysis
7. Independent Component Analysis
8. Apriori algorithm
9. Singular value decomposition
student has to figure out a concept himself and Semi-Supervised learning: where a teacher
teaches a few concepts in class and gives questions as homework which are based on similar
concepts.
6) Problem Framing:
Requires the use of exploratory data analysis and data mining.
7)Data Cleaning.
Requires the use of outlier detection, imputation and more.
8)Data Selection:
Requires the use of data sampling and feature selection methods.
9)Model Configuration:
Requires the use of statistical hypothesis tests and estimation statistics
6. Compare Machine Learning with traditional programming. Discuss
types of Machine Learning with suitable examples. Apr 2023 [5]
Ans:-
Comparison of Machine Learning with Traditional Programming
For any solution, the first task is the creation of the most suitable algorithm and writing
the code.
Machine learning’s impact extends to autonomous vehicles, drones, and robots, enhancing
their adaptability in dynamic environments. This approach marks a breakthrough where
machines learn from data
examples to generate accurate outcomes, closely intertwined with data mining and data
science.
With an understanding of the common machine learning uses, let’s explore some
examples of the popular applications in the market that rely heavily on machine learning.
2. Transportation (Uber)
Uber is a customized cab application that relies on machine learning to automatically
locate a rider, and offer options to travel home, to work, or to any other regular location
based on the rider’s history and patterns. Moreover, the app further uses ML algorithms to
make precision predictions around the Estimated Time of Arrival (ETA) to a particular
destination by analyzing traffic conditions.
3. Language Translation (Google Translate)
To break all language barriers and make traveling to foreign countries easy, Google
Translate employs Google Neural Machine Translation (GNMT) which relies on Natural
Language Processing(NLP) to translate words across thousands of languages and
dictionaries. It also makes use of POS Tagging, Named Entity Recognition (NER), and
Chunking to maintain the words’ tonality.
4.Image Recognition:
Image recognition is one of the most common applications of machine learning. It is
used to identify objects, persons, places, digital images, etc. The popular use case of image
recognition and face detection is, Automatic friend tagging suggestion:
Facebook provides us a feature of auto friend tagging suggestion. Whenever we upload a
photo with our Facebook friends, then we automatically get a tagging suggestion with name,
and the technology behind this is machine learning's face detection and recognition
algorithm.
It is based on the Facebook project named "Deep Face," which is responsible for face
recognition and person identification in the picture.
5.Speech Recognition:
While using Google, we get an option of "Search by voice," it comes under speech
recognition, and it's a popular application of machine learning.
Speech recognition is a process of converting voice instructions into text, and it is also
known as "Speech to text", or "Computer speech recognition." At present, machine learning
algorithms are widely used by various applications of speech recognition. Google
assistant, Siri, Cortana, and Alexa are using speech recognition technology to follow the
voice instructions.
6.Traffic prediction:
If we want to visit a new place, we take help of Google Maps, which shows us the
correct path with the shortest route and predicts the traffic conditions.It predicts the traffic
conditions such as whether traffic is cleared, slow-moving, or heavily congested with the
help of two ways:
o Real Time location of the vehicle form Google Map app and sensors
o Average time has taken on past days at the same time.
Everyone who is using Google Map is helping this app to make it better. It takes
information from the user and sends back to its database to improve the performance.
7.Product recommendations:
Machine learning is widely used by various e-commerce and entertainment companies
such as Amazon, Netflix, etc., for product recommendation to the user. Whenever we search
for some product on Amazon, then we started getting an advertisement for the same
product while internet surfing on the same browser and this is because of machine learning.
Google understands the user interest using various machine learning algorithms and suggests
the product as per customer interest.
As similar, when we use Netflix, we find some recommendations for entertainment series,
movies, etc., and this is also done with the help of machine learning.
8.Self-driving cars:
One of the most exciting applications of machine learning is self-driving cars. Machine
learning plays a significant role in self-driving cars. Tesla, the most popular car
manufacturing company is working on self-driving car. It is using unsupervised learning
method to train the car models to detect people and objects while driving.
o Permission filters
Linear models are parametric, which means that they have a fixed form with a small number
of numeric
parameters that need to be learned from data.
For example, in f (x) = mx + c, m and c are the parameters that we are trying to learn from
the data.
This technique is different from tree or rule models, where the structure of the model (e.g.,
which features to use in the tree, and where) is not fixed in advance.
Linear models are stable, i.e., small variations in the training data have only a limited impact
on the learned model.
Shree Ramchandra college of engineering Page 14
Sem:-VII Sub:- ML
In contrast, tree models tend to vary more with the training data, as the choice of a different
split at the root of the tree typically means that the rest of the tree is different as well.
As a result of having relatively few parameters, Linear models have low variance and high
bias. This implies that Linear models are less likely to overfit the training data than some
other models. However, they are more likely to underfit.
For example, if we want to learn the boundaries between countries based on labeled data,
then linear models are not likely to give a good approximation.
Distance Model
Distance-based models are the second class of Geometric models.
Like Linear models, distance-based models are based on the geometry of data.
As the name implies, distance-based models work on the concept of distance.
In the context of Machine learning, the concept of distance is not based on merely the
physical distance between two points.
Instead, we could think of the distance between two points considering the mode of
transport between two points.
Travelling between two cities by plane covers less distance physically than by train because
as the plane is unrestricted.
Similarly, in chess, the concept of distance depends on the piece used – for example, a
Bishop can move diagonally.
Thus, depending on the entity and the mode of travel, the concept of distance can be
experienced differently.
The distance metrics commonly used are Euclidean, Minkowski, Manhattan, and
Mahalanobis. Distance is applied through the concept of neighbors and exemplars.
Neighbors are points in proximity with respect to the distance measure expressed through
exemplars.
Exemplars are either centroids that find a centre of mass according to a chosen distance
metric or medoids that find the most centrally located data point.
The most commonly used centroid is the arithmetic mean, which minimizes squared
Euclidean distance to all other points.
The algorithms under Geometric Model: KNN, Linear Regression, SVM, Logistic
Regression etc.
2. Probabilistic Models
The third family of machine learning algorithms is the probabilistic models.
The k-nearest neighbour algorithm uses the idea of distance (e.g., Euclidean distance) to
classify
entities, and logical models use a logical expression to partition the instance space.
Here the probabilistic models use the idea of probability to classify new entities.
Probabilistic models see features and target variables as random variables.
The process of modeling represents and manipulates the level of uncertainty with respect to
these variables.
There are two types of probabilistic models: Predictive and Generative.
Predictive probability models use the idea of a conditional probability distribution P (Y |X)
from which Y can be predicted from X.
Generative models estimate the joint distribution P (Y, X). Once we know the joint
distribution for the generative models, we can derive any conditional or marginal
distribution involving the same variables.
Thus, the generative model is capable of creating new data points and their labels, knowing
the joint probability distribution.
The joint distribution looks for a relationship between two variables.
Once this relationship is inferred, it is possible to infer new data points.
The algorithms under Probabilistic Models: Naïve Bayes , Gaussian Process Regression etc
Naïve Bayes is an example of a probabilistic classifier.
The goal of any probabilistic classifier is given a set of features (x_0 through x_n) and a set
of classes (c_0 through c_k), we aim to determine the probability of the features occurring in
each class, and to return the most likely class.
Therefore, for each class, we need to calculate P(c_i | x_0, …, x_n).
We can do this using the Bayes rule defined as The Naïve Bayes algorithm is based on the
idea of Conditional Probability.
Conditional probability is based on finding the probability that something will happen, given
that something else has already happened.
The task of the algorithm then is to look at the evidence and to determine the likelihood of a
specific class and assign a label accordingly to each entity.
Features should be selected so that a minimum correlation exists between them and a
maximum correlation exists between the selected features and output.
Feature engineering is the process to manipulate the original data into new and potential
data that has a lot many features within it.
In simple words Feature engineering is converting raw data into useful data or getting the
maximum out of the original data.
Feature engineering is arguably the most crucial and time-consuming step of the ML
pipeline.
Feature selection and engineering answers questions – Are these features going to make
any sense in our prediction? It deals with the accuracy and precision of data.
4. Model Training:
After the first three steps are done completely we enter the model training phase.
It is the first step officially when the developer gets to train the model on basis of data.
To train the model, data is split into three parts- Training data, validation data, and test
data.
Around 70%-80% of data goes into the training data set which is used in training the
model.
Validation data is also known as development set or dev set and is used to avoid overfitting
or underfitting situations i.e. enabling hyperparameter tuning.
Hyperparameter tuning is a technique used to combat overfitting and underfitting.
Validation data is used during model evaluation.
Around 10%-15% of data is used as validation data.
Rest 10%-15% of data goes into the test data set. Test data set is used for testing after the
model preparation.
It is crucial to randomize data sets while splitting the data to get an accurate model.
Data can be randomized using Scikit learn in python.
5. Model Evaluation:
After the model training, validation, or development data is used to evaluate the model.
To get the most accurate predictions to test data may be used for further model evaluation.
A confusion matrix is created after model evaluation to calculate accuracy and precision
numerically. After model evaluation, our model enters the final stage that is prediction.
6. Prediction:
In the prediction phase developer deploys the model.
After model deployment, it becomes ready to make predictions.
Predictions are made on training data and test data to have a better understanding of the
build model.
The deployment of the model isn’t a one-time exercise. As more and more data gets
generated, the model is trained on new data, evaluated again, and deployed again. Model
training, model evaluation, and prediction phase circulate each other.