Machine Learning Unit 1
Machine Learning Unit 1
UNIT 1
MACHINE LEARNING
Machine learning is a subfield of artificial intelligence (AI) that
focuses on the development of algorithms and statistical models that
enable computers to perform tasks without explicit programming
instructions. The primary goal of machine learning is to allow
computers to learn from data and make predictions or decisions based
on that learning.
Types of Machine Learning:
1. Supervised Learning:
In supervised learning, the algorithm learns from labeled data, where each input-output pair is
provided during the training phase. The algorithm aims to learn the mapping from input to
output.
2. Unsupervised Learning:
Unsupervised learning involves training algorithms on data without labeled responses. The
algorithm tries to find patterns or structures within the data.
3. Semi-supervised Learning:
This approach combines both labeled and unlabeled data for training. It's particularly useful
when labeled data is scarce.
4. Reinforcement Learning:
Reinforcement learning involves training algorithms to make sequential decisions. The
algorithm learns to interact with an environment by receiving feedback in the form of rewards or
penalties.
Applications of Machine learning
■ Machine learning is a buzzword for today's technology, and it is growing very rapidly day by
day. We are using machine learning in our daily life even without knowing it such as Google
Maps, Google assistant, Alexa, etc. Below are some most trending real-world applications of
Machine Learning
1. Image Recognition
■ Image recognition, a key application of machine learning, helps computers identify
objects, people, places, and more in digital images.
■ One popular example is automatic friend tagging on Facebook:
When we upload a photo with our friends on Facebook, the platform automatically
suggests tagging them by name. This is made possible by machine learning algorithms
that detect and recognize faces.
Behind the scenes, Facebook uses a project called "Deep Face" for face recognition and
identifying people in photos. It's all about making it easier for us to tag our friends in
pictures without having to manually do it ourselves.
2. Speech Recognition
■ Speech recognition, a common use of machine learning, powers features like "Search by
voice" in Google.
■ It works by converting spoken words into text, also known as "Speech to text."
■ Many applications, like Google Assistant, Siri, Cortana, and Alexa, use machine learning
algorithms for speech recognition. They understand and respond to voice commands,
making it easier for users to interact with technology using their voice.
3. Traffic prediction
■ When we use Google Maps to navigate to a new destination, it not only shows us the shortest route
but also predicts the traffic conditions along the way
1. Real-time data: Google Maps gathers information about the current location of vehicles through its
app and various sensors. It tracks the movement of cars on the roads to determine if traffic is flowing
smoothly, moving slowly, or heavily congested.
2. Historical data: Google Maps also considers past traffic patterns for the same route and time of day.
By analyzing data from previous days, it estimates how long it typically takes to travel the route at that
particular time. By combining real-time and historical data, Google Maps can provide accurate
predictions about traffic conditions, helping users plan their journeys more efficiently.
4. Product recommendations
■ Many e-commerce and entertainment companies like Amazon and Netflix use
machine learning to recommend products to users.
For example, when we search for a product on Amazon, we might start seeing ads for
similar products while browsing the internet. This is because Amazon's machine
learning algorithms understand our interests and suggest relevant products.
Similarly, when we use Netflix, we receive recommendations for movies and TV shows
based on our viewing history. Machine learning helps Netflix understand our
preferences and suggests content we're likely to enjoy
5. Self-driving cars
■ Self-driving cars represent one of the most fascinating applications of machine learning. Tesla,
a leading car manufacturer, is at the forefront of developing self-driving technology.
■ In self-driving cars, machine learning algorithms play a crucial role in enabling vehicles to
navigate and make decisions autonomously. Tesla utilizes unsupervised learning methods to
train its car models to detect and recognize people, objects, and obstacles while driving.
■ This technology allows self-driving cars to perceive their surroundings, make real-time
decisions, and safely navigate roads without human intervention. It's an exciting advancement
that holds the promise of revolutionizing transportation and improving road safety in the future
Key Algorithms
1. Linear Regression: A basic algorithm for modeling the relationship
between a dependent variable and one or more independent variables.
2. Logistic Regression: For binary classification problems, logistic
regression estimates the probability that an instance belongs to a
particular class.
3. Decision Trees: Decision trees recursively split the data based on
features, resulting in a tree-like structure used for classification or
regression tasks.
4. Random Forests: An ensemble learning method that constructs
multiple decision trees during training and outputs the mode of the
classes (classification) or the average prediction (regression).
5. Support Vector Machines (SVM): SVM finds the optimal hyperplane
that best separates classes in high-dimensional space.
Key Algorithms
6. Neural Networks: Inspired by the human brain, neural networks consist of
interconnected layers of nodes (neurons) that process input data. Deep neural networks
(DNNs) are neural networks with multiple hidden layers.
7. Clustering Algorithms: Such as K-means, hierarchical clustering, etc., used in
unsupervised learning to group similar data points together.
8. Dimensionality Reduction Techniques: Like Principal Component Analysis (PCA) and t-
distributed Stochastic Neighbor Embedding (tSNE), used to reduce the number of
features in a dataset while preserving its essential characteristics
Evaluation Metrics:
Various metrics are used to assess the performance of machine learning models, including
accuracy, precision, recall, F1-score, ROC-AUC, mean squared error (MSE), etc.
■ Machine learning life cycle involves seven major steps, which are given below:
■ 1. Gathering Data:
➢ The first step in the machine learning journey is gathering data. This means finding and
collecting all the information we need.
➢ In this step, we look for data in different places like files, databases, the internet, or even
from mobile devices.
➢ This step is super important because the amount and quality of data we collect
determine how well our predictions will work. The more data we have, the better our
predictions will be.
➢ We do a few things in this step:
• Figure out where to get data from
• Collect the data
• Put all the data together from different places into one big set called a dataset.
Once we've got our dataset, we can move on to the next steps in our machine learning
adventure!
■ 2. Data preparation
After we collect our data, the next step is to prepare it for the rest of the machine learning
process. This step is called data preparation.
Here's what we do in data preparation:
1. Put Data Together and Randomize: First, we gather all our data into one place and mix it up
so it's not in any particular order.
2. Data Exploration:
➢ We take some time to understand the data we have. This means looking at what kind of data it
is, how it's structured, and if there are any mistakes or missing pieces.
➢ Understanding our data well helps us get better results later on. During this stage, we try to
find patterns, trends, and any unusual data bits.
3. Data Pre-processing:
➢ Once we know our data inside out, it is ready for analysis. So, in simple terms, data
preparation involves getting all data organized, understanding it, and then getting it ready to be
analyzed by our machine learning system.
3. Data Wrangling
Data wrangling is all about getting your data into shape so it's ready for analysis. This
process involves cleaning up the data, selecting the important parts, and transforming it
into a format that's easy to work with the next steps.
Here's why data wrangling is so important:
1. Cleaning Data: Sometimes, the data we collect isn't perfect. It might have missing
values, duplicates, invalid entries, or even random noise. Data wrangling helps us
identify and fix these issues to make our data more reliable.
2. Selecting Variables: Not all the data we collect will be useful for our analysis. Data
wrangling allows us to pick out the important variables that we actually need.
By using various techniques like filtering, we can clean up our data and make sure it's in
good shape for the next stages of our project. This is crucial because the quality of our
data directly impacts the quality of our final results.
■ 4. Data Analysis
Now that we've got our data all cleaned up and ready to go, it's time to start analyzing it. This step
involves a few key tasks:
1. Choosing Analysis Techniques: Just like picking the right tools for a job, we need to select the
best techniques for analyzing our data. This might involve methods like sorting, categorizing,
or finding patterns.
2. Building Models: Think of this like building a blueprint for a house. We use our data to create
models that help us understand and predict outcomes. We might use different types of
models depending on the problem we're trying to solve.
3. Reviewing Results: Once our models are built, we take a look at what they tell us. Are they
giving us the insights we expected? Do they accurately represent our data? This step helps us
fine-tune our analysis and make any necessary adjustments. So, in simple terms, in this step,
we take our cleaned-up data and use special algorithms to build models that help us
understand it better.
■ 5. Train Model
In the "Train Model" step, we teach our model to get better at its job. Here's how it works:
1. Training the Model: Just like teaching a student, we show our model lots of examples from
our datasets. This helps it learn different patterns, rules, and features in the data.
2. Using Machine Learning Algorithms: Think of these as different teaching methods. We use
various algorithms to train our model, each one helping it understand the data differently.
The goal here is to improve the model's performance so that it can give us better results when
we apply it to real-world problems.
■ 6. Test Model
In the "Test Model" step, we evaluate how well our trained machine learning model
performs. Here's what happens:
1. Testing the Model: After we've trained our model on a specific dataset, we give it a
different dataset to see how well it performs. This dataset is called a test dataset.
2. Checking Accuracy: We measure how accurate our model is by comparing its predictions
with the actual outcomes in the test dataset.
This gives us a percentage that shows how well the model performs according to the
project's requirements or the problem we're trying to solve.
The goal of this step is to ensure that our model is reliable and gives accurate predictions
when applied to new data
7. Deployment
In the "Deployment" phase, we put our machine learning model to work in the real world. Here's
how it unfolds:
1. Real-world Implementation: After ensuring our model delivers accurate results at an
acceptable speed, we integrate it into the actual system where it will be used.
2. Performance Monitoring: Before fully deploying the project, we continue to monitor the
model's performance with real-world data. This ensures that it maintains its accuracy and
effectiveness over time.
Think of the deployment phase as finalizing and presenting our project's findings. It's the
culmination of all the hard work put into developing and refining the machine learning model.
Previous year Questions
■ Write the applications of machine learning (5 marks 2024)
■ Explain the challenges of Machine Learning ( 8 Marks 2024)
■ Life cycle of developing a working model of machine learning (8 Marks )