Machine Learning

Uploaded by

hitawo1606

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

12 views

Machine Learning

Uploaded by

hitawo1606

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 7

Khwaja Fareed

University of
Engineering &
Information Technology Rahim Yar Khan

Department of Data Science & AI

Program: BS Data Science
Fall 2023
Assignment 1
Name: Muhammad Sarmad Iqbal
Roll No: COSC-222102008

[Course Code] [COSC-4121] Course Instructor: Shahzad Hussain

Course Name: Deep Learning and Applications
Total Marks: 10 Weightage: 2.50
th
Deadline: 15 October, 2023

Note: Dear students, please complete this assignment within due data and time. No
assignment will accept after the due date and time and upload the assignment on the LMS.
1.How would you define Machine Learning?
Machine Learning is a subfield of artificial intelligence that focuses on the
development of algorithms and statistical models that enable computers to
learn and make predictions or decisions without being explicitly programmed.
It involves the use of data to train and improve the performance of these
algorithms, allowing them to recognize patterns, make predictions, and adapt
to new information. Machine Learning is used in various applications, such as
image recognition, natural language processing, recommendation systems, and
more.
2.Can you name four types of problems where it shines?

A:Image Recognition: Machine Learning is widely used for tasks like object
recognition, facial recognition, and character recognition in handwritten
documents. Applications include self-driving cars, medical image analysis, and
security systems

B:Natural Language Processing (NLP): Machine Learning plays a crucial

role in NLP, enabling applications like sentiment analysis, language translation,
chatbots, and text summarization. It’s used to understand and generate human
language.
C:Recommendation Systems: Machine Learning algorithms power
recommendation engines, as seen in platforms like Netflix, Amazon, and
Spotify. They analyze user behavior and preferences to suggest relevant
products, movies, or musics.

D:Predictive Analytics: Machine Learning is used for predictive modeling,

such as financial forecasting, fraud detection, and demand forecasting. It can
help businesses make data-driven decisions and anticipate future trends.
3.What is a labeled training set?
A labeled training set, in the context of machine learning, is a
dataset consisting of input data (features) and their corresponding
output labels. Each data point in the training set is paired with a
label that represents the correct or expected outcome. The purpose
of a labeled training set is to train a machine learning model to
learn the relationships and patterns between the input features and
the labels.
For example, if you were building a machine learning model to classify images
of animals as either “cat” or “dog,” a labeled training set would include images
of cats and dogs, with each image associated with the correct label (“cat” or
“dog”). The model learns from this training set, identifying features in the
images that distinguish between cats and dogs, so it can make accurate
predictions on new, unlabeled data.
Labeled training sets are essential for supervised learning, where the model is
guided by the labeled data during training to make predictions on similar,
unlabeled data in the future.
4.What are the two most common supervised tasks?
The two most common supervised machine learning tasks are:
A:Classification: In classification tasks, the goal is to assign input data to
predefined categories or classes. The model learns to map input features to
discrete labels. For example, classifying emails as “spam” or “not spam,”
identifying whether an image contains a “cat” or a “dog,” or diagnosing
diseases based on medical data are all classification problems.
B:Regression: Regression tasks involve predicting a continuous numeric value
or quantity. The model learns to establish a relationship between input features
and a numerical target variable. For instance, predicting house prices based on
features like square footage, number of bedrooms, and location is a regression
problem. Similarly, forecasting stock prices or estimating a person’s age from
their biometric data are regression tasks.
These two supervised learning tasks form the foundation for a wide range of
applications in machine learning, spanning various domains and industries.
5.Can you name four common unsupervised tasks?
A:Clustering: Clustering involves grouping similar data points together based
on their intrinsic characteristics, without any predefined labels. K-Means
clustering and hierarchical clustering are examples of techniques used for this
task. It’s often used for customer segmentation, image segmentation, and
anomaly detection.
B:Dimensionality Reduction:Dimensionality reduction techniques, such as
Principal Component Analysis (PCA) and t-Distributed Stochastic Neighbor
Embedding (t-SNE), aim to reduce the number of features or variables in a
dataset while preserving essential information. This is useful for visualization,
feature selection, and simplifying complex data.
C:Association Rule Mining: In this task, algorithms like Apriori and
FPGrowth are used to discover interesting relationships, patterns, and
associations within a dataset. It’s commonly employed in market basket
analysis to identify which items are frequently purchased together, aiding in
product recommendations.
D:Density Estimation: Density estimation involves modeling the probability
distribution of data within a dataset. Techniques like Gaussian Mixture Models
(GMM) and Kernel Density Estimation (KDE) are used to estimate the
underlying probability distribution, which can be helpful for anomaly detection,
data generation, and outlier identification.
Unsupervised learning tasks are valuable for uncovering hidden structures,
patterns, and insights within data when there are no predefined target labels.
6.What type of Machine Learning algorithm would you use to allow a
robot to walk in various unknown terrains?
To enable a robot to walk in various unknown terrains, you would typically use
a type of Reinforcement Learning (RL) algorithm. Reinforcement Learning is
well-suited for training agents, such as robots, to interact with their
environment and learn through trial and error.
Specifically, you might consider using algorithms like Proximal Policy
Optimization (PPO) or Deep Deterministic Policy Gradients (DDPG) for
robotic locomotion tasks. These algorithms allow the robot to explore different
actions and learn a policy that maximizes a reward signal. In the case of a
walking robot, the reward signal could be based on objectives like maintaining
balance, making forward progress, or avoiding obstacles.
Reinforcement Learning algorithms can adapt to various terrains and learn to
navigate in complex, unknown environments, making them a suitable choice
for tasks like walking or locomotion. Keep in mind that training a walking
robot can be a challenging and time-consuming process, as it involves a lot of
trial-and-error learning in simulation or real-world environments.
7.What type of algorithm would you use to segment your customers into
multiple groups?
One of the most popular clustering algorithms is K-Means, which partitions
the data into clusters based on similarity. Other clustering methods like
Hierarchical Clustering and DBSCAN are also commonly used, depending
on the nature of the data and the specific goals of segmentation.
Customer segmentation can help businesses tailor their marketing strategies,
product offerings, and customer service to better meet the needs and
preferences of different customer groups.
8.Would you frame the problem of spam detection as a supervised learning
problem or an unsupervised learning problem?
The problem of spam detection is typically framed as a supervised learning
problem.
In spam detection, you have a labeled dataset where each email or message is
already categorized as either “spam” or “not spam” (ham). Supervised learning
algorithms are used to train a model on this labeled data, learning the patterns
and characteristics of both spam and non-spam messages. Once trained, the
model can then make predictions on new, unlabeled messages to determine
whether they are spam or not.
This approach is effective because you have clear, labeled examples to teach
the model what constitutes spam. Unsupervised learning is not commonly used
for spam detection because it doesn’t have the benefit of labeled data to guide
the learning process.
9.What is an online learning system?
An online learning system, in the context of machine learning, is a type of
machine learning system that continuously learns and adapts to new data as it
becomes available. It is also referred to as incremental learning or streaming
machine learning.
Unlike traditional batch learning, where models are trained on fixed datasets,
online learning systems are designed to handle a continuous stream of data,
making predictions or updates in real-time. They are well-suited for
applications where data arrives sequentially, and the model needs to adapt to
changing patterns and make immediate decisions.
Online learning systems are commonly used in various domains, including
recommendation systems, fraud detection, and anomaly detection, where new
data is generated continuously, and the model must stay up-to-date to provide
accurate and timely predictions.
10.What is a test set, and why would you want to use it?
A test set is a portion of a dataset that is set aside and not used during the
training of a machine learning model. It is used to evaluate the model’s
performance and assess how well it generalizes to new, unseen data.
The primary reasons to use a test set are.
A:Performance Evaluation: The test set allows you to measure how well your
machine learning model is likely to perform on real-world, unseen data. By
making predictions on the test set and comparing them to the known, true
values (labels), you can calculate metrics like accuracy, precision, recall,
F1score, or mean squared error, depending on the type of problem
(classification or regression).
B:Preventing Overfitting: A test set helps you detect and prevent overfitting. If a
model is too complex and has memorized the training data rather than learning to
generalize, its performance on the test set is likely to be worse than on the training
data, indicating a problem.
C:Tuning Hyperparameters: You can use the test set to fine-tune
hyperparameters of the model, such as learning rates or regularization strength.
By evaluating the model’s performance on the test set with different
hyperparameter settings, you can choose the best configuration.
11.What is the purpose of a validation set?
The purpose of a validation set in machine learning is to fine-tune and optimize
the model’s hyperparameters and to provide an estimate of how well the model
is likely to perform on new, unseen data.
Validation set is important:
A.Hyperparameter Tuning: Machine learning models often have
hyperparameters, which are settings that control the learning process but are not
learned from the data. These include parameters like the learning rate, the
number of hidden layers in a neural network, or the depth of a decision tree.
The validation set is used to evaluate the model’s performance with different
hyperparameter configurations, helping to select the best settings.

B.Preventing Overfitting: By monitoring the model’s performance on the

validation set during training, you can detect signs of overfitting. If the model’s
performance on the validation set starts to degrade while improving on the
training set, it suggests that the model is overfitting the training data.
C.Model Selection: In some cases, you might be comparing multiple models or
algorithms. The validation set is used to compare their performance and select
the best model for your specific problem.
It’s important to note that the validation set is distinct from the test set. The test
set is used for the final evaluation of the model’s performance after all
hyperparameter tuning and model selection have been completed. The
validation set is used during the training process to guide these decisions and
prevent overfitting.
12.What is the train-dev set, when do you need it, and how do you use it?
The train-dev set, also known as the training development set, is a dataset used
in machine learning for certain scenarios where traditional training, validation,
and test sets might not be sufficient. Its purpose is to help diagnose and
mitigate data mismatch or distributional shift issues.
Here’s when and how you might use a train-dev set:
Data Distribution Mismatch: If you suspect that the distribution of data in
your training set is significantly different from the distribution of data that your
model will encounter in production, a train-dev set can be useful.
Scenario: For example, consider a scenario where your training data contains
images of cats and dogs taken indoors, but in the real world, your model will be
deployed to identify cats and dogs in outdoor environments. The distribution of
data in the training set doesn’t match the real-world distribution.
Usage: You would create a train-dev set by splitting a portion of your training
data and setting it aside as the train-dev set. You would use this set during
model development to diagnose problems related to data distribution mismatch.
If your model performs well on the training set but poorly on the train-dev set,
it suggests issues with distribution mismatch that need to be addressed.
The train-dev set is not always necessary, and its use depends on the specific
challenges you encounter when deploying a machine learning model in
realworld scenarios where the data distribution may differ from your training
data. It’s a tool to help identify and correct distributional shift problems during
model development.

SDD Template v01
No ratings yet
SDD Template v01
10 pages
IB DP Computer Science Syllabus
100% (1)
IB DP Computer Science Syllabus
6 pages
Question Bank - Machine Learning (Repaired)
100% (1)
Question Bank - Machine Learning (Repaired)
78 pages
Problem Set Time Value of Money
No ratings yet
Problem Set Time Value of Money
5 pages
CP1 Revision Mat
No ratings yet
CP1 Revision Mat
3 pages
Tutorial 1 Question
No ratings yet
Tutorial 1 Question
3 pages
machine learning and AI
No ratings yet
machine learning and AI
13 pages
Mastering Machine Learning: A Comprehensive Guide to Success
From Everand
Mastering Machine Learning: A Comprehensive Guide to Success
Rick Spair
No ratings yet
Machine Learing- Assignment
No ratings yet
Machine Learing- Assignment
6 pages
ML Question Bank
No ratings yet
ML Question Bank
68 pages
1 - Introduction AML
No ratings yet
1 - Introduction AML
41 pages
MACHINE LEARNING FOR BEGINNERS: A Practical Guide to Understanding and Applying Machine Learning Concepts (2023 Beginner Crash Course)
From Everand
MACHINE LEARNING FOR BEGINNERS: A Practical Guide to Understanding and Applying Machine Learning Concepts (2023 Beginner Crash Course)
Elaine Tate
No ratings yet
Kaleb ML
No ratings yet
Kaleb ML
20 pages
UNIT1@
No ratings yet
UNIT1@
4 pages
Machine Learning with Python: Foundations and Applications: ML, #1
From Everand
Machine Learning with Python: Foundations and Applications: ML, #1
Mohammed Nurudeen
No ratings yet
Machine Learning Unit-1
No ratings yet
Machine Learning Unit-1
22 pages
ML Doc1
No ratings yet
ML Doc1
14 pages
ML-QB-Unit 1
No ratings yet
ML-QB-Unit 1
41 pages
Full Notes
No ratings yet
Full Notes
37 pages
Lecture 2: Introduction To Machine Learning
No ratings yet
Lecture 2: Introduction To Machine Learning
20 pages
T.Y.Bca-Introduction To Machine Learning-Lab-Prepared By: Bhaumik Shah
No ratings yet
T.Y.Bca-Introduction To Machine Learning-Lab-Prepared By: Bhaumik Shah
13 pages
Lect3 Machine Learning
No ratings yet
Lect3 Machine Learning
27 pages
Unit-5 Machine Learning
No ratings yet
Unit-5 Machine Learning
25 pages
Tutorial1 Q&A PDF
No ratings yet
Tutorial1 Q&A PDF
4 pages
Intro - Types of Machine Learning
No ratings yet
Intro - Types of Machine Learning
24 pages
Lecture 3 Mcqs
No ratings yet
Lecture 3 Mcqs
7 pages
UNIT 1
No ratings yet
UNIT 1
12 pages
Chapter - Machine Learning Algorithms
No ratings yet
Chapter - Machine Learning Algorithms
2 pages
ML Unit 1
No ratings yet
ML Unit 1
19 pages
Greenfocustech.in Mockinterview.php
No ratings yet
Greenfocustech.in Mockinterview.php
2 pages
Business Data Mining Week 5
No ratings yet
Business Data Mining Week 5
19 pages
Machine Learning Is The Branch of
No ratings yet
Machine Learning Is The Branch of
12 pages
ML Tutorial
No ratings yet
ML Tutorial
87 pages
AI unit 5
No ratings yet
AI unit 5
27 pages
Data Science Interview Questions (#Day9)
No ratings yet
Data Science Interview Questions (#Day9)
9 pages
Questions and Answers[1]
No ratings yet
Questions and Answers[1]
7 pages
Unit 3ML
No ratings yet
Unit 3ML
23 pages
Assignment No 1
No ratings yet
Assignment No 1
9 pages
ML Question Bank-1
No ratings yet
ML Question Bank-1
10 pages
Machine Learning Algorithms for Data Scientists: An Overview
From Everand
Machine Learning Algorithms for Data Scientists: An Overview
Vinaitheerthan Renganathan
No ratings yet
Supervised Unsupervised Reinforcement
No ratings yet
Supervised Unsupervised Reinforcement
39 pages
ML Viva Q&A
No ratings yet
ML Viva Q&A
17 pages
UNit 1 Introduction To ML
No ratings yet
UNit 1 Introduction To ML
225 pages
Machine Learning INTRO
No ratings yet
Machine Learning INTRO
12 pages
MCQS ML
No ratings yet
MCQS ML
27 pages
Intorduction of ML
No ratings yet
Intorduction of ML
14 pages
Machine Learning Unit-I
No ratings yet
Machine Learning Unit-I
41 pages
SSRN 3702236
No ratings yet
SSRN 3702236
8 pages
ML Unit 1
No ratings yet
ML Unit 1
9 pages
Machine Learning
No ratings yet
Machine Learning
12 pages
Machine Learning
No ratings yet
Machine Learning
11 pages
machine learning notes
No ratings yet
machine learning notes
20 pages
The Secret Of Machine Learning
From Everand
The Secret Of Machine Learning
Mhd Arjunanta
No ratings yet
21cs743 Solutions
No ratings yet
21cs743 Solutions
19 pages
Quiz
No ratings yet
Quiz
6 pages
Questionaire-Group-8
No ratings yet
Questionaire-Group-8
3 pages
INTRODUCTION TO MACHINE LEARNING
No ratings yet
INTRODUCTION TO MACHINE LEARNING
31 pages
CA5EL52 Done Machine Learning
No ratings yet
CA5EL52 Done Machine Learning
4 pages
MCA -ML Question Bank Answer
No ratings yet
MCA -ML Question Bank Answer
139 pages
Ai 4
No ratings yet
Ai 4
49 pages
Machine Learning For Beginner
No ratings yet
Machine Learning For Beginner
31 pages
MachineLearning Spring2020 1
No ratings yet
MachineLearning Spring2020 1
69 pages
Session 3 Types of Machine Learning (1)
No ratings yet
Session 3 Types of Machine Learning (1)
22 pages
MLT Question Bank New
No ratings yet
MLT Question Bank New
6 pages
Metal Matrix Composites For Aerospace Application: Advanced Materials
100% (1)
Metal Matrix Composites For Aerospace Application: Advanced Materials
3 pages
Sava Tablice
No ratings yet
Sava Tablice
13 pages
1996 Williamson Vortex Dynamics in The Cylinder Wake Ann Review
No ratings yet
1996 Williamson Vortex Dynamics in The Cylinder Wake Ann Review
67 pages
Nucleophilicity
No ratings yet
Nucleophilicity
2 pages
Crossovers The Basics: What Is A Crossover?
No ratings yet
Crossovers The Basics: What Is A Crossover?
5 pages
Iec61439 Importance
100% (1)
Iec61439 Importance
30 pages
Indore Branch of Circ of Icai: Mr. Saurabh Parikh
No ratings yet
Indore Branch of Circ of Icai: Mr. Saurabh Parikh
32 pages
2.7 - Arc Length and Sector Area
No ratings yet
2.7 - Arc Length and Sector Area
5 pages
Infinite Limits and Limits at Infinity
No ratings yet
Infinite Limits and Limits at Infinity
14 pages
Index 9th Edition Pheur
50% (2)
Index 9th Edition Pheur
34 pages
Exam On Work and Power
No ratings yet
Exam On Work and Power
9 pages
Lab 4 - 5
No ratings yet
Lab 4 - 5
13 pages
STEPS-MAILMERGE FULL
No ratings yet
STEPS-MAILMERGE FULL
10 pages
Rohit Kumar
No ratings yet
Rohit Kumar
2 pages
Actual Topics For PHD Study Programme
No ratings yet
Actual Topics For PHD Study Programme
7 pages
References: Sources Used
No ratings yet
References: Sources Used
4 pages
Lecture 3
No ratings yet
Lecture 3
25 pages
Crash 2024 03 31 18 08 12 419
No ratings yet
Crash 2024 03 31 18 08 12 419
9 pages
Inplace Analysis - Case Study
No ratings yet
Inplace Analysis - Case Study
5 pages
5.02 Direct Method of Interpolation
100% (1)
5.02 Direct Method of Interpolation
11 pages
Job Satisfaction and Occupational Stress Among Public and Private Bank Employees in Manipur
No ratings yet
Job Satisfaction and Occupational Stress Among Public and Private Bank Employees in Manipur
9 pages
Hourly Analysis Program HVAC System
No ratings yet
Hourly Analysis Program HVAC System
9 pages
Third Semester B.E. (CBCS) Examination Engineering Mathematics-III
No ratings yet
Third Semester B.E. (CBCS) Examination Engineering Mathematics-III
3 pages
Gupta N Gupta
No ratings yet
Gupta N Gupta
25 pages
Solved Past Paper BS AD Semester 8 (Thermoanalysis Method) Chem-468
No ratings yet
Solved Past Paper BS AD Semester 8 (Thermoanalysis Method) Chem-468
11 pages
Equipment Operational Reliability Evaluation Metho
No ratings yet
Equipment Operational Reliability Evaluation Metho
9 pages