0% found this document useful (0 votes)
20 views22 pages

ML MQP1 Solved

Uploaded by

csumant94
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views22 pages

ML MQP1 Solved

Uploaded by

csumant94
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

Machine Learning

MQP-1

PART-A
I. 4*2=8

1. What is ML? Give example .

Ans: Machine learning is a subfield of artificial intelligence (AI)

 ML is a “Field of study that gives computers the capability to learn

without being explicitly programmed”

"OR"

 Machine learning is a branch of artificial intelligence that enables algorithms to uncover


hidden patterns within datasets, allowing them to make predictions on new, similar data
without explicit programming for each task.

Ex: Speech & Image Recognition, Chatbot, Traffic alerts using Google Map,

Google Translation etc

2. What is scikit-learn?

Ans:

 Scikit-learn (Sklearn) is the most useful and robust library for machine

learning in Python.

 It provides a selection of efficient tools for machine learning and

statistical modeling including classification, regression, clustering and

dimensionality reduction via a consistence interface in Python.

 This library, which is largely written in Python, is built upon NumPy,

pandas,SciPy and Matplotlib.


Installing scikit-learn on windows
If you already installed NumPy and Scipy, the following are the two easiest

ways to install scikit-learn

Using pip

The following command can be used to install sci-kit-learn via pip

pip install scikit-learn

To verify your installation, you can use the following commands:

python -m pip show scikit-learn

3. What is labelled data and unlabelled data ? Give an example.

Ans: Labelled data:

 Labeled data is data that has some predefined tags such as name, type, or number

 Labeled data is used in Supervised Learning techniques

 It’s harder to obtain and store (can be time consuming and costly).

 It can be used to identify actionable insights, such as predictions.

Unlabelled data:

 Unlabeled data on the other hand, doesn't have any meaningful tags or labels and usually
consists of natural or human-created samples such as photos, audio recordings, videos,
news articles, tweets, or x-rays that can be easily obtained.

 Unlabeled data is used in unsupervised machine learning

 It’s easier to obtain and store.

 It doesn't have as many uses (however, unsupervised learning methods can help
uncover new data clusters for additional categories).

4. What is classification? Give an example.

Ans: Classification is a process of categorizing data or objects into predefined classes or


categories based on their features or attributes.
 Machine Learning classification is a type of supervised learning technique where an
algorithm is trained on a labeled dataset to predict the class or category of new, unseen
data.

 For example, a classification model might be trained on a dataset of images labeled as


either dogs or cats and then used to predict the class of new, unseen images of dogs or
cats based on their features such as color, texture, and shape.

5. What is Clustering?

Ans: The task of grouping data points based on their similarity with each other is called
Clustering or Cluster Analysis.

 This method is defined under the branch of Unsupervised Learning, which aims at
gaining insights from unlabelled data points.

 Unlike supervised learning we don’t have a target variable.

Clustering algorithms:

 Centroid-based Clustering (Partitioning methods)

 Density-based Clustering (Model-based methods)

 Connectivity-based Clustering (Hierarchical clustering)

 Distribution-based Clustering

6. What is DBSCAN?

Ans: DBSCAN is the abbreviation for Density-Based Spatial Clustering of Applications with Noise.

 It is an unsupervised clustering algorithm.

 DBSCAN clustering can work with clusters of any size from huge amounts of data and
can work with datasets containing a significant amount of noise.

 It is basically based on the criteria of a minimum number of points within a region.

DBSCAN Algorithm:

DBSCAN algorithm can cluster densely grouped points efficiently into one cluster. It can identify
local density in the data points among large datasets. DBSCAN can very effectively handle
outliers. An advantage of DBSACN over the K-means algorithm is that the number of centroids
need not be known beforehand in the case of DBSCAN.

DBSCAN algorithm depends upon two parameters epsilon and minPoints.

PART-B
II . 4*5=20

7. Why use ML?

Ans: Machine learning is important because it allows computers to learn from data and improve
their performance on specific tasks without being explicitly programmed.

 This ability to learn from data and adapt to new situations makes machine learning
particularly useful for tasks that involve large amounts of data, complex decision-making,
and dynamic environments.

Uses:

 Prediction and Forecasting

 Improved Decision Making

 Real-time Data Analysis

 Pattern Recognition

 Automation and Optimization

 Personalization

 Efficient Resource Utilization

 Natural Language Processing (NLP)

 Healthcare and Medicine

8. Write the application of ML.

Ans:
9. What is Feature Engineering? Explain the key components of Feature Engineering ?

Ans: Feature Engineering is the process of creating new features or transforming existing
features to improve the performance of a machine-learning model.

 It involves selecting relevant information from raw data and transforming it into a
format that can be easily understood by a model.

 The goal is to improve model accuracy by providing more meaningful and relevant
information.

Key components of Feature Engineering are:

 Feature Creation

 Feature Transformation

 Feature Extraction

 Feature Selection

 Feature Scaling
 Feature Creation

Feature Creation is the process of generating new features based on domain knowledge
or by observing patterns in the data. It is a form of feature engineering that can
significantly improve the performance of a machine-learning model.

 Feature Transformation

Feature Transformation is the process of transforming the features into a more


suitable representation for the machine learning model. This is done to ensure that
the model can effectively learn from the data.

 Feature Extraction

Feature Extraction is the process of creating new features from existing ones to
provide more relevant information to the machine learning model. This is done by
transforming, combining, or aggregating existing features.

 Feature Selection

Feature Selection is the process of selecting a subset of relevant features from the
dataset to be used in a machine-learning model. It is an important step in the feature
engineering process as it can have a significant impact on the model’s performance.

 Feature Scaling

Feature Scaling is the process of transforming the features so that they have a
similar scale. This is important in machine learning because the scale of the features
can affect the performance of the model.

10. How Naive Bayes Classifier works?

Ans:
11. How K-means Clustering works ? Write an algorithm.

Ans:
12. Write a Python code to demonstrate K-mean clustering.

Ans:
Section - C
III 4*8=32

13. Explain the types of ML?

Ans:

 Machine learning is a subfield of artificial intelligence (AI) .

 ML is a “Field of study that gives computers the capability to learn

without being explicitly programmed”

 Ex: Speech & Image Recognition, Chatbot, Traffic alerts using Google Map, Google
Translation etc.

There are three types

i)Supervised Machine Learning

ii)Unsupervised Machine Learning

iii)Reinforcement Learning
14. Explain the essential Libraries and Tools required for ML projects?

Ans:
15. a. Discuss the sources of real world data

b. Explain the process of selecting and training a ML model?

Ans: (a)
(b)
16. Explain how to discover and visualize the data to gain insights in data preparation

Ans:
17. a. Write a Python code to demonstrate classification task using CART.

b. Write the application of K-means Clustering.

Ans: (a)
(b)

18. a. How clustering is used in Semi-Supervised Learning

b. Write a note on i)Mean-shift. ii) Affinity propogation

Ans: (a)
(b)

You might also like