ML MQP1 Solved
ML MQP1 Solved
MQP-1
PART-A
I. 4*2=8
"OR"
Ex: Speech & Image Recognition, Chatbot, Traffic alerts using Google Map,
2. What is scikit-learn?
Ans:
Scikit-learn (Sklearn) is the most useful and robust library for machine
learning in Python.
Using pip
Labeled data is data that has some predefined tags such as name, type, or number
It’s harder to obtain and store (can be time consuming and costly).
Unlabelled data:
Unlabeled data on the other hand, doesn't have any meaningful tags or labels and usually
consists of natural or human-created samples such as photos, audio recordings, videos,
news articles, tweets, or x-rays that can be easily obtained.
It doesn't have as many uses (however, unsupervised learning methods can help
uncover new data clusters for additional categories).
5. What is Clustering?
Ans: The task of grouping data points based on their similarity with each other is called
Clustering or Cluster Analysis.
This method is defined under the branch of Unsupervised Learning, which aims at
gaining insights from unlabelled data points.
Clustering algorithms:
Distribution-based Clustering
6. What is DBSCAN?
Ans: DBSCAN is the abbreviation for Density-Based Spatial Clustering of Applications with Noise.
DBSCAN clustering can work with clusters of any size from huge amounts of data and
can work with datasets containing a significant amount of noise.
DBSCAN Algorithm:
DBSCAN algorithm can cluster densely grouped points efficiently into one cluster. It can identify
local density in the data points among large datasets. DBSCAN can very effectively handle
outliers. An advantage of DBSACN over the K-means algorithm is that the number of centroids
need not be known beforehand in the case of DBSCAN.
PART-B
II . 4*5=20
Ans: Machine learning is important because it allows computers to learn from data and improve
their performance on specific tasks without being explicitly programmed.
This ability to learn from data and adapt to new situations makes machine learning
particularly useful for tasks that involve large amounts of data, complex decision-making,
and dynamic environments.
Uses:
Pattern Recognition
Personalization
Ans:
9. What is Feature Engineering? Explain the key components of Feature Engineering ?
Ans: Feature Engineering is the process of creating new features or transforming existing
features to improve the performance of a machine-learning model.
It involves selecting relevant information from raw data and transforming it into a
format that can be easily understood by a model.
The goal is to improve model accuracy by providing more meaningful and relevant
information.
Feature Creation
Feature Transformation
Feature Extraction
Feature Selection
Feature Scaling
Feature Creation
Feature Creation is the process of generating new features based on domain knowledge
or by observing patterns in the data. It is a form of feature engineering that can
significantly improve the performance of a machine-learning model.
Feature Transformation
Feature Extraction
Feature Extraction is the process of creating new features from existing ones to
provide more relevant information to the machine learning model. This is done by
transforming, combining, or aggregating existing features.
Feature Selection
Feature Selection is the process of selecting a subset of relevant features from the
dataset to be used in a machine-learning model. It is an important step in the feature
engineering process as it can have a significant impact on the model’s performance.
Feature Scaling
Feature Scaling is the process of transforming the features so that they have a
similar scale. This is important in machine learning because the scale of the features
can affect the performance of the model.
Ans:
11. How K-means Clustering works ? Write an algorithm.
Ans:
12. Write a Python code to demonstrate K-mean clustering.
Ans:
Section - C
III 4*8=32
Ans:
Ex: Speech & Image Recognition, Chatbot, Traffic alerts using Google Map, Google
Translation etc.
iii)Reinforcement Learning
14. Explain the essential Libraries and Tools required for ML projects?
Ans:
15. a. Discuss the sources of real world data
Ans: (a)
(b)
16. Explain how to discover and visualize the data to gain insights in data preparation
Ans:
17. a. Write a Python code to demonstrate classification task using CART.
Ans: (a)
(b)
Ans: (a)
(b)