ML Merged
ML Merged
Name: __________________________________________________________
_______________
Student’s Signature
DEPARTMENT OF COMPUTER ENGINEERING
Certificate
Seal of
Institution
INSTRUCTION FOR STUDENTS
Students shall read the points given below for understanding the theoretical concepts and
practical applications.
1) Listen carefully to the lecture given by teacher about importance of subject, curriculum
philosophy learning structure, skills to be developed, information about equipment,
instruments, procedure, method of continuous assessment, tentative plan of work in
laboratory and total amount of work to be done in a semester.
2) Student shall undergo study visit of the laboratory for types of equipment, instruments,
software to be used, before performing experiments.
3) Read the write up of each experiment to be performed, a day in advance.
4) Organize the work in the group and make a record of all observations.
5) Understand the purpose of experiment and its practical implications.
6) Write the answers of the questions allotted by the teacher during practical hours if
possible or afterwards, but immediately.
7) Student should not hesitate to ask any difficulty faced during conduct of
practical/exercise.
8) The student shall study all the questions given in the laboratory manual and practice to
write the answers to these questions.
9) Student shall develop maintenance skills as expected by the industries.
10) Student should develop the habit of pocket discussion/group discussion related to the
experiments/exercises so that exchanges of knowledge/skills could take place.
11) Student shall attempt to develop related hands-on-skills and gain confidence.
12) Student shall focus on development of skills rather than theoretical or codified
knowledge.
13) Student shall visit the nearby workshops, workstation, industries, laboratories, technical
exhibitions, trade fair etc. even not included in the Lab manual. In short, students should
have exposure to the area of work right in the student hood.
14) Student shall insist for the completion of recommended laboratory work, industrial
visits, answers to the given questions, etc.
15) Student shall develop the habit of evolving more ideas, innovations, skills etc. those
included in the scope of the manual.
16) Student shall refer technical magazines, proceedings of the seminars, refer websites
related to the scope of the subjects and update his knowledge and skills.
17) Student should develop the habit of not to depend totally on teachers but to develop self-
learning techniques.
18) Student should develop the habit to react with the teacher without hesitation with respect
to the academics involved.
19) Student should develop habit to submit the practicals, exercise continuously and
progressively on the scheduled dates and should get the assessment done.
20) Student should be well prepared while submitting the write up of the exercise. This will
develop the continuity of the studies and he/she will not be over loaded at the end of the
term.
GUIDELINES FOR TEACHERS
Teachers shall discuss the following points with students before start of practicals of the subject.
1) Learning Overview: To develop better understanding of importance of the subject. To
know related skills to be developed such as Intellectual skills and Motor skills.
2) Learning Structure: In this, topic and sub topics are organized in systematic way so that
ultimate purpose of learning the subject is achieved. This is arranged in the form of fact,
concept, principle, procedure, application and problem.
3) Know your Laboratory Work: To understand the layout of laboratory, specifications of
equipment/Instruments/Materials, procedure, working in groups, planning time ets.
Also to know total amount of work to be done in the laboratory.
4) Teaching shall ensure that required equipments are in working condition before start of
experiment, also keep operating instruction manual available.
5) Explain prior concepts to the students before starting of each experiment.
6) Involve students activity at the time of conduct of each experiment.
7) While taking reading/observation each student shall be given a chance to perform or
observe the experiment.
8) If the experimental set up has variations in the specifications of the equipment, the
teachers are advised to make the necessary changes, wherever needed.
9) Teacher shall assess the performance of students continuously as per norms prescribed
by university of Mumbai and guidelines provided by IQAC.
10) Teacher should ensure that the respective skills and competencies are developed in the
students after the completion of the practical exercise..
11) Teacher is expected to share the skills and competencies are developed in the students.
12) Teacher may provide additional knowledge and skills to the students even though not
covered in the manual but are expected from the students by the industries.
13) Teachers shall ensure that industrial visits if recommended in the manual are covered.
14) Teacher may suggest the students to refer additional related literature of the Technical
papers/Reference books/Seminar proceedings, etc.
15) During assessment teacher is expected to ask questions to the students to tap their
achievements regarding related knowledge and skills so that students can prepare while
submitting record of the practicals. Focus should be given on development of enlisted
skills rather than theoretical /codified knowledge.
16) Teacher should enlist the skills to be developed in the students that are expected by the
industry.
17) Teacher should organize Group discussions /brain storming sessions / Seminars to
facilitate the exchange of knowledge amongst the students.
18) Teacher should ensure that revised assessment norms are followed simultaneously and
progressively.
19) Teacher should give more focus on hands on skills and should actually share the same.
20) Teacher shall also refer to the circulars related to practicals supervise and assessment
for additional guidelines.
DEPARTMENT OF COMPUTER ENGINEERING
Student’s Progress Assessments
Student Name: __________________________________ Roll No.: ______________________
Class/Semester: BE CS/SEM-VII Academic Year: 2024-2025
Course Name: Machine Learning Laboratory Course Code: CSL701
Assessment Parameters for Practical’s/Mini Project/Assignments
Criteria for Grading Total Lab
Exp.
No.
Title of Experiment PE KT DR DN PL (out of Average Objective
(out of 3)
(Out of 3) (Out of 3) (Out of 3) (Out of 3) (Out of 3) 15) s
To Implement ensemble
3 learning bagging and boosting LO2
To Implement multivariate
4 Linear Regression LO1
Average Marks
Criteria for Grading – Preparedness and Efforts(PE),Knowledge of tools(KT), Debugging and results(DR),
Documentation(DN), Punctuality & Lab Ethics(PL).
Criteria for Grading Total
Assignments (out of Average Covere
TS OM NT IS (out of 3) d COs
(Out of 3) (Out of 3) (Out of 3) (Out of 3) 12)
C01-C0
Assignment No. 1
3
C04-C0
Assignment No. 2
6
Average Marks
Criteria for Grading –Timely submission(TS), Originality of the material(OM), Neatness(NT), Innovative solution(IS)
Grades – Meet Expectations(3 Marks), Moderate Expectations (2 Marks), Below Expectations (1 Mark)
80
Judge your ability with regard to the following points by putting a (√), on the scale of 1 (lowest) to
5 (highest), based on the knowledge and skills you attained from this course.
Sr. 1 5
Your ability to 2 3 4
No. Lowest Highest
_______________ _______________
Student’s Signature Date
DEPARTMENT OF COMPUTER ENGINEERING
Programme Outcome (PO & PSOs)
Programme Outcomes are the skills and knowledge which the students have at the time of graduation. This will indicate
what student can do from subject-wise knowledge acquired during the programme.
PO Short title of the PO Description of the Programme outcome as defined by the NBA
Apply the knowledge of mathematics, science, engineering fundamentals, and an
PO-1 Engineering knowledge
engineering specialization to the solution of complex engineering problems.
Identify, formulate, review research literature, and analyze complex
PO-2 Problem analysis engineering problems reaching substantiated conclusions using first principles
of mathematics, natural sciences, and engineering sciences.
Design solutions for complex engineering problems and design system
Design/development of components or processes that meet the specified needs with appropriate
PO-3
solutions consideration for the public health and safety, and the cultural, societal, and
environmental considerations.
Use research-based knowledge and research methods including design of
Conduct investigations of
PO-4 experiments, analysis and interpretation of data, and synthesis of the information to
complex problems
provide valid conclusions.
Create, select, and apply appropriate techniques, resources, and modern engineering
PO-5 Modern tool usage and IT tools including prediction and modeling to complex engineering activities with
an understanding of the limitations.
Apply reasoning informed by the contextual knowledge to assess societal, health,
The engineer and
PO-6 safety, legal and cultural issues and the consequent responsibilities relevant to the
society
professional engineering practice.
Understand the impact of the professional engineering solutions in societal and
Environment and
PO-7 environmental contexts, and demonstrate the knowledge of, and need for
sustainability
sustainable development.
Apply ethical principles and commit to professional ethics and responsibilities and
PO-8 Ethics
norms of the engineering practice.
Function effectively as an individual, and as a member or leader in diverse teams, and
PO-9 Individual and team work
in multidisciplinary settings.
Communicate effectively on complex engineering activities with the engineering
community and with society at large, such as, being able to comprehend and write
PO-10 Communication
effective reports and design documentation, make effective presentations, and give
and receive clear instructions.
Demonstrate knowledge and understanding of the engineering and
Project management
PO-11 management principles and apply these to one’s own work, as a member and leader
and finance
in a team, to manage projects and in multidisciplinary environments.
Recognize the need for, and have the preparation and ability to engage in
PO-12 Life-long learning
independent and life-long learning in the broadest context of technological change.
Program Specific Outcomes (PSOs) defined by the programme. Baseline-Rational Unified Process(RUP)
The graduate must be able to develop, deploy, test and maintain the software or
Computing solution to
PSO-1 computing hardware solutions to solve real life problems using state of the art
solve real life problem
technologies, standards, tools and programming paradigms.
Computer Engineering The graduate should be able to adapt Computer Engineering knowledge and skills to
PSO-2
knowledge and skills create career paths in industries or business organizations or institutes of repute.
DEPARTMENT OF COMPUTER ENGINEERING
CSL701 Machine Learning lab
Seven Semester, 2024-2025 (Odd Semester)
Name of Student :
Roll No. :
Division :
Assignment No. :
Outcome :
Task :
Date of Assignment :
Date of Submission :
Max. Marks
Particulars
Marks Obtained
Timely Submission (TS) 3
Neatness (NT) 3
3
Innovative Solution (IS)
12
Total
Grades – Meet Expectations (3 Marks), Moderate Expectations (2 Marks), Below Expectations (1 Mark)
Ans: Multiclass classification is a common machine learning task where you need to classify
data points into one of multiple possible classes or categories. There are several popular
algorithms and approaches for multiclass classification, each with its own strengths and
weaknesses. Here are some of the most popular algorithms for multiclass classification:
Logistic Regression: While often used for binary classification, logistic regression can be
extended to multiclass problems through techniques like one-vs-all (OvA) or softmax
regression. It trains multiple binary classifiers or a single classifier with multiple output classes.
Decision Trees: Decision tree algorithms like Random Forest and Gradient Boosting can be
used for multiclass classification. They partition the feature space into regions and assign
classes to those regions.
k-Nearest Neighbors (k-NN): k-NN is a simple yet effective algorithm for multiclass
classification. It assigns a data point to the majority class among its k-nearest neighbors in the
feature space.
Naive Bayes: Naive Bayes algorithms, such as Gaussian Naive Bayes or Multinomial Naive
Bayes, are probabilistic classifiers that work well for multiclass problems, especially in text
classification.
Support Vector Machines (SVM): SVMs can handle multiclass problems through one-vs-
one (OvO) or one-vs-all (OvA) strategies. SVMs aim to find a hyperplane that best separates
the classes.
3. What is Graph Based Clustering?
Ans: Graph-based clustering is a technique that organizes data points into groups (or
clusters) by modeling them as nodes in a graph, where edges represent the relationships or
similarities between these points. This method capitalizes on the connectivity of the graph to
identify clusters, making it particularly effective for complex datasets where traditional
clustering algorithms may struggle. For instance, in social network analysis, individuals can
be represented as nodes, and their interactions (like friendships or collaborations) as edges.
By analyzing the structure of this graph, one can identify tightly-knit communities or groups
of users with similar interests.
A common algorithm used in graph-based clustering is Spectral Clustering. This method
involves constructing a graph Laplacian from the similarity matrix of the data and then
computing its eigenvalues and eigenvectors. For example, consider a dataset of images where
each image is a node, and edges represent similarity based on visual features. By applying
Spectral Clustering, we can effectively group similar images together, allowing for tasks like
automatic categorization or retrieval based on visual similarity. This approach not only
enhances the quality of clustering but also provides a deeper understanding of the underlying
relationships within the data.
4. Write short note on Epsilon neighborhood graph.
Ans: K-Means and Spectral Clustering are two different approaches to clustering data, each
with its own strengths and weaknesses. Let's explore each of them in more detail:
K-Means Clustering:
Basic Idea: K-Means is a partition-based clustering algorithm that aims to group data points
into K clusters, where K is a predefined number of clusters.
Clustering Process:
Initialization: K initial cluster centroids are randomly or strategically chosen from the data
points.
Assignment: Each data point is assigned to the cluster whose centroid is closest (usually based
on Euclidean distance).
Update: The centroids of the clusters are recalculated as the mean of all data points assigned
to that cluster.
Repeat Assignment and Update: The assignment and update steps are repeated until
convergence (i.e., when the centroids no longer change significantly) or for a specified number
of iterations.
Strengths:
Simplicity and efficiency: K-Means is computationally efficient and easy to implement.
Works well for spherical clusters: It performs well when clusters are roughly spherical, evenly
sized, and have similar densities.
Weaknesses:
Sensitive to initialization: The choice of initial centroids can affect the final clustering result,
leading to suboptimal solutions.
Assumes equal-sized, spherical clusters: K-Means may struggle with non-convex clusters,
uneven cluster sizes, and clusters with varying densities.
Spectral Clustering:
Basic Idea: Spectral Clustering is a graph-based clustering algorithm that leverages spectral
graph theory to find clusters in data.
Clustering Process:
Construct Similarity Graph: A similarity graph (e.g., Epsilon Neighborhood Graph or K-
Nearest Neighbors Graph) is created based on pairwise similarities between data points.
Graph Laplacian: A graph Laplacian matrix is derived from the similarity graph.
Eigenvector Decomposition: The eigenvectors and eigenvalues of the Laplacian matrix are
computed.
Dimension Reduction: A subset of the eigenvectors (usually corresponding to the smallest
eigenvalues) is selected to reduce the dimensionality of the data.
Clustering: Traditional clustering techniques like K-Means are applied in the reduced-
dimensional space.
Strengths:
Handles non-convex clusters: Spectral Clustering is effective at finding clusters with complex
shapes, as it captures the underlying data structure.
Not sensitive to initialization: Unlike K-Means, Spectral Clustering is not sensitive to the initial
choice of cluster centers.
Can uncover hidden structures: It can discover clusters that may not be apparent in the original
feature space.
Weaknesses:
Parameter tuning: Choosing the number of clusters (K) and graph-related parameters (e.g.,
epsilon or the number of nearest neighbors) can be challenging.
Computationally intensive: Spectral Clustering can be computationally expensive, especially
for large datasets, due to eigenvalue decomposition.
Ans: Dimension reduction is a crucial step in machine learning for several reasons:
Curse of Dimensionality: As the number of features (dimensions) in a dataset increases, the
amount of data required to adequately cover that space grows exponentially. This phenomenon
is known as the "curse of dimensionality." With high-dimensional data, the dataset can become
sparse, making it challenging to find meaningful patterns and relationships. Dimension
reduction helps mitigate this problem by reducing the number of features while retaining
important information.
Computational Efficiency: High-dimensional data requires more computational resources for
training machine learning models, making the process slow and resource-intensive. Dimension
reduction can significantly speed up training and prediction times by reducing the feature
space's dimensionality.
Overfitting Reduction: High-dimensional datasets are more prone to overfitting, where a
model fits the noise in the data rather than the underlying patterns. Reducing the dimensionality
can help reduce overfitting and improve a model's generalization to unseen data.
Visualization: Visualizing data in high dimensions is challenging. Humans are limited in their
ability to comprehend and visualize data beyond three dimensions. Dimension reduction
techniques, such as Principal Component Analysis (PCA) or t-SNE, project data into lower-
dimensional spaces that can be visualized more easily, helping analysts and data scientists gain
insights.
Feature Engineering: Dimension reduction can assist in feature engineering by identifying
which features contribute the most to explaining the data's variance or target variable. This
knowledge can guide feature selection and the creation of more informative features.
Improved Model Performance: Removing irrelevant or redundant features through
dimension reduction can lead to a simpler and more interpretable model, improving model
performance and reducing the risk of overfitting.
Noise Reduction: High-dimensional data often contains noise or irrelevant information.
Dimension reduction methods aim to preserve the most informative features while discarding
less useful ones, effectively reducing the impact of noise.
Interpretability: Simplifying the dataset through dimension reduction can make it easier to
interpret and understand the relationships between variables. This is especially important in
fields like healthcare and finance, where interpretability is crucial.
DEPARTMENT OF COMPUTER ENGINEERING
CSL702 Machine Learning Lab
Seven Semester, 2024-25 (ODD Semester)
Name of Student :
Roll No. :
Batch :
Date of Implementation:
Date of Submission :