Coding Assignment Report

The report discusses the use of Dynamic Time Warping (DTW) over Euclidean distance for analyzing human activity data due to DTW's ability to handle temporal warping, phase variation, and shape-based similarity. It also compares Diffusion Maps and t-SNE, highlighting that Diffusion Maps better preserve temporal progression patterns, making them suitable for sequential human motion analysis. Additionally, the report evaluates three optimization methods (Nelder-Mead, Simulated Annealing, CMA-ES), outlining their strengths, weaknesses, and best use cases, ultimately emphasizing trade-offs between accuracy, speed, and generality.

Uploaded by

huntersganggaming

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

3 views5 pages

Coding Assignment Report

Uploaded by

huntersganggaming

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

Mathematical foundation for AI (CS303)

Name: Nishit Prajapati

ID: 24120036
Coding Assignment Report

Problem 1
Here is the link to the colab notebook which contains the code that solves the problem 1.
The explanation and the difference between DTW distance and Euclidean distance have
already been discussed in the notebook above, so I am not repeating the same thing here.
Why DTW distance was used to construct the kernel and do the diffusion map embedding
instead of Euclidean distance?
➔ The reason lies in the UCI HAR DATASET itself only. DTW was selected in place of
Euclidean distance due to three significant properties of human activity data:
Temporal warping tolerance: Synchronizes offset activity patterns (e.g., upstairs vs
downstairs) by warping the time-axis nonlinearly.
Phase variation robustness: It synchronizes sequences of different speeds with
dynamic programming alignment.
Shape-based similarity: Maintains morphological features of sensor data irrespective
of local time offsets.

Visual inspection showed DTW produces large warping paths among activities (see
below image), while Euclidean distance produces artificial mismatches due to rigid
alignment.
In conclusion :
• Euclidean Distance: Measures the straight-line distance between two sequences
of equal length. It is sensitive to shifts and distortions in the time axis.
• DTW Distance: Allows flexible alignment by warping the time axis, making it more
robust for comparing time series with varying speeds or misalignments.

Thus, DTW is more suitable for time-dependent sequences, while Euclidean distance
is a simpler, rigid comparison method. And HAR dataset contains time-dependent
sequences as data was sampled in fixed-width sliding windows of 2.56 seconds with a
50% overlap (128 readings per window).

Why Diffusion Maps outperform PCA/t-SNE?

➔ To answer this, we can look at the silhouette score for each type of clustering.
Silhouette score is a metric used to evaluate the quality of clustering, and it represents
how far a data point is from points in neighbouring clusters. Greater the silhouette
score more dense and well-separated clusters are. Now, if we look at this score for
different clustering:
So, if we observe all the above images, we can see both Diffusion Maps & t-SNE has
good score when compared to other clustering.
• Raw Feature Space: Exhibits the worst clustering quality with too much overlap
between the majority of activity classes.
• PCA: Provides slightly better separation than original features but still has considerable
overlap between some classes.
• Diffusion Maps: Produces a unique structure with clear separation between
prominent activity groups, exposing underlying manifold structure in the data.
• t-SNE: Captures the ideal cluster quality with clear boundaries among different activity
categories and well-formed clusters.
Now, if we compare t-SNE & Diffusion map, I found:
• Diffusion maps retain both local neighbourhood similarity and global manifold
structure simultaneously, whereas t-SNE preserves neighbourhood relationships but
often distorts global patterns. This can be observed in the manner the curved manifold
structure reveals the continuous relationship between activities.
• The diffusion map embedding shows how systems evolve over time by approximating
the eigenfunctions of the normalized graph Laplacian that characterizes the behaviour
of the system. This makes it particularly well-suited for finding patterns in time-series
data.
• Diffusion maps view time-series observations as samples from evolving distributions
and unveil the underlying statistical structure. That is evident from the way the clusters
trace a smooth curved trajectory. For noisy measurement data in time series, diffusion
maps are better than PCA (which only applies to linear data) and t-SNE (which may
vary greatly depending on the parameters you choose).
Thus, while t-SNE showed better cluster separation (0.065 > 0.030), Diffusion Maps
uniquely preserved temporal progression patterns through curved embeddings that
correlate with activity state transitions. This makes them particularly suitable for
analysing sequential human motion data where temporal relationships between states
are critical.
Problem 2
Here is the link to the colab notebook which contains the code that solves the problem 2.
The explanation along with their strengths, limitations and process for these three-
optimization method(Nelder-Mead, Simulated Annealing(SA), CMA-ES) has already been
discussed in the colab notebook(link on the above line), so not repeating the same thing here.
Now, what are the trade-offs of each method?
Note: To answer this question, I have used some results obtain ed from the code
available in the above linked notebook.
➔ Below are the results which will helps us to observe trade-offs for each method.

1. Nelder-Mead Method
Strengths:
Low computation cost: Obtained results within less than 146 iterations for
optimization of Rosenbrock function (compared to 4118 for SA).
Simplicity: No gradient calculations are required.
Rapid neighbourhood convergence: Best for smooth convex problems.

Weaknesses:
Global search is poor: Failed to reduce Rastrigin function (function value=7.96) and
Ackley function(function value =6.56) due to local minima trapping.
Dimensionality constraints: Struggled with tuning SVM hyperparameters
(accuracy=0.91 vs CMA-ES's 0.9 with 1/2 evaluations).

Best Use Case: Low-dimensional, unimodal problems with limited compute budgets.

2. Simulated Annealing (SA)

Strengths:
Global optimization: All test functions optimized to f≈0 (even for Rastrigin, Ackley).
Noise tolerance: Applicable in rugged terrain through probabilistic acceptability.

Weaknesses:
High compute cost: Required over 4,000 iterations for test functions.
Parameter sensitivity: Performance is highly dependent on cooling schedule
adjustment.
Inefficient convergence: Took 6,009 evaluations for SVM tuning versus Nelder-Mead's
49 evaluations.
Best Use Case: Multimodal problems where global optimum quality is warranted by
computational cost.

3. CMA-ES
Strengths:
Balanced search: Correctly solved Rosenbrock/Ackley precisely (function value≈1e-12)
with moderate iterations.
Flexibility: Self-tuning covariance matrix accommodates ill-conditioned topographies.
Sample efficiency: Finished SVM tuning in 30 tests (1/200 of SA's cost).

Weaknesses:
Premature convergence: Poor Rastrigin solution (function value = 0.99) due to
population size limitations.
Memory overhead: Saves covariance matrix (O(n²) complexity).
Discrete parameter problems: Struggled with categorical kernel selection in SVM.

Best Use Case: Medium-dimensional continuous optimization with correlated

parameters.

Implications:
Rosenbrock-type problems: Apply Nelder-Mead with a fast search using smooth-
gradients.
Multimodal landscapes (Rastrigin/Ackley): Optimize SA over computational expense.
ML hyperparameter tuning: CMA-ES for continuous parameters, Nelder-Mead for
mixed spaces.
In conclusion, Trade-offs are:
Accuracy and Speed: SA’s results are quite accurate but speed is slow while the Nelder-
Mead was fast but not accurate for the Rastrigin and Ackley functions.
Generality and Specialization: Nelder-Mead's simplex degrades in large dimensions
but performs well in low-Dimensional functions.
Automation and Control: CMA-ES makes tuning easier but is harder to understand.

-x-x-x-x-x-x-x-x-x-x-x-x-x-x-x-x-x-x-x-x-x-x-x-Reports ends here-x-x-x-x-x-x-x-x-x-x-x-x-x-x-x-x-x-x-x-x-

03 Matrix
No ratings yet
03 Matrix
112 pages
AST1
No ratings yet
AST1
84 pages
Dat Science: CLASS 11: Clustering and Dimensionality Reduction
No ratings yet
Dat Science: CLASS 11: Clustering and Dimensionality Reduction
30 pages
Feature Extraction and Dimensionality Reduction - 2
No ratings yet
Feature Extraction and Dimensionality Reduction - 2
75 pages
D3S2 - Unsupervised - Dimensionality Reduction
No ratings yet
D3S2 - Unsupervised - Dimensionality Reduction
81 pages
Chapter 2
No ratings yet
Chapter 2
70 pages
Feature Engineering
No ratings yet
Feature Engineering
51 pages
Supplementary - Active Learning Alloys
No ratings yet
Supplementary - Active Learning Alloys
38 pages
100 Time Series Data Mining Questions With Answers
No ratings yet
100 Time Series Data Mining Questions With Answers
26 pages
Lecture 2. Dimension Reduction
No ratings yet
Lecture 2. Dimension Reduction
71 pages
Lec 3
No ratings yet
Lec 3
60 pages
Weekly Homework X
No ratings yet
Weekly Homework X
15 pages
TSIndexing
No ratings yet
TSIndexing
64 pages
Day School 03
No ratings yet
Day School 03
32 pages
Anirban CMI StatFin 2019 II
No ratings yet
Anirban CMI StatFin 2019 II
92 pages
Ruiz Modified I2ml3e Chap6
No ratings yet
Ruiz Modified I2ml3e Chap6
38 pages
Recommender Systems Assignment
No ratings yet
Recommender Systems Assignment
10 pages
CHP 4
No ratings yet
CHP 4
72 pages
Obstetics Simplified El-Mowafi
50% (2)
Obstetics Simplified El-Mowafi
515 pages
CS-3035 (ML) - CS End April 2024
No ratings yet
CS-3035 (ML) - CS End April 2024
21 pages
2017, Nguyen - Ranked - Time - Series - Matching - by - Interleaving - Similarity - Distances
No ratings yet
2017, Nguyen - Ranked - Time - Series - Matching - by - Interleaving - Similarity - Distances
10 pages
Sta 5
No ratings yet
Sta 5
16 pages
Question&Answer
No ratings yet
Question&Answer
9 pages
L09 OtherDimensionReductionMethods-1
No ratings yet
L09 OtherDimensionReductionMethods-1
29 pages
Hierarchicalclustering
No ratings yet
Hierarchicalclustering
20 pages
AML Unit - 1 Material
No ratings yet
AML Unit - 1 Material
36 pages
I2ml3e Chap6
No ratings yet
I2ml3e Chap6
37 pages
SMAI-M20-06: Data, Distances and Learning: C. V. Jawahar
No ratings yet
SMAI-M20-06: Data, Distances and Learning: C. V. Jawahar
24 pages
DB Blind
No ratings yet
DB Blind
11 pages
CB0494 Notes
No ratings yet
CB0494 Notes
6 pages
ML Co4 Session 29
No ratings yet
ML Co4 Session 29
36 pages
Dip Ii-Unit
No ratings yet
Dip Ii-Unit
7 pages
ML Unit4 QB Solutions
No ratings yet
ML Unit4 QB Solutions
8 pages
Mahoney Drineas 2009 Cur Matrix Decompositions For Improved Data Analysis
No ratings yet
Mahoney Drineas 2009 Cur Matrix Decompositions For Improved Data Analysis
6 pages
02data Part4
No ratings yet
02data Part4
28 pages
Clustering Slides
No ratings yet
Clustering Slides
22 pages
Unsupervised Learning Algorithm 1
No ratings yet
Unsupervised Learning Algorithm 1
3 pages
Data Preprocessing
No ratings yet
Data Preprocessing
8 pages
Unsuper
No ratings yet
Unsuper
15 pages
16 dm2 Dimred 2022 23
No ratings yet
16 dm2 Dimred 2022 23
49 pages
DM Lab 02
No ratings yet
DM Lab 02
12 pages
Lecture 15 - 23.09.2024 - Feature Selection
No ratings yet
Lecture 15 - 23.09.2024 - Feature Selection
47 pages
ML 4
No ratings yet
ML 4
14 pages
10 EST Solution
No ratings yet
10 EST Solution
16 pages
Facial Recognition and Mathematics - Vectors and Geometry in Action
No ratings yet
Facial Recognition and Mathematics - Vectors and Geometry in Action
6 pages
Exercise - 6: DS203-2024-S1 Problem1:: Statistics
No ratings yet
Exercise - 6: DS203-2024-S1 Problem1:: Statistics
10 pages
Data Science Cheatsheet
No ratings yet
Data Science Cheatsheet
5 pages
Quiz 1-A
No ratings yet
Quiz 1-A
5 pages
Machine Learning Mindmap PDF
100% (1)
Machine Learning Mindmap PDF
5 pages
MODELS (AutoRecovered)
No ratings yet
MODELS (AutoRecovered)
9 pages
Cluster Analysis Introduction
No ratings yet
Cluster Analysis Introduction
23 pages
Sample Practical Evolutionary Algorithms
No ratings yet
Sample Practical Evolutionary Algorithms
12 pages
Planning Kopia
No ratings yet
Planning Kopia
4 pages
Comparison of The Performance of GaussianNB Algorithm, The K Neighbors Classifier Algorithm
No ratings yet
Comparison of The Performance of GaussianNB Algorithm, The K Neighbors Classifier Algorithm
11 pages
An Approach of Hybrid Clustering Technique For Maximizing Similarity of Gene Expression
No ratings yet
An Approach of Hybrid Clustering Technique For Maximizing Similarity of Gene Expression
14 pages
COSC 6335 Data Mining (Dr. Eick) Solution Sketches Midterm Exam October 25, 2012
No ratings yet
COSC 6335 Data Mining (Dr. Eick) Solution Sketches Midterm Exam October 25, 2012
11 pages
Rangkuman Data Analitik Dan Big Data
No ratings yet
Rangkuman Data Analitik Dan Big Data
10 pages
Fatehpursikri 160411155403 PDF
No ratings yet
Fatehpursikri 160411155403 PDF
65 pages
3.5 Food Tests
No ratings yet
3.5 Food Tests
5 pages
Lacrosse Offense
100% (1)
Lacrosse Offense
21 pages
Information Bulletin For PHD Admission AUTUMN 2025
No ratings yet
Information Bulletin For PHD Admission AUTUMN 2025
26 pages
Po 390
No ratings yet
Po 390
264 pages
Scorecard For Beauty Pageant
No ratings yet
Scorecard For Beauty Pageant
2 pages
Msds Coagulation
No ratings yet
Msds Coagulation
54 pages
Poem I (From Tao-Teh-Ching) : Lao Tzu
No ratings yet
Poem I (From Tao-Teh-Ching) : Lao Tzu
4 pages
B4 13858
No ratings yet
B4 13858
125 pages
Structure & Bonding Poster
No ratings yet
Structure & Bonding Poster
1 page
Lecture 6
No ratings yet
Lecture 6
38 pages
End of Chapter 2 Test
No ratings yet
End of Chapter 2 Test
4 pages
ACI E-107 Aggregates For Concrete (Summary)
No ratings yet
ACI E-107 Aggregates For Concrete (Summary)
4 pages
Lecture 2
No ratings yet
Lecture 2
17 pages
7th Grade General Science Proficiency Scales
No ratings yet
7th Grade General Science Proficiency Scales
10 pages
Solving Linear Systems: Decomposition
No ratings yet
Solving Linear Systems: Decomposition
24 pages
Computer Science Revision Notes
No ratings yet
Computer Science Revision Notes
26 pages
Roots: Graphical
No ratings yet
Roots: Graphical
18 pages
Jumper (3.5e Class) - D&D Wiki
No ratings yet
Jumper (3.5e Class) - D&D Wiki
8 pages
#Thomas: Tridiagonal
No ratings yet
#Thomas: Tridiagonal
14 pages
GE6 Module 3 Topic 2 Philosophical Foundations of Art
No ratings yet
GE6 Module 3 Topic 2 Philosophical Foundations of Art
16 pages
List of Jails in Rajasthan
No ratings yet
List of Jails in Rajasthan
14 pages
Postmodernism and Biology in John Fowles S The French Lieutenant's Woman
No ratings yet
Postmodernism and Biology in John Fowles S The French Lieutenant's Woman
23 pages
TPS61197 Single-String White-LED Driver For LCD TV: 1 Features 3 Description
No ratings yet
TPS61197 Single-String White-LED Driver For LCD TV: 1 Features 3 Description
30 pages
VDRIVE Manual
No ratings yet
VDRIVE Manual
24 pages
Guidlances Modulo 2
No ratings yet
Guidlances Modulo 2
4 pages
MPH - Series MP FILTRI
No ratings yet
MPH - Series MP FILTRI
7 pages
Recommendations For Gem Stones: Mousumi Chttopadhyay
No ratings yet
Recommendations For Gem Stones: Mousumi Chttopadhyay
1 page
TKC41005
No ratings yet
TKC41005
2 pages
D Manikanta Swamy
No ratings yet
D Manikanta Swamy
8 pages
Structure and Features: Mounting Holed Type High Rigidity Crossed Roller Bearings V
No ratings yet
Structure and Features: Mounting Holed Type High Rigidity Crossed Roller Bearings V
14 pages
BME 301 HW #1 Solutions
No ratings yet
BME 301 HW #1 Solutions
2 pages
Introducing Wireless Proximity Switches: Technology Review
No ratings yet
Introducing Wireless Proximity Switches: Technology Review
8 pages
Case Study 1.2 44-47
No ratings yet
Case Study 1.2 44-47
4 pages
House Reminders
No ratings yet
House Reminders
4 pages
Radial Basis Networks: Fundamentals and Applications for The Activation Functions of Artificial Neural Networks
From Everand
Radial Basis Networks: Fundamentals and Applications for The Activation Functions of Artificial Neural Networks
Fouad Sabry
No ratings yet
Mesh Generation: Advances and Applications in Computer Vision Mesh Generation
From Everand
Mesh Generation: Advances and Applications in Computer Vision Mesh Generation
Fouad Sabry
No ratings yet
Level Set Method: Advancing Computer Vision, Exploring the Level Set Method
From Everand
Level Set Method: Advancing Computer Vision, Exploring the Level Set Method
Fouad Sabry
No ratings yet
Bundle Adjustment: Optimizing Visual Data for Precise Reconstruction
From Everand
Bundle Adjustment: Optimizing Visual Data for Precise Reconstruction
Fouad Sabry
No ratings yet

Coding Assignment Report

Uploaded by

Coding Assignment Report

Uploaded by

Mathematical foundation for AI (CS303)

Name: Nishit Prajapati

Why Diffusion Maps outperform PCA/t-SNE?

2. Simulated Annealing (SA)

Best Use Case: Medium-dimensional continuous optimization with correlated

-x-x-x-x-x-x-x-x-x-x-x-x-x-x-x-x-x-x-x-x-x-x-x-Reports ends here-x-x-x-x-x-x-x-x-x-x-x-x-x-x-x-x-x-x-x-x-

You might also like