Data Science Full Archive Notes ?
Data Science Full Archive Notes ?
avichawla.substack.com
avichawla.substack.com
Table of Contents
Breathing KMeans: A Be0er and Faster Alterna4ve to KMeans ........................ 8
How Many Dimensions Should You Reduce Your Data To When Using PCA? .....11
! Mito Just Got Supercharged With AI! ..........................................................14
Be Cau4ous Before Drawing Any Conclusions Using Summary Sta4s4cs ..........16
Use Custom Python Objects In A Boolean Context............................................18
A Visual Guide To Sampling Techniques in Machine Learning ...........................20
You Were Probably Given Incomplete Info About A Tuple's Immutability .........24
A Simple Trick That Significantly Improves The Quality of Matplotlib Plots ......26
A Visual and Overly Simplified Guide to PCA ....................................................28
Supercharge Your Jupyter Kernel With ipyflow ................................................31
A Lesser-known Feature of Crea4ng Plots with Plotly ......................................33
The Limita4on Of Euclidean Distance Which Many Ocen Ignore ......................35
Visualising The Impact Of Regularisa4on Parameter .......................................38
AutoProfiler: Automa4cally Profile Your DataFrame As You Work....................40
A Li0le Bit Of Extra Effort Can Hugely Transform Your Storytelling Skills ..........42
A Nasty Hidden Feature of Python That Many Programmers Aren't Aware Of .44
Interac4vely Visualise A Decision Tree With A Sankey Diagram .......................47
Use Histograms With Cau4on. They Are Highly Misleading! ............................49
Three Simple Ways To (Instantly) Make Your Sca0er Plots Clu0er Free ............51
A (Highly) Important Point to Consider Before You Use KMeans Next Time ......54
Why You Should Avoid Appending Rows To A DataFrame ................................57
Matplotlib Has Numerous Hidden Gems. Here's One of Them..........................59
A Counterintui4ve Thing About Python Dic4onaries ........................................61
Probably The Fastest Way To Execute Your Python Code ..................................64
Are You Sure You Are Using The Correct Pandas Terminologies? ......................66
Is Class Imbalance Always A Big Problem To Deal With? ..................................69
A Simple Trick That Will Make Heatmaps More Elegant ..................................71
A Visual Comparison Between Locality and Density-based Clustering ..............73
Why Don't We Call It Logis4c Classifica4on Instead? .......................................74
A Typical Thing About Decision Trees Which Many Ocen Ignore ......................76
1
avichawla.substack.com
Always Validate Your Output Variable Before Using Linear Regression ............77
A Counterintui4ve Fact About Python Func4ons ..............................................78
Why Is It Important To Shuffle Your Dataset Before Training An ML Model ......79
The Limita4ons Of Heatmap That Are Slowing Down Your Data Analysis .........80
The Limita4on Of Pearson Correla4on Which Many Ocen Ignore ....................81
Why Are We Typically Advised To Set Seeds for Random Generators? ..............82
An Underrated Technique To Improve Your Data Visualiza4ons .......................83
A No-Code Tool to Create Charts and Pivot Tables in Jupyter............................84
If You Are Not Able To Code A Vectorized Approach, Try This. ..........................85
Why Are We Typically Advised To Never Iterate Over A DataFrame?................87
Manipula4ng Mutable Objects In Python Can Get Confusing At Times ............88
This Small Tweak Can Significantly Boost The Run-4me of KMeans .................90
Most Python Programmers Don't Know This About Python OOP .....................92
Who Said Matplotlib Cannot Create Interac4ve Plots? ....................................94
Don't Create Messy Bar Plots. Instead, Try Bubble Charts! ...............................95
You Can Add a List As a Dic4onary's Key (Technically)! .....................................96
Most ML Folks Ocen Neglect This While Using Linear Regression ....................97
35 Hidden Python Libraries That Are Absolute Gems .......................................98
Use Box Plots With Cau4on! They May Be Misleading. ....................................99
An Underrated Technique To Create Be0er Data Plots ...................................100
The Pandas DataFrame Extension Every Data Scien4st Has Been Wai4ng For 101
Supercharge Shell With Python Using Xonsh .................................................102
Most Command-line Users Don't Know This Cool Trick About Using Terminals
.....................................................................................................................103
A Simple Trick to Make The Most Out of Pivot Tables in Pandas .....................104
Why Python Does Not Offer True OOP Encapsula4on.....................................105
Never Worry About Parsing Errors Again While Reading CSV with Pandas .....106
An Interes4ng and Lesser-Known Way To Create Plots Using Pandas .............107
Most Python Programmers Don't Know This About Python For-loops ............108
How To Enable Func4on Overloading In Python .............................................109
Generate Helpful Hints As You Write Your Pandas Code .................................110
Speedup NumPy Methods 25x With Bo0leneck .............................................111
Visualizing The Data Transforma4on of a Neural Network ............................112