Machine Learning Algorithm Cheat Sheet
Machine Learning Algorithm Cheat Sheet
Algorithm Advantages Disadvantages Use Cases Key Parameters Bias-Variance Complexity (Class) Outlier Affect
Linear Regres- Simple, inter- Assumes linearity, Predictive an- Learning rate, High bias, low O(n) (Low) High
sion pretable, fast sensitive to outliers alytics, trend L1/L2 regulariza- variance
analysis tion
Logistic Regres- Probabilistic out- Requires feature Binary classifi- Learning rate, High bias, low O(n) (Low) Moderate
sion put, efficient scaling, linear cation L1/L2 regulariza- variance
boundaries tion
Decision Trees Easy interpreta- Prone to overfit- Classification, Max depth, min Low bias, high O(n log n) (Medium) High
tion, handles non- ting, sensitive to regression, fea- samples split variance
linearity noise ture importance
Random Forest Reduces overfit- Computationally Classification, Number of estima- Moderate bias, O(n log n) (Medium) Low
ting, handles miss- expensive, less in- regression, fea- tors, max depth moderate vari-
ing values terpretable ture importance ance
Support Vec- Effective in high Slow training, Classification, Kernel type, C, Low bias, high O(n2 )toO(n3 )(High) High
tor Machines dimensions, robust memory-intensive regression, out- gamma variance
(SVM) lier detection
K-Nearest Simple, non- Computationally Classification, Number of neigh- Low bias, high O(n) per query (Low) High
Neighbors parametric, adapt- expensive, sensitive regression, rec- bors (K), distance variance
(KNN) able to noise ommendation metric
Naive Bayes Fast, requires less Assumes feature Text classifi- Smoothing param- High bias, low O(n) (Low) Low
data, good for text independence, lim- cation, spam eter variance
ited complexity filtering, senti-
ment analysis
Gradient Boost- High accuracy, Computationally Classification, Learning rate, Low bias, high O(n*m*d) (High) Moderate
ing (XGBoost, handles missing expensive, prone to regression, number of trees, variance
LightGBM, data overfitting ranking max depth
CatBoost)
Neural Net- Complex patterns, Requires large Image/speech Layers, learning Low bias, high Varies (High) Moderate to High
works (Deep handles large data data, computa- recognition, rate, activation variance
Learning) tionally expensive NLP
K-Means Clus- Simple, scalable Sensitive to initial- Clustering, Number of clus- N/A O(n*k*i) (Medium) High
tering ization, spherical segmentation, ters (K), distance
clusters anomaly detec- metric
tion
Hierarchical No need for K, in- Computationally Taxonomy, Linkage criteria N/A O(n2 logn)(High) Moderate
Clustering terpretable dendro- expensive, difficult grouping, bioin- (single, complete,
gram for large datasets formatics average)
End of table
1
Continued on next page
Algorithm Advantages Disadvantages Use Cases Key Parameters Bias-Variance Complexity (Class) Outlier Affect
Principal Com- Reduces dimen- Loss of inter- Dimensionality Number of compo- N/A O(n*m2 )(M edium) High
ponent Analysis sionality, removes pretability, requires reduction, fea- nents
(PCA) multicollinearity standardization ture extraction,
noise reduction
Reinforcement Adapts to environ- Requires extensive Robotics, game Discount factor, Varies Varies greatly (High) Varies
Learning (Q- ment, long-term training, complex playing, finance learning rate, ex-
Learning, optimization tuning ploration rate
DDPG, PPO)
Evaluation Metrics
2
Metric Type Metric Name Description and Formula Use Case
Continued on next page
Metric Type Metric Name Description and Formula Use Case
q P
1 n 2
Regression Root Mean Squared Er- n i=1 (yi − ŷi ) Same units as target,
ror (RMSE) common for regression
P 2
Regression R-squared (Coefficient of 1− P(yi −ŷi )2 Model fit, variance ex-
(yi −ȳ)
Determination) plained
Clustering Silhouette Score [-1, 1], similarity to cluster vs. others. Cluster quality, validation
Clustering Davies-Bouldin Index Lower is better, scatters vs. separa- Cluster quality, validation
tions.
Dimensionality Explained Variance Ratio Variance along principal components. Feature reduction, data
Reduction (PCA) understanding