0% found this document useful (0 votes)
3 views

ML Algorithms Explained (1)

The document provides an overview of various machine learning algorithms including KNN, Random Forest, SVR, Gradient Boosting, XGBoost, AdaBoost, and Extra Trees. Each algorithm is described in terms of its type, functionality, advantages, and disadvantages, along with a comparison table highlighting their training time, prediction time, accuracy, overfitting risk, and other features. This summary serves as a guide to understanding the strengths and weaknesses of different machine learning approaches.

Uploaded by

Prachi patro
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

ML Algorithms Explained (1)

The document provides an overview of various machine learning algorithms including KNN, Random Forest, SVR, Gradient Boosting, XGBoost, AdaBoost, and Extra Trees. Each algorithm is described in terms of its type, functionality, advantages, and disadvantages, along with a comparison table highlighting their training time, prediction time, accuracy, overfitting risk, and other features. This summary serves as a guide to understanding the strengths and weaknesses of different machine learning approaches.

Uploaded by

Prachi patro
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 3

Machine Learning Algorithms Explained

KNN (K-Nearest Neighbors)


- Type: Supervised, Non-parametric
- How it works:
- Stores all training data.
- For a new point, it finds the 'K' closest data points.
- Classification: majority vote.
- Regression: average value.
- Pros: Simple, no training needed.
- Cons: Slow for large data, sensitive to irrelevant features.

Random Forest
- Type: Ensemble of Decision Trees
- How it works:
- Builds many decision trees using random subsets of data/features.
- Classification: majority voting.
- Regression: average result.
- Pros: High accuracy, handles missing values.
- Cons: Slower and less interpretable than single tree.

SVR (Support Vector Regression)


- Type: Regression (based on SVM)
- How it works:
- Fits the best line within a margin (epsilon).
- Points outside margin are penalized.
- Uses kernels for non-linear data.
- Pros: Works well in high dimensions, kernel flexibility.
- Cons: Hard to tune, slow for large data.

Gradient Boosting
- Type: Ensemble (Boosting)
- How it works:
- Sequentially builds trees correcting previous errors.
- Focuses more on difficult cases.
- Pros: High accuracy, captures complex patterns.
- Cons: Slow training, prone to overfitting.
XGBoost (Extreme Gradient Boosting)
- Type: Optimized Gradient Boosting
- How it works:
- Similar to Gradient Boosting but faster, regularized.
- Includes pruning, regularization, parallelism.
- Pros: Fast, accurate, widely used.
- Cons: Complex to tune.

AdaBoost (Adaptive Boosting)


- Type: Ensemble (Boosting)
- How it works:
- Trains weak learners sequentially.
- Each focuses on previous errors.
- Assigns weights to predictions.
- Pros: Simple, effective on clean data.
- Cons: Sensitive to noise and outliers.

Extra Trees (Extremely Randomized Trees)


- Type: Ensemble of Decision Trees
- How it works:
- Similar to Random Forest but more randomness.
- Random thresholds for splits.
- Pros: Fast, reduces overfitting.
- Cons: Might be less accurate than Random Forest.

Comparison Table
Feature KNN Random SVR Gradient XGBoost AdaBoos Extra
Forest Boosting t Trees

Type Lazy, Ensembl Margin- Ensembl Gradient Boosting Ensembl


Instance e based e Boosting e
-based (Bagging Regressi (Boostin (Bagging
) on g) )

Training Fast Medium Slow Slow Fast Medium Fast


Time

Predictio Slow Medium Medium Slow Fast Medium Fast


n Time

Accuracy Medium High Medium High Very High High


High

Overfitti High Low Medium High (if Low High Low


ng Risk not (with (with
tuned) tuning) noise)

Handles Yes Yes Yes Yes Yes Yes Yes


Non- (distanc (with
linearity e-based) kernels)

Feature No Yes No Yes Yes Yes Yes


Importa
nce

Robust No Yes No No Yes No Yes


to
Outliers

You might also like