Machine Learning Notes
Machine Learning Notes
STATISTICS
INFERENCE -> sample is subset of population. takes sample dataset to predict the
total outcome of all the population.
Quantitative ->
Descriptive is countable, Continuous product price
21/05/2025
uniform distribution implies that all outcomes within a specific range have an
equal probability of occurring, while a normal distribution (also known as the
Gaussian distribution) describes a probability distribution where most data points
cluster around the mean, tapering off symmetrically toward the extremes
Distribution types:
Binomial, Bernoulli
Normal, Uniformed, Continuous
Problem we face in ML-> missing values, bias, imbalance, choosing the right algo,
getting a labelled data
when to use ML and DL -> if limited data points t\are there then ML is preferred
else DL is used.
ENCODING -> textual data to numerical data ml can't process text. like a matrix,
for n values conider only n-1 cols in the matrix.
scikit-learn, skilearn.
how outliers are treated -> imputation, remove, transformation, capping, binning,
problems when many feature's are there -> overfitting, time,complexity, performance
will decrease(inaccurate predictions)
problem is having many fatures -> dimension reducing techniques (PCA for linear
datas, TSNE for non-linear datas) ;
23/5/25
Gradient Descent -> start with guess, calculate the error, calculate the
gradient(go through videos), update the value of m1 b, Repeat find min value for
cost price.
Gradient descent has three main types: batch, stochastic, and mini-batch.