0% found this document useful (0 votes)
33 views37 pages

Intro To ML Notes

The document discusses machine learning topics such as principal component analysis, different machine learning tasks, and types of learning. It also covers math concepts for machine learning like eigenvectors and eigenvalues as they relate to finding the direction of maximum variance in data. Standardizing data and finding the principal components are described as ways to maximize the variance in projections of the data.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
33 views37 pages

Intro To ML Notes

The document discusses machine learning topics such as principal component analysis, different machine learning tasks, and types of learning. It also covers math concepts for machine learning like eigenvectors and eigenvalues as they relate to finding the direction of maximum variance in data. Standardizing data and finding the principal components are described as ways to maximize the variance in projections of the data.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 37

Intro to ML

Lee-1
:

- PCA cond .

-
what is M2 ?

- ML and SDE

-
ML Tasks

learning
-

Types of
-
1 9650204730
(Whatsapp
# :

Email
jitn. Gupta
:
& scelecom -1

notes + cheat
PCA >
- Short note fall Sheet
=> Lec10
Math for ML
d
lecture notes
in today's
as well .
Rap a Weight Y
X -
x
T

the columns
S1) Standardize
·
> X

direction with
S2) find the

manimum variance

>
- Height

Goal
-
>
-
Maximize
projection
* I

angman-

U
IX
N
·
i= 1
lete
2

argman + Eiti)2 St .
Inli oe

- i* =

argman / (FU)" + In e
-
i
i. T
-
,I .

U ,

Xi =

+T
F
..
.

Uz =

u
Vector ! as :
-
T
U: . .

i: T T
!
·

!
. .

InT Ud int NX1


Wed dx 1

Xull
" "

1 =
(i ,
+
5) + (F2 5) +...+ (UNT)

Kull- (iiT
Xull (xn) (xn) Vall=
11 =

E
T
BTAT
identify (AB) =

Xull"
* +
=> x xi
I =
Jul 2 -
argman/UTXTX En e
+
= T* xix
covarice
=
=

Jul
[t 1
* UT Si +
=>
i
-

-angmen
-
it to zeno
& equate
Take the gradient

dul-6] = S +
6 Fn
-fuls
+

u5 Si + x(2)
=* *

Ensures
I

TC ] b(2n)
↓x2si +
-..
=
-

. .

i
2s 6(2)

if + =
0

- si = =
JNT

ded dx1
-
Si = d'u u
X = constant
vo
so
vector
scales
matrina Vector a 5.
2 (4 , 22

when a matrice is multiplied i40 (2, 1)

of what happens >


a vector
with
7-
-

3
Es .
s =

45
i =

[5]
2x 1
- +2

Sa :

1 ]13T [
%
=
J Est] List
: =

1 =
3, 23) N
M
Si
si du

(2 3)
.

i
-

C N >
f
/45
Eg
if
-
-
- -

s =
= =

si :

[ ] ( i / kit
"

[=
d
1 Si (6 24) d'u
Si
,
=

= 6
1 i (1 4)
-

Eigenvector of matric
,

* (1 , u) is a

3 is G
q

S and its Eigenvalue


Eigenvectors and Eigenvalues

Si = 61n

and bi its eigenvalue


eigenvector of
s
zb i is an

have
'd'
(dxd) will
Covenance
matric s
eigenvalue
=>
has
eigenvector
an
and each
eigenvectors -

has
with maximum variance
i The directions

the largest eigenvalue


and eigenvalue
- we will find all the eigenvector
pains
order of
- We will sort them in the descending
d 3
eigenvalues . >
-

it is is in its--Id

de da da du As ↓d

> decreasing
eigenvalue
d d' 3
Eg.
=
>
-

directions Un is
-

My
we
will take our final as :

um
d-dimensional
co-ordinates of all the datapoints
Now just
need to find
datapoint along
we
each
-

directions . - take proj of


these 3
along i, Us
My ,

↑ e
Cas
1 -I
----

2 la
Li , is i Us (
i

i, -
E
, H2 ⑨ 3
, -
2
Miz

Us)
"
E2
Kie (iz" i M2 H2 ,

: !
did
[ETNT U , ,
ITNTE2 ,
IN Us)
1)
i -- 2

Eg . d = 4 i =
2
=
2
Us I
O

I O

O
I
Y
2,
5
2

-
I fe fo ts fu
-
T
u. U 52 -
1

+ 50+211 + ( 1) X 1
"
5 12
-

4 X1 R2 50
i ,Th ,
-

= -

i ,T in = 4 + 10 + 2 +0
= S

-1 = T
8 + 0
ii is +
T = 0

U
i I 5
5 zb
8
2
T
·
I
PCA
Information retained after
i, is is in --- id
*

d d d
, da bu Id
* >
-
33

di+ but by +- ... Jal


explained
z
variance

↓ ,
d + 62 +
by -... + dd

into retained
d

&
Edi
·

Edj
ja /
So fas : mathematical implementation
M
and Pandas)
DS libraries (Numpy data manipulation
Seaborn)"
- >
-

"(Matplotlib and
preprocessing
&
& data viz .

& EDA
conditional prob
Prob stats marginal & -

- -

~
prob dist.
.

hypothesis testing
"

& Coordinate Geometry


Linear algebra
mathematically
-

them
Y Build intuitions I implement

- Calculus & optimization


it -I l >
-
Py = 00g

gradients
Y manima & 0:6
[dell
minima
is
+
P1 =


GradientDescent
in naive bayes models
conditional probabilitie
Visualization
Pta
>
-
(Basic revision

Numerical >
-
histogram ,
KDE ,
bomplot...
>
-
I variable
↳ Categorical >
- count plot barplot
,
.....

< Num-Num >


-

Scatterphof
Pd Crosstab (Contigency table)
Barplots ,
-

I variables car-cet >


-

>

>
- Bouplot
>
- Num-Cat

with have
Scatte plot
3 variables >
->
Num-Num-cat ->
height
d
weight , height ,
genden
a :
... o
·
nue-gender
..
ou
·
8 - Male
-

Female
O G O o

· +

...
=> u
O

> weight
marks
in a class
Quiz) Students
- bouplot
histogram
-

↳ Numerical >
-
,

Inso 20
s
ML Classical
programming
vs

-Span
Email
· non-spam (hem)

Classical programming perspective


emails
* Identity commonly occuring/phrases in span

~ store them - spam-words


span-words O
Grade
words
>
-
Spon
of
in
lot
email >
-

if a

- else >
-

Cham d
-
> Decision
Final
Rules
>
- Decision
scass
llogic

can learn the rules


* Rules are rigid >
-
Spammers
pattern needs to be
* A lot of
hard coding - every new

added separately
"mine"
o can me
create algorithms which can

the data !
the rules and patterns from
ML perspective
I-
Imput Decision
>
-
Model >
-

SSpan/Non-Spam)
( Machine -

learning
the basis
Decisions
learning patterns
>
- on

3 Machine is
learnt
of patterns

Data Decision/labels
* Visualize ML approach
NS
Es
>
model learns the
data
-

training
NS

Input >
-

Training Ez
pattern from
S
e NS data

per S
Trained model
N21000 ↓
email
Predict whether
is spam) non-spam
is ML ?
what
emails spam or

classification of
as
Task (T) :
eg. non-spam

labelled emails
Experience (E) :
Training data .
eg

how
well
Metric which
quantities
Performance (P)
:

Your
model is
working

eg
. Actual Predicted
NS % Jemails accurately classified
④ S

② S S
I & x100 = 80%
③ NS NS 100

④ NS NS
i (

L
As you provided more experience (training datal
should increase
performance
.

your

TEP model >


-
Yask-experience
-

performance
Task Experience Performance

difference in the actual


historical prices
① predict stock value
stock
and predicted
prices of the

& categorize Labelled images , % of accurately predicted


images into labolled
with images
images
rabbit
cat ,
dog , as C/D/R
Types tasks
of
* Classification A Regression
data points
Predict a

classifying numerical value

into one of the categories

Recommendation
*
Clustering *
video,
Similar samples
movies
Recommend a

Group
etc
data points product .

or

* Time Series forecasting


stock values , future sales revenue

Predicting
,
Classification task
ef .
-
using labelled data >
-

Supervised Algo
fron trand or not-fraud
Frand detection >
-
classifying as

mail as spam non-span


-
Classifying

task labelled date -> Supervised Algo


Regression eg > using
.
-

house prices #ufrooms area


of house , locality etc

Predicting
-

,
-

E
T

Spinny
-
+ model , age mileage , odometer
old
,
-

predicting
car
prices
T
Clustering task eg .
>
-
Unsupervised learning

labelled data
-
does require
not
their
customers based or
-
Amazon is
grouping
and purchasing patterns
.

profile Amount spent


Address Total purchases d

⑱C
T
I
-
Cust I :
-
·:
!:
O

O
:
O

- ↳I

#of times
- -

customel
visited
transactions data
Clustering
-

on basis

# of time was rejected


a
transaction

⑭te O -
f f
X X

i
x

De
+ ↓ X
x

#y *

of
y
amount
>
Monetary
transaction
random images
- A lot of
Y
group
all images
which contain dog ,
cat
,
rabbit
car
Dog
·
>
-

OO
1 B is P
B

BB
Q Dog
B is D
D B B

O
>
-
Rabbit
D
i D D

- Rabbit
& Unsupervised
Impervised Learning
labelled data
Supervised >
-
They learn
using
d target
f f2 As -- fd

I
.

&

2
labels
3

·
y
-
labdled date
unsupervised >
-

training data has no labels

learning >
-
Clustering
fitz . .
-
fd Y Recommendation



&

!
Show houses with Grooms
Quiz Customers >
- me

all the houses


with rooms
Seller >
-

group
-
clustering
Field 1
Customer
-
Miz surveys
-
field 2

field 3

d
surveys
on
the basis of
grouping
responses

detering >
- unsupervised learning
stock prices > historical data
Quiz >
-
predicting

volume
indicators of
Price
stock
labeled
no
Supervised learning ↓
future prices
candidates
Quiz- job as pirants classifying

A month
HR >
-
notice period - <

> < 12(P


Salary expectation
-

:
Favorable

f
Candidate I -

train
candidate 2
- C Unfavorable -

-s ↑
candidate
Doubts
-

Novic
dataset 4
Genze Director Year

·
-
-

Name
g

M1 Matrin Sailfi
08 Mis
Mr
·

M2 Avatar Selfi O O

Comedy ·
M3 Toy story Animation ·
M10
My O
(
&

>
*
encoding

Ill
user
date
Me Ma M3 MuMgr
----
Miob

T
Us · 10
-

Us

-
Nationality Watch time

insus
-
...

I
Year
Actors genzes
Director

]
-

Safi
Comedy
>
-

-
-

Mo , 00 TI
-
SDE models
M2 model
+

You might also like