0% found this document useful (0 votes)
352 views14 pages

Machine Learning

This document provides an overview of the topics covered in an ML course over 9 weeks. Week 1 introduces concepts of probability and random variables. Week 2 covers discrete and continuous random variables and related formulas. Week 3 presents the Gaussian distribution and elements of information theory. Weeks 4-5 cover decision trees, the ID3 algorithm, and extensions. Weeks 6-7 cover Bayesian classifiers like Naive Bayes. Week 8 is a midterm exam. Week 9 covers instance-based learning and the k-NN algorithm. The document lists theoretical concepts, formulas, and exercises for each week from a textbook.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
352 views14 pages

Machine Learning

This document provides an overview of the topics covered in an ML course over 9 weeks. Week 1 introduces concepts of probability and random variables. Week 2 covers discrete and continuous random variables and related formulas. Week 3 presents the Gaussian distribution and elements of information theory. Weeks 4-5 cover decision trees, the ID3 algorithm, and extensions. Weeks 6-7 cover Bayesian classifiers like Naive Bayes. Week 8 is a midterm exam. Week 9 covers instance-based learning and the k-NN algorithm. The document lists theoretical concepts, formulas, and exercises for each week from a textbook.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

ML course, 2015 fall

What you should know:


Week 1: Random events
Concepts/definitions:
event/sample space, random event
probability function
conditional probabilities
independent random events (2 forms);
conditionally independent random events
(2 forms)

Theoretical results/formulas:
elementary probability formula:
# favorable cases
# all possible cases
the multiplication rule; the chain rule
total probability formula (2 forms)
Bayes formula (2 forms)

Exercises illustrating the above concepts/definitions and theoretical results/formulas,


in particular: proofs for certain properties derived from the definition of the probability function
for instance: P () = 0
= 1 P (A)
P (A)
A B P (A) P (B)
Bonferronis inequality: P (A B) P (A) + P (B) 1
Ciortuz et al.s exercise book: ch. 1, ex. 1-5 [6-7], 42-45 [46-48]

Week 2: Random variables


Concepts/definitions:
random variables;
random variables obtained through function composition
cumulative function distribution

Theoretical results/formulas:

discrete random variables;


probability mass function (pmf)
examples: Bernoulli, binomial, geometric,
Poisson distributions

P
for any discrete variable X:
x p(x) = 1, where p is the pmf of X

continuous random variables;


probability density function (pdf)
examples: Gaussian, exponential, Gamma
distributions

E[X + Y ] = E[X] + E[Y ]


E[aX + b] = aE[X] + b
Var[aX] = a2 Var[X]
Var[X] = E[X 2 ] (E[X])2
Cov(X, Y ) = E[XY ] E[X]E[Y ]

expectation (mean), variance, standard


variation; covariance. (See definitions!)
multi-valued random functions;
joint, marginal, conditional distributions
independence of random variables;
conditional independence of random variables
Advanced issues:
vector of random variables;
Covariance matrix for a vector of random
variables;
pozitive [semi-]definite matrices,
negative [semi-]definite matrices

for
R any continuous variable X:
p(x) dx = 1, where p is the pdf of X

X, Y independent variables
Var[X + Y ] = Var[X] + Var[Y ]
X, Y independent variables
Cov(X, Y ) = 0, i.e. E[XY ] = E[X]E[Y ]
Advanced issues:
For any vector of random variables, the
covariance matrix is symmetric and positive semi-definite.

Exercises illustrating the above concepts/definitions and theoretical results/formulas, concentrating especially on:
identifying in a given problems text the underlying probabilistic distribution: either a basic
one (e.g., Bernoulli, binomial, categorial, multinomial etc.), or one derived [by function
composition or] by summation of identically distributed random variables (see for instance
pr. 14, 50, 51);
computing probabilities (see for instance pr. 50, 51a,c);
computing means / expected values of random variables (see for instance pr. 2, 14, 51b,d);
verifying the [conditional] independence of two or more random variables (see for instance
pr. 12, 17, 51, 52).
Ciortuz et al.s exercise book: ch. 1, ex. 8-17 [18-22], 49-54 [55-57]

Week 3:
PART I: The Gaussian distribution (only at the lecture):
the uni-variate case:
pdf, mean and variance, and cdf;
the variable transformation enabling to go from the non-standard case to the standard case;
the Central Limit Theorem the i.i.d. case
[advanced issues: the multi-variate case: pdf, the case of diagonal covariance matrix]
Ciortuz et al.s exercise book: ch. 1, ex. 23, 24, 27; [25, 26]
(During the lecture, we have briefly presented the problems 23, 24 and 27.)
PART II: Elements of Information Theory
Concepts/definitions:
entropy;
specific conditional entropy;
average conditional entropy;
joint entropy;
information gain (mutual information)
Advanced issues:

Theoretical results/formulas:
0 H(X) H(1/n, 1/n, . . . , 1/n)
|
{z
}
n times

IG(X; Y ) 0
H(X, Y ) = H(X) + H(Y |X) = H(Y ) + H(X|Y )
(generalisation: the chain rule)
IG(X; Y ) = H(X)H(X|Y ) = H(Y )H(Y |X)

relative entropy;

H(X, Y ) = H(X) + H(Y ) iff X and Y are indep.

cross-entropy

IG(X; Y ) = 0 iff X and Y are independent

Exercises illustrating the above concepts/definitions and theoretical results/formulas, concentrating especially on:
computing different types of entropies (see ex. 34 and 36);
proof of some basic properties (see ex. 33a, 62), including the functional analysis of the
entropy of the Bernoulli distribution, as a base for drawing its plot;
finding the best IG(Xi ; Y ), given a dataset with X1 , . . . , Xn as input variables and Y as
output variable (see CMU, 2013 fall, E. Xing, W. Cohen, Sample Questions, pr. 4).
Ciortuz et al.s exercise book: ch. 1, ex. 32ab-34, 36, 61-65 [32c-g, 35, 37-40, 66]

Week 4 and 5: Decision Trees


Important Note:
The Overview (Rom.: Privire de ansamblu) of Chapter 2 in Ciortuz et al.s exercise
book is in fact a road map for what we will be doing here. (This note applies also to
all chapters.)
Week 4: application of ID3 algorithm; properties of ID3 and decision trees
Ciortuz et al.s exercise book, ch. 2, ex. 1-7, 20-27
Week 5: extensions to the ID3 algorithm
handling of continuous attributes: Ciortuz et al.s ex. book, ch. 2, ex. 8, 9 [10], 28-29
[30-32]
overfitting: Ciortuz et al.s ex. book, ch. 2, ex. 8, 19bc, 28, 36b
pruning strategies for decision trees: Ciortuz et al.s ex. book, ch. 2, ex. 17, 18, 33, 34
using other impurity neasures as [local] optimality criterion in ID3: Ciortuz et al.s ex.
book, ch. 2, ex. 13
other issues: Ciortuz et al.s ex. book, ch. 2, ex. 11, 12, 14-16, 19ad, 35, 36a
Implementation exercises:
0. CMU, 2012 fall, Zif Bar-Joseph, HW1, pr. 2
given a Matlab/Octave implementation for ID3, work on synthetic, noisy data, and study
the relationship between model complexity, training set size, train and test accuracy;
1. CMU, 2012 spring, Roni Rosenfeld, HW3
complete a given C (incomplete) implementation for ID3; work on a simple example
(Play Tennis from TMs ML book) and on a real dataset (Agaricus-Lepiota Mushrooms);
perform reduced-error (top-down vs bottom-up) pruning to reduce the overfitting;
CMU, 2011 spring, T. Mitchell, A. Singh, HW1, pr. 3
similar to the above one, except that pruning a node is conditioned on getting at least an
increase in accuracy
CMU, 2011 spring, Roni Rosenfeld, HW3
similar to the previous two HWs, except that instead of mushrooms it uses a chess dataset;
2. CMU, 2009 spring, Tom Mitchell, HW1
do an ID3 implementation, including reduced-error prunning and rule post-pruning; work
on a real dataset: predicting the votes in the US House of Representatives
3. CMU, 2011 fall, T. Mitchell, A. Singh, HW1, pr. 2
working with continuous attributes, complete a given a Matlab/Octave implementation for
ID3, perform reduced-error pruning; implement another splitting criterion: the weighted
misclassification rate; work on a real dataset: Breast Cancer
4. CMU, 2008 spring, T. Mitchell, HW1, pr. 2 asked for doing an ID3 implementation,
including reduced-error prunning work on a real dataset: German Credit Approval

Week 6 and 7: Bayesian Classifiers


Week 6: application of Naive Bayes and Joint Bayes algorithms:
Ciortuz et al.s exercise book, ch. 3, ex. 1-4, 11-14
Week 7:
computation of the [training] error rate of Naive Bayes:
Ciortuz et al.s exercise book, ch. 3, ex. 5-7, 15-17;
some properties of Naive Bayes and Joint Bayes algorithms:
Ciortuz et al.s exercise book, ch. 3, ex. 8, 20;
comparisons with other classifiers: Ciortuz et al.s exercise book, ch. 3, ex. 18-19;
Advanced issues:
classes of ML hypotheses: MAP hypotheses vs. ML hypotheses:
Ciortuz et al.s exercise book, ch. 3, ex. 9-10.
Week 8: midterm EXAM
Week 9: Instance-Based Learning
application of the k-NN algorithm:
Ciortuz et al.s exercise book, ch. 4, pr. 1-6; 14-20, 22, [15-21, 23, in the 2016s version]
comparisons with the ID3 algoritm:
Ciortuz et al.s exercise book, ch. 4, pr. 7, 12a, 13.
Shepards algoritm: Ciortuz et al.s exercise book, ch. 4, pr. 8.
Advanced issues:
some theoretical properties of the k-NN algorithm: Ciortuz et al.s exercise book, ch. 4,
pr. 10, 11 [and 14, in the 2016s version]; 21 [22, in the 2016s version].
Implementation exercises:
0. Stanford, 2013 fall, CS106L (Standard C++ Programming Lab) course, Cristian Cibils
Bernardes, HW3:
implementing kd-trees, in C++; use them in conjunction with k-NN;
1. CMU, ? spring, 10-711 (ML) course, HW1, pr. 5
implement k-NN in Matlab; study the evolution of the test error with k, on a dataset
from R2 ;
advanced issues:
compare the performances of k-NN (where k is selected according to the previous task)
and Gaussian Naive Bayes (whose pseudo-code is given in the problem) on the given
dataset;
2. MPI Informatik, Saarbr
ucken (Germany), 2005 spring, Jorg Rahnenf
uhrer and Adrian
Alexa, HW4, pr. 11
use k-NN from R; do training on the Breast Cancer dataset (full and reduced version),
using k-NN (with k selected via corss-validation) preceded by application of a (given)
statistical feature selection filter;make prediction for 3 test probes;

3. CMU, 2004 fall, Carlos Guestrin (?), HW4, pr. 3.2-8


implement k-NN in Matlab; apply it for hand written-character recognition (on the given
dataset); select k based on n-fold cross-vlidation and/or CVLOO;
4. CMU, 2005 spring, Carlos Guestrin (?), HW3, pr. 3.1-4[-7]
implement k-NN in Matlab; apply it for text classification (the dataset is provided), using
the cosine distance (its implementation in Matlab is provided); implement n-fold CV; for
the cosine distance, see http://en.wikipedia.org/wiki/Cosine similarity;
advanced issues:
make comparisons between k-NN and SVM, using the libSVM implementation (Matlab
scripts for train and test with SVM are provided);
5. CMU, 2010 spring, Eric Xing Tom Mitchell, Aarti Singh, HW2, pr. 2.1
implement k-NN in Matlab, using the Euclidian distance; use it for face recognition on
the provided ORL database (perform 10-fold cross-validation);
advanced issues:
CMU, 2011 fall, Eric Xing, HW5, pr. 1.1-2
use 1-NN from Matlab, in conjunction with PCA (Principal Component Analysis) on the
same ORL database of faces.

Weeks 10-14: Clustering


Week 10:
Hierarchical Clustering:
see section B in the overview (Rom.: Privire de ansamblu) of the Clustering chapter in
Ciortuz et al.s exercise book;
pr. 1-6; 23-26, 38
Week 11:
The k-Means Algorithm
see section C1 in the overview of the Clustering chapter in Ciortuz et al.s exercise book;
pr. 7-12, 21 [7-13, in the 2016s version]; 27-32
Week 12:
Estimating the parameters of probabilistic distributions: the MAP and MLE methods;
Ciortuz et al.s exercise book, ch. 1, pr. 28-31; 58-60 [58-63, in the 2016s version]
Week 13:
using the EM algoritm to solve GMMs (Gaussian Mixture Models), the uni-variate case:
see section C2 in the overview of the Clustering chapter in Ciortuz et al.s exercise book;
pr. 13bc, 14, 15, 33 [14bc-17, in the 2016s version];
Week 14:
EM for GMMs (Gaussian Mixture Models), the multi-variate case:
Ciortuz et al.s exercise book, the Clustering chapter: pr. 16-19, [20], 34-37 [18-21, 22,
34-37 in the 2016s version];
The EM Algorithmic Schemata: Tom Mitchell, Machine Learning, sect. 6.12.2.
Ciortuz et al.s exercise book, the EM Algorithm chapter; pr. 2 [3-4
Weeks 15-16: final EXAM

Erat
a la [draftul preliminar pentru] cartea
Exercitii de nv
atare automat
a
de L. Ciortuz, A. Munteanu, E. B
ad
ar
au
Observatie:
Corectiile scrise n albastru au fost facute de catre dl. Dulman, redactor la Editura UAIC - Iasi.
Multumiri
1. pag. 4, r
andul 7 de sus: un un un
2. pag. 4, r
andul 9 de sus: mentuinati mentionati
Cap. 1. Probabilit
ati si statistic
a
1. pag. 5, r
andul 6 de sus: procedeie procedee
2. pag. 5, r
andul 2 de jos: probabilt
ati probabilit
ati
3. pag. 6, r
andul 8 de jos: etropie entropie
4. pag. 7, r
andul 13 de sus: de a anulare de anulare
5. pr. 2, pag. 8, r
andul 2 de jos: a (0, 1] a (0, 1)
6. pr. 3, pag. 10, r
andul 2 (num
ar
and de sus, ns
a far
a a socoti antetul problemei):
vede verde
7. (S
tefan Pantiru)
pr. 4, pag. 11:
r
andul 11: 14 15
ntre r
andul 11 si r
andul 12 trebuie inserata linia: S = 2 : (1, 1)
r
andul 16: 14/36 = 7/18 15/36
2/36
1
2
r
andul 18:
=
14/36
7
15
r
andul 19: 1/7 2/15
8. (std. Grigorit
a Mihai)
pr. 8, pag. 15, r
ndul 11 de sus:
cele dou
a usi r
amase (usa 2 si usa 3) cele dou
a usi r
amase (usa 1 si usa 3)
9. (std. Dobranici Alexandru)
pr.P
10, pag. 19, r
andul 7 de sus:
P
= wVal(W ) P ({ | W () = w}) + zVal(Z) P ({ | Z() = z})
P
P
= wVal(W ) wP ({ | W () = w}) + zVal(Z) zP ({ | Z() = z})

10. pr. 13, pag. 22, r


andul 7 de jos: in n

11. pr. 14, pag. 23


aritelea de-a s
aritelea
r
andul 17 de jos: de-a-n s
r
andurile 10 si 11 de jos:
pozitia medie la care ne astept
am s
a fie iepurasul (adica media tuturor pozitiilor n care
poate ajunge iepurasul) pozitia (medie) la care ne astept
am s
a fie iepurasul

12. pr. 18 pag. 28:


r
andul 12 de jos: W si Z W : R si Z : R
r
andul 5 de jos: X si Y X : R si Y : R
13. pr. 19, pag. 28, r
andul 12 de jos: log ln
14. pr. 22:
pag. 31,
pag. 31,
pag. 32,
pag. 34,
pag. 34,

r
andul
r
andul
r
andul
r
andul
r
andul

11 de jos (except
and formulele): triunghilara triunghiular
a
2 de jos (except
nd nota de subsol): Fie D este Fie D
12 de jos: in sensul n sensul
3 de sus: probabilt
ati probabilit
ati
6 de sus: verosimiltatii verosimilitatii

15. pr. 23
R 2 2
pag. 35, r
andul 3 de jos: fom vom pag. 36, r
andul 4 de jos:
( v + 2v +
2
v2
v2

R 2 2
2
2
( v + 2v + ) e 2 dv
) e 2 dv
2

16. pr. 25, pag. 38, r


andul 12 de jos: defintia definitia

17. pr. 26:


pag. 39, r
andul 8 (num
ar
and de sus, ns
a far
a a socoti antetul problemei):
p(x , ) p(x; , ) pag. 40, r
andul 7 de sus:

1
1

T


0
1 x1 1
1
x1 1
12

exp
p(x; , ) =

1
x

2 1 2
2
2
2
2
2
0
22
 2

T 
 
!
1 x1 1
1
1
2 0
x1 1
exp
=
x2 2
12
2 1 2
2 x2 2
12 22 0

p(x; , )

1
1
exp
2 1 2
2

1
1
exp
2 1 2
2




x1 1
x2 2

T 

x1 1
x2 2

T

pag. 40, r
andul 11 de jos: aletoare aleatoare
18. pr. 27, pag. 41:
r
andul 2: ceasta aceasta
r
andul 11: variatia varianta
r
andul 21: rezulta pentru orice a 0 rezulta
r
andurile 25-27 trebuie sterse
19. pr. 27, pag. 42, r
andul 3: centrale centrala
20. pr. 29, pag. 44
r
andul 7 de sus: indepenedent independent
r
andul 10 de sus: N > 0

1
2
1 22

1
12

22
0

 

x1 1
x2 2



0
x1 1
1 x2 2
22
0
12

!

21. pr. 30:


pag. 46, r
andul 5 de jos, partea dreapt
a a egalitatii respective:

PN
PN 
(xi )2
(xi )2
1

i=1 log 2
i=1 log 2
2 2
2 2
pag. 47, r
andul 1 de jos, partea
a a egalitatii respective:
 dreapt

2
PN
(xi )2
log( 2 ()
i=1 log( 2 + 22
2 2 log C


2
PN 
)2
i=1 log 2 + (xi2
log 2 ()
2
2 2 log C

22. pr. 31, pag. 51, r


andul 4 de jos:
semnul = este suprascris de catre expresia p( | x1 , . . . , xn ); am corectat

23. (S
tefan Pantiru)
pr. 32, pag. 53, r
andul 5 de jos:
1
1
H(1/3, 1/3, 1/6) = H(1/2, 1/2) + H(2/3/1/3)
2
2
1
H(1/2, 1/3, 1/6) = H(1/2, 1/2) + H(2/3, 1/3).
2
24. pr. 32, pag. 56, r
andul 3 de jos:
1
log t
m
1
1
m
log t
1

n
log s
n
n
n
n
log s
n
25. pr. 34, pag. 59
aruncarea unui zar perfect cu 6 fete aruncarea a dou
a zaruri perfecte, cu 6 fete
26. pr. 34, pag. 60
r
andul 6: Informtion Information; Surpise Surprise
r
andul 8: suprins surprins
27. pr. 34, pag. 61:
n r
andul al doilea al tabelului de la punctul d., toate fractiile 1/36 trebuie nlocuite cu
1/6
28. pr. 35, pag. 61, primul r
and de la nota de subsol 8:
a 6= 1 a 6= 1 si c 6= 1
29. pr. 36:
pag. 63, prima linie a celui de-al treilea tabel din aceast
a pagina: XI Xi
pag. 63, r
andul 6 de jos:
M I(X4 , Y ) = 0.3113 M I(X5 , Y ) = 0.3113 M I(X4 , Y ) = 0.3113 si M I(X5 , Y ) = 0.3113

pag. 64, r
andul 1 de sus: adaga adauga
30. pr. 37: pag. 64, r
andul 10 de sus:
prelucare prelucrare
pag. 65, r
andurile 9 si 10 de sus:
n raport cu Y Y pe de o parte n raport cu Y pe de o parte,
31. pr. 39, pag. 69
r
andul 10 de sus: guaranta garanta
r
andul 18 de sus: posibl posibil
32. pr. 40, pag. 70
r
andul 10 de sus: Acesta aceasta
r
andul 10 de sus: avem nevoie a pentru a justifica avem nevoie pentru a justifica
33. pr. 41, pag. 70, r
andul 9 de jos: Enunt Enuntul

34. pr. 45, pag. 72: venimente evenimente


35. pr. 50, pag. 74, r
andul 7 de jos: P (X0 ) = 0 X0 = 0 cu probabilitate 1
36. pr. 55, pag. 76, r
andul 9 de jos:

4
4
(1 x2 ) (1 x3 )
3
3

37. pr. 58, pag. 77, r


andul 8 de jos: x1 , x2, . . . , xn x1 , x2 , . . . , xn
38. (std. Nit
a Axenia, FII)
pr. 58, pag. 78, r
andul 8 de sus:
(x1 , y1 ), (x1 , y1 ), . . ., (xn , yn ) (x1 , y1 ), (x2 , y2 ), . . ., (xn , yn )
39. pr.P
59, pag. 78, r
andul 3 dePjos:
P
P
xi i sin(wxi ) cos(wxi ) = i xi yi cos(wxi ) i xi sin(wxi ) cos(wxi ) = i xi yi cos(wxi )

40. pag. 80: titlul subsectiunii Elemente de teoria informatiei trebuie mutat la nceputul
acestei pagini (adica imediat deasupra problemei 61)
41. pr. 62, pag. 80, r
andul 5 de jos (far
a nota de subsol): aleatore aleatoare
Cap. 2. Arbori de decizie
1. (std. Cojocaru Cristian, FII)
pag. 84, r
andul 4 de sus: gready greedy
2. pag. 85, r
andul 12 de sus: arborele arbore
3. pag. 86, r
andul 7-8 de sus: superiora superioara
4. pr. 2
(std. Munteanu Petru, Sutea-Dragan Silviu)

5
pag. 91, la calculul pentru HComestibila , r
andul al doilea: log2 5 log2 5
8
pag. 93, r
andul 8 de jos (far
a a tine cont de figuri): porceda proceda
pag. 95:
r
andul 1 de sus: fac
and facand
r
andul 6 de sus: nvatare nv
atare

5. pr. 3
pag. 95, r
andul 11 de sus: prelucare prelucrare
pag. 95, r
andul 3 de jos:
valoarea maxima a entropiei unei variabile boolene valoarea maxima a entropiei [conditionale
medii a] unei variabile boolene
pag. 96, r
andul 3 de sus:
Cele dou
a entropii sunt egale Cele dou
a entropii [conditionale medii] sunt egale
6. pr. 4, pag. 97, r
andul 14 de jos: suprevietuitor Supravietuitor
7. pr. 5
pag. 99, r
andul 9 de sus: Care va fi eroarea de antrenare Care va fi eroarea la antrenare
pag. 99, r
andul 16 de jos: eroarea la antrenare rata [medie a] erorii
pag. 99, r
andul 11 de jos: rata de eroare eroarea la antrenare
pag. 99, r
andul 9 de jos: rata de eroare eroare [la antrenare]
pag. 99, r
andul 4 de jos: rata maxima a erorii eroarea maxima
pag. 99, r
andul 9 de jos: fiecarei fiecarei

pag. 100, r
andul 2 de sus:
(ca tupluri de valori ale atributelor de intrare) ca tupluri de valori ale atributelor de
intrare
pag. 100, r
andul 6 de sus: Rata maxima de eroare Eroarea maxima la antrenare
8. pr. 8, pag. 103:
In prima figura de pe aceast
a pagina trebuie ncercuite instantele X = 4, X = 6 si
X = 8.5.
In figura a doua de pe aceast
a pagina trebuie ncercuite instantele X = 4, X = 6, X = 8,
X = 8.5 si X = 9.
r
andul 8 de sus: se de exemple set de exemple
r
andurile 12-13 de jos trebuie rescrise astfel:
Este imediat ca pentru restul de 7 cazuri (X = 1, 2, 3, 7, 8, 9, 10), arborele determinat de
algoritmul DT2 va clasifica corect punctul X.
9. pr. 9
pag. 104, r
andul 22 de jos: atentie atentie
pag. 104, r
andul 7 de jos: nod frunza a acestui arbore nod frunza al acestui arbore
pag. 105, r
andul 5 de sus: nod frunza a acestui arbore nod frunza al acestui arbore
10. pr. 11:
pag. 107, r
andurile 1 si 2 de sus:
entropia conditional
a ai multimii S entropia conditional
a medie a multimii S
pag. 107, r
andul 4 de sus: atributul A atributului A
pag. 108, r
andul 9 de jos: face face
11. pr. 12, pag. 109, r
andul 1: entropia [specifica] entropia conditional
a specifica
12. pr. 13
pag. 111, r
andul 9 de jos (si de alte 6 ori mai jos):
ne-omogeneitate ne-omogenitate
pag. 112, r
andul 16 de jos: demosntrat demonstrat
13. pr. 15
pag. 115, r
andul 19 de jos: n a avans n avans
pag. 115, r
andul 17 de jos: acceasi aceeasi
pag. 116, r
andul 9 de jos: acceasi aceeasi
14. pr. 16, pag. 120, r
andul 3 de jos (far
a formulele din josul paginii):
calculul castigul de informatie calculul castigului de informatie
15. pr. 17
pag. 121,
pag. 122,
pag. 122,
pag. 123,

r
andul
r
andul
r
andul
r
andul

14 de jos: truchiere trunchiere


5 de sus: al caror un castig al caror castig
10 de sus: vice versa viceversa
20 de jos: IG > IG

16. pr. 18
pag. 124, r
andul 14 de sus: respingea respingerea
pag. 124, r
andul 3 de jos: matrice de contingenta matrice de contingenta,
 2
2

1
1
pag. 127, n ultima formula: 1
1
3
3

17. pr. 19, pag. 128


r
andul 10 de jos: ca ca
r
andul 7 de jos: inconsistente inconsistente,
18. pr. 20, pag. 129, r
andul 2 din enunt: corespunde corespunde/corespund
19. pr. 21, pag. 129, r
andul 7 de sus: funtiile functiile
20. pr. 22, pag. 129:
a aceast
a
r
andul 15 de jos: acest
r
andul 8 de jos: acest
a aceast
a
21. pr. 27, pag. 132, r
andul 5 de sus: costruit construit
22. pr. 28, pag. 132, r
andul 14 de jos: corss-validare cross-validare
23. pr. 34, pag. 139, r
andul 21 de sus: ca care are ca are
24. pr. 35, pag. 140, r
andul 7 de jos (far
a nota de subsol): IG(Y | Xi ) IG(Y, Xi )
Cap. 3. Clasificare bayesian
a
1. pag. 143, r
andul 7 de jos (far
a a pune la socoteal
a nota de subsol):
probabilt
ati probabilit
ati
2. pr. 4, pag. 150, r
andul 16 de jos: tablelului tabelului
3. pr. 5
pag. 155, r
andurile 1 si 3 de jos (far
a a pune la socoteal
a formulele si tabelul din partea
de jos paginii): Eroarea medie Rata medie a erorii
pag. 156, r
andul 5 de sus (far
a a pune la socoteal
a tabelul din prima parte a paginii):
Eroarea medie Rata medie a erorii
4. pr. 6 pag. 156, r
andul 8 de jos:
s
a aib
a rata (i.e., media) erorii de 50% s
a aiba rata medie a erorii de 50%
5. pr. 7
pag. 159, r
andul 14 de jos: indepenedente independente
pag. 159, r
andul 1 de jos (far
a a pune la socoteal
a formulele din ultima parte a paginii):
indepenedente independente
6. pr. 8, pag. 162
r
andul 15 si 16 de sus:
d
cate NNB = 1 + log2 observatii pentru fiecare din variabilele Xi (i = 1, d),

d
NNB = 1 + log2 observatii,

r
andul 7 de jos: s
a ori 0 ori 1 s
a fie ori 0 ori 1
7. pr. 9, pag. 165
r
andurile 6 si 7 de sus:
testele X > 2.5 i X sunt X > 3.625 sunt sunt 0.419
testele X > 2.5 i X > 3.625 sunt 0.419
r
andurile 12 si 16 de jos (far
a a pune la socoteal
a graficele din ultima parte a paginii:
) fie ), fie

8. pr. 10, pag. 167, r


andul 2 de jos: aceesi aceeasi
9. pr. 13, pag. 168, r
andul 4 de jos: Aadar Asadar
10. pr. 15, pag. 170, r
andul 3 de sus (far
a a pune la socoteal
a tabelul din prima parte a
paginii): rata medie erorilor rata medie a erorilor
Cap. 4.
Inv
atare bazat
a pe memorare
1. pr. 1, pag. 177
ari cei trei clasificatori
r
andul 8 de jos: cei clasific
r
andul 2 de jos: r
astr
ange restr
ange
2. pr. 3, pag. 180, r
andul 5 de sus: orecare oarecare
3. pr. 4, pag. 180, r
andul 2 de jos: memorez
a memoreaz
a
4. pr. 5, pag. 184, r
andul 10 de sus: 0 > 1/4 0 < 1/4
5. pr. 6, pag. 185, primul r
and de sus: k 6= j [k/j

k 6= j [k/j]

6. pr. 7, pag. 187, r


andurile 11 si 2 de jos:
Pentru orice punct (x, y) situat n afara celor dou
a zone hasurate, vom avea d+ =| x 1 |
si d = |x + 1|.
Aceast
a afirmatie nu este adevarata pentru toate zonele mentionate (deci altele decat
zonele hasurate). Corectiile, imposibil de redat aici din cauza lipsei de spatiu, au fost
introduse n noua variant
a (n pregatire) a culegerii.
7. (std. Ionel S
tef
anuc
a, FII)
pr. 8, pag. 189: n figura de la finalul rezolv
arii trebuie adagat semnul n pozitia (6, 4.5),
corespunzator valorii nv
atate de algoritm pentru x = 6.
8. pag. 192, pr. 9, la rezolvarea punctului d, linia a doua a demonstratiei:

n


1
1
n ln 1 p
ln 10 n ln 1 p ln 10
2
2
9. pr. 11, pag. 198, r
andul 7 de sus: x > 3.5 x > 4.5
10. pr. 12
pag. 199, r
andul 3 de jos (far
a a pune la socoteal
a tabelul din ultima parte a paginii):
urmaator urmator
pag. 200, r
andul 18 de sus: testare antrenare
11. pr. 14, pag. 202, r
andul 3 de sus: cros-validare cross-validare
12. pr. 18, pag. 203, r
andul 3 de jos: euclideana euclidian
a
Cap. 5. Clusterizare
1. pag. 208, r
andul 7 de sus: culsterizare clusterizare
2. pag. 208, r
andul 14 de jos:
2
1
P
|A| (|A| 1)
x,yA d(x, y)

C2|A|
x,yA d(x, y)

3. pag. 209
r
andul 1 de sus: cluserizare clusterizare
r
andul 14 de sus: a functie o functie
4. pag. 210
r
andul 6 de sus trebuie s
a nceapa cu: pentru K > 0 fixat,
r
andul 16 de sus: intr-clustere intra-clustere
5. pag. 214, r
andul 10 de sus: detalii n referitoare detalii referitoare
6. pag. 214, r
andul 18 de sus: evaluaze evalueze
7. pr. 1, pag. 216, r
andul 5 de jos (far
a a tine cont de notele de subsol):
6
6
4.285 c(C6 ) =
4.285
h(C6 ) =
1.4
1.4
8. pr. 3, pag. 221, r
andul 6 de sus: urmatorul setul de date urmatorul set de date
9. pr. 4, pag. 223, r
andul 8 de sus:
distantele dintre punctele consecutive se g
asesc n ordine strict crescatoare
distantele dintre punctele care se clusterizeaza succesiv sunt n ordine strict crescatoare
10. pr. 5, pag. 224, r
andul 6-7 de sus: clustere detop clustere de top
11. pr. 6
pag. 225, r
andul 11 de sus:
ale carui noduri sunt elementele din S ale carui noduri frunza sunt elementele din S
pag. 227, r
andul 2 de jos (n nota de subsol): folosese folosesc
12. pr. 7
pag. 230, r
andul 5 de jos: procedeie procedee
pag. 230, r
andul 3 de jos: procedeie procedee
13. pr. 8, pag. 231
r
andul 2 de sus: clustrele clusterele
r
andul 13 de sus: celalalt celalalt
r
andul 5 de jos (far
a a tine cont de figuri): clustre clustere
14. pr. 11:
pag. 237, r
andul 8 de jos: panul euclidian planul euclidian
pag. 239, r
andul 17 de jos: aceasi motiv acelasi motiv
15. pr. 13
pag. 245, r
andul 6 de sus: a a2
pag. 245, r
andul 10 de sus: a , b , a si b a , b , a2 , b2 , a si b
pag. 248, r
andurile 1-2 de jos: probabilt
atile probabilit
atile
16. pr. 14, pag. 253
r
andul 8 de sus:
1 Pm (t)
1 Pm (t)
1 Pm (t)
1 Pm (t)
2

p (xi 1 )2
i=1 pi1 (xi 1 )
i=1 pij + 3
i=1 pi1 + 3
1
1
1
1 i=1 i1
r
andurile 11-14 de sus: indicele 0 trebuie inlocuit peste tot cu 1
17. pr. 16, pag. 257, r
andul 7 de jos (far
a a tine cont de figurile din josul paginii):
K-means: K-means

18. pr. 18
pag. 260, r
andul 5 de jos (far
a a tine cont de notele de subsol): algorimtului algoritmului
pag. 261, r
andul 1 de jos (far
a a tine cont de figurile din josul paginii):
20 iteratii 20 de iteratii
19. pr. 19, pag. 262, r
andul 5 de jos: for fi vor fi
20. pr. 20:
pag. 267, r
andul 13 de jos: folosesste foloseste
pag. 269, r
andul 9 de sus: variabilel variabilele
pag. 269, r
andul 14 de sus: problema optimul local problema optimului local
21. pr. 1, pag. 269, r
andurile 6 si 13 de jos:
advantaje avantaje
22. pr. 26, pag. 273, r
andurile 8-9 de sus:
Justificati alegerea n mod riguros facuta. Justificati n mod riguros alegerea facuta.
23. pr. 33, pag. 276, r
andul 19 de sus: de de de
24. pr. 34, pag. 277, r
andul 2 de sus: modele urmatoare modelele urmatoare
25. pr. 35:
pag. 279, r
andul 6 de sus: set of instante set de instante
pag. 279, r
andul 9 de sus: pin
a pana
26. pr. 36:
pag. 280, r
andurile 4-5 de sus: prin pentru modelare prin modelare
pag. 280, r
andurile 10 de sus:
Daca un anumit pas nu necesar Daca un anumit pas nu este necesar
27. pr. 38, pag. 282, r
andul 3 de jos (vezi nota de subsol 67): clustrelor clusterelor

You might also like