CS426 SolutionForHomework1
CS426 SolutionForHomework1
2.3.1
0.5
(c) Map: for each integer i in the file, emit key-value pair (i, 1)
Reduce: turn the value list into 1.
Note the result is obtained from the keys of the output.
6.1.1
0.5
(a) frequent items are: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20
6.1.5
1
(a) {5, 7} 2
The baskets containing both item 5 and item 7 are basket 35 and basket 70, in which only basket
70 also contains item 2. Hence, the confidence of the association rule {5,7}2 is 1/2.
(b) {2, 3, 4} 5
The baskets whose numbers are the multiples of 12 contain item set {2,3,4} as a subset, there are
8 such baskets, while only those whose numbers are the multiples of 60 contain item set {2,3,4,5}
as a subset, there are 1 such basket. Hence, the confidence of the association rule {2,3,4}5 is
1/8.
6.2.1
0.5
For any pair {i, j} in the triangular matrix, the corresponding index k is ( 1) (20 2) +
Solve the equation 100 = ( 1) (20 ) + , 1 < 20, we can get the pair
2
{7,8}.
6.2.6
1
(a)
Candidate itemsets of size 1 (C1):
Items with index 1~100
Truly frequent itemsets of size 1 (L1):
Items with index 1~20
Candidate itemsets of size 2 (C2):
1 < 20 {, }
Truly frequent itemsets of size 2 (L2):
The same as 6.1.1 (b)
Candidate itemsets of size 3 (C3):
c, |c| = 3 , 2 , c =
Truly frequent itemsets of size 3 (L3):
{1, 2, 3}, {1, 2, 4}, {1, 2, 5}, {1, 2, 6}, {1, 2, 7}, {1, 2, 8}, {1, 2, 9}, {1, 2, 10}, {1, 2, 12}, {1, 2, 14}, {1, 2,
16}, {1, 2, 18}, {1, 2, 20},
{1, 3, 4}, {1, 3, 5}, {1, 3, 6}, {1, 3, 9}, {1, 3, 12}, {1, 3, 15}, {1, 3, 18},
{1, 4, 5}, {1, 4, 6}, {1, 4, 8}, {1, 4, 10}, {1, 4, 12}, {1, 4, 16}, {1, 4, 20},
item
support
pair
{1, 2}
{1, 3}
{1, 4}
{1, 5}
{1, 6}
{2, 3}
{2, 4}
{2, 5}
support
pair
{2, 6}
{3, 4}
{3, 5}
{3, 6}
{4, 5}
{4, 6}
{5, 6}
support
pair
{1, 2}
{1, 3}
{1, 4}
{1, 5}
{1, 6}
{2, 3}
{2, 4}
{2, 5}
bucket
10
pair
{2, 6}
{3, 4}
{3, 5}
{3, 6}
{4, 5}
{4, 6}
{5, 6}
bucket
(b)
(c)
bucket
10
support
0.5
1 1
1 2 3 4 2 4
30 100
(a) M = [
][
]=[
]
100 354
1 4 9 16 3 9
4 16
2 6
12
1 1
1
2
3
4
6
20
42
2
4
M = [
][
]=[
3 9 1 4 9 16
12 42 90
4 16
20 72 156
(b) Eigenpairs of M are:
20
72 ]
156
272
0.273
0.962
1 = 382.3786, 2 = 1.6214, 1 = [
], = [
]
0.962 2
0.273
(c) Eigenvalues of M are:
1 = 382.3786, 2 = 1.6214, 3 = 4 = 0
(d)Eigenvectors of M are:
0.7613
0.0632
0.3517
0.5411
0.1570
0.2247
0.6534
x1 = [
],x = [
] , x3 = [
] , x4 = [ 0.7056 ]
0.4847 2
0.3369
0.5519
0.5891
0.8430
0.4084
0.3021
0.1769
11.3.2 0.5
We can map [0, 3, 0, 0, 4] into concept space by multiplying it by V, getting the representation
of Leslie in concept space which is [1.74, 2.84]. Multiplying [1.74, 2.84] by V T , we get [1.0092,
1.0092, 1.0092, 2.0164, 2.0164] which can be used to represent how well Leslie would like the
other movies.
11.4.2 1
(a)
The columns for both Matrix and Alien are [1 3 4 5 0 0 0] , the row for Jim is [3 3 3 0 0 ], and
1.54
4.63
6.17
7.72
0
0
0 ]
Scale the row for Jim by 2 = 2 27/243 = 0.471 and the row for John by 3 =
2 48/243 = 0.6285
6.37 6.37 6.37 0 0
R=[
]
6.37 6.37 6.37 0 0
3 3
The W is [
].
4 4
Applying SVD on W, we get
0.6 0.8 50 0 0.707 0.707
W = XY = [
][
]
][
0.8 0.6 0
0 0.707 0.707
+ = [1/50 0]
0
0
0.707 0.707 0.02 0 0.6 0.8
0.0085 0.0085
U = Y( + )2 X T = [
][
][
]=[
]
0.707 0.707
0
0 0.8 0.6
0.0113 0.0113
9.2.1(a) 0.5
cos(A, B) =
cos(B, C) =
cos(A, C) =
8.2008 + 160000 2 + 24 2
9.3636 + 250000 2 + 36 27.1824 + 102400 2 + 16 2
7.8256 + 204800 2 + 24 2
7.1824 + 102400 2 + 16 2 8.5264 + 409600 2 + 362
8.9352 + 320000 2 + 36 2
9.3636 + 250000 2 + 362 8.5264 + 409600 2 + 36 2
9.2.3
0.5
(a) avg = (4+2+5)/3=11/3
A: 4-11/3=1/3
B: 2-11/3=-5/3
C: 5-11/3=4/3
(b)Processor Speed: 3.06*1/3-2.68*5/3+2.92*4/3=0.4467
Disk Size: 500*1/3-320*5/3+640*4/3= 486.6667
Main-Memory Size: 6*1/3-4*5/3+6*4/3=3.3333
9.3.1
1
4
(a) Jaccard(A, B) = 8 = 2
Jaccard(B, C) =
4 1
=
8 2
Jaccard(A, C) =
4 1
=
8 2
(b)
cos(A, B) =
cos(B, C) =
17
202
11
= 0.601
= 0.435
810
11
cos(A, C) =
= 0.615
85
9.4.1
1
(a)Start with the U and V in Fig. 9.10
1 1
2
2
2
1 1
2
2
2
1 1 1 1 1
]= 1+ 1+ 1+
1 [
1 1 1 1 1
1 1
2
2
2
[1 1 ]
[ 2
2
2
The contribution to the sum of squares from the third row is
2
2
1+
2
2
2
2
1+
2
2 ]
+1
+1
+1
+1
+1
2
2
2
2
2]
1
1
1
1
[1
1
2
1
2
1 1 1 2.2 1
]= 2
1 [
1 1 1 1 1
1
2
]
[
2
1
2
2
2
2
2
2
2
2
2
2
3.2
3.2
3.2
3.2
3.2
2
2
2
2
2]