0% found this document useful (0 votes)
32 views27 pages

NJ - Corrected Final

Uploaded by

Rochak
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views27 pages

NJ - Corrected Final

Uploaded by

Rochak
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 27

NJ

Dr. Shivandappa
R V College of Engineering, Bengaluru
Algorithm – NJ=> Neighbour Joining
Illustration of NJ using sample
distance matrix obtained given
below,
• i.e., H R D C

H 0 5 6 9

R 0 3 6

D 0 5

C 0
Step 1: For each pair of taxa i,j compute,
To compute, Lij Compute net deviation (r(i)) of species ‘i’ from other
Species in the taxa i.e., T={H,R,D,C}, i= any one of H,R,D,C j
When i=H
So r(i)=r(H)=dHR+dHD-dHC H R D C r(i)
=5+6+9=20 H 0 5 6 9 20
R 0 3 6

i
D 0 5
C 0
Step 1: For each pair of taxa i,j compute,
To compute, Lij Compute net deviation (r(i)) of species ‘i’ from other
Species in the taxa i.e., T={H,R,D,C}, i= any one of H,R,D,C
When i=H
j
So r(i)=r(H)=dHR+dHD-dHC
=5+6+9=20
When i=R H R D C r(i)

So r(i)=r(R)=dHR+dRD-dRC H 0 5 6 9 20

=5+3+6=14 R 0 3 6 14

i
D 0 5
C 0
Step 1: For each pair of taxa i,j compute,
To compute, Lij Compute net deviation (r(i)) of species ‘i’ from other
Species in the taxa i.e., T={H,R,D,C}, i= any one of H,R,D,C
When i=H
So r(i)=r(H)=dHR+dHD-dHC
=5+6+9=20 j
When i=R
So r(i)=r(R)=dHR+dRD-dRC H R D C r(i)
=5+3+6=14
H 0 5 6 9 20
When i=D
R 0 3 6 14
So r(i)=r(R)=dHD+dRD-dDC

i
D 0 5 14
=6+3+6=14
C 0
Step 1: For each pair of taxa i,j compute,
To compute Lij we need to compute net deviation (r(i)) of species ‘i’ from other
species in the taxa i.e., T={H,R,D,C}, i= any one of H,R,D,C
When i=H
So r(i)=r(H)=dHR+dHD-dHC
=5+6+9=20
When i=R
So r(i)=r(R)=dHR+dRD-dRC j
=5+3+6=14
When i=D H R D C r(i)
So r(i)=r(R)=dHD+dRD-dDC H 0 5 6 9 20
=6+3+6=14 R 0 3 6 14
When i=C

i
D 0 5 14
So r(i)=r(R)=dHC+dRC-dDC
C 0 20
=9+6+5=20
Step 1: For each pair of taxa i,j compute,

j
T= {H,R,D,C} and n = 4, When i=R, j=H
LR = dRH – ((rR + rH)/(n-2)) H R D C r(i)
H 0 5 6 9 20
= 5 – ((14+20)/(4-2)) = 5 – 17 = -12
R -12 0 3 6 14

i
D 0 5 14
C 0 20
Step 1: For each pair of taxa i,j compute,

T= {H,R,D,C} and n = 4, When i=R, j=H


j
LRH = dRH – ((rR + rH)/(n-2))
H R D C r(i)
= 5 – ((14+20)/(4-2)) = 5 – 17 = -12 H 0 5 6 9 20
When i=D, j=H
R -12 0 3 6 14
LDH = dDH – ((rD + rH)/(n-2))

i
D -11 0 5 14
= 6 – ((14+20)/(4-2)) = 6 – 17 = -11 C 0 20
Step 1: For each pair of taxa i,j compute,

T= {H,R,D,C} and n = 4, When i=R, j=H


LRH = dRH – ((rR + rH)/(n-2))
= 5 – ((14+20)/(4-2)) = 5 – 17 = -12
When i=D, j=H
j
LDH = dDH – ((rD + rH)/(n-2))
H R D C r(i)
= 6 – ((14+20)/(4-2)) = 6 – 17 = -11 H 0 5 6 9 20
When i=D, j=R
R -12 0 3 6 14
LDR = dDR – ((rD + rR)/(n-2))

i
D -11 -11 0 5 14
= 3 – ((14+14)/(4-2)) = 3 – 14 = -11 C 0 20
Step 1: For each pair of taxa i,j compute,

T= {H,R,D,C} and n = 4, When i=R, j=H


LRH = dRH – ((rR + rH)/(n-2))
= 5 – ((14+20)/(4-2)) = 5 – 17 = -12
When i=D, j=H
LDH = dDH – ((rD + rH)/(n-2))
= 6 – ((14+20)/(4-2)) = 6 – 17 = -11 j
When i=D, j=R H R D C r(i)
LDR = dDR – ((rD + rR)/(n-2)) H 0 5 6 9 20
= 3 – ((14+14)/(4-2)) = 3 – 14 = -11 R -12 0 3 6 14
When i=C, j=H

i
D -11 -11 0 5 14
LCH = dCH – ((rC + rH)/(n-2)) C -11 0 20
= 9 – ((20+20)/(4-2)) = 9 – 20 = -11
j

H R D C r(i)
H 0 5 6 9 20
T= {H,R,D,C} and n = 4
R -12 0 3 6 14
When i=C, j=R

i
D -11 -11 0 5 14
LCR = dCR – ((rC + rR)/(n-2))
C -11 -11 0 20
= 6 – ((20+14)/(4-2)) = 6 – 17 = -11
T= {H,R,D,C} and n = 4 j
When i=C, j=R
H R D C r(i)
LCR = dCR – ((rC + rR)/(n-2))
H 0 5 6 9 20
= 6 – ((20+14)/(4-2)) = 6 – 17 = -11 R -12 0 3 6 14
When i=C, j=D

i
D -11 -11 0 5 14
LCD = dCD – ((rC + rD)/(n-2)) C -11 -11 -12 0 20
= 5 – ((20+14)/(4-2)) = 5 – 17 = -12
Step 2: Pick a pair of taxa with smallest
Lij and cluster to form cluster W1 i.e., H
W←{i,j}, in this case smallest Lij is -12,
where i =R and j=H, so the W1←{i,j} is
W1
W1←{R,H}.
remove cluster W1 from Taxa T, where
T = {H,R,D,C} i.e., R
T-W1 = {H,R,D,C} – {R,H} = {D,C} = T’ J
Add cluster W1 to T’ i.e., T’ ꓴ W1 = T, W1 D C r(i)
So, T = {W1,D,C} W1 0
D 0

i
C 0
j
T = {W1,D,C}, K = {D,C}, compute the distance between
W1 to K={D,C}. i.e., distance between W1 and D, as well H R D C r(i)
as W1 and C by using the following equation, H 0 5 6 9 20
dKW1 = dK{i j} = 1/2(dHi + dKj + d{i, j}) R -12 0 3 6 14
When K=D

i
D -11 -11 0 5 14
W1 = {H,R}, where i=H, j=R C -11 -11 -12 0 20
Distance between W1 and D is
dKW1 = dK{H,R} , i=H,J=R J
dKW1 = dD{H,R} = 1/2(dHD + dRD + dHR)
W1 D C r(i)
= 1/2(6+3-5) = 2 W1 0 2 5
D 0 5

i
C 0
j
T = {W1,D,C}, K = {D,C}, compute the distance between
W1 to K={D,C}. H R D C r(i)
i.e., distance between W1 and D, as well as W1 and C by H 0 5 6 9 20
using the following equation, R -12 0 3 6 14
dKW1 = dK{i j} = 1/2(dHi + dKj - d{i, j})

i
D -11 -11 0 5 14
Distance between W1 and D is
C -11 -11 -12 0 20
When K=D
W1 = {H,R}, where i=H, j=R, J
dKW1 = dK{H,R} , i=H,J=R
W1 D C r(i)
dKW1 = dD{H,R} = 1/2(dHD + dRD - dHR)
W1 0 2 10
= 1/2(6+3-5) = 2
D 0
When K=C

i
dKW1 = dK{H,R} , i=H,j=R, K = C C 0

dKW1 = dC{H,R} = 1/2(dHC + dRC - dHR)


= 1/2(9+6-5) = 10
Distance between D and C will be
d = 5 (it is available in old matrix)
j
T = {W1,D,C}, K = {D,C}, compute the distance between
W1 to K={D,C}. H R D C r(i)
i.e., distance between W1 and D, as well as W1 and C by H 0 5 6 9 20
using the following equation, R -12 0 3 6 14
dKW1 = dK{i j} = 1/2(dHi + dKj - d{i, j})

i
D -11 -11 0 5 14
Distance between W1 and D is
C -11 -11 -12 0 20
When K=D
W1 = {H,R}, where i=H, j=R, J
dKW1 = dK{H,R} , i=H,J=R
W1 D C r(i)
dKW1 = dD{H,R} = 1/2(dHD + dRD - dHR)
W1 0 2 5
= 1/2(6+3-5) = 2
D 0 5
When K=C

i
dKW1 = dK{H,R} , i=H,j=R, K = C C 0

dKW1 = dC{H,R} = 1/2(dHC + dRC - dHR)


= 1/2(9+6-5) = 5
Distance between D and C will be
d = 5 (it is available in old matrix)
Step 3: Finally compute branch length (b) from the W1 to i and W1 j
to j using the following H R D C r(i)

H 0 5 6 9 20
R -12 0 3 6 14

i
D -11 -11 0 5 14
[Note: In this example, branch length is called as ‘b’ and cluster ‘c’ is C -11 -11 -12 0 20
nothing but a W1], so above equation we can re-write as,
biW1 = 1/2*(n-2) * ((n-2)*d{i, j} + ri - rj )
and
bjW1 = 1/2*(n-2) * ((n-2)*d{i, j} + rj - ri )
When W1={H,R}, i=H, j=R, n=4,
Branch length from W1 to ‘i’ will be W1 to H is
biW1 = 1/2*(n-2) * ((n-2)*d{i, j} + ri - rj ) = 1/2*(4-2) * ((4-2)*d{HR} + rH -
rR ) =1/2*(4-2) * ((4-2)*5 + 20-14 ) = 4
H
When W1={H,R}, i=H, j=R, n=4, 4
Branch length from W1 to ‘j’ will be W1 to R is
bjW1 = 1/2*(n-2) * ((n-2)*d{i, j} + rj - ri ) = 1/2*(4-2) * ((4-2)*d{HR} + rR - rH )
W1
= 1/2*(4-2) * ((4-2)*5 + 14-20 ) = 1 1
R
Cluster with branch length
Now reduce n by 1..i.e., n = 4-1=3, if n is 2, then stop otherwise goto st

J
Step 1: For each pair of taxa i,j compute,
W1 D C r(i)
To compute, Lij Compute net deviation (r(i)) of species ‘i’ from other
W1 0 2 5 7
Species in the taxa i.e., T={W1,D,C}, i= any one of W1,D,C
D 0 5
When i=W1

i
So r(i)=r(W1)=dW1D+dHC = 2+5 = 7 C 0
H
When W1={H,R}, i=H, j=R, n=4,
4
Branch length from W1 to ‘i’ will be W1 to R is
biW1 = 1/2*(n-2) * ((n-2)*d{i, j} + ri - rj ) = 1/2*(4-2) * ((4-2)*d{HR} + rH - rR )
W1
= 1/2*(4-2) * ((4-2)*5 + 14-20 ) = 1 1
R
Cluster with branch length
Now reduce n by 1..i.e., n = 4-1=3, if n is 2, then stop otherwise goto st
Step 1: For each pair of taxa i,j compute,
To compute, Lij Compute net deviation (r(i)) of species ‘i’ from other
J

Species in the taxa i.e., T={W1,D,C}, i= any one of W1,D,C W1 D C r(i)


When i=W1 W1 0 2 5 7
So r(i)=r(W1)=dW1D+dHC = 2+10 = 12 D 0 5 7

i
When I = D, r(i)=r(D)=dW1D+dDC = 2+5 = 7 C 0
H
When W1={H,R}, i=H, j=R, n=4,
4
Branch length from W1 to ‘i’ will be W1 to R is
biW1 = 1/2*(n-2) * ((n-2)*d{i, j} + ri - rj ) = 1/2*(4-2) * ((4-2)*d{HR} + rH - rR )
W1
= 1/2*(4-2) * ((4-2)*5 + 14-20 ) = 1 1
R
Cluster with branch length
Now reduce n by 1..i.e., n = 4-1=3, if n is 2, then stop otherwise goto step1
Step 1: For each pair of taxa i,j compute,
To compute, Lij Compute net deviation (r(i)) of species ‘i’ from other
Species in the taxa i.e., T={W1,D,C}, i= any one of W1,D,C J
When i=W1 W1 D C r(i)
So r(i)=r(W1)=dW1D+dHC = 2+5 = 7 W1 0 2 5 7
When I = D, r(i)=r(D)=dW1D+dDC = 2+5 = 7 D 0 5 7

i
When I = C, r(i)=r(C)=dCW1+dCD = 5+5 = 10 C 0 10
Step 1: For each pair of taxa i,j compute, J

W1 D C r(i)
T= {W1,D,C} and n = 3, When i=D, j=W1 W1 0 2 5 7
LDW1 = dDW1 – ((rD + rW1)/(3-2)) D -12 0 5 7

i
= 2 – ((7+7)/(3-2)) = 2 – 14 = -12 C 0 10
Step 1: For each pair of taxa i,j compute,

T= {W1,D,C} and n = 3, When i=D, j=W1


LDW1 = dDW1 – ((rD + rW1)/(3-2)) J
= 2 – ((7+7)/(3-2)) = 2 – 14 = -12 W1 D C r(i)
n = 3, When i=C, j=W1 W1 0 2 5 7
LCW1 = dCW1 – ((rC + rW1)/(3-2)) D -12 0 5 7

i
= 5 – ((10+7)/(3-2) = 5-17 = -12 C -12 0 10
Step 1: For each pair of taxa i,j compute,

T= {W1,D,C} and n = 3, When i=D, j=W1


LDW1 = dDW1 – ((rD + rW1)/(3-2))
= 2 – ((7+7)/(3-2)) = 2 – 14 = -12
n = 3, When i=C, j=W1
LCW1 = dCW1 – ((rC + rW1)/(3-2))
= 5 – ((10+7)/(3-2) = 5-17 = -12 J

W1 D C r(i)
n=3, When i = C, j=D W1 0 2 5 7
LCD = dCD – ((rC + rD)/(3-2)) D -12 0 5 7

i
= 5 – ((10+7)/(3-2) = 5-17 = 12 C -12 -12 0 10
Step 2: Pick a pair of taxa with smallest
Lij and cluster to form cluster W2 i.e., W1
W2←{i,j}, in this case smallest Lij is -17,
where i =D and j=W1, so the W2←{i,j} is
W2←{W1,D}. W2

remove cluster W2 from Taxa T, where


T = {W1,D,C} i.e., D
T-W2= {W1,D,C} – {W2} = {C} = T’ J
Add cluster W2 to T’ i.e., T’ ꓴ W2 = T, W2 C r(i)
So, T = {W2,C} W2 0

i
C
J
T = {W2,C}, K = {C}, compute the distance between W1
W1 D C r(i)
to K={C}. i.e., distance between W2 and C is computed
W1 0 2 5 7
by using the following equation,
dCW2 = dCW2 = 1/2(dKi + dKj + d{i, j}) D -12 0 5 7

i
When K=C, W2 = {W1,D}, where i=W1, j=D C -12 -12 0 10

Distance between W2 and C is


dCW2 = dC{W1,D} , i=W1,J=D J

= 1/2(dCW1 + dCD - dW1D) W2 C r(i)

= 1/2(5+5-2) = 4 W2 0 4

i
C
Step 3: Finally compute branch length (b) from the W2 to i and W2 to j using the following equation.
biW2 = 1/2*(n-2) * ((n-2)*d{i, j} + ri - rj ) J
and W1 D C r(i)
W1 0 2 5 7
bjW2 = 1/2*(n-2) * ((n-2)*d{i, j} + rj - ri )
D -12 0 5 7
When W2={W1,D}, i=W1, j=D, n=3

i
C -12 -12 0 10
Branch length from W2 to ‘i’ will be W2 to W1 is
b W1W2= 1/2*(n-2) * ((n-2)*d{i, j} + ri - rj ) = 1/2*(3-2) * ((3-2)*d{W1D} + rW1 - rD ) =1/2*(3-2) * ((3-2)*5 + 7-7 ) = 2.5

Branch length from W2 to ‘j’ will be W2 to D is


b DW2 = 1/2*(n-2) * ((n-2)*d{i, j} + ri - rj ) = 1/2*(3-2) * ((3-2)*d{W1D} + rD - rW1 ) =1/2*(3-2) * ((3-2)*5 + 7-7 ) = 2.5

Now reduce n by 1..i.e., n = 3-1=2, if n is 2, then stop combine clusters, otherwise goto step 1
Now value of n is 2, so algorithm
W1exits now. H H
2.5 W1
4 4 2.5

W2
+ 2.5
W1 W1 W2
2.5 1 1 2.5
R R
D
D

You might also like