0% found this document useful (0 votes)

12 views10 pages

14 Link 1

The document outlines the structure and challenges of web search, emphasizing the importance of PageRank in ranking web pages based on their links. It discusses the concept of web graphs, the significance of links as votes for page importance, and the implementation of PageRank through various mathematical formulations. Additionally, it addresses issues like dead ends and spider traps in web navigation and presents solutions to these problems.

Uploaded by

asansyzbai

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

12 views10 pages

14 Link 1

Uploaded by

asansyzbai

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

Announcements

• Homeworks:
• HW2 (due: 11/08)
• HW3 (will be posted on 11/06)
Link Analysis 1 • Note: Each homework has its own claim session
EE412: Foundation of Big Data Analytics • Textbook vs. slides:
• Prioritize the slides over the textbook.

Jaemin Yoo 1 Jaemin Yoo 2

Recap Outline
• UV Decomposition 1. Web Search as a Graph
• UV Decomposition: Computation 2. PageRank
• UV Decomposition: Variants 3. PageRank: Implementation

n k n
𝑓
✖ 𝑉! k
m R ≈ U 𝑓 𝑦 + 𝛻𝑓(𝑦)

𝑦
Jaemin Yoo 3 Jaemin Yoo 4
Graphs Graph Data: Social Networks
• Data structure that represents connections and relationships.
• Consists of nodes and edges.
• Can be directed or undirected.
• Represented as a sparse adjacency matrix.

Source: GeeksforGeeks
Source: [Backstrom et al., 2011]

Jaemin Yoo 5 Jaemin Yoo 6

Graph Data: Communication Web as a Graph

• Web is represented as a directed graph.
domain2
• Each web page is a node.
• There is an edge if there are hyperlinks from page 𝑝) to 𝑝*.
domain1

CS224W: Computer
I teach a Classes are Science Stanford
class on in the Department University
Networks.
Gates at Stanford
router building

domain3

Source: Stanford CS246

Jaemin Yoo 7 Jaemin Yoo 8

Challenges in Web Search Early Search Engines
Two challenges of web search: • Many search engines before Google used an inverted index.
1. Who to trust? • Data structure that makes it easy to find all pages containing a term.
• Web contains many sources of information. • Given a search query, pages with those terms are extracted and ranked.
• Idea: Trustworthy pages may point to each other. • Page is more relevant if a term occurs frequently.

2. What is the best answer to each query? ...the cat is

cat
• No single right answer. fat...

• Idea: Pages that know about 𝑋 might be pointing to many 𝑋.

dog
… raining cats Documents
and dogs ...

Inverted Buckets …the dog is

eating ...
index
Jaemin Yoo 9 10

Early Search Engines Ranking Nodes on the Graph

• Unethical people started to fool search engines. • Observation: All web pages are not equally important.
• For example, add term “cat” thousands of times. • Google introduced PageRank: Let’s rank the pages by the links.
• Make term same color as background. • There is large diversity in node connectivity in web graphs.
• Search for “cat,” copy that page, and make it invisible.

cat cat cat

cat cat cat cat
cat cat ...

dog
… raining cats Documents
and dogs ...

Inverted Buckets …the dog is

index eating ...
Source: Stanford CS246
11 12
Outline Intuition 1: Links as Votes
1. Web Search as a Graph • Idea: Consider links as “votes” for importance.
2. PageRank • Page is more important if it has more in-coming links.
• www.stanford.edu has 23,400 in-links.
3. PageRank: Implementation • www.joe-schmoe.com has 1 in-link.
• Are all in-links are equal?
• Recursive question: Links from important pages count more.
• PageRank is the converged state of page importance.

Jaemin Yoo 13 Jaemin Yoo 14

Intuition 2: Random Surfing Example: PageRank Scores

• Web pages are important if people visit them a lot.
• However, we can’t watch everybody using the Web. A B
3.3 C
• Good surrogate is the random surfer model: 38.4
34.3
• Start at a random page and follow random out-links repeatedly.
• Assume that people follow links randomly.
• PageRank is the probability of being at a page at any time. D E F
3.9 8.1 3.9

1.6
1.6 1.6 1.6 1.6

Source: Stanford CS246

Jaemin Yoo 15 Jaemin Yoo 16

Recursive Formulation Matrix Formulation
• Each link’s vote is proportional to the importance of its source page. • Define a transition matrix 𝑀 of size 𝑛 × 𝑛 from the graph.
• If page 𝑗 with importance 𝑟+ has 𝑛 out-links, each link gets 𝑟+ / 𝑛 votes. • Let page 𝑖 has 𝑑, out-links.
• Page 𝑗’s own importance is the sum of the votes on its in-links. • If 𝑖 → 𝑗, then 𝑀+, = 1 / 𝑑, , otherwise 𝑀+, = 0.
• 𝑀 is a column-stochastic matrix.
i k • Every entry is positive, and each column sums to 1.
ri/3
rk/4 • Define a rank vector 𝑟 of size 𝑛.
j rj/3 𝑟! = 𝑟" / 3 + 𝑟# / 4 • 𝑟, : Importance score of page 𝑖, where ∑, 𝑟, = 1.

rj/3 rj/3
• The recursive flow equation can be written as 𝑟 = 𝑀𝑟.

Source: Stanford CS246

Jaemin Yoo 17 Jaemin Yoo 18

Example: Matrix Formulation Eigenvector Formulation

y a m • Observation: Rank vector 𝑟 is an eigenvector of 𝑀.
y y ½ ½ 0 • 𝑟 = 𝑀𝑟 follows the definition of an eigenpair (𝐴𝑥 = 𝜆𝑥) with 𝜆 = 1.
a ½ 0 1 • 𝑟 is the dominant (or principal) eigenvector.
• Since an eigenvalue of a stochastic matrix is always between 0 and 1.
a m m 0 ½ 0
• Can find the dominant eigenvector with power iteration!
r = Mr • Power iteration works only for diagonalizable matrices.
• Stochastic matrices are diagonalizable in most cases (discussed later).
ry = ry /2 + ra /2 y ½ ½ 0 y
ra = ry /2 + rm a = ½ 0 1 a
rm = ra /2 m 0 ½ 0 m
Source: Stanford CS246

Jaemin Yoo 19 Jaemin Yoo 20

Power Iteration Why Power Iteration Works?
• Power iteration finds the dominant eigenvector as follows: • Define a sequence 𝑟 $
,𝑟 %
,⋯,𝑟 #
of rank vectors as follows:
-
• Initialize 𝑟 = 1/𝑁, 1/𝑁, ⋯ , 1/𝑁 . ) -
𝑟 = 𝑀𝑟
• Iterate 𝑟 ./) = 𝑀𝑟 . for 𝑡 = 0, ⋯ , 𝑇. 𝑟 * = 𝑀𝑟 ) = 𝑀*𝑟 -

• Stop when ||𝑟 ./) − 𝑟 . || is small enough. ⋯

• Can be seen as modeling the movement of random surfers. 𝑟 0 = 𝑀0 𝑟 -
• Start from any stochastic vector 𝑟 - . • Claim: The sequence approaches the dominant eigenvector of 𝑀.
• The limit 𝑀(𝑀(⋯ 𝑀(𝑀𝑟 - ))) is the long-term distribution of the surfers. • Proof: See the next page.
• If 𝑟 is the limit of 𝑀𝑀 ⋯ 𝑀𝑢, then 𝑟 satisfies the equation 𝑟 = 𝑀𝑟.

Jaemin Yoo 21 Jaemin Yoo 22

Why Power Iteration Works? Why Power Iteration Works?

• Assume that 𝑀 has 𝑛 linearly independent eigenvectors. • 𝑀# 𝑟 $
= 𝑐% 𝜆%# 𝑥% + 𝑐& 𝜆#& 𝑥& + ⋯ + 𝑐' 𝜆#' 𝑥'
• 𝑥), ⋯ , 𝑥1 with corresponding eigenvalues 𝜆), ⋯ , 𝜆1 , where 𝜆. > 𝜆./) . (! # (# #
• This is true when 𝑀 is a diagonalizable matrix. = 𝜆%# 𝑐% 𝑥% + 𝑐& 𝑥& + ⋯ + 𝑐' 𝑥'
(" ("
• Vectors 𝑥% , 𝑥& , ⋯ , 𝑥' form a basis and thus we can write as
• Since 𝜆) > 𝜆)*% , all fractions , ,⋯, are in −1, +1 .
(! ($ (#
• 𝑟 - = 𝑐)𝑥) + 𝑐*𝑥* + ⋯ + 𝑐1 𝑥1 . (" (" ("

• Then, 𝑀𝑟 $
= 𝑀 𝑐% 𝑥% + 𝑐& 𝑥& + ⋯ + 𝑐' 𝑥' • Since 𝜆" /𝜆% #
= 0 as 𝑘 → ∞, we prove 𝑀 𝑟 # $
= 𝑐% 𝜆%# 𝑥% .
= 𝑐% 𝑀𝑥% + 𝑐& 𝑀𝑥& + ⋯ + 𝑐' 𝑀𝑥' • May not converge if 𝜆) = 𝜆* (discussed later).

= 𝑐% 𝜆% 𝑥% + 𝑐& 𝜆& 𝑥& + ⋯ + 𝑐' 𝜆' 𝑥'

• If we repeat: 𝑀# 𝑟 $
= 𝑐% 𝜆%# 𝑥% + 𝑐& 𝜆#& 𝑥& + ⋯ + 𝑐' 𝜆#' 𝑥' .
Jaemin Yoo 23 Jaemin Yoo 24
PageRank for Undirected Graphs Outline
• Given an undirected graph with 𝑛 nodes and 𝑚 edges. 1. Web Search as a Graph
• Nodes are pages and edges are hyperlinks. 2. PageRank
• Claim: For any node 𝑣, 𝑟+ = 𝑑+ / 2𝑚 is a solution. 3. PageRank: Implementation
• Proof: Substitute 𝑟" with 𝑑" / 2𝑚 in the equation 𝑟 = 𝑀𝑟.

Jaemin Yoo 25 Jaemin Yoo 26

Two Problems in PageRank Problem: Dead Ends

• Dead ends: Some pages have no out-links. • Power iteration: y
y a m
) y ½ ½ 0
• Random walker has nowhere to go to. • Set 𝑟+ = .
2 a ½ 0 0
• Such pages cause importance to leak out. 4
• Update 𝑟+ ← ∑,→+ 5! iteratively. a m m 0 ½ 0
• Spider traps: All out-links are within the group. !

• Random walker gets stuck in a trap. • Example: ry = ry /2 + ra /2

• Eventually, spider traps absorb all importance. ra = ry /2
𝑟𝑦 1/3 2/6 3/12 5/24 0
rm = ra /2
𝑟𝑎 = 1/3 1/6 2/12 3/24 … 0
𝑟𝑚 1/3 1/6 1/12 2/24 0

• Here the PageRank leaks out since the matrix is not stochastic.

Jaemin Yoo 27 Jaemin Yoo 28

Solution: Always Teleport Problem: Spider Traps
• Follow random teleport links with probability 1 from dead-ends. • Power iteration: y
y a m
) y ½ ½ 0
• Adjust the transition matrix accordingly. • Set 𝑟+ = .
2 a ½ 0 0
4
• Update 𝑟+ ← ∑,→+ 5! iteratively. a m m 0 ½ 1
!
y y
• Example: m is a spider trap ry = ry /2 + ra /2
a ra = ry /2
a m m 𝑟𝑦 1/3 2/6 3/12 5/24 0
rm = ra /2 + rm
𝑟𝑎 = 1/3 1/6 2/12 3/24 … 0
y a m y a m 𝑟𝑚 1/3 3/6 7/12 16/24 1
y ½ ½ 0 y ½ ½ ⅓
a ½ 0 0 a ½ 0 ⅓ • All PageRank scores are trapped in node 𝑚.
m 0 ½ 0 m 0 ½ ⅓

Jaemin Yoo 29 Jaemin Yoo 30

Solution: Probabilistically Teleport Google’s Solution to Both Problems

• Teleports: At each step, the random surfer has two options: • Google’s solution:
• With probability 𝛽, follow a link at random. • Fill in the “empty” columns of 𝑀 with 1 / 𝑁.
• With probability 1 − 𝛽, jump to some random page. • Random surfer has two options at each step:
• 𝛽 is typically in the range 0.8 to 0.9. • With 𝛽, follow a link at random; With 1 − 𝛽, jump to some random page.

• Surfer will teleport out of a trap within a few time steps. • PageRank equation [Brin and Page, 1998]:
𝑟" 1
𝑟! = : 𝛽 + 1−𝛽
y y
𝑑" 𝑁
"→!
a m a m
• 𝑑, is the out-degree of node 𝑖.

Jaemin Yoo 31 Jaemin Yoo 32

The Google Matrix Random Teleports (𝛽 = 0.8)
M [1/N]NxN
• We have the Google Matrix 𝐴:
7/15
y 1/2 1/2 0 1/3 1/3 1/3
0.8 1/2 0 0 + 0.2 1/3 1/3 1/3
1
𝐴 = 𝛽𝑀 + 1 − 𝛽 0 1/2 1 1/3 1/3 1/3
𝑁

1/
5
7/1

15
7/1
0×0

1/
15
y 7/15 7/15 1/15
13/15
• 𝑀 is preprocessed to be column-stochastic. a 7/15 1/15 1/15
7/15
a m 1/15 7/15 13/15
• In practice, 𝛽 = 0.8 or 0.9 (surfer jumps every 5 to 10 steps). 1/15
m
A
• Note: 𝐴 is stochastic, diagonalizable, and satisfies 𝜆% > 𝜆& .
• 𝜆) and 𝜆* are the two largest eigenvalues. y 1/3 0.33 0.24 0.26 7/33
a = 1/3 0.20 0.20 0.18 ... 5/33
m 1/3 0.46 0.52 0.56 21/33
Jaemin Yoo 33 34

Computation of PageRank Sparse Matrix Encoding

• For computation, it is inefficient to explicitly create 𝐴 from 𝑀. • Encode the sparse matrix 𝛽𝑀 using only the nonzero entries.
• 𝐴 is a dense matrix, while 𝑀 is a sparse matrix. • Space is roughly proportional with the number of links.
• Creating a dense matrix for the entire web is almost impossible. • Still won’t fit in memory for a large graph, but will fit on disk.
• Rearrange the PageRank equation into
1−𝛽
𝑟 = 𝛽𝑀𝑟 +
𝑁 0

• Core operation is the spare matrix-vector multiplication Source: Stanford CS246

Jaemin Yoo 35 Jaemin Yoo 36

Basic Algorithm Basic Algorithm 1−𝛽
𝑟 = 𝛽𝑀𝑟 +
𝑁
• Assume that we have enough memory to store 𝑟 234 .
2
• Each step of the power iteration is:
• Store the previous rank 𝑟 789 and the matrix 𝑀 on disk.
Initialize all entries of 𝑟 :;< = 1 − 𝛽 /𝑁
• For each source page, update 𝑟 234 of all destination pages. For each page 𝑖 (with out-degree 𝑑, ):
Read into memory: 𝑖, 𝑑, , dest) , ⋯ , dest 5! , 𝑟 789 𝑖
0 rnew source degree destination rold 0
1
For 𝑗 = 1, ⋯ , 𝑑,
1 0 3 1, 5, 6
2 2 𝑟 :;< dest+ += 𝛽𝑟 789 𝑖 / 𝑑,
3 1 4 17, 64, 113, 117 3
4 4
2 2 13, 23 • Only one full disk access is required for each iteration.
5 5
6 6 • Still slow? MapReduce was originally designed for PageRank.

Jaemin Yoo 37 Jaemin Yoo 38

Summary
1. Web Search as a Graph
2. PageRank
• Recursive formulation
• Power iteration
3. PageRank: Implementation
• Spider traps
• Dead ends
• Random teleports
• Sparse matrix computation

Jaemin Yoo 39

Social Network Analysis
No ratings yet
Social Network Analysis
28 pages
Google Pagerank: The World'S Largest Matrix Computation
No ratings yet
Google Pagerank: The World'S Largest Matrix Computation
13 pages
CS345 Data Mining: Link Analysis Algorithms Page Rank
No ratings yet
CS345 Data Mining: Link Analysis Algorithms Page Rank
37 pages
Anticancer Drugs Classification
100% (1)
Anticancer Drugs Classification
19 pages
Algorithms
No ratings yet
Algorithms
49 pages
04 Pagerank
No ratings yet
04 Pagerank
64 pages
ch05 Linkanalysis1
No ratings yet
ch05 Linkanalysis1
60 pages
Lecture11 PageRank V0
No ratings yet
Lecture11 PageRank V0
38 pages
Markov Chains PDF
No ratings yet
Markov Chains PDF
66 pages
MLC 04 Graph Methods Ranking Communities Link Prediction-Sose2023
No ratings yet
MLC 04 Graph Methods Ranking Communities Link Prediction-Sose2023
110 pages
Power Point
No ratings yet
Power Point
77 pages
The Linear Algebra Behind Google
No ratings yet
The Linear Algebra Behind Google
34 pages
09 Pagerank
No ratings yet
09 Pagerank
61 pages
Extrapolation Methods For Accelerating Pagerank Computations
No ratings yet
Extrapolation Methods For Accelerating Pagerank Computations
45 pages
CSEC-Chemistry-p2 May-June 2012 PDF
50% (4)
CSEC-Chemistry-p2 May-June 2012 PDF
20 pages
Lab 4-2
No ratings yet
Lab 4-2
4 pages
FULL PPSC GIS Officer Guide Detailed
No ratings yet
FULL PPSC GIS Officer Guide Detailed
6 pages
Unit - 4
No ratings yet
Unit - 4
22 pages
Model Report Pmegp Solar Charkha New - 25
No ratings yet
Model Report Pmegp Solar Charkha New - 25
10 pages
Google Pagerank: Maths Delivers!
No ratings yet
Google Pagerank: Maths Delivers!
24 pages
Unit 4
No ratings yet
Unit 4
60 pages
Cse535 Link Analysis
No ratings yet
Cse535 Link Analysis
19 pages
Module 4 MapReduce and Link Analysis
No ratings yet
Module 4 MapReduce and Link Analysis
103 pages
Pagerank
No ratings yet
Pagerank
3 pages
CAIM: Cerca I Anàlisi D'informació Massiva: FIB, Grau en Enginyeria Informàtica
No ratings yet
CAIM: Cerca I Anàlisi D'informació Massiva: FIB, Grau en Enginyeria Informàtica
44 pages
Lecture 4
No ratings yet
Lecture 4
3 pages
Deeper Inside Pagerank: Amy N. Langville and Carl D. Meyer
No ratings yet
Deeper Inside Pagerank: Amy N. Langville and Carl D. Meyer
33 pages
TM3 ch05 Link Analysis
No ratings yet
TM3 ch05 Link Analysis
69 pages
Page Rank
No ratings yet
Page Rank
29 pages
Project2 SimplifiedPageRank
No ratings yet
Project2 SimplifiedPageRank
6 pages
Dbms Review-3: G.BALAVIGNESH-10MSE1072 Harshavardhan-10Mse1077
No ratings yet
Dbms Review-3: G.BALAVIGNESH-10MSE1072 Harshavardhan-10Mse1077
35 pages
PageRank 2021
No ratings yet
PageRank 2021
55 pages
09 KTK - 14 Statistics
No ratings yet
09 KTK - 14 Statistics
36 pages
1e Aldehyde & Ketone
100% (1)
1e Aldehyde & Ketone
48 pages
Assignment5 NLA Aug2023
No ratings yet
Assignment5 NLA Aug2023
7 pages
CSF-469-L11-13 (Link Analysis Page Rank)
No ratings yet
CSF-469-L11-13 (Link Analysis Page Rank)
47 pages
Datamining-Lect7 - Link Analysis Ranking PageRank - Random Walks HITS Absorbing Random Walks and Label Propagation
No ratings yet
Datamining-Lect7 - Link Analysis Ranking PageRank - Random Walks HITS Absorbing Random Walks and Label Propagation
99 pages
Semiconductor Devices and Circuits
No ratings yet
Semiconductor Devices and Circuits
3 pages
Link Analysis
No ratings yet
Link Analysis
47 pages
Technical University of Ilmenau Institute For Theoretical and Technical Computer Science Automata and Formal Languages
No ratings yet
Technical University of Ilmenau Institute For Theoretical and Technical Computer Science Automata and Formal Languages
19 pages
Lect 14-Web Ranking
No ratings yet
Lect 14-Web Ranking
30 pages
05.2-Efficient Computation of PageRank
No ratings yet
05.2-Efficient Computation of PageRank
19 pages
Lecture 9
No ratings yet
Lecture 9
64 pages
Jeffrey D. Ullman Stanford University
No ratings yet
Jeffrey D. Ullman Stanford University
55 pages
JHA Painting
100% (1)
JHA Painting
9 pages
Jeffrey D. Ullman Stanford University
No ratings yet
Jeffrey D. Ullman Stanford University
44 pages
Module 6-: Real Time Big Data Models
No ratings yet
Module 6-: Real Time Big Data Models
58 pages
Applications of Stochastic Models in Web Page Ranking
No ratings yet
Applications of Stochastic Models in Web Page Ranking
8 pages
Advanced Analysis of Algorithms: Dept of CS & IT University of Sargodha
No ratings yet
Advanced Analysis of Algorithms: Dept of CS & IT University of Sargodha
51 pages
Es 5 Power - Flow
No ratings yet
Es 5 Power - Flow
84 pages
15 Link 2
No ratings yet
15 Link 2
11 pages
PMBD-07-Link Analysis
No ratings yet
PMBD-07-Link Analysis
42 pages
Google Pagerank and Reduced-Order Modelling
No ratings yet
Google Pagerank and Reduced-Order Modelling
56 pages
The Pagerank and HITS Algorithms
No ratings yet
The Pagerank and HITS Algorithms
22 pages
Feb 28
No ratings yet
Feb 28
12 pages
Page Rank With 13 Cases
No ratings yet
Page Rank With 13 Cases
72 pages
AIATS Second Step JEE (Main & Advanced) 2024
No ratings yet
AIATS Second Step JEE (Main & Advanced) 2024
5 pages
The Use of The Linear Algebra by Web Search Engines
No ratings yet
The Use of The Linear Algebra by Web Search Engines
5 pages
Page Rank Algorithm
No ratings yet
Page Rank Algorithm
18 pages
Rec Sys Network
No ratings yet
Rec Sys Network
45 pages
Distributed Computing Seminar: Lecture 5: Graph Algorithms & Pagerank
No ratings yet
Distributed Computing Seminar: Lecture 5: Graph Algorithms & Pagerank
33 pages
Report PDF
No ratings yet
Report PDF
35 pages
Page Rank PDF
0% (1)
Page Rank PDF
20 pages
Math 551 Lab 12
No ratings yet
Math 551 Lab 12
5 pages
6 Pagerank
No ratings yet
6 Pagerank
7 pages
Introduction To Polymer Chemistry, Fourth Edition Carraher Jr. Download
No ratings yet
Introduction To Polymer Chemistry, Fourth Edition Carraher Jr. Download
65 pages
Electric Circuits
No ratings yet
Electric Circuits
10 pages
7 Colin Shelley Facts Global Energy
No ratings yet
7 Colin Shelley Facts Global Energy
13 pages
Page Rank Algorithm
No ratings yet
Page Rank Algorithm
9 pages
1.1 Pagerank Description
No ratings yet
1.1 Pagerank Description
19 pages
Conjunction Worksheet
No ratings yet
Conjunction Worksheet
4 pages
Indicaciones de Uso Vitremer
No ratings yet
Indicaciones de Uso Vitremer
4 pages
microCT SKYSCAN 1273 Brochure DOC-B76-EXS014 High
No ratings yet
microCT SKYSCAN 1273 Brochure DOC-B76-EXS014 High
12 pages
Solid State Voltage Regulator
No ratings yet
Solid State Voltage Regulator
9 pages
The Work of The Traditional Healer: Traditional Healers and Modern Doctors
No ratings yet
The Work of The Traditional Healer: Traditional Healers and Modern Doctors
2 pages
Endangered Species
No ratings yet
Endangered Species
14 pages
Singh Et Al. 2022 - PCA - Description
No ratings yet
Singh Et Al. 2022 - PCA - Description
14 pages
Full-Application Note Drinking Water Monitoring An Algae Bloom
No ratings yet
Full-Application Note Drinking Water Monitoring An Algae Bloom
6 pages
Acknowledgement
No ratings yet
Acknowledgement
6 pages
Appendix K Tym
No ratings yet
Appendix K Tym
118 pages
DX Diag
No ratings yet
DX Diag
34 pages
Full Thesis
No ratings yet
Full Thesis
27 pages
Decay Modernism
No ratings yet
Decay Modernism
6 pages
Medical Coding New
No ratings yet
Medical Coding New
26 pages
Fernando Reinforcement and Extension Worksheets
No ratings yet
Fernando Reinforcement and Extension Worksheets
27 pages
4 X 4 Fishing Guide
No ratings yet
4 X 4 Fishing Guide
12 pages
Dogmatica Zizioulas
No ratings yet
Dogmatica Zizioulas
241 pages
1 PB
No ratings yet
1 PB
9 pages
The Visual Elements—Photography: A Handbook for Communicating Science and Engineering
From Everand
The Visual Elements—Photography: A Handbook for Communicating Science and Engineering
Felice C. Frankel
No ratings yet
30-Minute Robotics Projects
From Everand
30-Minute Robotics Projects
Loren Bailey
No ratings yet

14 Link 1

Uploaded by

14 Link 1

Uploaded by

Announcements

Jaemin Yoo 1 Jaemin Yoo 2

Jaemin Yoo 5 Jaemin Yoo 6

Graph Data: Communication Web as a Graph

Source: Stanford CS246

Source: Stanford CS246

Jaemin Yoo 7 Jaemin Yoo 8

2. What is the best answer to each query? ...the cat is

• Idea: Pages that know about 𝑋 might be pointing to many 𝑋.

Inverted Buckets …the dog is

Early Search Engines Ranking Nodes on the Graph

cat cat cat

Inverted Buckets …the dog is

Jaemin Yoo 13 Jaemin Yoo 14

Intuition 2: Random Surfing Example: PageRank Scores

Source: Stanford CS246

Jaemin Yoo 15 Jaemin Yoo 16

Source: Stanford CS246

Jaemin Yoo 17 Jaemin Yoo 18

Example: Matrix Formulation Eigenvector Formulation

Jaemin Yoo 19 Jaemin Yoo 20

• Stop when ||𝑟 ./) − 𝑟 . || is small enough. ⋯

Jaemin Yoo 21 Jaemin Yoo 22

Why Power Iteration Works? Why Power Iteration Works?

= 𝑐% 𝜆% 𝑥% + 𝑐& 𝜆& 𝑥& + ⋯ + 𝑐' 𝜆' 𝑥'

Jaemin Yoo 25 Jaemin Yoo 26

Two Problems in PageRank Problem: Dead Ends

• Random walker gets stuck in a trap. • Example: ry = ry /2 + ra /2

Jaemin Yoo 27 Jaemin Yoo 28

Jaemin Yoo 29 Jaemin Yoo 30

Solution: Probabilistically Teleport Google’s Solution to Both Problems

Jaemin Yoo 31 Jaemin Yoo 32

Computation of PageRank Sparse Matrix Encoding

• Core operation is the spare matrix-vector multiplication Source: Stanford CS246

Jaemin Yoo 35 Jaemin Yoo 36

Jaemin Yoo 37 Jaemin Yoo 38

You might also like