100% found this document useful (1 vote)
58 views10 pages

Algorithms For Web Indexing and Searching: Gerth Stølting Brodal and Rolf Fagerberg Fall 2002

The document outlines a course on algorithms for web indexing and searching. It discusses how search engines like Google work and the key components of search engines, including web crawlers to retrieve pages, indexes to store page content and metadata, different types of queries, and methods for ranking search results. The course will cover topics like clustering, categorization, evaluating search engines, models of the web graph, and data mining. Students will complete a programming project to implement a basic web search engine by developing components for crawling, indexing, ranking results, and a query interface.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
58 views10 pages

Algorithms For Web Indexing and Searching: Gerth Stølting Brodal and Rolf Fagerberg Fall 2002

The document outlines a course on algorithms for web indexing and searching. It discusses how search engines like Google work and the key components of search engines, including web crawlers to retrieve pages, indexes to store page content and metadata, different types of queries, and methods for ranking search results. The course will cover topics like clustering, categorization, evaluating search engines, models of the web graph, and data mining. Students will complete a programming project to implement a basic web search engine by developing components for crawling, indexing, ranking results, and a query interface.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

Algorithms for

Web Indexing and Searching

Gerth Stlting Brodal and Rolf Fagerberg


Fall 2002

Course Motivation
How does Google work?

Course Motivation
How does Google work?

How do search engines work?

Course Motivation
How does Google work?

How do search engines work?

Algorithms for web indexing and searching

Course Outline
1. Introduction to Course
2. General Anatomy of Web Search Engines
3. Building blocks of Search Engines
(a) Web Crawlers
Anatomy of crawlers
Crawling strategy
(b) Index
Inverted files
Suffix trees
Signature files
Compression
Issues of efficient construction
Duplicate removal

Course Outline
(c) Types of Queries
(d) Ranking
Textbased methods
Vector based methods
Latent semantic indexing
Link based methods
PageRank
HITS
SALSA
Others

Course Outline
4. Further topics
(a)
(b)
(c)
(d)
(e)

Clustering
Automatic Categorization/Hierarchy Building
Evaluation of search engines
Structure of and Models for the Web Graph
Data Mining

Formal Course Description


Prerequisites:

dADS

Literature:

Handouts

Course language:

Danish or English

Credits:

2 points/10 ECTS

Evaluation:

Programming project
Course page:

http://www.daimi.au.dk/~gerth/webalg02/index.html

Programming Project
Implement a Web Search Engine

Programming Project
Implement a Web Search Engine
Distributed project
Groups (24 persons) doing:
Web crawling
Index building
Ranking
Query interface
Start: index Aarhus University website
Goal: index domain .dk

You might also like