Web Search Engine Building
Web Search Engine Building
On
Submitted By:
1. Introduction to project
The project is a web browser with a search engine implemented in Java. We had several
pedagogical goals in designing this project.
The project consists of four separate parts that incrementally build a web Browser with a
search engine. Each part adds to or improves on functionality implemented by the previous part
by applying new algorithms and using data structures that are presented in class. The final part,
which completes the implementation, uses all of the advanced data structures (binary trees,
priority queues, hash tables, and graphs).
2. PROJECT PARTS
The individual parts include:
Part 1 : Parsing a webpage and building a word frequency tree of its contents (uses binary
search trees)
Part 2: The first implementation of the search engine: Processing User Queries to Find the
Most Relevant web Pages (uses priority queues and binary search trees)
Part 3: Adding caching to the search engine and implementing the GUI front-end to the
web browser and
Part 4: Adding a hyper-link graph to the web browser that displays Shortest Path and
Reachability information between webpage nodes and incorporating "link- to" information
into the ordering of search results presented by the search engine. (uses graphs, hash tables,
priority queues, and binary search trees)
For GUI and Functionality of Browser1-Adding Home, Back, and Forward buttons to the web browser (adds stacks and/or queues)
2-Enabling links in Displayed web pages and in the list of URL results displayed by the search
engine.
The final version of the web browser has the following capabilities:
1-It displays a web page given a URL.
2-Its search engine finds good matching web pages given a query string and display the
resulting URLs in order of best match to worst (criteria used to determine a "good match"
changes as more features are added to the search engine).
3-It automatically displays the webpage of the best matching URL result of a search.
4-Its search engine caches previous search results that are used to improve the
performance of subsequent searches.
3. Software Used
N etbeansIDE 7.0
4. SNAPSHOTS
Searching a query