ABSTRACT1
ABSTRACT1
The good news about the Internet and its most visible component, the World Wide Web, is that there are
hundreds of millions of pages available, waiting to present information on an amazing variety of topics. The
bad news about the Internet is that there are hundreds of millions of pages available, most of them titled
according to the whim of their author, almost all of them sitting on servers with cryptic names. When you need
to know about a particular subject, how do you know which pages to read? If you're like most people, you visit
an Internet search engine.
Internet search engines are special sites on the Web that are designed to help people find information stored
on other sites. There are differences in the ways various search engines work, but they all perform three basic
tasks:
They search the Internet -- or select pieces of the Internet -- based on important words.
They keep an index of the words they find, and where they find them.
They allow users to look for words or combinations of words found in that index.
Early search engines held an index of a few hundred thousand pages and documents, and received
maybe one or two thousand inquiries each day. Today, a top search engine will index hundreds of
millions of pages, and respond to tens of millions of queries per day. In this article, we'll tell you how
these major tasks are performed, and how Internet search engines put the pieces together in order to
let you find the information you need on the Web.
HISTORY OF Search Engine
While Ted was against complex markup code, broken links, and many other problems associated with
traditional HTML on the WWW, much of the inspiration to create the WWW was drawn from Ted's work.
There is still conflict surrounding the exact reasons why Project Xanadu failed to take off.
The first few hundred web sites began in 1993 and most of them were at colleges, but long before most of
them existed came Archie. The first search engine created was Archie, created in 1990 by Alan Emtage,
a student at McGill University in Montreal. The original intent of the name was "archives," but it was
shortened to Archie.
Archie helped solve this data scatter problem by combining a script-based data gatherer with a regular
expression matcher for retrieving file names matching a user query. Essentially Archie became a
database of web filenames which it would match with the users queries.