Database & Search Engine

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 17

DATABASE & SEARCH ENGINE

PRESENTED BY:
S.BHAVITHRA
Database
What is a database?
• A database is an organized collection of related
records that is stored digitally.
• It is arranged in a structured order for ease and
speed of search.
• An example would be the Library Literature
Database on the New York Public Library website
which “Indexes
periodicals and books, reports, pamphlets, and
library school theses on all aspects of library and
information science” from 1984 to the present
What Is A Search Engine ?
• Search Engine usually refer to a web search engine, which searches information
on the web.

• Search engines are huge databases of web page files that have been assembled
automatically by machines.

• By performing a search using a search engine, you're asking the engine to scan its
index of sites and match your keywords and phrases with those in the text of
documents within the engine's database.
• Search Engine is a Document Retrieval System* which is designed to help find
information stored on a computer system like on World Wide Web.

• Search Engine allows one to ask for content, meeting specific criteria, typically
those containing a given word or a phrase and retrieves a list of those items that
match those criteria. This list is often sorted with respect to some measure of
relevance of the results.

• When you are using a search engine, you are NOT searching the entire web as it
exists at this moment. You are actually searching a portion of the web, captured
in a fixed index created at an earlier date.
What Is A Search Engine ?

• A client/server application
• A document retrieval system
• Use regularly updated indexes to operate quickly and
Efficiently
• Designed to help find information stored:
• On a computer system, such as on the World Wide Web
• Inside a corporate or proprietary network
• 􀁹 In a personal computer
• 􀁹 Different selection and relevance criteria can apply in
• different environments, or for different uses
• 􀁹 Allows one to ask for content meeting specific criteria
• 􀁹 Typically those containing a given word or phrase
• 􀁹 Retrieves a list of items that match those criteria
Search Engines Consist of Four Discrete Software
Components

• Spider/ Crawler : a software program that gathers


information and puts it into the search engine’s database. It
visits Web pages, often starting at the main page of a site,
reads them and the follows the links to other pages.

• The database or Index: the web pages are systematically


stored and updated here.

• Search Engine Result Engine: which is the software that sifts


through the pages stored in the index to find matches to a
search and rank them in order of what it believes, is most
relevant.

• The interface, which is what we use to query the database. It


usually consists of a search box in which you type your query
and a button to launch the search. Sometimes there are
menus to choose various search functions to refine the query.
• The Spider retrieves pages
from the world wide web.

• The data retrieved by the


spider is systematically
indexed and stored in the
search engine’s database.

• When a user types in a search


query the Search Engine
Result Engine looks up the
Index and provides a listing of
best-matching web pages
according to its criteria,
usually with a short summary
containing the document's title
and sometimes parts of the
text. Most search engines
support the use of the boolean
terms AND, OR and NOT to
further specify the search
query.
Let's see how Goggle processes a query

1. The web server sends the


query to the index servers.
The content inside the
index servers is similar to
the index in the back of a
book--it tells which pages
contain the words that
match any particular
query term.
2. The query travels to the
doc servers, which
actually retrieve the stored
documents. Snippets are
generated to describe each
search result.
3. The search results are
returned to the user in a
fraction of a second.
Types of Search Engines
• Crawler based Search Engines – Crawlers are indexed using
spiders. E.g Google, Altavista.

• Directory – These are created and maintained by human


editors. The editors review and select sites for inclusion in
their directories on the basis of previously determined
selection criteria. Their databases are organised by category or
subject to permit browsing but are in general much smaller
than those of crawler based engines. E.g. Yahoo, Looksmart.

• Regional - Regional search engines focus on one particular


language or region. E.g. Google.co.in, khoj.com

• Metasearcher - MetaSearchers use a uniform platform to


search using several engines simultaneously. E.g Kartoo.com,
profusion, vivisimo.
Invisible Web
• Search engines do not necessarily reach all parts of the
Web or necessarily index all pages at a site.

• The Invisible Web, as it is called, is largely comprised of


databases not easily indexed by the search engines, pages
deep in a web site that don't get crawled, file formats that
the search engines ignore, and services for subscribers
only (and often for a fee).

• No one has an estimate, but some have guessed at 500


billion.
SEARCH ENGINE APPLICATIONS
Search Engines allow field
searches for Search in title,
Date last updated, Search in
the URL, etc.

Search Engine searches from


a huge database of web
pages.

The results are displayed as


per the highest occurrence of
keywords specified. One can
reorder by date of posting as
well.
THE METHOD TO CONDUCT SEARCH
When you conduct a search for a specific title or author what type of search are
you conducting?
Field searching allows the researcher to select a specific portion of the electronic
record to search, be that title, author, publication year, etc. If someone were looking
for articles by John Updike, the searcher could simply type “Updike, John” into the
author field to search for all articles contained in the database written by John
Updike.

What are basic search techniques?


The first basic principle of conducting a search is to choose appropriate keywords,
using a thesaurus if deemed necessary. In choosing keywords the researcher should
consider variant word forms, differing spellings and related words
List some advanced search techniques.
In order to conduct a more specific search,
field searching is recommended. This
would mean searching such particular fields as
Author, Title, Year of Publication,
Language, etc. for precise keywords. Thus a
researcher could input “1999” in the
year of publication field to find documents
published in that year or “French” in the
language field to find documents written in
French or “small” in the title field to find
books with the word small in the title.
In addition to the basic search techniques, on
some interfaces a proximity operator,
like “with, “adjacent” or “near,” can be used to
further limit or expand search
potentials.
THANK YOU

You might also like