23/2/2013
WEB SEARCHING
By: Dr. Noor Dayana Abd Halim
Introduction
What do you know about web in
general and web-searching in
specific?
Web
World Wide Web (or WWW, It is called a web because
the interconnections between documents resemble a
spiders web).
A software application that makes it easy and possible
for nearly anyone to publish and browse hypertext
documents on the Internet.
1
23/2/2013
Web Searching
An act of looking for information in computer database or
network (web).
Web Searching
PROBLEMS THAT MIGHT YOU FACED
A lot of information is available on-line, but not all of
them is completely accurate.
The web-page addresses are constantly changing, it
may be only available for a short time
time.
Or you do not know to locate the correct information
and you had trouble finding what you wanted
You need a mechanism to help you find information on
the web
Categories of Web Searching
There are four categories:
Directories
Search engines
Meta-search engines
Yellow pages
2
23/2/2013
Web Searching : Directory
1. Directory
Definition: A Web Directory or Web Guide is a
hierarchical representation of hyperlinks
A web directory is not a search engine and does not
display lists of web pages based on keywords; instead, it
lists web sites by category and subcategory.
The top level is typically a wide range of very general
topics.
Each topic contains hyperlinks of more specialized sub-
topics
Web Searching : Directory
Examples
http://www.usa.gov/directory/federal/index.shtml
http://www.irs.gov/News-&-Events
Search box
Web Searching : Search Engine
2. Search engine
A computer program that does the following:
9 Allows user to submit a query that consists of a word or
phase.
9 Searches the database.
9 Returns a list of suitable URLs which match your query.
9 Allows user to revise and resubmit.
Submit your
query here
3
23/2/2013
Web Searching : Search Engine
Example:
9 www.google.com
9 www.yahoo.com
9 www.lycos.com
Web Searching : Metasearch Engine
3. Metasearch engine
A metasearch or all-in-one search engine performs a
search by the use of more than one other search engine
to complete the search job.
The duplicate
p retrievals are eliminated.
The results are ranked according to how well they match
with the query.
Advantage: A single query can access lot of search
engines.
Disadvantage: A lot of matches will not be suitable for
you.
Web Searching : Metasearch Engine
Example :
9 Metasearch www.metasearch.com
9 Metacrawler www.metacrawler.com
9 Meta search engine www.metasearchengine.com
4
23/2/2013
Web Searching : Yellow pages
4. Yellow pages
White pages allows user to lookup information about
individuals.
We can use white page to track down the telephone
numbers email address.
numbers, address
People can abuse white pages
Some people think that white pages are an invasion of
their privacy.
Yellow pages contain information about businesses.
Web Searching : Yellow pages
Example
9 Bigfoot www.bigfoot.com
9 Yahoo! People Search people.yahoo.com
9 WhoWhere www.whowhere.com
9 Yellow Page Malaysia www.yellowpages.com.my
www yellowpages com my
9 SuperPages www.superpages.com
Directory VS Search Engine
Directory Search Engine
A directory allows you to explore
and get what you want eventually.
Use a directory to find cooking-
related websites.
Use a directory to find travel guides
in a country.
5
23/2/2013
Directory VS Search Engine
Directory Search Engine
A directory allows you to explore A search engine brings you to the
and get what you want eventually. exact page on the words or phrases
you are looking for.
Use a directory to find cooking- Use a search engine to find a
related websites. specific recipe, by providing the
name of the ingredients.
Use a directory to find travel guides Use a search engine to find the
in a country. transport trains schedule in South
Africa.
Browser
Can be defined as a software application used to locate
and display web.
Popular browser
IE (Internet Explorer)
Nestcape
N t
Mozilla Firefox
Safari
Opera
Mosaic
Google Chrome
Searching Techniques
Searching techniques..
have you got idea about it?
6
23/2/2013
Searching : terminology
Search Tool: Any mean to locating information on the
Internet.
Query: Information typed into the form on the search
engine.
Query syntax: Rules for constructing a valid query
query.
Query semantics: Rules for defining the meaning of a
query.
Hit/Match: A URL that the search engine returns for a
specific query.
Relevancy score: A value that indicates the quality of the
URL (match close to the query 1 to 100).
Searching : Pattern Matching Queries
It is also called Fuzzy Query.
You can enter ungrammatical sentences, incomplete
sentence fragments, disjoint phrases, nonsense
words.
The search engineg g
gets a collection of keywords.
y
Required keyword: Mark with + before the keyword.
Prohibited keyword: Mark with - before the keyword.
Searching : Boolean Query
A Boolean Query is a query that consists keywords but
with logical operators (AND, OR, NOT).
X AND Y will return URLs that contain both X and Y.
X OR Y will return URLs that contain either X or Y.
X AND NOT Y will return URLs that contain X and do
not contain Y.
7
23/2/2013
Searching : Search Strategies
You should find a search engine that meets the following
conditions:
A user-friendly interface
Easy-to-understand documentation
Convenient
C i t tto access
A large indexed database
Assigning good relevancy scores.
Learn the syntax of this particular search engine, but not
several different engines.
Searching : Search Generalization
Too few hits?
Needs to generalize your search query.
Pattern matching query: eliminate one of the more
specific keywords of the query.
Boolean query: remove the keywords with AND
operator, or delete the NOT item.
Use a directory or metasearch engine if still cannot
locate the matched URL.
Searching : Search Specialization
Too many hits?
Needs to specialize your search query.
Pattern matching query: add more keywords.
Boolean query: use AND with other keyword, or add
NOT operatort to
t excluded
l d d some unwanted t d pages.
Try capitalizing proper nouns or names.
Use a directory to locate your information
8
23/2/2013
Searching : Search Specialization
Too many hits?
Needs to specialize your search query.
Pattern matching query: add more keywords.
Boolean query: use AND with other keyword, or add
NOT operatort to
t excluded
l d d some unwanted t d pages.
Try capitalizing proper nouns or names.
Use a directory to locate your information
Searching: How does it works?
User interface : Allows you to type a query and displays the results.
Searcher: The engine searches the database for matching your query.
Evaluator : The engine assigns scores to the retrieved information.
Gatherer : The component that travels the WEB, and collects information.
Indexer : The engine that categorizes the data collected by the gatherer
Searching: How does it works?
1. User Interface
Provides a mechanism for a user to submit queries to
the search engine.
Uses forms, very user friendly.
The
Th user iinterface
t f displays
di l th
the search
h results
lt iin a
convenient way.
A summary of each matched page is shown.
2. Searcher
It is a program that uses the search engines database to
locate the matches for a specific query.
The database of a search engine holds extremely large
indexed pages.
9
23/2/2013
Searching: How does it works?
3. Evaluator
The searcher returns a set of URLs that match your
query.
Not all of the hits equally match your query.
More
M references
f t the
to th page, the
th ranking
ki off the
th page will
ill
be higher.
How the relevancy score is calculated?
Varies from one engine to another one.
The number of times of the word appears?
The query words appear in the title?
The query words appear in the META tag?
Searching: How does it works?
4. Gatherer
It is a program that traverses the Web and gathers
information about the Web documents.
It runs at a short and regular intervals.
It returns information and will be indexed to the
d t b
database.
Alternate names: Bot, Crawler, Robot, Spider and Worm.
5. Indexer
It organizes the data by creating a set of keys or an
index.
Indexes need to be rebuilt frequently.
E.g. Libraries Author, Title, ISBN, etc
In order to ensure the returned URL is not out of date.
Searching Tips
Be natural
Is cell phone harmful?
Ask the search engine : Cell phone AND harmful
Capitalize
Always use lowercase
star will search Star, STAR, stAr,
Type Star unless you really want to search Star.
Use uncommon keywords
The more specific results will return to you.
Think a valid and uncommon keyword.
10
23/2/2013
Searching Tips
Require words
Add a + before the keyword.
It will be in every match.
Exclude words
Use - before the keyword
keyword.
In what situation should we use?
Correct Spelling
Beware of the differences between English and
American spellings (Color, Colour) (color OR
colour)
Searching Tips
Stop words
Ignore the most common words the, is,
searching the web and the search engine will ignore
the web.
Add more relevant keyword.
Use wildcards
Use * in some search engines.
funk* funk, funky, funkiest,
Solve dead links
If the search engine returns
http://www.hit.com/a/b/c.html which is a dead link.
You can try http://www.hit.com/a/b/
Or http://www.hit/com/a/
Searching Tips
Use different resources to find/search different kinds of
information.
Use successive query refinement to achieve effective
search queries.
Think carefully for the keywords typed in the search
engine.
Use Boolean queries when you need combinations of
keywords.
11
23/2/2013
Lets Do it!
1. History, default page, favourite/bookmark
Clear history
Clear cache
Clear URL
Save/using bookmark/favourite
Setting default page
2. Find any notes (word, power point, .pdf, etc) about
communication model that you think provide all the
information that you want and share it with your
friend in e-learning
Thank You !!
12