0% found this document useful (0 votes)

144 views

Untangling The Web: Alena Kaltunevich

This document provides an overview of searching the internet and various search tools. It discusses how search engines work by having software "spiders" crawl the web and index pages in their databases. When a search is performed, the engine checks its index to find relevant results. Different engines use different algorithms to rank results. The document also describes various search engines like Google, Yahoo, and specialized tools for images, videos or other data types. It provides tips for effective searching using operators, quotation marks, and excluding terms.

Uploaded by

tezla76

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

144 views

Untangling The Web: Alena Kaltunevich

Uploaded by

tezla76

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 25

UNTANGLING THE WEB

ALENA KALTUNEVICH

1
Contents

• Introduction to Searching

• Search Engines

• Specialized Search

• Research Tips & Techniques

2
Document unclassified by NSA in 2013

3
Introduction to Searching

1. The spider/robot/crawler is software that "visits" sites on the Internet (each

search engine does this differently). The spider reads what is there, follows
links at the site, and ultimately brings all that data back to:

2. The search engine index, catalog, or database, where everything the spider
found is stored.

3. The search engine is software that actually sifts through everything in the
index to find matches and then ranks or sorts them into a list of results or hits.
When you use a search engine, you are searching the index or database, not
the web pages themselves. This is important to remember because no
search engine operates in "real time. “

4
Most search engines use statistical interfaces. The search engine assigns
relative weights to each search term, depending on:

• its rarity in their database

• how frequently the term occurs on the webpage
• whether or not the term appears in the uri
• how close to the top of the page the term appears
• (sometimes) whether or not the term appears in the metatags.

When you query the database, the search engine adds up all the weights
that match your query terms and returns the documents with the highest
weight first. Each search engine has its own algorithm for assigning
weights, and they tweak these frequently. In general, rare, unusual terms
are easier to find than common ones because of the weighting system.
However, remember that "popularity" measured by various means often
trumps any statistical interface

Search engines are not the only and often not even
the best way to access information on the Internet. 5
The growth in the number of search engines has led to the creation of "meta" search
sites. These services allow you to invoke several or even many search engines
simultaneously.

metasearch engines do serve a purpose. If you are unsure if a term will be

found anywhere on the web, try a metasearch engine first to "size" the problem:

http://c1usty.com/
it employs its own
clustering engine, software that organizes unstructured information into
hierarchical folders.
Clusty is especially useful for searching ambiguous terms, such as cardinal,
because it clusters them by logical categories, as shown below.
Ex Iran (clusters on the left)

http://www.dogpile.com/
https://mamma.com/
6
Use the right tool for the job

the best starting places for general information on broad topics are web
directories/subject guides, virtual libraries, and reference desks

http://www.about.com/
http://www.encyclopedia.com/
http://www.britannica.com/

While directories and virtual libraries contain information selected by people, search
engine databases are mostly unfiltered, that is, no human being is looking at the
data being indexed to determine its value, authenticity, and reliability.

However, no single search engine is

best. Each has its own advantages and drawbacks.
Use more than one search engine.

7
Search engines
Google

Google first gained fame and widespread use because of its single-minded
focus on search, exemplified by its "clean" interface, and its PageRank
weighted link popularity."

In simple terms, Google gives each webpage a rank based on the

number of other pages linking to it and the "importance" of those pages,
where importance is derived from an overall link count.

While PageRank is imperfect, it works better than most other approaches to

ranking search results and, indeed, is one of the primary reasons for
Google's success.

8
Google assumes as its default that multiple search terms are joined by the AND
operator, so that a search on the keywords [windows explorer] will find all the
webpages that contain both search terms. Furthermore, Google will first try to find
all the webpages that contain the phrase ["windows explorer"].

Google will search:

first, for phrases (keywords as one long phrase)
second, for webpages containing all the keywords with the greatest
adjacency (closest together),
third, for webpages containing all the keywords, regardless of where they appear on the page

While Google assumes that multiple keywords are a phrase, searchers can
delimit phrases using double-quotes. For example, if I search on [the last king of france]
without double-quotes, Google will ignore the "the" and the "of' in its search. The
results I get include many irrelevant hits, such as music from a group called ''The
Last King" and an article about Lance Armstrong. However, if I enclose the same
query in double-quotes, Google will search on exactly the phrase ["the last king of
france"], and return a result with the name of the last king of France. Enclosing
searches in double-quotes is much more effective for finding precise results than
relying on automatic phrase searching. 9
It is unnecessary to use the plus sign (+) with any terms except stop words because
by default Google searches for all keywords.

However, there are many times when searchers need to exclude certain terms
that are commonly associated with a keyword but irrelevant to their search.

That's where the minus sign (-) comes in.

Using the minus sign in front of a keyword ensures that Google excludes that term
from the search. For example, the results for the search ["pearl harbor" -movie] are
very different from the results for ["pearl harbor"].

To force Google to search only for the term with the diacritic , put a plus sign
in front of the term: [+façade].

10
Google Advanced

site: restricts search to a domain

[shuttle site:www.nasa.gov] finds pages about the space shuttle at the NASA website.

[cirrus -site:mastercard.com] finds pages about the keyword cirrus that are not at
the Mastercard.com site

intitle: restricts the results to documents containing the keyword in the title.
[intitle:amazon "rain forest"] finds all pages that include the word amazon in their
title and mention the phrase "rain forest" anywhere in the document (title or text
or anywhere in the document)

inurl: restricts the results to documents containing the keyword in the urI
[inurl:nasa -site:gov] finds all pages that include nasa anywhere in the uri of sites
that are not in the .gov top-level domain

11
link: restricts the results to documents that have links to a specific webpage.
[Iink:www.noaa.gov] finds all pages linking to the NOAA homepage.

filetype: Google will search the content of many file types

[filetype:doc bulletin]

Microsoft filetypes are potentially dangerous to open in their native formats .

Define: ex define blog

Video search:
genre, duration
[is:free sharknado]

Contrary to popular opinion, everything is not on the Internet. In fact, much of the kind
of information you are used to working with is not and never will be on the Internet.
Unrealistic expectations about the kinds of information you may find on the Internet can
lead to frustration and wasted time and effort. A general rule of thumb:
the more sensitive, rare, or expensive the information, the less likely it is to be on the
12
Internet. Also, much valuable data on the Internet requires payment.
Word Order Matters. Google gives more weight to the first term in a query, so put
the most important search term(s) first Try these two queries and you'll see how
different the results are: [new york city] vs. [city york new]

YAHOO
https://fr.search.yahoo.com/
Boolean operator queiries can give results that are different from returned by google

[cardinals AND (bird OR catholic) AND NOT (baseball OR football)]

Here is an interesting twist on link searching, that is, finding sites that link to a
specific address. This search, which works with Yahoo finds pages that link to a specific
domain or domains but not to another specific domain or domains.

[Iinkdomain:mod.ir linkdomain:ieimil.com -Iinkdomain:cia.gov]

this technique has obvious applicability for search engine optimization
("who is linking to my competitors but not linking to me?")

13
Gigablast

http://www.gigablast.com/

Strengths
• simple interface
• cached copies with date indexed [archived copies]
• cached copies of webpages without images [stripped]
• links to Internet Archives [older copies]
• clusters results by default (can be turned off)
• no limit on number of search terms
Weaknesses
• most obviously, the Gigablast index is still smaller than those of Google or Yahoo
• no truncation
• is not case sensitive
• no wildcard
• limited file type searches
• limited language options
• poor documentation
14
Exalead
http://www.exalead.com/search/
The French search engine Exalead, which introduced a new look in 2006, has
features that make it worth special mention. Exalead offers both proximity searches
and truncation, two options no other major search engine offers anymore. In
addition, Exalead presents thumbnail images of websites in the results list (if you want them)

• Exalead refreshes its index continuously, not on a schedule (this is a good thing)
• default operator is AND; users may use OR.
• Exalead does not publish a search term limit
• as of now, Exalead has no sponsored links.
There are two other operators that can be used in a boolean query:
NEAR and OPT. NEAR finds search terms within 16 words of each other and OPT
makes a query term preferable but does not require it.
For example: [(football NEAR cardinals) OPT "st louis"]
This is nice to know because most search engines use AND as their default, and will
not return results unless all terms are found

Ask
http://fr.ask.com/ 15
Specialized Search
The whole problem of keeping information on the Internet private dramatically
worsened almost overnight a couple of years ago when Google quietly started
indexing whole new types of data.

Originally, most of what got spidered and indexed was HTML webpages and documents,
with some plain text thrown in for good measure.

However, the ever-innovative Google decided this wasn't good enough

and started to index PDF, PostScript, and-most importantly-a whole range of
Microsoft file types: Word, Excel, PowerPoint, and Access.

Problem was, lots of folks had assumed these file types were "immune" to spidering
not because it couldn't be done but because no one had yet done it.
As a result, many companies,
organizations, and even governments had quite a lot of egg on their faces when
sensitive documents began turning up in the Google database
16
What kinds of sensitive information can routinely be found using search engines?

The types of data most commonly discovered by Google hackers usually falls into
one of these categories:

• personal and/or financial information

• userids, computer or account logins, passwords
• private, confidential, or proprietary company data
• sensitive government information
• vulnerabilities in websites and servers that could facilitate breaking into the site

Ex: search by file type , site type, and keyword:

many organizations store financial, inventory, personnel, etc., data in Excel spreadsheet
format and often mark the information "Confidential," so a Google hacker looking for
sensitive information about a company in South Africa might use a query such as:

[filetype:xls site:za confidential]

17
Other examples: not for distribution,login, password etc
Getting private information "back" is harder than preventing
its disclosure in the first place.

Even when Google removes your data, there are literally hundreds of other
search engines around the world, and who knows what they have indexed from your
site. It will not be an easy task finding out. And I'll hazard a guess that not all of them
will be quite so accommodating as Google in removing pages..

Wikipedia
http://www.wikiwax.com/
To search all Wikipedias:
[site:wikipedia.org]

http://a9.com/
Amazon search

18
Google book search
[inpublisher:o-reilly]
[inauthor:patrick-o-brian]
[intitle:"nutmeg of consolation"]
[isbn:0393030326]

Answers
http://www.answers.com/

Wayback Machine
http://archive.org/web/
Using the Wayback Machine, you may very well be able to retrieve a page or an entire site
even if it disappeared from the web years ago.

Europeaen search engine

http://www.searchenginesoftheworld.com/search_engines_of_europe/

International directory of search engines

http://www.searchenginecolossus.com/ 19
Research Tips & Techniques

Tip 1: Use the Right Tool

The single biggest mistake researchers make is using the wrong search tool.
For example, search engines are generally not useful for finding current news
(use a specialized news search service).

Wikis, custom search engines, and directories are generally better when researching a
broad topic

Tip 2: Search for the Most Obscure Term

Tip 3: Put the Most Important Search Term First

While it's not always true, search engines usually give more weight to the first term
you list because the search software assumes it's the most important term
(otherwise, why would you list it first)? Try these two queries in Google one after the
other: [gardening roses] then [roses gardening]. The results are similar but not identical.
20
Tip 4: Search on the Singular Form First
While it is not always the case that search engines automatically search for plural
forms of search terms, many (including Yahoo and Google) do.
The converse, however, is not true, i.e., a search on [rose] will find roses, but a search on
[roses] will not find rose.
Therefore, it makes sense to search first on the singular form of a term.

Tip 5: Use Regional Search Services, Directories, and Databases

Tip 6: Search in the Native Language

Tip 7: Follow Those links

Whenever you find a good website, always check its links. While in theory links at a
web page that is indexed by a search engine should also have been indexed, the
reality is often different. "Links" pages are often a gold mine of sites with similar
information.

21
Tip 8: Learn Two Words in Any Non-English Language in Which You are Searching
Those two words are search and links. You need to be able to push the search or
find button on a non-English web page, and you need to be able to find the links Page

Tip 9: Search on the LINK Field

Tip 10: Look Beyond Search Engines and the Web

Search engines and directories index only a tiny portion of the Internet. With some notable
exceptions, they are basically designed to index web pages. A vast amount of data is stored,
for example, in online databases, many of which are free and open to the public.

Tip 11: Configure and Use Two Browsers

Tip 12: Try URL Guessing

It works more frequently than you would imagine. For example, I found the Iranian
Ministry of Foreign Affairs by guessing www.mfa.gov.ir.

Tip 13: Change URLs to Find "Hidden" Webpages

22
Tip 14: Be on the Lookout for URL Errors
Not surprisingly, many uris listed on webpages are incorrect. Among the most
common mistakes are misspellings, putting a backslash (\) where a slash (I) should
be, including or excluding the L in HTML, e.g.:
http://www.examlpe~com/pathl1ame\bigmistake.html

Tip 15: Take a Look at the "Site Map“

Tip 16: Try Using the "Mouseover"

For non-English sites where you don't know the language, try the "mouseover" trick,
i.e., move your mouse over hyperlinks. Often, the link information is in English or, if it
isn't, quite often the uri that appears in the too/bar at the bottom of the browser is
revealing because it is likely to be written in English.

Tip 17: Try Alternative Spellings. Especially of Non-English Names or Terms.

23
Tip 18: Always look at a Website's Native language Version
Usually, the native language version of a website will differ from the English version,
sometimes a little, sometimes a great deal.

Tip 19: Use Wildcards to Maximize Effectiveness

Tip 20: Examine Page Source Code

In addition to often revealing the webpage's language encoding, page source can
provide other helpful details, including names, dates, email addresses, type of
software used to create the page, etc.

Tip 21: Ask for Help

24
Questions?

Manual GHunt - en
No ratings yet
Manual GHunt - en
4 pages
Website Maintenance Agreement
100% (1)
Website Maintenance Agreement
2 pages
Demystifying Google Hacks
No ratings yet
Demystifying Google Hacks
11 pages
Google Tricks1
No ratings yet
Google Tricks1
9 pages
The 101 Most Useful Websites On The Internet - Digital Inspiration
No ratings yet
The 101 Most Useful Websites On The Internet - Digital Inspiration
5 pages
IP Location Finder - Geolocation
No ratings yet
IP Location Finder - Geolocation
1 page
How To Surf The Web Anonymously
No ratings yet
How To Surf The Web Anonymously
6 pages
Deep Web
No ratings yet
Deep Web
25 pages
Cookie List
No ratings yet
Cookie List
52 pages
Finding The Real Origin Ips Hiding Behind Cloudflare or Tor
No ratings yet
Finding The Real Origin Ips Hiding Behind Cloudflare or Tor
10 pages
List of Sites Where You Can Find Books Online
No ratings yet
List of Sites Where You Can Find Books Online
5 pages
Google Dorks - Advance Searching Technique: August 2019
No ratings yet
Google Dorks - Advance Searching Technique: August 2019
13 pages
Internet Terminology
100% (1)
Internet Terminology
8 pages
Search Operators You Can Use With Gmail - Gmail Help
No ratings yet
Search Operators You Can Use With Gmail - Gmail Help
4 pages
Google - Search Prefixes or Query Options
No ratings yet
Google - Search Prefixes or Query Options
6 pages
New Rich Text Document
No ratings yet
New Rich Text Document
4 pages
BUS505 Lec7 Searching Google
No ratings yet
BUS505 Lec7 Searching Google
28 pages
SEOmoz The Beginners Guide To SEO 2012 PDF
No ratings yet
SEOmoz The Beginners Guide To SEO 2012 PDF
67 pages
Google Search Operators
No ratings yet
Google Search Operators
10 pages
Command Prompt
No ratings yet
Command Prompt
3 pages
ToolsandresourcesupdatedNov2017 - Raymond Joseph
No ratings yet
ToolsandresourcesupdatedNov2017 - Raymond Joseph
6 pages
02a Free Search Tools
No ratings yet
02a Free Search Tools
24 pages
Useful Websites
No ratings yet
Useful Websites
6 pages
GRADE 9/10 Best 101 Free Computer Software For Your Daily Use
No ratings yet
GRADE 9/10 Best 101 Free Computer Software For Your Daily Use
6 pages
Google Hacks
No ratings yet
Google Hacks
4 pages
How To Search On Google
No ratings yet
How To Search On Google
6 pages
Hacking Attacks, Methods, Techniques and Their Protection Measures
No ratings yet
Hacking Attacks, Methods, Techniques and Their Protection Measures
6 pages
Health & Wellbeing Useful Websites and Contacts
No ratings yet
Health & Wellbeing Useful Websites and Contacts
7 pages
Best List of Deep Web Research Tools 2021
No ratings yet
Best List of Deep Web Research Tools 2021
8 pages
Hello Everyone
No ratings yet
Hello Everyone
2 pages
Top 25 Social Bookmark Sites
No ratings yet
Top 25 Social Bookmark Sites
3 pages
Searching The Internet
No ratings yet
Searching The Internet
49 pages
Google Searcher
No ratings yet
Google Searcher
58 pages
Internet Searching Strategies For Journalists
100% (1)
Internet Searching Strategies For Journalists
11 pages
Google Information 1
100% (1)
Google Information 1
4 pages
Betsy August, M.D. Massachusetts License Applications
No ratings yet
Betsy August, M.D. Massachusetts License Applications
57 pages
Understanding The Invisible Internet, Chase Cunningham
No ratings yet
Understanding The Invisible Internet, Chase Cunningham
41 pages
DIY Research Guide
No ratings yet
DIY Research Guide
2 pages
Google Advanced Search Operators PDF
100% (1)
Google Advanced Search Operators PDF
2 pages
GoLookUp Now Providing A Nationwide Reverse Phone Lookup Directory
No ratings yet
GoLookUp Now Providing A Nationwide Reverse Phone Lookup Directory
2 pages
Research On OSNIT1
No ratings yet
Research On OSNIT1
18 pages
Internet Research Developer V 10.5
100% (1)
Internet Research Developer V 10.5
48 pages
Google Class Action AdSense Payout Complaint
100% (1)
Google Class Action AdSense Payout Complaint
56 pages
Implementing Security in ATM PIN Using Hidden Key Cryptography Algorithm
No ratings yet
Implementing Security in ATM PIN Using Hidden Key Cryptography Algorithm
2 pages
Las Vegas - You Can Do It For Less...
From Everand
Las Vegas - You Can Do It For Less...
Shane Loves Vegas
No ratings yet
Learn Basic Internet
No ratings yet
Learn Basic Internet
14 pages
Hacking Methods Techniques and Their Pre
No ratings yet
Hacking Methods Techniques and Their Pre
15 pages
Articles - Researching Content
No ratings yet
Articles - Researching Content
9 pages
The Paranoid's Guide to Using the Internet
From Everand
The Paranoid's Guide to Using the Internet
Pamela Gifford
1/5 (1)
How to Safeguard Your Debit & Credit Card From Fraud
From Everand
How to Safeguard Your Debit & Credit Card From Fraud
Tarajii Art Books
No ratings yet
Discover How We Made $15,775 In 7 Days With Free Secret Systems that Generates Real and Unlimited HQ Backlinks that Rank Your Website, Video and Blog On Top of Google, Youtube, Yahoo and Bing In Just 60 Seconds: Unleash the Backlink Alchemy and Turbocharge Your Online Success and Income
From Everand
Discover How We Made $15,775 In 7 Days With Free Secret Systems that Generates Real and Unlimited HQ Backlinks that Rank Your Website, Video and Blog On Top of Google, Youtube, Yahoo and Bing In Just 60 Seconds: Unleash the Backlink Alchemy and Turbocharge Your Online Success and Income
Miller Allen A.
No ratings yet
proxy servers Third Edition
From Everand
proxy servers Third Edition
Gerardus Blokdyk
No ratings yet
Love Letters
From Everand
Love Letters
Sandra Leigh Savage
No ratings yet
The Hidden Treasure in Your Website: The First Professional Guide to Monetizing Your Website with In-Text Advertising
From Everand
The Hidden Treasure in Your Website: The First Professional Guide to Monetizing Your Website with In-Text Advertising
Tomer Treves
No ratings yet
Rank Hack Method
From Everand
Rank Hack Method
Start Vector.com
3/5 (2)
Talk Cheap: Your Guide to Free and Cheap Phone Service
From Everand
Talk Cheap: Your Guide to Free and Cheap Phone Service
Annie Jean Brewer
No ratings yet
Proxy server A Complete Guide
From Everand
Proxy server A Complete Guide
Gerardus Blokdyk
No ratings yet
Trackpad Ver. 2.0 Class 4
From Everand
Trackpad Ver. 2.0 Class 4
Nidhi Arora
No ratings yet
Bad Credit Loans: Little Known Tips You Can't Afford to Miss
From Everand
Bad Credit Loans: Little Known Tips You Can't Afford to Miss
Laurie Ratcliff
No ratings yet
Knockout Punch for the Identity Thief -7 Secrets to Online Security for Stay at Home Moms On the Go
From Everand
Knockout Punch for the Identity Thief -7 Secrets to Online Security for Stay at Home Moms On the Go
Scott Lewis
5/5 (1)
Ultimate Hack
From Everand
Ultimate Hack
Lance Erlick
No ratings yet
Include An SVG (Hosted On GitHub) in MarkDown - Stack Overflow
No ratings yet
Include An SVG (Hosted On GitHub) in MarkDown - Stack Overflow
12 pages
RPT Tutorial
No ratings yet
RPT Tutorial
29 pages
Discord Rules
No ratings yet
Discord Rules
3 pages
School Management
No ratings yet
School Management
19 pages
Blood Distribution System Using Data Mining
No ratings yet
Blood Distribution System Using Data Mining
5 pages
7.2.5 Configuring Trusted Authentication For The Web Application
No ratings yet
7.2.5 Configuring Trusted Authentication For The Web Application
2 pages
Custom SQL Query For ASA DB (OpsCenter) To Generate Monthly Success Rate - Symantec Connect Community
No ratings yet
Custom SQL Query For ASA DB (OpsCenter) To Generate Monthly Success Rate - Symantec Connect Community
6 pages
Social Media Use by Undergraduate Students of Education in Nigeria A Survey
No ratings yet
Social Media Use by Undergraduate Students of Education in Nigeria A Survey
18 pages
Web Enabled DDS
No ratings yet
Web Enabled DDS
99 pages
Principle of Web Development
No ratings yet
Principle of Web Development
10 pages
System Analysis and Design - Internet Banking: June 2015
No ratings yet
System Analysis and Design - Internet Banking: June 2015
10 pages
SIM7500 SIM7600 Series HTTPS Application-Note V3.00
No ratings yet
SIM7500 SIM7600 Series HTTPS Application-Note V3.00
18 pages
Ayush Verma: Software Engineer
No ratings yet
Ayush Verma: Software Engineer
2 pages
Irule
No ratings yet
Irule
42 pages
Gambireitor 2
No ratings yet
Gambireitor 2
5 pages
Docs Ckan Org en Latest
No ratings yet
Docs Ckan Org en Latest
615 pages
Ccea Home Economics A2 Coursework
100% (2)
Ccea Home Economics A2 Coursework
5 pages
Chapter 4 FORMS-X
No ratings yet
Chapter 4 FORMS-X
3 pages
Bituing Natatangi
No ratings yet
Bituing Natatangi
9 pages
Inside The Dark Web
No ratings yet
Inside The Dark Web
30 pages
Digital Marketing Strategy Canvas
No ratings yet
Digital Marketing Strategy Canvas
2 pages
Tracks
No ratings yet
Tracks
1 page
Adobe AEM On AWS
No ratings yet
Adobe AEM On AWS
25 pages
wt-unit3
No ratings yet
wt-unit3
10 pages
Category Sub-Category Status: Pre-Scan Analysis
No ratings yet
Category Sub-Category Status: Pre-Scan Analysis
53 pages
Oracle 9I Application Server Release 2 - Basic Administration
No ratings yet
Oracle 9I Application Server Release 2 - Basic Administration
57 pages
Proposed Amendments To The ICT Act
No ratings yet
Proposed Amendments To The ICT Act
2 pages
How To Earn From Blogging
No ratings yet
How To Earn From Blogging
5 pages
Jennifer Rosa: Weather Dashboard
No ratings yet
Jennifer Rosa: Weather Dashboard
1 page

Untangling The Web: Alena Kaltunevich

Uploaded by

Untangling The Web: Alena Kaltunevich

Uploaded by

UNTANGLING THE WEB

• Research Tips & Techniques

1. The spider/robot/crawler is software that "visits" sites on the Internet (each

• its rarity in their database

metasearch engines do serve a purpose. If you are unsure if a term will be

However, no single search engine is

In simple terms, Google gives each webpage a rank based on the

While PageRank is imperfect, it works better than most other approaches to

Google will search:

That's where the minus sign (-) comes in.

site: restricts search to a domain

filetype: Google will search the content of many file types

Microsoft filetypes are potentially dangerous to open in their native formats .

Define: ex define blog

[cardinals AND (bird OR catholic) AND NOT (baseball OR football)]

[Iinkdomain:mod.ir linkdomain:ieimil.com -Iinkdomain:cia.gov]

However, the ever-innovative Google decided this wasn't good enough

• personal and/or financial information

Ex: search by file type , site type, and keyword:

[filetype:xls site:za confidential]

Europeaen search engine

International directory of search engines

Tip 1: Use the Right Tool

Tip 2: Search for the Most Obscure Term

Tip 3: Put the Most Important Search Term First

Tip 5: Use Regional Search Services, Directories, and Databases

Tip 6: Search in the Native Language

Tip 7: Follow Those links

Tip 9: Search on the LINK Field

Tip 10: Look Beyond Search Engines and the Web

Tip 11: Configure and Use Two Browsers

Tip 12: Try URL Guessing

Tip 13: Change URLs to Find "Hidden" Webpages

Tip 15: Take a Look at the "Site Map“

Tip 16: Try Using the "Mouseover"

Tip 17: Try Alternative Spellings. Especially of Non-English Names or Terms.

Tip 19: Use Wildcards to Maximize Effectiveness

Tip 20: Examine Page Source Code

Tip 21: Ask for Help

You might also like