0% found this document useful (0 votes)

21 views7 pages

Search and Retrieval of Information

This document discusses information retrieval and information retrieval systems. It explains that information retrieval is carried out through database queries using interrogation languages. It also describes different tools for information retrieval such as databases, the Internet, indexes, keywords and thesauri. Finally, it explains the components of an information retrieval system such as structured documents and databases, as well as interrogation languages and search equations.

Uploaded by

ScribdTranslations

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

21 views7 pages

Search and Retrieval of Information

Uploaded by

ScribdTranslations

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 7

Search and Retrieval of Information

Information retrieval is the next step to determining information needs. It can be recovered
through different tools: databases, Internet, thesauri, ontologies, maps... Knowing and
using these tools contributes to quality recovery.

Recover of information
The recovery process is carried out through queries to the database where the structured
information is stored, using an appropriate interrogation language. It is necessary to take
into account the key elements that allow the search to be carried out, determining a greater
degree of relevance and precision, such as: indexes, keywords, thesauri and the phenomena
that can occur in the process such as noise and documentary silence. . One of the problems
that arise in the search for information is whether what we recover is “a lot or a little”, that
is, depending on the type of search, a multitude of documents can be recovered or simply a
very small number. This phenomenon is called Silence or Documentary Noise.

 Documentary silence : These are those documents stored in the database but that
have not been recovered, because the search strategy has been too specific or
because the keywords used are not appropriate to define the search.
 Document noise : These are those documents recovered by the system but that are
not relevant. This usually happens when the search strategy has been defined too
generic.

Information retrieval system concept

Process where previously stored information is accessed, using computer tools that allow
establishing specific search equations. This information must have been structured prior to
its storage.

Essential components

 Structured documents. It is necessary to establish a process where indexing and

terminological control tools are established.
 Databases where the documents are stored. Define interrogation languages and
operators that the database will support and establish what type of equations will be
allowed.

Tools
Databases

Internet

1
 Electronic magazines
 Search engines. Search engines are tools that allow you to locate and retrieve
information stored on the Internet. The operation is similar to databases, they store
pages with certain characteristics (metadata) and later, after using some keywords,
they issue a list of the most relevant ones.
o General search engines
 Google (http://www.google.com)
 Alltheweb (http://www.alltheweb.com)
 AltaVista (http://www.altavista.com)
 Excite (http://www.excite.com)
 Infoseek (http://www.infoseek.com)
 Lycos (http://www.lycos.com)
 Webcrawler (http://webcrawler.com)
 Hotboot (http://www.hotbot.com)
 Directories. Directories are organized lists that allow us to access information in a
structured and hierarchical way. They are classified into categories and the user
links from the most general to the most specific.
o Recommended for searches in which the user does not know much about the
specific topic
 The Google Directory (http://directory.google.com)
 Ozu (http://categorias.ozu.es)
 The index (http://www.elindice.com)
 Yahoo (http://www.yahoo.com)
o Directory and specialized engines
 Humbul http://www.humbul.ac.uk
 Librarian Index to the Internet http://lii.org
 Internet Public Library http://www.ipl.org
 Scirus http://www.scirus.com
 Search4Science http://www.search4science.com
 Metasearch engines. They are search engines, with the quality that they not only
search in a single database, but when entering the search concepts they scan
different databases, in this way the breadth of results is greater.
o Vivisimo (http://www.vivisimo.com)
o Dogpile (http://www.dogpile.com)
o Kartoo (http://www.kartoo.com)
o Qbsearch (http://www.qbsearch.com)
o Metacrawler: (http://www.metacrawler.com)
 Selective search engines. They use a database specialized in a subject.
o Ask (http://www.ask.com)
o Teoma (http://www.teoma.com)
o Electric Library (http://www.elibrary.com)
o Hieros Gamos http://www.hg.org/index.html
 Program to search
o Copernic (http://www.copernic.com)
 Intelligent agents. Intelligent agents are tools that allow you to locate information
automatically. You only need to define a search profile and where you should

2
launch it (databases, websites, etc.) and they automatically present a report on the
new information that is found. is emerging.
o BookWhere http://www.bookwhere.com
o BullsEye Pro http://www.intelliseek.com
o WebSeeker 5 http://www.bluesquirrel.com/
o WebFerret http://www.ferretsoft.com

Indexing and terminological control languages

Indices.

List of standardized terms that represent the content of a resource. Some types are:

 Subject index: terms ordered according to the subjects covered by the database, the
search engine, etc.
 Alphabetical index: listing of terms alphabetically
 KWIC Index: Type of permuted index in which the thematic content of a work is
represented by keywords from its title or another source of information in the
document.
 KWOC Index: Type of permuted index that varies in presentation from the KWIC
index, in which keywords appear as a separate line heading. Under each heading
appears all the titles, complete or truncated, that contain the keyword in question.

Keywords (Keywords).

Meaningful term in natural language that represents the content of the document.

When searching for information, this option is essential since it allows us to narrow down
and specify information. The problem lies in defining the exact word that represents the
content, which is why it is convenient to use specifiers. For example, if we use the word
flower in any search engine we may be looking for the nearest florist, an image of flowers
or a study about flowers in the different seasons of the year.

 Meta Keywords. Most search engines use the keywords of each web page to locate
resources. For this reason, it is essential that each page has a label that includes the
keywords that define it. The exact definition of each page is also important because
it is from these that search engines locate a resource or not.

Thesauruses

It is a controlled terminological list about an area or field of knowledge that maintains

semantic and generic relationships among themselves.

Its main characteristic is that the terms are arranged hierarchically, allowing terminological
precision in the search for information.

3
Components:

 Admitted or preferred descriptors : these are those normalized terms (where they
have undergone an expurgation process, denying plurals, avoiding synonyms, etc.)
that the thesaurus considers suitable to be assigned to a document and that
subsequently facilitate recovery.
 Unsupported descriptors : these are those that, even though they are standardized,
are not considered appropriate for use (they are usually synonyms, terms not used in
the field of action, etc.)

Relations:

 Hierarchical : indicate when one term is more specific than another

 Associative : They indicate that the terms have some relationship
 Synonyms : They indicate that two terms are synonyms and which of them is used as
admitted

Question languages and search equations

Idioms

Each recovery system has its own interrogation language, which allows it to “speak” in the
same language as the database. This language, like any other, has its own syntax that
specifies the special characteristics of the search, determining at all times the relationship
that the search elements have. The grammatical rules in question language are operators.

How to propose a search strategy

There are no guidelines that tell us how to exactly do all the searches because each query is
different. That is why it is convenient to define a basic work procedure:

 Raising the issue from different points of view

 Determining what is known about the topic
 Formulating our search through:
o The selection of keywords that represent what I am looking for (use
dictionaries, synonyms, thesauri, ontologies, etc.)
o Translating important words into other languages (English)
 Selecting search tools (indexes, engines, metasearch engines). It is recommended to
use different tools at the same time.
 Applying keywords in selected search tools

Simple equations

Composite equations

Operators

4
Logical or Boléan: They allow you to convert the words of the query into
mathematical sets, and operate with the words as if they were sets. The basic
operations are addition (OR), subtraction (NOT) and product (AND).

o AND logical (AND)

o logical NOT (NOT)
o logical OR (OR)
 Positional: They allow you to specify the position of the words within the
document.
o NEAR
o Together (ADJ)
o Phrases
 Existence: Indicates when the presence or absence of a word is required in the
recovered documents.
o Presence / Absence
o Absence
 Accuracy: This type of operator is used when the intended query is less specific
since it allows the possibility of cutting a search word to its root.
o Proximity
o By fields

Navigation versus Information Retrieval

Concept

Navigation is the program that allows you to consult and obtain information through
hypertext systems.

Differences

The essential difference between both concepts lies in the way of obtaining information;
While information retrieval is obtained linearly, navigation has the ability to obtain
information through hypertext. This means that the acquisition of knowledge is carried out
gradually and depending on the user's interest, it is deepened through the information nodes
in one subject or another.

Directories versus Search Engines

Search Engines Vs. Directories

The information is updated by the human hand
The information is automatically updated
that registers in the directory when creating a
over the network.
website.
They collect all the information stored onThey do not store all web content, only the most

5
the page. relevant fields such as the title, keywords, etc.
They store the information through theirThey store information through directories,
own database. classified into categories.
The search is performed in the database The search is carried out hierarchically
using the search equation. according to the established categories.
The presentation of the results is The presentation of the results is carried out
established in order of relevance through a list of all the corresponding
according to criteria established in thedocuments in the category, without any
search equation. presentation criteria.
Appropriate for locating specificAppropriate for locating general information on
information. a topic.

Metadata

Metadata in navigation and information retrieval are used to detect relevant information
quickly and efficiently. Tags describe the content of the web resource, which is then used
by search tools to locate and access the resource. Mainly it is the keyword and title tags that
give way to locating the document.

Recovery quality
Below are some basic criteria so that the recovery carried out is of quality.

 Consistency: The ability of a search system to coordinate its classification system

with the search language, thus allowing search equations to be established on
supported terms.
 Exhaustiveness: It is the quality of an information system to recover all the relevant
documents that a collection has, in accordance with the requirements established in
the search strategy.
 Hit rate: coefficient that arises from dividing the number of relevant documents
recovered by the total number of relevant documents in the collection
 Relevance: Characteristic of a recovered document that meets the needs of
information.
 Relevance rate: coefficient that arises from dividing the number of relevant
documents recovered by the total number of documents recovered
 Relevance: It is the quality that the recovered document has of adapting to the
information needs.
 Relevance rate: coefficient that arises from dividing the number of relevant
documents recovered by the total number of documents recovered
 Precision: it is the ability of the search system to coordinate the equation with the
most relevant documents. Otherwise they are those relevant documents recovered.
 Precision rate: coefficient that arises from dividing the number of relevant
documents recovered by the total number of documents in the collection

6
Skills and competencies
 Formulation of a plan for searching for information: defining the subject or aspects
to be searched, using a list of appropriate keywords, delimiting the search according
to chronological and idiomatic criteria.
 Knowledge of potential and actual sources of information
 Skills in locating relevant printed and electronic resources in the context of the
information need
 Ability to select the most appropriate search tool and formulate the most appropriate
strategy.
 Mastery of advanced techniques for retrieving information on the Internet, using
engines, search directories, and intelligent agents.
 Skills to evaluate the results of the search, reflecting on successes, failures and
alternative strategies.
 Determine the location and access to information, respecting ethical and legal
principles.

Extracted from E-COMS (Electronic Content Management Skills) Available at:

http://www.mariapinto.es/e-coms/busqueda-y-recuperacion-de-informacion/

ch1 - Information Retrieval Systems
No ratings yet
ch1 - Information Retrieval Systems
52 pages
Aesthetics and Technology in Building, Pier Luigi Nervi
100% (4)
Aesthetics and Technology in Building, Pier Luigi Nervi
146 pages
CompletedUNIT 1 PPT 10.7.17
100% (6)
CompletedUNIT 1 PPT 10.7.17
87 pages
11 Multimedia Media IR
No ratings yet
11 Multimedia Media IR
19 pages
Information Storage and Retrieval: Chapter One - Introduction
No ratings yet
Information Storage and Retrieval: Chapter One - Introduction
50 pages
2 Search Engines
No ratings yet
2 Search Engines
41 pages
Wollo University Kombolcha Institute of Technology College of Informatics Department of Information Technology
100% (1)
Wollo University Kombolcha Institute of Technology College of Informatics Department of Information Technology
35 pages
Informaiton Retrieval and Web Search
No ratings yet
Informaiton Retrieval and Web Search
44 pages
IR Chapter 1
No ratings yet
IR Chapter 1
32 pages
Unit 2 Search Strategies and Information Retrieval
No ratings yet
Unit 2 Search Strategies and Information Retrieval
26 pages
Tycs Sem Vi Informational Retrival Final Notes (WWW - Profajaypashankar.com-1
No ratings yet
Tycs Sem Vi Informational Retrival Final Notes (WWW - Profajaypashankar.com-1
103 pages
IRS B Tech CSE Part 1
No ratings yet
IRS B Tech CSE Part 1
161 pages
Info Seeking Skills - Asif
No ratings yet
Info Seeking Skills - Asif
22 pages
Week 2 - Information Retrieval Basics
No ratings yet
Week 2 - Information Retrieval Basics
74 pages
Unit - I - IR
No ratings yet
Unit - I - IR
39 pages
UNIT 1 IRS WWWWW
No ratings yet
UNIT 1 IRS WWWWW
26 pages
Introduction
No ratings yet
Introduction
25 pages
Information Literacy Topic 5
No ratings yet
Information Literacy Topic 5
23 pages
The Overview of Web Search Engines 16ep4np3gk
No ratings yet
The Overview of Web Search Engines 16ep4np3gk
23 pages
Information Retrieval: Prof: Ehab Ezzat Hassanein
No ratings yet
Information Retrieval: Prof: Ehab Ezzat Hassanein
49 pages
Information Retrieval Techniques
No ratings yet
Information Retrieval Techniques
59 pages
Unit-1INTERNET SEARCHING TECHNIQUES
No ratings yet
Unit-1INTERNET SEARCHING TECHNIQUES
27 pages
1 IR Intro
No ratings yet
1 IR Intro
30 pages
Ucc 100 Week 7
No ratings yet
Ucc 100 Week 7
15 pages
All Units Notes TYBSC-CS-Information-Retrieval
No ratings yet
All Units Notes TYBSC-CS-Information-Retrieval
89 pages
Unit - 1
No ratings yet
Unit - 1
51 pages
Unit 18
No ratings yet
Unit 18
19 pages
01 Introduction To ISR
No ratings yet
01 Introduction To ISR
34 pages
UNIT 1 Notes
No ratings yet
UNIT 1 Notes
16 pages
Topic 6 - Information Search Strategies
No ratings yet
Topic 6 - Information Search Strategies
7 pages
Internet Research Reviewer
No ratings yet
Internet Research Reviewer
19 pages
Chapter 1 Introduction To IR
No ratings yet
Chapter 1 Introduction To IR
18 pages
Information Retrieval
No ratings yet
Information Retrieval
21 pages
Information Literacy Skills
No ratings yet
Information Literacy Skills
31 pages
Information PDF
No ratings yet
Information PDF
16 pages
Basics of Retrieving Literature - Bpatc
No ratings yet
Basics of Retrieving Literature - Bpatc
29 pages
02 Topic 4 Database Search Vs Open Web (Kathy) With Lab 5
No ratings yet
02 Topic 4 Database Search Vs Open Web (Kathy) With Lab 5
8 pages
Information Retrieval
No ratings yet
Information Retrieval
6 pages
Week 1
No ratings yet
Week 1
28 pages
Information Storage and Retrieval
No ratings yet
Information Storage and Retrieval
45 pages
Water Bodies Prohibition Circular 2013
100% (2)
Water Bodies Prohibition Circular 2013
3 pages
AN ASSIGNMENT For BUSINESS DEPT
No ratings yet
AN ASSIGNMENT For BUSINESS DEPT
15 pages
LG Lib 339 Eng L15B
No ratings yet
LG Lib 339 Eng L15B
4 pages
Irs Unit-2 Notes - 241015 - 102936
No ratings yet
Irs Unit-2 Notes - 241015 - 102936
27 pages
IRS Unit 2
No ratings yet
IRS Unit 2
15 pages
Chapter I TCE 1
No ratings yet
Chapter I TCE 1
6 pages
Chapter I TCE
No ratings yet
Chapter I TCE
6 pages
Zaheer Ahmad, Presentation Information Literacy Skills
No ratings yet
Zaheer Ahmad, Presentation Information Literacy Skills
29 pages
IR Chapter 1&2
No ratings yet
IR Chapter 1&2
88 pages
Information Storage and Retrieval
No ratings yet
Information Storage and Retrieval
5 pages
Modern Information Retrieval: Computer Engineering Department Fall 2005
No ratings yet
Modern Information Retrieval: Computer Engineering Department Fall 2005
19 pages
Unit 1: Introduction and Data Pre-Processing
No ratings yet
Unit 1: Introduction and Data Pre-Processing
71 pages
Databases and Computerized Information Retrieval
No ratings yet
Databases and Computerized Information Retrieval
57 pages
Introduction To IR 2021
No ratings yet
Introduction To IR 2021
40 pages
IRS Notes
No ratings yet
IRS Notes
10 pages
Wickline - Week One Assignment - Pretest
No ratings yet
Wickline - Week One Assignment - Pretest
3 pages
M 2 Vdpi 2 Ealk
50% (2)
M 2 Vdpi 2 Ealk
95 pages
Unit 1 - Modern Information Retrieval - WWW - Rgpvnotes.in
No ratings yet
Unit 1 - Modern Information Retrieval - WWW - Rgpvnotes.in
8 pages
Gas Station Guidelines
100% (1)
Gas Station Guidelines
16 pages
Quality Audit Report
No ratings yet
Quality Audit Report
20 pages
The Six Pillars of Self-Esteem by Nathaniel Branden
No ratings yet
The Six Pillars of Self-Esteem by Nathaniel Branden
13 pages
Mass Communication Essay
No ratings yet
Mass Communication Essay
8 pages
Zorba The Greek - Cacho Tirao
100% (1)
Zorba The Greek - Cacho Tirao
8 pages
Definition of Conditional Sale - Robert
No ratings yet
Definition of Conditional Sale - Robert
12 pages
โค้งสุดท้ายเข้าเตรียมอุดม 2
No ratings yet
โค้งสุดท้ายเข้าเตรียมอุดม 2
39 pages
Internal Regulations For The Administration of Local Churches Acym
No ratings yet
Internal Regulations For The Administration of Local Churches Acym
20 pages
The Political System of Almond and Powell
No ratings yet
The Political System of Almond and Powell
9 pages
Request For Assignment of Cuc To The SNCP
No ratings yet
Request For Assignment of Cuc To The SNCP
3 pages
The Social Sciences and Their Field of Study
No ratings yet
The Social Sciences and Their Field of Study
5 pages
Americanisms
No ratings yet
Americanisms
10 pages
IBM 9406 270 Repair Analysis
No ratings yet
IBM 9406 270 Repair Analysis
773 pages
Play The World Cup of Very Very Distant Football
No ratings yet
Play The World Cup of Very Very Distant Football
2 pages
3) Practical Activity 1 Property Management and Rental Answers
No ratings yet
3) Practical Activity 1 Property Management and Rental Answers
2 pages
Thyroid Case Study
No ratings yet
Thyroid Case Study
9 pages
TYLER
No ratings yet
TYLER
3 pages
Current International Accounting Standards
No ratings yet
Current International Accounting Standards
2 pages
Moses Makosso: Year 2020 - 2021
No ratings yet
Moses Makosso: Year 2020 - 2021
8 pages
Isabel I CEO
No ratings yet
Isabel I CEO
7 pages
Medication Dispensing System
No ratings yet
Medication Dispensing System
11 pages
Docs For Family Pension
No ratings yet
Docs For Family Pension
5 pages
DISCOVERY KIDS in SPANISH LIVE ONLINE
No ratings yet
DISCOVERY KIDS in SPANISH LIVE ONLINE
3 pages
Model of Reengagement Procedure and Payment of Salaries and Other Benefits
No ratings yet
Model of Reengagement Procedure and Payment of Salaries and Other Benefits
4 pages
Task 1 Database in SPSS
No ratings yet
Task 1 Database in SPSS
5 pages
Simón Planas
No ratings yet
Simón Planas
4 pages
Official Mexican Standard NOM
No ratings yet
Official Mexican Standard NOM
4 pages
Autopsy of A Snowflake
No ratings yet
Autopsy of A Snowflake
3 pages
Control Systems of Provided Services
No ratings yet
Control Systems of Provided Services
3 pages
Format N
No ratings yet
Format N
3 pages
RHUB5921 Description
No ratings yet
RHUB5921 Description
11 pages
Anthropology in Agronomy
No ratings yet
Anthropology in Agronomy
2 pages
Resolved Text Commentary Rousseau
No ratings yet
Resolved Text Commentary Rousseau
2 pages
Letter To A Friend William Shakespeare
No ratings yet
Letter To A Friend William Shakespeare
2 pages
Vdmwyf Full
No ratings yet
Vdmwyf Full
1 page
Artistic Maps in GIMP
No ratings yet
Artistic Maps in GIMP
22 pages
Buyer Motivation
No ratings yet
Buyer Motivation
1 page
Potentio
No ratings yet
Potentio
12 pages
Forensic Toxicology
No ratings yet
Forensic Toxicology
10 pages
A Conceptual Framework of The
No ratings yet
A Conceptual Framework of The
167 pages
BOSS Supastor Stainless Steel Unvented Cylinders
No ratings yet
BOSS Supastor Stainless Steel Unvented Cylinders
10 pages
Qualitative and Qualitative Research Paradigm
No ratings yet
Qualitative and Qualitative Research Paradigm
20 pages
Course Outline
No ratings yet
Course Outline
12 pages
Replacing The Hood Maxfire
No ratings yet
Replacing The Hood Maxfire
2 pages
CM19352 Process Optimization
No ratings yet
CM19352 Process Optimization
2 pages
Philippines Faces Bigger Shortage of Rice Farms - Miraflor (2020)
No ratings yet
Philippines Faces Bigger Shortage of Rice Farms - Miraflor (2020)
3 pages
Certifications: Toastmasters Diploma in IFRS Us-Gaap (FP&A) Oracle
No ratings yet
Certifications: Toastmasters Diploma in IFRS Us-Gaap (FP&A) Oracle
1 page
NG Âm Bu I 1
No ratings yet
NG Âm Bu I 1
6 pages
SPUTNIK7 - LEGAL JOURNALFinal
No ratings yet
SPUTNIK7 - LEGAL JOURNALFinal
129 pages
Vivek Pandey Resume
No ratings yet
Vivek Pandey Resume
1 page
Analisa Respon
No ratings yet
Analisa Respon
9 pages
Stack Project2
No ratings yet
Stack Project2
18 pages
Lesson Plan (Speaking)
No ratings yet
Lesson Plan (Speaking)
3 pages
Brand Coolness
No ratings yet
Brand Coolness
21 pages
Decimals: Skill 4 - 27B: Estimate Sums and Differences Directions: Estimate by Rounding. Rewrite Each Problem
No ratings yet
Decimals: Skill 4 - 27B: Estimate Sums and Differences Directions: Estimate by Rounding. Rewrite Each Problem
3 pages
Mrs Wash Flyer For LBP, OWWA & POEA - Tuguegarao City
No ratings yet
Mrs Wash Flyer For LBP, OWWA & POEA - Tuguegarao City
1 page
8 CC IntegratedProjectEvaluation
No ratings yet
8 CC IntegratedProjectEvaluation
8 pages
Smartboard Orientation
No ratings yet
Smartboard Orientation
1 page

Search and Retrieval of Information

Uploaded by

Search and Retrieval of Information

Uploaded by

Search and Retrieval of Information

Information retrieval system concept

 Structured documents. It is necessary to establish a process where indexing and

Indexing and terminological control languages

It is a controlled terminological list about an area or field of knowledge that maintains

 Hierarchical : indicate when one term is more specific than another

Question languages and search equations

How to propose a search strategy

 Raising the issue from different points of view

o AND logical (AND)

Navigation versus Information Retrieval

Directories versus Search Engines

Search Engines Vs. Directories

 Consistency: The ability of a search system to coordinate its classification system

Extracted from E-COMS (Electronic Content Management Skills) Available at:

You might also like