0% found this document useful (0 votes)

34 views16 pages

Unit 5 IRS

Irs unit 5.cse3 rd year 1 sem

Uploaded by

anjuanjani769

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

34 views16 pages

Unit 5 IRS

Irs unit 5.cse3 rd year 1 sem

Uploaded by

anjuanjani769

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 16

TEXT SEARCH ALGORITHMS

 There are three classical text retrieval techniques that are defined for organizing items
in a textual database, for rapidly identifying the relevant items and for eliminating
items that do not satisfy the search.
 They are
o Full text scanning (streaming)
o Word inversion
o Multi attribute retrieval
 In addition to indexes, streaming of text was used for searching text in information
systems.

5.1 Introduction to Text Search Techniques

Text scanning system

 Basic concept of a text scanning system
o The ability for one or more users to enter queries
o text to be searched is accessed and compared to the query terms.
o When all of the text has been accessed, the query is complete.
 Advantage of text scanning system
o As soon as an item is identified as satisfying a query, the results can be
presented to the user for retrieval.
 Architecture

 database
o Contains the full text of the items.
 term detector
o Special hardware/software that contains all of the terms being searched for.
o It will input the text and detect the existence of the search terms.
o It will output to the query resolver the detected terms to allow for final logical
processing of a query against an item.
o In Hardware search machines
 Multiple parallel search machines (term detectors) may work against
the same data stream allowing for more queries or against different
data streams reducing the time to access the complete database.
o In software systems
 Multiple detectors may execute at the same time.
o Two approaches to the data stream.
 In the first approach, the complete database is being sent to the
detector(s) functioning as a search of the database.
 In the second approach, random retrieved items are being passed to the
detectors.
 query resolver
o It performs two functions.
o It will accept search statements from the users, extract the logic and search
terms and pass the search terms to the detector.
o It also accepts results from the detector and determines which queries are
satisfied by the item and possibly the weight associated with hit.
 user interface
o The Query Resolver will pass information to the user interface that will be
continually updating search status to the user.
o On request, retrieve any items that satisfy the user search statement.

 Inversions/indexes
o gain their speed by minimizing the amount of data to be retrieved and provide
the best ratio between the total number of items delivered to the user versus
the total number of items retrieved in response to a query.
o require storage overheads of 50% to 300%.
o hits may be returned to the user as soon as found.
o complete query must be processed before any hits are determined or available.
o encounter problems in fuzzy searches and imbedded string query terms.
o difficult to locate all the possible index values short of searching the complete
dictionary of possible terms.

 Finite state automata

 Many of the hardware and software text searchers use finite state automata.
 A finite state automata is a logical machine that is composed of
o I - a set of input symbols from the alphabet supported by the automata
o S - a set of possible states
o P - a set of productions that define the next state based upon the current state
and input symbol
o a special state called the initial state
o a set of one or more final states from the set S

 It is possible to represent the productions by a table with the states as the rows and the
input symbols that cause state transitions as each column.
 The states are representing the current state and the values in the table are the next
state given the particular input symbol.
5.3 Hardware Text Search Systems

 Issues in Software text search systems

 Restrictions to handle many search terms simultaneously against the same text
and limits due to I/O speeds.
 Hardware Text Search Systems
 Specialized hardware machine to perform the searches and pass the results to
the main computer supports the user interface and retrieval of hits.
 Advantages of hardware text search systems
 Scalability by increasing the number of hardware search devices.
 Elimination of the index that represents the document database.
 New items can be searched as soon as received by the system rather than
waiting for the index to be created
 Search speed is deterministic.
 It is slower than using an index, but provides the user with an exact search
time.
 Architecture
 Figure represents hardware text search solutions.
 The algorithmic part of the system is focused on the term detector.
 There are three approaches to implement term detectors:
 parallel comparators or associative memory
 cellular structure
 universal finite state automata
 When the term comparator is implemented with parallel comparators, each term in the
query is assigned to an individual comparison element and input data are serially
streamed into the detector.
 When a match occurs, the term comparator informs the external query resolver
(usually in the main computer) by setting status flags.

 Example for hardware text string search units

 Rapid Search Machine by General Electric.
 In this, a single query was passed against a magnetic tape containing the
documents.
 Associative File Processor (AFP) by Operating Systems Inc.
 It is capable of searching against multiple queries at the same time.
 High Speed Text Search (HSTS) machine by Operating Systems Inc.
 It uses a finite state machine algorithm that runs three parallel state machines.
 One state machine is dedicated to contiguous word phrases
 another for imbedded term match
 final for exact word match

 The GESCAN system

 Uses a text array processor (TAP) that simultaneously matches many terms
 Conditions against a given text stream, the TAP receives the query
information from the user’s computer and directly access the textual data from
secondary storage.
 TAP consists of a large cache memory and an array of four to 128 query
processors.
 Text is loaded into the cache and searched by the query processors.
 Each query processor is independent and can be loaded at any time.
 A complete query is handled by each query processor.
 Queries support exact term matches, fixed length don’t cares, variable length
don’t cares, terms may be restricted to specified zones, Boolean logic, and
proximity.
 A query processor works two operations in parallel; matching query terms to
input text and boolean logic resolution.
 Term matching is performed by a series of character cells each containing one
character of the query.
 A string of character cells is implemented on the same LSI chip and the chips
can be connected in series for longer strings.
 When a word or phrase of the query is matched, a signal is sent to the
resolution sub-process on the LSI chip.
 The resolution chip is responsible for resolving the Boolean logic between
terms and proximity requirements.

 If the item satisfies the query, the information is transmitted to the user’s
computer.
 The text array processor uses these chips in a matrix arrangement as shown in
Figure9.10.
 Each row of the matrix is a query processor in which the first chip performs
the query resolution while the remaining chips match query terms.
 The maximum number of characters in a query is restricted by the length of a
row while the number of rows limits the number of simultaneous queries that
can be processed.
 Another approach for hardware searchers is to augment disc storage.
 The augmentation is a generalized associative search element placed between
the read and write heads on the disk.
 Examples
 Content Addressable Segment Sequential Memory(CASSM) system
 developed at the University of Florida
 uses search elements in parallel to obtain structured data from a database.
 perform string searching across the database.

 Relational Associative Processor (RAP)

 Another special search machine developed at the University of Toronto.
 Performs search across a secondary storage device using a series of cells
comparing data in parallel.

 Fast Data Finder (FDF)

 Most recent specialized hardware text search unit.
 It was developed to search text and has been used to search English and
foreign languages.
 The early Fast Data Finders consisted of an array of programmable text
processing cells connected in series forming a pipeline hardware search
processor.
 The cells are implemented using a VSLI chip.
 Each chip contained 24processor cells with a typical system containing 3600
cells
 Each cell will be a comparator for a single character limiting the total number
of characters in a query to the number of cells.
 The cells are interconnected with an 8-bit data path and approximately 20-bit
control path.
 The text to be searched passes through each cell in a pipeline fashion until the
complete database has been searched.
 As data is analyzed at each cell, the 20 control lines states are modified
depending upon their current state and the results from the comparator.
 A cell is composed of both a register cell (Rs) and a comparator (Cs).
 The input from the Document database is controlled and buffered by the micro
process/memory and feed through the comparators.
 The search characters are stored in the registers.
 The connection between the registers reflects the control lines that are also passing
state information.

 Groups of cells are used to detect query terms, along with logic between the terms,
by appropriate programming of the control lines.
 When a pattern match is detected, a hit is passed to the internal microprocessor
that passes it back to the host processor, allowing immediate access by the user to
the Hit item.

 The functions supported by the Fast data Finder are:

 Boolean Logic including negation
 Proximity on an arbitrary pattern
 Variable length “don’t cares “
 Term counting and thresholds
 fuzzy matching
 term weights
 numeric ranges
Multimedia Information Retrieval

 Text elements that are used for indexing are

o Characters
o word stems
o words
o Phrases.
 Imagery, audio, and video elements are
o In audio: Phonemes (or basic units of sound) and their properties (e.g.,
loudness, pitch),
o In imagery: color, shape, texture, and location
o In video: imagery and audio elements, camera position and movement.
 The users demanding content-based access to materials are increasing and
approximately 10 million sites are on the World Wide Web.

5.4 Spoken Language Audio Retrieval

 The ability to search the content of audio sources such as speeches, radio broadcasts,
and conversations would be valuable for a range of applications.
 Techniques developed are
o automated recognition of speech
 application areas are
 speaker verification
 transcription
 command and control
Evaluation, Issues, and findings
 Speech and text retrieval in the context of the Video Mail Retrieval (VMR) project by
Jones et al.
o speech transcription word error rates may be high
o Redundancy in the source material helps offset these error rates and still
support effective retrieval.
o speaker-dependent techniques retain approximately 95% of the performance of
retrieval of text transcripts
o Speaker independent techniques about 75%.
o System scalability remains a significant challenge.

 BBN’s Rough ’n’ Ready prototype

o Provides information access to spoken language from audio and video sources.
o It creates a Rough summarization of speech that is ready for browsing.
o Its transcription is created by the BYBLOS™ large vocabulary speech
recognition system
o A continuous-density Hidden Markov Model (HMM) system tested in annual
formal evaluations for the past 12 years.
o BYBLOS runs at 3 times real-time, uses a 60,000 word dictionary, and
reported word error rates of 18.8% for the broadcast news transcription task.
o Addressing multilingual information access.

 Tokyo Institute of Technology and NHK broadcasting

o It addresses transcription and topic extraction from Japanese broadcast news.
o Improvise processing by modeling filled pauses, performing on-line
incremental speaker adaptation and by using a context dependent language
model
o The language model includes Chinese characters and two kinds of Japanese
characters.
5.5 Non-Speech Audio Retrieval

 In addition to content-based access to speech audio, noise/sound retrieval is also

important in such fields as music and movie/video production.

 SoundFisher
o It’s a user-extensible sound classification and retrieval system
o Illustrates from several disciplines, including signal processing,
psychoacoustics, speech recognition, computer music, and multimedia
databases.
o As image indexing algorithms use visual feature vectors to index and match
images, a vector of directly measurable acoustic features such as duration,
loudness, pitch, rightness are used to index sounds.
o This enables users to search for sounds within specified feature ranges.
o Content-based retrieval application enables a user to browse and/or query a
sound database by acoustic (e.g., pitch, duration) and/or perceptual properties
(e.g., “scratchy”) and/or query by example.
o For example, SoundFisher supports complex content queries such as “Find all
AIFF encoded files with animal or human vocal sounds that are similar to
barking sounds without regard to duration or amplitude.”
o The user can also perform a weighted query-by-value
o For example, foreground and transition with >.8 metallic and >.7 plucked
aural properties and 2000 hz < average pitch < 300 hz and duration.
o The system can also be trained by example, so that perceptual properties (e.g.,
“scratchiness” or “buzziness”) that are more indirectly related to acoustic
features can be specified and retrieved.

o Additional requirements identified by research are

 need for sound displays
 sound synthesis (a kind of query formulation/refinement tool)
 sound separation
 matching of trajectories of features over time
5.6 Graph Retrieval

 Another important media class is graphics, to include tables and charts (e.g., column,
bar, line, pie, scatter).
 Graphs are constructed from more primitive data elements such as points, lines, and
labels.

Sagebook
 An example of a graph retrieval system created at Carnegie Mellon University.
 Enables both search and customization of stored data graphics.
 Supports data graphic query, representation (content description), indexing, search,
and adaptation capabilities.
o Queries are formulated via a graphical direct-manipulation interface by
 selecting and arranging spaces (e.g., charts, tables),
 objects contained within those spaces (e.g., marks, bars)
 object properties (e.g., color, size, shape, position).
o Relevant graphics stored in a library retrieved by matching the content and/or
properties of the graphical query.
o Both exact matching and similarity based matching is performed.
 Maintains an internal representation of the syntax and semantics of data-graphics
(spatial relationships between objects, relationships between data domains, and the
various graphic and data attributes)
 Search is performed both on graphical and data properties to enable varying degrees
of match relaxation.
 Provides automatic adaptation techniques that can modify the retrieved graphic that
do not match the specified query.
 The ability to enable new capabilities for retrieve graphics by content.
5.7 Imagery Retrieval

 Increasing volumes of images have raised the need for more effective and efficient
imagery access.
 There are needs for indexing and search of not only the metadata (e.g., captions,
annotations) associated with the imagery but also retrieval directly on the content of
the imagery.
 The automatic indexing of visual features of imagery(e.g., color, texture, shape) used
for retrieving similar images without the burden of manual indexing.
 However, the ultimate objective is semantic based access to imagery.

Query By Image Content (QBIC) system

 UltimediaManager, commercial version of QBIC, represents imagery attribute
indexing approach.
 Access to imagery collections on the basis of visual properties such as color, shape,
texture, and sketches.
 Query facilities for specifying color parameters, drawing desired shapes, or selecting
textures replace the traditional keyword query found in text retrieval.
 As robust, domain independent object identification remains difficult and manual
image annotation is tedious, automated and semi automated object outlining tools
(e.g., foreground/background models to extract objects) were developed to facilitate
database population.

Content based imagery access to video retrieval.

 More recently researchers have investigated the application of content based imagery
access to video retrieval.
 For example, shot detection and extraction of a representative frame (r-frame or
keyframe) is performed for each shot, and derived a layered representation of moving
objects.
 This enables queries such as “find me all shots panning left to right” which yield a list
of relevancy ranked r-frames (which acts as a thumbnail), selection of which retrieves
the associated video shot.
 Additional research in image processing
o addressed specific kinds of content-based retrieval problems.
o face processing, where
 distinguish face detection (identifying a face or faces in a scene),
 face recognition (authenticating that a given face is of a particular
person),
 face retrieval (find the closest matching face in a repository).

 Also developed systems

o to track human movement (e.g., heads, hands, feet)
o to differentiate human expressions such as a smile, surprise, anger, or disgust.
o to research in emotion recognition in the context of human computer
interaction.

 Informedia Digital Video Library system

o Face recognition is also important in video retrieval.
o This system extracts information from audio and video and supports full
content search over digitized video sources.
o It provides a facility called named face which automatically associates a name
with a face and enables the user to search for a face given a name and vice
versa.
5.8 Video Retrieval

 The ability to support content based access to video are

o access to video mail
o videotaped meetings
o surveillance video
o broadcast television
 Broadcast News Navigator (BNN) system
o It is a web-based tool that automatically captures, annotates, segments,
summarizes and visualizes stories from broadcast news video.
o Integrates text, speech, and image processing technologies to perform
multistream analysis of video to support content-based search and retrieval.
o Addresses the problem of time-consuming, manual video acquisition /
annotation techniques that frequently result in inconsistent, error-full or
incomplete video catalogues.
o From BNN’s video query page, the user can
 search among thirty national or local news sources,
 specify an absolute or relative date range,
 search closed captions or speech transcriptions,
 run a pre-specified profile,
 search on text keywords or concepts that express topics or named
entities such as people, organizations, and locations.
o BNN automatically generates a custom query web page which includes menus
of people and location names from content extracted over the relevant time
period.
o It incorporates the Alembic natural language information extraction system.
o Supports simple browsing of stories during particular time intervals or from
particular sources.
o Ability to display a graph of named entity frequency over time.
o User can automatically data mine the named entities in the analyzed stories
using the “search for correlations” link shown on the left panel.
o Users are able to find video content about six times as fast.
o Automated segmentation of news programs into individual stories using cross
media cues such as visual changes, speaker changes, and topic changes
enhanced the performance.

 Topic detection and tracking (TDT)

o Topic detection and tracking initiative for broadcast news and newswire
sources aims to investigate algorithms that perform
 story segmentation (detection of story boundaries)
 topic tracking (detection of stories that discuss a topic, for each given
target topic)
 topic detection (detection of stories that discuss an arbitrary topic, for
all topics).

 Geospatial News on Demand Environment (GeoNODE)

o Whereas BNN focuses on story segmentation, GeoNODE addresses topic
detection and tracking.
o It presents news in a geospatial and temporal context.
o An analyst can navigate the information space through indexed access into
multiple types of information sources (from broadcast video, on-line
newspapers to specialist archives)
o It incorporates information extraction, data mining/correlation and
visualization components.
o ability of GeoNODE to automatically nominate and animate topics from
sources thereby directing analysis to the relevant documents having the right
topic, the right time, and the right place.

s550 6989494 Enus SM Bob Cat s550
100% (2)
s550 6989494 Enus SM Bob Cat s550
1,234 pages
Create A Simple ABAP CDS View in ADT
100% (1)
Create A Simple ABAP CDS View in ADT
71 pages
Receiving An MDS: Version: IMDS Release 10.0
No ratings yet
Receiving An MDS: Version: IMDS Release 10.0
26 pages
Unit 5 IRS
No ratings yet
Unit 5 IRS
17 pages
Unit V Irs
No ratings yet
Unit V Irs
17 pages
Irs Unit-Iv
No ratings yet
Irs Unit-Iv
22 pages
IRS Unit 5 by by Krishna
No ratings yet
IRS Unit 5 by by Krishna
19 pages
Unit V
No ratings yet
Unit V
23 pages
IRSunit 5
No ratings yet
IRSunit 5
34 pages
Unit - 5 Irs
100% (1)
Unit - 5 Irs
78 pages
Unit V
No ratings yet
Unit V
43 pages
Nformation Etrieval Ystems: P.Veera Swamy
No ratings yet
Nformation Etrieval Ystems: P.Veera Swamy
73 pages
Irs Unit-5
No ratings yet
Irs Unit-5
6 pages
Irs Unit 5 PDF
No ratings yet
Irs Unit 5 PDF
24 pages
Irs Mid
No ratings yet
Irs Mid
13 pages
Explain Item Normalization?
No ratings yet
Explain Item Normalization?
7 pages
IRS Unit-1
No ratings yet
IRS Unit-1
27 pages
Irs Sem Unit 5
No ratings yet
Irs Sem Unit 5
8 pages
IRS Unit-1
No ratings yet
IRS Unit-1
61 pages
Unit 1 Irs
No ratings yet
Unit 1 Irs
26 pages
IRS1part 2
No ratings yet
IRS1part 2
28 pages
Irs Unit-1
No ratings yet
Irs Unit-1
61 pages
UNIT 1 IRS WWWWW
No ratings yet
UNIT 1 IRS WWWWW
26 pages
Introduction To Information Storage and Retrieval Systems: BY-Research Scholar
No ratings yet
Introduction To Information Storage and Retrieval Systems: BY-Research Scholar
42 pages
JanuaryFebruary-2023 Irs
No ratings yet
JanuaryFebruary-2023 Irs
2 pages
IRS UNIT 5-Compressed
No ratings yet
IRS UNIT 5-Compressed
80 pages
Unit-I: Introduction To Information Retrieval Systems
100% (1)
Unit-I: Introduction To Information Retrieval Systems
14 pages
Did It Make The News?
No ratings yet
Did It Make The News?
6 pages
IRS Unit-1
50% (2)
IRS Unit-1
14 pages
Chap 1
No ratings yet
Chap 1
22 pages
Software Requirement Specification Template
0% (1)
Software Requirement Specification Template
7 pages
Irs Unit1
No ratings yet
Irs Unit1
15 pages
IRS Study Material
100% (1)
IRS Study Material
87 pages
Irs U-1
No ratings yet
Irs U-1
49 pages
Unit 1
No ratings yet
Unit 1
15 pages
Automatic Image Annotation: Fundamentals and Applications
From Everand
Automatic Image Annotation: Fundamentals and Applications
Fouad Sabry
No ratings yet
Parallel and Distributed Ir
No ratings yet
Parallel and Distributed Ir
33 pages
Irs Important Questions
0% (1)
Irs Important Questions
3 pages
IRS Unit-1
100% (5)
IRS Unit-1
14 pages
Automatic Image Annotation: Enhancing Visual Understanding through Automated Tagging
From Everand
Automatic Image Annotation: Enhancing Visual Understanding through Automated Tagging
Fouad Sabry
No ratings yet
Unit I
No ratings yet
Unit I
23 pages
Fpga Implementation of Binary Search 1
No ratings yet
Fpga Implementation of Binary Search 1
5 pages
Information Retrieval Algorithms: A Survey: Prabhakar Raghavan
No ratings yet
Information Retrieval Algorithms: A Survey: Prabhakar Raghavan
8 pages
Fast Line Detection Using Major Line Removal Morphological Hough
No ratings yet
Fast Line Detection Using Major Line Removal Morphological Hough
5 pages
NLP 1 - 5 Modules
No ratings yet
NLP 1 - 5 Modules
210 pages
Previous Papers and Answers
No ratings yet
Previous Papers and Answers
46 pages
Prefix-Based Multi-Pattern Matching On FPGA
No ratings yet
Prefix-Based Multi-Pattern Matching On FPGA
2 pages
Fla 03
No ratings yet
Fla 03
27 pages
Irs I
No ratings yet
Irs I
20 pages
Cmrit Isr Notes - Docx New
No ratings yet
Cmrit Isr Notes - Docx New
54 pages
ISR Chap..1
No ratings yet
ISR Chap..1
27 pages
11 Multimedia Media IR
No ratings yet
11 Multimedia Media IR
19 pages
Application of Finite Automata Search Engine
No ratings yet
Application of Finite Automata Search Engine
21 pages
Unit 5 Irs PDF
No ratings yet
Unit 5 Irs PDF
9 pages
Selected Topics in Computer Science CH
No ratings yet
Selected Topics in Computer Science CH
24 pages
F K W S INC C: Uzzy EY ORD Earch Loud Omputing
No ratings yet
F K W S INC C: Uzzy EY ORD Earch Loud Omputing
18 pages
IRS Unit - 1 & 2
No ratings yet
IRS Unit - 1 & 2
33 pages
Unit 1
No ratings yet
Unit 1
19 pages
Irs Question Papers
No ratings yet
Irs Question Papers
6 pages
Revision Worksheet Answers
No ratings yet
Revision Worksheet Answers
6 pages
IRS Cycle III BIT BANK
No ratings yet
IRS Cycle III BIT BANK
3 pages
Lexicon of Computer Science Terminology: Lexicon of Tech and Business, #16
From Everand
Lexicon of Computer Science Terminology: Lexicon of Tech and Business, #16
Mustafa Al-Dori
4/5 (1)
Efficient Memory Optimization for IoT Intrusion Detection
From Everand
Efficient Memory Optimization for IoT Intrusion Detection
Ethan Evelyn
No ratings yet
Listele de Episoade La Seriile Dragon Ball
No ratings yet
Listele de Episoade La Seriile Dragon Ball
109 pages
Claves Acceso Journals
83% (6)
Claves Acceso Journals
84 pages
TEWWG:Story of An Hour
No ratings yet
TEWWG:Story of An Hour
2 pages
Manish Kumar Vs Union of India UOI and Ors 1901202SC20212001211057171COM112670
No ratings yet
Manish Kumar Vs Union of India UOI and Ors 1901202SC20212001211057171COM112670
169 pages
Loner by Rae
No ratings yet
Loner by Rae
278 pages
Pinlac V CA
100% (2)
Pinlac V CA
1 page
Audio Poetics Literary Meaning in Voice As Explicature by Kanyi Thiongo
No ratings yet
Audio Poetics Literary Meaning in Voice As Explicature by Kanyi Thiongo
322 pages
Rakesh Jhunjhunwala Portfolio - October 2011
No ratings yet
Rakesh Jhunjhunwala Portfolio - October 2011
1 page
Mann Letters To Paul Amann
No ratings yet
Mann Letters To Paul Amann
175 pages
1.2 Impromptu Speech
No ratings yet
1.2 Impromptu Speech
20 pages
Sapkas & Kollar - Lateral-Torsional Buckling of Composite Beams (2002)
No ratings yet
Sapkas & Kollar - Lateral-Torsional Buckling of Composite Beams (2002)
25 pages
575 - Sahodaya Post Mid Term Circular 2024
No ratings yet
575 - Sahodaya Post Mid Term Circular 2024
1 page
Atsız
No ratings yet
Atsız
16 pages
Ldpe 2101TN47
No ratings yet
Ldpe 2101TN47
3 pages
Hsiang (2016) - Climate Econometrics
No ratings yet
Hsiang (2016) - Climate Econometrics
35 pages
IB HL 5 EQ Paper 2 s99 To s13 Incl W 4students PDF
No ratings yet
IB HL 5 EQ Paper 2 s99 To s13 Incl W 4students PDF
76 pages
Mahaveer Nishad Problematic Soil Word File
No ratings yet
Mahaveer Nishad Problematic Soil Word File
7 pages
#1 Ok - Actual Let Exam
No ratings yet
#1 Ok - Actual Let Exam
24 pages
Birth and Death
No ratings yet
Birth and Death
36 pages
MPA Lesson Plan
No ratings yet
MPA Lesson Plan
4 pages
Stages of Literacy Development
No ratings yet
Stages of Literacy Development
3 pages
B1 Prel 2, Test 1 Answer Key, Listening
100% (1)
B1 Prel 2, Test 1 Answer Key, Listening
4 pages
The Bible The Most Useful Book For Devotion
No ratings yet
The Bible The Most Useful Book For Devotion
6 pages
Mathematical Modeling of Inland Vessel Maneuverability Considering Rudder Hydrodynamics Jialun Liu Instant Download
No ratings yet
Mathematical Modeling of Inland Vessel Maneuverability Considering Rudder Hydrodynamics Jialun Liu Instant Download
55 pages
(H) VISem BCH6.2 GST Week3 AnkitaTomar
No ratings yet
(H) VISem BCH6.2 GST Week3 AnkitaTomar
23 pages
Gullas vs. Philippine National Bank, G.R. No. L-43191, November 13, 1935
100% (1)
Gullas vs. Philippine National Bank, G.R. No. L-43191, November 13, 1935
3 pages

Unit 5 IRS

Uploaded by

Unit 5 IRS

Uploaded by

TEXT SEARCH ALGORITHMS

5.1 Introduction to Text Search Techniques

Text scanning system

 Finite state automata

 Issues in Software text search systems

 Example for hardware text string search units

 The GESCAN system

 Relational Associative Processor (RAP)

 Fast Data Finder (FDF)

 The functions supported by the Fast data Finder are:

 Text elements that are used for indexing are

5.4 Spoken Language Audio Retrieval

 BBN’s Rough ’n’ Ready prototype

 Tokyo Institute of Technology and NHK broadcasting

 In addition to content-based access to speech audio, noise/sound retrieval is also

o Additional requirements identified by research are

Query By Image Content (QBIC) system

Content based imagery access to video retrieval.

 Also developed systems

 Informedia Digital Video Library system

 The ability to support content based access to video are

 Topic detection and tracking (TDT)

 Geospatial News on Demand Environment (GeoNODE)

You might also like