0% found this document useful (0 votes)
339 views

Search and Resource Discovery Paradigms

Two goals of e-commerce are to increase the availability and accessibility of information through search tools. Search tools utilize computer processing power to improve decision making without increasing time/effort. Effective ways of navigating, searching and retrieving online information are important for both customers and organizations. Common methods of searching include information retrieval, electronic directories/catalogs, and information filtering.

Uploaded by

rachana sai
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
339 views

Search and Resource Discovery Paradigms

Two goals of e-commerce are to increase the availability and accessibility of information through search tools. Search tools utilize computer processing power to improve decision making without increasing time/effort. Effective ways of navigating, searching and retrieving online information are important for both customers and organizations. Common methods of searching include information retrieval, electronic directories/catalogs, and information filtering.

Uploaded by

rachana sai
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 12

E-COMMERCE

MODULE 4

CONSUMER SEARCH AND RESOURCE DISCOVERY

 Two fundamental goals of electronic commerce is to increase the availability and


accessibility of useful information.
 Availability is accomplished through improved publishing tools that provide ready
access to large amount of product information.
 Accessibility is enhanced through search and retrieval tools.
 The goal of search tools is to utilize the information processing power of the computer
to improve decision making without increasing the time and effort in making choices.
 Hence, designing flexible ways of navigating, searching and retrieving information
from on-line databases is important. This need affects both individual customers and
as well as organizations.
 In consumer-oriented E-commerce, consumers search on-line stores for the best
product in terms of price or functional characteristics.
 In organizations, search is a process through which an organization adapts to changes
in its external environment such as new suppliers, new products and new services.

SEARCH AND RESOURCE DISCOVERY PARADIGMS

Three information search and resource discovery paradigms are in use:

 Information search and retrieval


 Electronic directories & catalogs.
 Information filtering.

Information search and retrieval:

 Search and retrieval begins when a user provides a description of the information
being sought to an automated discovery system.
 Using the knowledge of the environment, the system attempts to locate the
information that matches the given description.
 An information retrieval method depends on the libraries.
 The challenge is to develop user in domains such as electronic shopping.

DEPT OF CSE, GITAM 1


E-COMMERCE

 Search and retrieval methods that refine queries through various computing
techniques such as nearest neighbors, them variants of original query.

Electronic catalogs and directories:

 Information organizing and browsing is accomplished using directories or catalogs.


 Organizing refers to the human-guided process of deciding how to interrelate
information, usually by placing it into some sort of a hierarchy.
 Browsing refers to the corresponding human-guided activity of exploring the
organization and contents of a resource space.
 Maintaining a well-organized database when large amounts of data are continuously
changing is difficult.
 Information browsing depends heavily on the quality and relevance of the
organization.

Information filtering:

 Goal of information filtering if selecting of data that is relevant, manageable and


understandable.
 Filters are of two types:
1. Local filter
2. Remote filter
 Local filters: local filters work on incoming data to a PC, such as news feeds.
 Remote filters: remote filters are often software agents that work on behalf of the user
and roam around the network from one data base to another.

INFORMATION SEARCH AND RETRIEVAL

 Information search is shifting through large volumes of information to find some


target information.
 Search and retrieval systems are designed for dealing with unstructured and semi-
structured data, in contrast to database applications involving only very structured
data, such as employee records.
 E-mail messages are an example of semi-structured data in that they have well-
defined header fields and an unstructured text body.

DEPT OF CSE, GITAM 2


E-COMMERCE

 The process of searching for text strings in a large collection of documents can be
divided into two phases: end-user retrieval and publisher indexing phase.
 The end-user retrieval phase consists of three steps that the user performs during the
text search.
1. First, the user formulates a query, specifying in some way the material for which
the text database is to be searched.
2. Second, the server interprets the user’s query, performs the search, and returns to
the user a list of documents meeting the search criteria. The list of matching
documents returned to the user is generally called a hit list.
3. Third, the user selects documents from the hit list and browses them, reading and
perhaps printing selected portions of retrieved data.
To illustrate, if the user specifies a query to find all documents containing the
string “Electronic commerce”, the system would apply a string-matching
algorithm to all the documents to search for the string. The result might be a
retrieval of multi-fold documents. To reduce the number of documents retrieved,
some systems allow users to specify the number of documents that they would
like to see in any one search, typically based on the location of the data, with
limited per-item or per-location searching facilities. In short, the goal for the user
is to obtain a limited set of information from an on-line source to solve some need
or problem.
 The publisher indexing phase consists of entering documents into the system and
creating indexes and pointers to facilitate subsequent searches.
 The process of loading documents into the system and updating indexes is normally
not a concern to the user.
 These two phases are highly interdependent.
 Searching can be comprehensive throughout the archive (for example, WAIS servers
provide full-text indexes) or limited to certain keywords.

Wide Area Information Service (WAIS) Engine

 It enables users to search the contents of the files for any string of text that they
supply.

DEPT OF CSE, GITAM 3


E-COMMERCE

 It uses an English language query front end a large assortment of data bases that
contains text based documents.
 It allows users to search the full text of all the documents on the server.
 Users on different platforms can access personal, company, and published information
from one interface i.e. text, picture, voice, or formatted document.
 Anyone can use this system because it uses natural language questions to find relevant
documents.
 Relevant documents can be fed back to a server to refine the search.
 The servers take a user’s question and do their best to find relevant documents.
 The WAIS server returns a list of documents that contain the specified phrases and
keywords.
 Today, the Netscape or NCSA mosaic browser with the forms capability is often used
as a front-end to talk to WIAS sever.
 WAIS has three elements: a client, a server and an indexer.
 First, the indexer takes a list of files the publisher wants to index and generates from it
several index files.
 These indexes include a directory of all words appearing in the database, a list of
documents and files that constitute the database.
 With the index created, the user must tell the rest of the world about it. The publisher
does this by automatically running WAIS with a register option, which places this
index next to the hundreds of WAIS indexes already available on the internet.
 WAIS solves a number of problems from the user’s perspective.
1. It allows users to identify and select information from large databases.
2. It provides heterogenous database access, as published databases may be on a
variety of different systems and the user need not know how to use each system.
3. It provides ways to download and organize the retrieved data so that users are not
overwhelmed.

Search Engines

 WAIS is a sophisticated search engine.

DEPT OF CSE, GITAM 4


E-COMMERCE

 The purpose of a search engine in any indexing system is simple: to find every item
that matches a query, no matter where it is located in the file system.
 Search engines are now being designed to go beyond simple, broadband searches for
which WAIS is so popular.
 One of the more popular approaches is used by Topic, a search engine used in Lotus
Notes, Adobe Acrobat and a variety of other products.
 It uses both keywords and information searching to rank the relevance of each
document.
 A different approach is offered by context-based searching. These tools let the user
enter a query and then come up with the relevant data based on the context of the
documents themselves.
 Other approaches to data searching on the Web or on other wide-area networks are
available.
 The most compelling is Oracle’s Context, which can go through a variety of
documents and create its own summary, pulling about three key sentences from each
document it selects.

Indexing methods:

 To accomplish accuracy and conserve disk space, two types of indexing methods are
used by search engines. They are:
1. File-level indexing
2. Word-level indexing
1. File-level indexing:
 It associates each indexed word with a list of all files in which that word appear at
least once.
 It does not carry any information about the location of words within the file.
2. Word-level indexing:
 It is more sophisticated and stores the location of every instance of a word.
 These indexes enable users to search for complete phrases or words that are in close
proximity.

DEPT OF CSE, GITAM 5


E-COMMERCE

 The disadvantage of the word-level indexing is that all the extra information they
contain gobbles up a lot of disk space – anywhere between 35 to 100 percent of the
size of the original text.
 The process of indexing data is simple one ,it has large number of indexing
packages.
 These indexing packages are categorized into three types, they are:
1. The client-server method: It is based on the distributed approach in which the
document database and the text search and retrieval software reside on a central
server and the data representation and user-interface software reside on the
user’s workstation. In this approach, the index file can be split into pieces
corresponding to work groups and maintained on separate servers.
2. The mainframe-based approach: It is generally more expensive and less flexible
than the previous architectures, but it provides for large amounts of storage, fast
response time, and standard data management and configuration control. The
mainframe may also handle query and display formatting, enabling searches to
be conducted from non-intelligent character based terminals.
3. The parallel-processing approach: It allows many processing units to conduct
searches simultaneously. The file to be searched is broken up into many pieces,
and each processor searches its segment of the index file. The processors may or
may not share memory and storage. The results are merged before being
presented to the user.

Search and new data types

 Over the past few years, new technologies have become incorporated into systems
that provide additional possibilities for, but also challenges to, effective search.
 We have the following search technologies for effective search:
 Hypertext: Richly interwoven links among items in displays allow users to
move in relatively ad hoc sequences from display to display within multimedia
database applications.
 Sound: Speech input and output, music and wide variety of acoustic cues
include realistic sounds that supplement and replace visual communication.

DEPT OF CSE, GITAM 6


E-COMMERCE

 Video: Analog or digital video input from multiple media, including video
tapes, CD-ROM, incorporated broadcast videos turners, cables and satellites
provide video imagery that supplement and replace computer-generated
graphics.
 3D-images: Virtual reality displays offer a 3D environment in which all
portions of the user interface are 3D.

WWW Robots, Wanderers and Spiders

 Robots, Wanderers and Spiders are all programs that traverse the WWW
automatically gathering information.
 For E-commerce, agent-based resource discovery is becoming increasingly important
as the number of sellers increases.
 A resource discovery program might fill out a form, or supply a user name and
password, to access the data of interest.
 A software agent views the World Wide Web as a graph.
 It starts at a set of nodes (HTML) and traverses the hypertext links in these nodes at a
certain depth beginning at a URL passed as an argument.
 Only URL’s having “.” Suffixes or tagged as “HTTP:” and ending in a slash are
probed.
 This method results in a limited-depth breadth-first traversal of only HTML portions
of the web.
 But because of time constraint and heterogeneity of the information and of the
repositories, to perform exhaustive searches multiple software agents are required.

ELECTRONIC COMMERCE CATALOGS OR DIRECTORIES

 A directory performs an essential support function that guides customers in a maze of


options by enabling the organizations of the information space.
 Directories are of two types:

1. The white pages

2. Yellow pages

DEPT OF CSE, GITAM 7


E-COMMERCE

 The white pages are used to people or institutions and yellow pages are used to
consumers and organizations.

Electronic white pages:

 Analogues to the telephone white pages, the electronic white pages provide services
from a static listing of e-mail addresses to directory assistance.
 White pages directories, also found within organizations, are integral to work
efficiency.
 The problems facing organizations are similar to the problems facing individuals.
 A white pages schema is a data model, specifically a logical schema, for organizing
the data contained in entries in a directory service, database, or application, such as an
address book.
 A white pages schema typically defines, for each real-world object being represented:
 What attributes of that object are to be represented in the entry for that object?
 What relationships of that object to other objects are to be represented?
 One of the earliest attempts to standardize a white pages schema for electronic mail
use was in X.520 and X.521, part of the X.500 a specification that was derived from
the addressing requirements of X.400.
 In a white pages directory, each entry typically represents an individual person that
makes the use of network resources, such as by receiving email or having an account
to log into a system.
 In some environments, the schema may also include the representation of
organizational divisions, roles, groups, and devices.
 The term is derived from the white pages, the listing of individuals in a telephone
directory, typically sorted by the individual's home location (e.g. city) and then by
their name.

White pages through x.500:

 One of the first goal of the X.500 project has been to create a directory for keeping
track of individual electronic mail address on the internet.
 X.500 offers the following features:
 Decentralized maintenance

DEPT OF CSE, GITAM 8


E-COMMERCE

 Each site running x.500 is responsible only for its local part of the directory.
 Searching capabilities: x.500 provides powerful searching capabilities i.e. in the white
pages; you can search solely for users in one country. From there you can view a list
of organizations, then departments, then individual names.
 This represents the tree structure.
 Single global name space: x.500 provides single name space to users.
 Structured information framework: X.500 defines the information framework used in
the directory, allowing local extensions.
 Standards-based directory: X.500 can be used to build directory applications that
requires distributed information.

ELECTRONIC YELLOW PAGES:

 The term Yellow Pages refers to a telephone directory of businesses, categorized


according to the product or service provided.
 The traditional term Yellow Pages is now also applied to online directories of
businesses.
 To avoid the increasing cost of yellow paper, the yellow background of the pages is
currently printed on white paper using ink. Yellow paper is no longer used.
 The name and concept of "Yellow Pages" came about in 1883, when a printer in
Cheyenne, Wyoming working on a regular telephone directory ran out of white paper
and used yellow paper instead.
 In 1886 Reuben H.Donnelley created the first official yellow pages directory,
inventing an industry.
 Today, the expression Yellow Pages is used globally, in both English-speaking and
non-English speaking countries.
 In the US, it refers to the category, while in some other countries it is a registered
name and therefore a proper noun.
 Third-party directories can be categorized variously:
 Basic yellow pages: These are organized by human-oriented products and
services.
 Business directories: This takes the extended information about companies,
financial health, and news clippings.

DEPT OF CSE, GITAM 9


E-COMMERCE

 State business directories: this type of directory is useful in businesses that


operate on a state or geographic basis.
 Directories by SIC :( standard industrial classification) directories are compiled by the
government.
 Manufacturer’s directories: if your goal is to sell your product or service to
manufacturers, then this type of directory is used.
 Big-business directory: This directory lists companies of 100 or more employees.
 Metropolitan area business directory: It develops sales and marketing tools for
specific cities.
 Credit reference directory: this directory provides credit rating codes for millions of
US companies.
 World Wide Web directory: this lists the various hyperlinks of the various servers
scattered around the internet.

INFORMATION FILTERING

 An Information filtering system is a system that removes redundant or unwanted


information from an information stream using (semi)automated or computerized
methods prior to presentation to a human user.
 Its main goal is the management of the information overload and increment of the
semantic signal-to-noise ratio. To do this the user's profile is compared to some
reference characteristics.
 A notable application can be found in the field of email spam filters.
 Thus, it is not only the information explosion that necessitates some form of filters,
but also inadvertently or maliciously introduced pseudo-information.
 On the presentation level, information filtering takes the form of user-preferences-
based newsfeeds, etc.
 Recommender systems are active information filtering systems that attempt to present
to the user information items (movies, music, books, news, webpage) the user is
interested in.
 Information filtering describes a variety of processes involving the delivery of
information to people who need it.

DEPT OF CSE, GITAM 10


E-COMMERCE

 This technology is needed as the rapid accumulation of information in electronic


databases.
 Information filtering is needed in e-mails, multimedia distributed system and
electronic office documents.

The features of the information filtering are:

 Filtering systems involves large amounts of data (gigabits of text).


 Filtering typically involves streams of incoming data, either being broadcast by
remote sources or sent directly by other sources like e-mails.
 Filtering has also been used to describe the process of accessing and retrieving
information from remote database.
 Filtering is based on descriptions of individual or group information preferences,
often called profiles.
 Filtering system deal primarily with textual information.

Email filtering:

 It is the processing of e-mail to organize it according to specified criteria.


 Most often this refers to the automatic processing of incoming messages, but the
term also applies to the intervention of human intelligence in addition to anti-spam
techniques, and to outgoing emails as well as those being received.
 Email filtering software inputs email.
 For its output, it might pass the message through unchanged for delivery to the
user's mailbox, redirect the message for delivery elsewhere, or even throw the
message away.
 Some mail filters are able to edit messages during processing.
 Common uses for mail filters include removal of spam and of computer viruses.
 A less common use is to inspecting outgoing e-mail at some companies to ensure
that employees comply with appropriate laws.
 Users might also employ a mail filter to prioritize messages, and to sort them into
folders based on subject matter or other criteria

Mail-filtering agents:

DEPT OF CSE, GITAM 11


E-COMMERCE

 Users of mailing-filtering agents can instruct them to watch for items of interest in
e-mail in-boxes, on-line news services, electronic discussion forums, and the like.
 The mail agent will pull the relevant information and put it in the users
personalized newspapers at predetermined intervals.
 Example of Apple’s Apple Search software. Mail filters can be installed by the
user, either as separate programs (see links below), or as part of their e-mail
program (e-mail client).
 In e-mail programs, users can make personal, "manual" filters that then
automatically filter mail according to the chosen criteria.
 Most e-mail programs now also have an automatic spam filtering function.
 Internet service providers can also install mail filters in their mail transfer agents
as a service to all of their customers. Corporations often use them to protect their
employees and their information technology assets.

News-filtering agents:

 These deliver real-time on-line news.


 Users can indicate topics of interest, and the agent will alert them to news stories
on those topics as they appear on the newswire.
 Users can also create personalized news clipping reports by selecting from news
services.
 Consumers can retrieve their news from through the delivery channel of their
choice like fax, e-mail, www page, or lotus notes platform.

******************

DEPT OF CSE, GITAM 12

You might also like