Artificial Intelligence in Information Retrieval
Artificial Intelligence in Information Retrieval
ARTIFICIAL INTELLIGENCE IN
2022 4th International Conference on Advances in Computing, Communication Control and Networking (ICAC3N) | 978-1-6654-7436-8/22/$31.00 ©2022 IEEE | DOI: 10.1109/ICAC3N56670.2022.10074291
INFORMATION RETRIEVAL
Lalita Shukla1 Dr.J.N. Singh2
Galgotias University, Greater Noida, U.P, India. Galgotias University, Greater Noida, U.P, India.
[email protected], [email protected],
Abstract—This paper discusses the relationship between Techniques of artificial intelligence are used to obtain
information retrieval (IR) and AI. Checking retrieval of information throughout the standard process and to
texts, summarizes its key features and demonstrates the acquire new resources of added value. The first section
state of its art by introducing it one model that may have
provides a brief overview of data recovery. The
details, and other test results that show its value. The paper
following sections are organized according to the
then analyzes this model and effective methods related to,
focusing on and forgiving their weak use, unwanted recovery step process and provide examples of
representation and thinking. This paper describes some of applications.
the most effective ways uses intelligence-acquiring
information retrieval (IR). Recovery of information is an
AI AND IR
important information management technology. It works “This is the use of computers to carry out tasks requiring
together by searching for information and referencing,
reasoning on world knowledge, as exemplified by giving
storing and categorizing information.
responses to questions in situation where one is dealing
Keywords: Information retrieval, NLP, Inverted Index, with only partial knowledge and with indirect
Stemming, lemmatization, Standardization, connectivity” [1]
Paramerization.
An IR system is a software system that provides access
I. INTRODUCTION to books, magazines, and other documents; Stores and
"Acquiring knowledge" is an all-encompassing term. maintains those documents.
This paper is based on a well-established concept of text 1.1. Basic Terms
retrieval. I will also limit you, at first, to writing
document, or text, retrieval, retrieval for processing Corpus: A large repository of document stored
other types of documents, of sample photos. This paper on computers.
answers the question: What is the source of information
Information Need: A topic about which we
(in the sense that document retrieval) related to
want to get the information.
performance intelligence? The answer may seem
Relevance: Some of the documents in corpus
obvious, that is, everything. If IR means, as very
that may contain what I want to search.
important and challenging, it is automatic retrieval of
content-based information, and then a common thought
1.2. Types of data
in AI that AI researchers will show IR staff how this is
done. Information Retrieval (IR) is a process that
involves activities related to human understanding and
1. STRUCTURED DATA:
information management; therefore, the definition of It refers to the information in the form of tables and has
a clear, overt semantic structure.
Information Access Systems can benefit from the use of
strategic strategies to account for internal and Example: Data stored in Relational Database.
uncertainties that reflect the subordination of the task.
2. UNSTRUCTURED DATA
Artificial intelligence methods in information
retrieval It lacks a clear, meaningful, intuitive, easy-for-a
computer structure.
Authorized licensed use limited to: Panimalar Engineering College - Chennai. Downloaded on January 04,2025 at 03:26:35 UTC from IEEE Xplore. Restrictions apply.
2022 4th International Conference on Advances in Computing, Communication Control and Networking (ICAC3N)
1.3. Areas Of Artificial Intelligence for A closer look at the models reveals that they are very
InformationRetrieval similar to the traditional vector space model Recovery.
[2]
Natural language processing
Step 3: Query Matching
Representation of knowledge
Expert systems The main function of Query Matching in IR is to first
Ex: Logical formalisms, conceptual locate the same query documents (recovery phase) and
graphs, etc. then list the corresponding documents (rating stage).
Machine learning Matching occurs between the question and each
Short term: over a single session document in the collection, as the collection is very large
Long term: over multiple searches by (in billions), the corresponding understanding should
multiple users work well. The translated query is linked to a distorted
Computer Vision. file with a knowledge base, if any. Traditional online
services match the name of each query specified by the
Ex: OCR
searcher to include in the search. The full NLP program
could mimic the question "Is slippery a common
Argument under uncertainty
condition on stoves?" Or "I love the place of all the
Ex: probability theory
smooth toffees in New England." The NLP will expand
"New England" and add similar names from its
II. WORKING OF IR
knowledge base, possibly "location." [3]
2
Authorized licensed use limited to: Panimalar Engineering College - Chennai. Downloaded on January 04,2025 at 03:26:35 UTC from IEEE Xplore. Restrictions apply.
2022 4th International Conference on Advances in Computing, Communication Control and Networking (ICAC3N)
Table 1: Term document Incidence Matrix The index always maps back to a part of the document.
Inverted aka converted file is the standard term in IR.
Term Non- Uniquel Adapti LZ7 Stati
docume Binary y ve 77 c Basic Idea of Inverted Index:
nt Huffm Decoda Huffm Dict Dicti
Inciden an ble an iona onar The basic premise of the distorted context is that we
ce Code Codes Tree ry y maintain a dictionary of words (sometimes called a
Matrix dictionary or dictionary). For each term, we need to store
Danglin 0 1 0 0 0 a list of all documents that contain t.
g suffix
Identify each document with the document
Huffma 1 0 1 0 0
number or document Id (docID).
n code
Dictiona 0 0 0 1 1 The list is then called the list of posts (or distorted lists),
ry as well as the entire list of posts that are collected
Tree 1 0 1 0 0 together after being named.
Our Answer: Non-Binary Huffman Codes and Adaptive Figure 3.1A term-document index matrix
Huffman Tree. SORTING:-
Boolean Retrieval Model The main index step is to sort this list so that the words
The Boolean retrieval model, sometimes also called as are in alphabetical order and so that the searching is
Incidence Vectors [10][11][12][13].It is a model for data faster. (Figure 3.2).
retrieval in which any question can be asked in the form
of Boolean expressions of words, in which the words are
combined with operators AND, OR and NOT.
3
Authorized licensed use limited to: Panimalar Engineering College - Chennai. Downloaded on January 04,2025 at 03:26:35 UTC from IEEE Xplore. Restrictions apply.
2022 4th International Conference on Advances in Computing, Communication Control and Networking (ICAC3N)
4
Authorized licensed use limited to: Panimalar Engineering College - Chennai. Downloaded on January 04,2025 at 03:26:35 UTC from IEEE Xplore. Restrictions apply.
2022 4th International Conference on Advances in Computing, Communication Control and Networking (ICAC3N)
IV. CONCLUSION
REFERENCES
5
Authorized licensed use limited to: Panimalar Engineering College - Chennai. Downloaded on January 04,2025 at 03:26:35 UTC from IEEE Xplore. Restrictions apply.