Permuted Title Indexing
KWIC, KWAC, KWOC
Permuted Title Indexing
-the representation of terms in headings by making every possible
combination of terms
-created by rotating the keywords in the titles as entry points into the
index
KWIC
• Keyword in context (KWIC) was originally introduced by Andrea Crestadoro
as long back as in 1864, under the name Keyword in Titles, for a catalogue of
Manchester public libraries.
• Nearly a half century later, around 1958, it was developed by Hans P. Luhn of
IBM for computer manipulation and applied to the American Chemical
Society’s current awareness publication Chemical Titles. The acronym KWOC
was also known by him.
• As the name itself suggests, in this system, the keywords derived for indexing
purposes are shown in a particular context. That means, they not only serve
as approach terms to the users, but also specify the particular context of the
document.
Indexing Process of KWIC
• Keyword selection: The first important work is selection of significant terms or
keywords. This done from the title of the document and/or title like phrases. The
selection may be done in two ways – either and editor may mark the significant terms or
a “stop list” or “word exclusion list” may be fed into the computer beforehand so that the
computer itself can exclude the significant words form the title. Usually initial articles,
prepositions, etc. are eliminated as insignificant terms. The keywords, thus selected,
serve as approach terms. For a document entitled “Treatment of heart diseases in India”
the significant terms or keywords will be (1) Treatment, (2) Heart, (3) Diseases, and (4)
India.
• Entry generation: Now index entries are generated with each keyword serving as an
approach term. The title is so manipulated that they keyword comes in the beginning (or
in the middle) followed by the rest of the title. Thus, in the above example, there will be
four index entries for the four keywords, each of them coming in the beginning by
rotation. The format of the entries has been described separately.
• Filing: The entries are filed alphabetically by keywords.
Format and Structure of KWIC
Each entry, according to KWIC system, consists of the following three parts arranged in
linear order:
Keyword – This is written either in the beginning or in the middle, often in bold letters or capital letters
or is underlined for easy filing and searching.
Context – The rest of the title, besides the keyword, is used as the context. A stroke (/) separates the
last word and first word of the title. The context helps in efficient retrieval.
Reference – A code number or symbol identifying the document is added at the extreme right end.
Thus the index entries of the title mentioned above will be:
Treatment of heart diseases in India -25
Heart diseases in India/Treatment of -25
Diseases in India/Treatment of heart -25
India/Treatment of heart diseases in -25
If keywords are brought in the middle, the entries will be:
in India/ Treatment of heart diseases -25
Treatment of Heart diseases in India -25
Of heart Diseases in India/Treatment -25
Diseases in India/Treatment of heart -25
Format and Structure of KWIC
Each entry, according to KWIC system, consists of the following three parts arranged in
linear order:
Keyword – This is written either in the beginning or in the middle, often in bold letters or capital letters
or is underlined for easy filing and searching.
Context – The rest of the title, besides the keyword, is used as the context. A stroke (/) separates the
last word and first word of the title. The context helps in efficient retrieval.
Reference – A code number or symbol identifying the document is added at the extreme right end.
Thus the index entries of the title mentioned above will be:
Treatment of heart diseases in India -25
Heart diseases in India/Treatment of -25
Diseases in India/Treatment of heart -25
India/Treatment of heart diseases in -25
If keywords are brought in the middle, the entries will be:
in India/ Treatment of heart diseases -25 The title written this
Treatment of Heart diseases in India -25 manner is called
Of heart Diseases in India/Treatment -25 wrap-around or
Diseases in India/Treatment of heart -25 recirculated title.
Advantages of KWIC
• It hardly requires any intellectual effort and hence the indexer is not required to be a
subject specialist.
• Indexing can be done mechanically and speedily.
• The keywords are mainly selected from the titles of documents and hence it is not often
necessary to go through the contents of the document to be indexed.
• The terminology used is always current as keywords represent actual terms used by
specialist author.
• It is based on natural language and hence no controlled vocabulary is required.
Disadvantages of KWIC
• Titles may not always be coextensive with the contents of the documents and when it
is not so, it becomes necessary to formulate expressive title-like phrases.
• Since controlled vocabulary is not used, entries relating to same subject get
scattered and consequently the searches yield low recall.
• The rendering of context with stroke (/) as a separator between the last word and first
word may not be liked by the users.
• The words used as keywords and the words used by the searcher may not always
tally and in such cases it is necessary for a searcher to know the synonyms and the
terms related to hi subject of search.
•As related topics are scattered, a user needs to search by several keywords.
•It fails to meet the exhaustive approach to information from a large collection.
Variants of KWIC
• Some variants of KWIC have also been developed with a view to overcome the
shortcomings of the system. The main features of a few such important variants are
described below:
KWOC: According to the Keyword out of Context (KWOC) system, the whole title of the
document is used as context along with the keyword. The keyword and the context
are written either in the same line or in two successive lines as shown below:
Treatment Treatment of heart diseases in India -25
Treatment Treatment of heart diseases in India -25
KWAC: In Keyword Augmented in Context or Keyword-and-Context (KWAC), the
keywords are enriched with additional words taken from the contents or abstract of the
document. Thus the dependence of the indexing system on titles is reduced.
Variants of KWIC
KWWC: Keyword with Context (KWWC) system prescribes use of only part of the title as context
which is relevant to a particular keyword.
KEYTALPHA (Key-Term Alphabetical) Index: The KEYTALPHA is just modified form with key
terms arranged alphabetically. It is permuted subject index that lists only keywords assigned to each
abstract. KEYTALPHA index is being used in the ‘Oceanic Abstract’.
WADEX (Word and Author Index): It is an improved version of KWIC index where along with
the key words, the names of authors are also treated as keywords and thus indexed accordingly.
Thus, it appears that WADEX satisfies both the author and subject. WADEX is used in ‘Applied
Mechanics Review’. AKWIC (Author and keyword in context) index is another version of WADEX.
KLIC (Key-Letter-In-Context) Index: This type of index only takes fragmented word (i.e.
key letters), instead of the full word, either at the beginning or at the end of the entry. In this
system, the key letters forming the part of the word are specified and the computer retrieves any
term containing that letters either at the beginning or at the end of the word. KLIC indexes are
almost unknown today, the Chemical Society (London) published a KLIC index as a guide to
truncation.