Drim 1201 Classification (Theory and Practice)
Drim 1201 Classification (Theory and Practice)
PREPARED BY
MWAHULHWA BALUKU MORIS
TEL:+256777670282
EMAIL:[email protected]
Skype: morrisonex
Course content
COURSE LEVEL: 1
CREDIT UNITS: 3
CONTACT HOURS: 45
COURSE DESCRIPTION
The course unit introduces the students to the basic skills in classification as a basis for
organizing records, archives and information for retrieval purposes. In addition the unit shall
give the students the necessary skills in abstracting, indexing and subject analysis which are
necessary technical skills for professionals in the field
COURSE OBJECTIVES
LEARNING OUTCOMES
COURSE CONTENT
Information access and retrieval systems
Classification systems
Subject analysis and treatment
History of indexing and abstracting
Abstracts and abstracting
Controlled vocabularies; Thesauri
Indexing
Citing and referencing
Bibliographic control
MODE OF DELIVERY
Lecture method
Practical work
MODE OF ASSESSMENT
Total 100%
RECORDS CLASSIFICATION
Classification
Classification is the process of assigning records to their appropriate place within a logical
arrangement, enabling them to be identified. Classification implies giving records a unique
identifier or reference number, assigned according to predetermined rules. Classification
also refers to the process of identifying and arranging records in categories according to
logically structured conventions, methods and procedural rules represented in a
classification system.
The classification of records needs to take into account the existing structure, functions and
activities of the organization and its divisions and branches. Thus, records may be arranged
in a structure that corresponds to the work being documented, making it easier to decide
where documents should be filed and where they may be found. Functions and activities are
the primary criterion by which records are classified. It requires a thorough understanding of
the functions of the organization and how the functions are used for example common
functional areas of a business organization are;
Administration
Human resource management
Marketing
Production
Classification usually involves organizing records into mutually exclusive categories so that
there can be no doubt about the appropriate place for an individual item. The ‘top-level’
category will be the series, but in a classification scheme of any complexity, there will be
further divisions into sub-series.
Classification schemes are likely to be hierarchical. That is, they will form a tree-like
structure, with multiple levels if necessary. To a large extent, classification schemes may be
pre-determined on the basis of business systems analysis, but they must be flexible enough
to accommodate new and changing structures, functions and activities. At the same time,
they must be kept under review to determine whether they continue to meet requirements.
Classification schemes based on business systems analysis should be designed in
consultation with users. Classification systems are often hierarchical; a classification that
attempts to arrange subjects according to a “natural” order – proceeding from classes to
divisions to subdivisions
If the retrievable term is simply a name (such as the name of a person, organization or
geographical area) it may not be necessary to control the vocabulary but there will need to
be rules relating to the order of names (last name first), the use or non-use of abbreviations
and the treatment of variant spellings. It is possible to include proper names in an ‘authority
list’, which includes all the names used in their standard form.
If an ad hoc, ‘no-rules’ index is used to index policy, operational and administrative files,
problems are likely to occur, such as the following.
The thesauro facet is a specialized kind of retrieval language with both a thesaurus type and
classification type each containing some unique item to itself and not found in the other. The
obvious advantage of a thesauro facet is that it is used for arranging books on the shelves
of a special Library as well as for indexing the terms in a database.
Some of these tools especially the subject heading list in catalogues have also been used to
organize internet resources like the Infomine, Biome, Sosig etc.
With the development of technology, an automatic indexing is when the assignment of
content identifiers is done with the aid of modern computer facilities. In automatic indexing
environment; the lack of human expertise can be overcome by intelligent use of frequent
vocabularies in stored records and information request.
Other advantages of automatic indexing are the maintenance of consistency in indexing,
indexing time is saved, index entries are produced at a lower cost and better retrieval
effectiveness is achieved
(Onwuchekwa & Jegede 2011)Indexing Xin Lu (1990) writes that in the ideal document
retrieval environment, a document or query statement is represented by a group of distinct
index terms as well as the semantic relationships between these terms, so that retrieval
could be based on a structure of semantic relationship.
Macleod (1990) also adds that documents are retrieved on the basis of the correspondence
between search terms expressed in the query and the index terms in the document.
Indexing systems designed to assist in the retrieval of documents operate by assigning
index terms to the analyzed subject of each document either manually or automatically.
The most difficult part of indexing is that phase where two different indexers analyze the
content of a given document in two different ways resulting in two different index entries.
On the other hand, it has been observed by information retrieval experts that indexing tends
to be more consistent when the vocabulary used is controlled, because indexers are more
likely to agree on the terms needed to describe a particular topic if they are selected from a
pre-established list than if a free hand is given. And it is also easier on the users/ searchers
part to identify the terms appropriate to the information need if the terms must be selected
from a definitive list.
Both subject heading list and thesauri contain alphabetically arranged terms with necessary
cross references and notes that can be used for indexing or searching n an information
retrieval environment.
The different kinds of vocabulary control tools include subject heading list, thesauri and the
thesauro facet .
Keyword is a term or combination of terms taken from the title or text of a document or file
characterizing its content and establishing an access point for its retrieval.
A keyword list is an alphabetical listing (sometimes called an authority list) of all the
standard terms from which index entries should be selected.
Keyword list: A controlled vocabulary that limits the choice of keywords when classifying or
indexing files. It is an alphabetical listing (sometimes called an authority list) of all the
Standard terms from which index entries should be selected.
A keyword list is a control mechanism. It limits the way individual records are classified and
indexed by imposing accuracy or exactness and consistency on the indexing process. Thus,
it should tell its users and operators where to place records on particular subjects or where
to look for them. The list can also provide a standard vocabulary to be used when giving file
titles. By limiting the choice of words to be used when assigning titles to files, a controlled
vocabulary or keyword list assists the indexing process and removes uncertainty about
where to file documents. An example of alphabetical keyword list
365 Access
186 Accommodation
331 Adjumani District
145 Administration
095 Allowances
110 Belgium
102 Budget
104 Census
190 Elections
273 Procedures
153 Pensions
246 Records Management
195 Speeches
024 Training
194 Transport
229 UNESCO
155 United Kingdom
For example a file on Budget speeches could be described by the keywords; 1. Budget and
2. Speeches
Secondly a file on Training in Belgium could described by the key words; 1. Training and 2.
Belgium.
To facilitate file retrieval, a list of keywords which explain briefly the work being carried out
in an organization is to be drawn up in consultation with staff and senior management. It
should reflect the functions and activities of an organization and must be tailor - made to
suite the users’ requirements. This list of words is numbered consecutively as it is created.
Once a number has been allocated it remains associated with the same keyword for the life
of the classification scheme and must never be re-used for a different keyword, even if the
word to which it relates is later eliminated from the list.
When drawing up the keyword list remember:
A keyword must not be so broad that it could be used for more than 10% of the work
covered by the scheme
A keyword should not be so narrow that its use is limited to only one or two files.
Words such as general, miscellaneous or correspondence are not permitted as they
are two vague to be helpful
Abbreviations are not normally used as keywords since they are sometimes confusing and
their meanings may be changed with the passage of time.
Keyword lists control vocabulary and so help users find information more easily.
Increasingly, organizations are tending to share information across divisions and
departments, both in paper and electronic form. As a result, a number of staff in different
locations are involved in the processes of naming and retrieving files and documents. In
these circumstances, a corporate-wide thesaurus or controlled vocabulary will be required.
This document will need to include ‘specialist’ terms relevant to individual departments as
well as terms that relate to the organization as a whole. Though ready-made thesauri can
be purchased, it is normally necessary for the organization to construct its own thesaurus
for records purposes so that it matches the requirements of the organization.
The main steps in constructing a controlled vocabulary are as follows. Some of these may
take place concurrently.
1. Understand the functions and activities of the organization.
2. Develop retrieval terms by analyzing functions and activities, discussing them with
action officers, and studying work programmes, existing file lists and other available
documentation.
3. Define the scope of the controlled vocabulary, for example the level or depth of
indexing and whether proper names and very general terms will be included.
Originally purely a manual task, indexing can now be done using computers. Various
computer programs are available but should be selected with care to ensure that they can
meet operational requirements. In an electronic system, file-naming conventions and
standardized directory structures should relate as closely as possible to the classification
and indexing system adopted by the organization.
File series
The level of arrangement of the files and other records of an organization or individual that
brings together those relating to the same function or activity or having a common form or
some other relationship arising from their creation, receipt or use. Also known as a file
series or records series.
Records in a series are linked together because
The records relate to the same functions and activities
The records are the products of those functions and activities.
When control systems based on series are well devised and consistently applied, they help
facilitate retrieval beyond the current into the semi-current and archival phases of the life
cycle. Good management of current files in the records office ensures that disposal
decisions are reduced as much as possible to a routine. It also facilitates archival
arrangement in accordance with the principles of provenance and original order. Control of
records based on series, called ‘series control’, also makes it possible to transfer a whole
file series with the function it serves when there is an administrative reorganization.
When creating file series, it is helpful to distinguish between the different categories
of files. Most agencies create a wide range of files, but some common broad categories
may be identified:
Policy files relate to the formulation of policy and procedures by the organization.
Subject files deal with the implementation of the organizations policies and
procedures
Administrative (common to all organizations or agencies) deal with subjects such as
buildings, equipment and supplies, finance and personnel, as well as with general
internal administration.
Administrative records: Records relate to those general administrative activities common to
all organizations, such as maintenance of resources, care of the physical plant or other
routine office matters.
Case files contain similar information on a wide range of, for example, individuals or
organizations, usually reflecting the particular functions and activities of the agency.
Case files may be operational (such as school inspection files) or administrative
(such as personnel files).
Case files: Files relating to a specific action, event, person, place, project, or other subject.
Also known as dossiers, dockets, particular instance papers, project files or transactional
files. Case files require further explanation. Case files relate to the actual conduct of
business or the execution of policy or legislation as it concerns individual cases. Each
individual file within the series concerns a separate person, institution or place but otherwise
is similar in form and content to other files in that series. If case files are generated in large
quantity (more than say 25 files), they will need to be arranged and classified separately
from policy and administrative files. Case files relate to individual people, organizations and
places or some other common characteristic. Recognizing these distinctions between
policy, administrative and case files helps to give greater specificity to file series and sub-
series and to file titles.
Coding system:
A representation of a classification scheme, in letters and/or numbers and in accordance
with a pre-established set of rules.
Some classification and coding schemes may need to take into account other factors, such
as the department that originated the records or the subject matter dealt with in the records.
To a large extent, the classification and coding scheme may be predetermined on the basis
of business systems analysis. However, some flexibility must be built into the system so that
new and changing structures, functions, activities and responsibilities can be
accommodated.
It is of no value to have a scheme that perfectly maps the work of the organization but that
does not allow the insertion of new files when new activities arise. Therefore, business
systems analysis is an ongoing process; it must be repeated from time to time in order to
keep the classification and coding schemes up to date. The classification scheme also
normally provides rules by which each file or document is given a unique reference number.
This is known as coding.
For example assuming an organization has seven clearly defined functions, each of which is
handled by a separate department. The prefix codes for these are as follows:
Administration ADM
Human Resource Development HRD
Human Resource management HRM
Management services MSD
Records and information management RIM
Compensation COM
Inspection INS
If for example a document for which a file is to be created concerns human resource
development in an organization, the file prefix will be HRD, budget speeches ADM and if the
document concerns records, and information management the file prefix will be RIM. It is
important to note that file series relate to functions rather than to organizational structures. If
one division handles more than one function it will have more than one series.
Neither of the keywords needs be regarded as being more important than the other. If each
is considered to have equal value, the lower of the two numbers will be written first in the file
reference. This simplifies the task of the registry staff. They do not have to decide which of
the two words is more important but simply write the two numbers in numerical order. For
example;
ADM/102/195
HRD 024/110
If it is helpful to organize the files to reflect a more hierarchical arrangement, where one of
the key word relates to a broader function and one to a narrower function, the number
relating to the broader function should be given first. This will bring the files relating to this
function together on the shelf or in the file cabinet
Example
Budget speeches ADM 102/195/01
Training in Belgium HRD 024/110/01
It can be difficult to predict what subjects may arise in the future, particularly at the file level.
A classification scheme wherein file codes or reference numbers are assigned as needed
will prove more flexible than a rigid, predetermined classification and coding scheme.
As a general rule, the more a coding system reflects hierarchical relationships between
records, the more difficult it becomes to insert new subjects and codes. This point must be
kept in mind when considering the rapidity of change in today’s administrative environment.
The main features of a coding or reference number system are as follows:
It must generate unique reference numbers for each item to be classified.
It should be as simple as possible.
It should provide a self-evident order: that is, the arrangement of items within the
system should be logical and predictable.
It should be unambiguous in form or format: for example, there should be no choice
about upper or lower case letters or the presence or absence of an element
Its elements should be clearly distinguishable from each other: for example,
AB/45/89/01.
Selecting Classification and Coding Systems
There are many different file classification and coding systems, and there are no hard and
fast rules for choosing a system. Choosing the right system will depend on a number of
factors, such as
The size and complexity of the organization
The range of its business
The quantity of files and other records
The presence of case files
The rate of creation of new files and records
The cost of installing and maintaining the system
The ease or difficulty with which the files and records can be organized into mutually
exclusive categories reflecting specific functions and activities
The training required to operate and sustain the system
The skills level of the records staff.
Coverage of the File classification system
File classification and coding systems must be designed to match the requirements of the
organization they will serve.
A file classification system should support business or organizational
requirements.
It should suit the organization it serves and support decision making and the
activities of the organization.
It should matches users’ needs.
It should provide the best, easiest and simplest solution.
It should be cost effective.
It should match resources, with adequate equipment, funds or staff.
It should not be dependent on outside resources for operational requirements.
A file classification system should be easy to understand, use and maintain.
It should be based on logic or common sense.
It should be understood by records staff and users.
It should be independent of human memory.
It should use simple processes.
It should inspire confidence in operators and users.
A file classification system should be precise.
It should minimize doubt about where to file papers.
It should allow the quick identification and retrieval of files.
A file classification system should be complete and comprehensive.
It should cover all the files that need to be included.
It should be capable of including files that may be created in future.
It should be flexible and allow for expansion, contraction or reorganization.
A file classification system should be backed up by a procedures manual and
training materials.
It should be clearly and comprehensively documented.
All procedures should be explained in easy-to-follow steps.
It should provide master copies of all forms, with completed examples.
It should be supported by training programmes.
It should be supported by professional advice or guidance.
A file classification system should be easily automated.
It should be capable of some form of useful automation, regardless of whether automation is
planned, such as for word processing, computerized indexing, database management or a
computerized record-keeping system.
Describe the factors that must be considered when selecting a classification scheme for an
organization
Information access(check the detailed Pdf document on information access I
have attached
Understand what range of information search and retrieval facilities are available
currently.
Information retrieval is assumed to also include database systems and question answering
systems, and information is construed to mean documents, references, text passages, or
facts.
In the public Library environment, anyone can be a user (members of the general public)
Children, students, housewives, the literate, neo-literate etc
Reference
Onwuchekwa, E. & Jegede, or, 2011. Information Retrieval Methods in Libraries and
Information Centers. African Research Review, 5(6), pp.108–120. Student, I. &
Briefing, C., Information Retrieval systems.
Abstracting A lot of authors have defined the abstract from different points of view.
Lancaster (2003) defines an abstract as a brief but accurate representation of the contents
of a document and he opines that an abstract is different from an extract, an annotation or
summary
Rowley (1996 ) defines an abstract as a concise and accurate representation of the content
of a document in a style similar to that of the original document. She adds that an abstract
covers all the main points made in the original document and usually follows the style and
the arrangement of the parent document.
Abstracts as documentary products always take the form of short texts either accompanying
the original document or included in its surrogate.
Different criteria have been used by other information scientist to categorize the different
kinds of abstracts and they are:
Abstract by writer: these are abstracts written by authors, subject experts, or by
professional abstractors.
• Abstracts by purpose are written to serve different purposes for example
informative abstract, indicative abstracts, critical and special purpose
abstract.
• Abstract by form is another different kind of abstract and can be differentiated
as structured abstract, mini abstract and telegraphic abstract.
With the rapid increase in the availability of full text and multimedia information in digital
form, the need for automatic abstracts or summaries as filtering tool is becoming extremely
important. Craven (2000) in his works proposes a hybrid abstracting system in which some
task are performed by human abstractors and others by an abstractors assistance software.