Digital Library
Digital Library
Digital Library
*1
Bharat Kumar Kunjam, Mr. Rahul Kumar Chawda
*1
Pursuing MCA Final year, CSE & IT Department, Kalinga University, Atal Nagar, Raipur, CG, India
2
Assistant Professor, HOD, CSE & IT Department, Kalinga University, Atal Nagar, Raipur, CG, India
Abstract:- During the past recent years, there has been tremendous development reaming the concept of digital libraries-a
knowledge base that can be stored and retrieved through on-line networks. Digital libraries are the most complex form of
information systems that support digital document preservation, distributed database management, hypertext, filtering,
information retrieval and selective dissemination of information. This has really overcome geographical barrier offering wide range
of academic, research and cultural resources with multimedia effects which can be accessed around the world over the distributed
networks. The paper examines the concept of Digital library, the technology that has enabled its emergence & architecture of digital
library system. It also highlights the digital library projects undertaken in USA, UK and India. Here the authors explored the
unique feature of digital library and possible challenges ahead for library and information professionals in the digital environment.
I. INTRODUCTION
1.1 OVERVIEW
A Digital library (also referred to as electronic library or digital repository) is a focused collection of digital objects that
can include text, visual material, audio material, video material, stored as electronic media formats (as opposed to print, micro
form, or other media), along with means for organizing, storing, and retrieving the files and media contained in the library
collection. Digital libraries can vary immensely in size and scope, and can be maintained by individuals, organizations, or
affiliated with established physical library buildings or institutions, or with academic institutions. The electronic content may
be stored locally, or accessed remotely via computer networks. An electronic library is a type of information retrieval system.
The term digital libraries was first popularized by the NSF/DARPA/NASA Digital Libraries Initiative in 1994 (Wikipedia,
2014).
In the digital library, information is stored as "digital objects". A primitive idea of a digital object is that it is just a set of bits,
but this idea is too simple. The content of even the most basic digital object has some structure, and information, such as
intellectual property rights, must be associated with the digital object. Figure 2 shows that a digital object in a repository has
two parts, content and associated data, sometimes called "metadata". (William Y. Arms, 1997)
To enable the content to represent useful information, its type must be known. Thus part of the content may be of type text
(perhaps encoded in a mark-up language), while another part may be of type audio. A single digital object may contain many
types of content. It turns out that arbitrarily complex data types can be constructed from a few basic types, notably bit
sequences, handles and other digital objects. By combining these in various combinations, any digital content can be
represented.
To manage valuable intellectual property, certain metadata is required. This is shown in the figure. It always includes a unique
identifier (the handle). It may also include properties such as rights and access methods. One property states whether a digital
object is mutable, in that it may be altered after being placed in a repository. Another is a digital signature or other method of
validating that an object has not been changed. Frequently, it is useful to keep a log of all transactions associated with each
digital object.
1.1.3 Repository
Repositories store and manage digital objects and other information. A large digital library may have many repositories of
various types, including modern repositories, legacy databases, and Web servers. We have the pilot repository that we have
implemented and enhancements planned for the prototype. The interface to this repository is called the repository access
protocol (RAP). Features of RAP are explicit recognition of rights and permissions that need to be satisfied before a client
can access a digital object, support for a very general range of disseminations of digital objects, and an open architecture with
well - defined interfaces. Repositories must look after the information they hold A repository stores digital objects, both the
content and the metadata.
The internal organization of a repository and the way that digital objects are stored are hidden from the user. A simple protocol
is provided for interactions with the repository. This protocol is called the "repository access protocol." The basic commands
in this protocol are those to access a digital object and its metadata, and the service request to disseminate a digital object. In
addition there are commands to add and delete digital objects.
1) It will provide cutting-edges facilities and services to support research, teaching, learning, and scholarly communication
across disciplines.
2) To collect, organize and collate prints and digital information and disseminate at the point of care and for future use.
3) To provide seamless access for information.
6) To develop and conduct tutorials for the users to enable them to effectively utilize the facilities and resources made
available by the library.
The relevance of this research is to attempt towards understanding the importance and benefits of digital library as individuals,
in our environment and also in the society. Thus the significance are as follows:
1) To bring readers up-to-date on the progress, nature and impact of digital libraries, bridging the gap since the publication
of the best-known digital library texts.
2) To provides a global perspective and integrates material from many sources in one place.
The scope of this topic covers the Historical background of Digital library, advantages and disadvantages of digital library,
components of a digital library, how to use a digital library, importance of digital library to the society, the internal diagram
of a digital library, characteristics of a digital library, types of digital libraries, function of a digital library, purpose of a digital
library, how to create a digital library, how to add and remove an article on digital library, Types of digital libraries and
examples, Software’s used for developing a digital library and Hardware involved( If any)
2) There is lack of preservation of a fixed copy (for the record and for duplicating scientific research)
3) There is difficulty in knowing and locating everything that is available, and differentiating valuable from useless
information.
4) There is job loss for traditional publishers and librarians.
This research is divided into 4 chapters, which are further subdivided into sections.
Chapter one is introduction which is further divided into sub sections, they include overview, objectives of the study,
significance of the study, limitation of the study, scope of the study, organization of work. Chapter two is the literature review
which is divided into one section it include the historical background of the study.
Chapter three is findings. It is further divided into sub sections which include: Definition of a digital library, internal
architecture of the topic, components of the study, the features and characteristics of digital libraries, advantages and
disadvantages of digital library, how to create a digital libraries, how to add and remove an article from a digital library, types
of digital libraries existing, Purpose and function of a digital library. Chapter four is the summary, and it is sub divided into
two sections which is the conclusion and the References.
1.7.1 Library: A library is a collection of sources of information and similar resources, made accessible to a defined
community for reference or borrowing.
1.7.2 Digital library: A Digital library is a focused collection of digital objects that can include text, visual material, audio
material, video material, stored as electronic media formats (as opposed to print, micro form, or other media), along with
means for organizing, storing, and retrieving the files and media contained in the library collection.
1.7.3 Digital objects: A digital object is a character string used to uniquely identify an object such as an electronic document.
1.7.4 Repository: Repositories store and manage digital objects and other information.
1.7.5 Metadata: Metadata is “data about data” it is defined as the data providing information about one or more aspects of
the data such as: means of creation of the data, purpose of the data, Time and date of creation, creator or author of the data,
Location on a computer network where the data were created.
1.7.6 Memex machine: Memex is a device in which an individual stores all his books, records and communications which
is mechanized so that it may be consulted with exceeding speed and flexibility.
In 1945, Vannevar Bush had a vision. In his article, "As We May Think," he describes a technical fix for the information
explosion that begun after World War II. Vannevar named this technical fix the Memex. The Memex was described as "a
device in which an individual stores all his books, records and communications which is mechanized so that it may be
consulted with exceeding speed and flexibility" (Bush, 1945).
In the 1980s, libraries card catalogs were being replaced by Online Public Access Catalogs (OPACs). These were usually
closed systems that could contain little more than bibliographic data. Most OPACs were are done in Machine Readable
Cataloging (MARC) format. It generally represents an individually published item or "information product," and describes
the physical characteristics of the item itself (Brenner et al, 2006).
The archival community however, no longer employs the MARC format. They use the Encoded Archival Description (EAD)
format. The EAD format is better suited for encoding the hierarchical relationships between the different parts of the collection
and displaying them to the user (Brenner et al, 2006). Recent trends have been capitalizing on the strengths of both formats
to improve access to digital collections (AlderMan, 1998).
In the 1990s and beyond, digital libraries changed the way we have thought about how we retrieve information. What exactly
is a digital library? According to Donald Waters, digital libraries are "organizations that provide the resources, including the
specialized staff, to select, structure, offer intellectual access to, interpret, distribute, preserve the integrity of, and ensure the
persistence over time of collections of digital works so that they are readily and economically available for use by a defined
community or set of communities" (Waters, 1998). This definition allows for a great degree of interpretation. The concept of
digital library has multiple senses that one might invoke in various contexts. For example, the concept may refer simply to
the notion of collection without reference to organization, intellectual accessibility or service attributes. This extended sense
seems to be in play, for example, when we hear the World Wide Web described as a digital library. The concept might also
refer to the organization underlying the collection, or even more specifically to the computer-based system in which the
collection resides (DLF, 1995). Digital libraries represent the meeting point of a large number of disciplines and fields, i.e.,
data management, information retrieval, library sciences, document management, information systems, the Web, image
processing, artificial intelligence, human-computer interaction, and others (Ioannidis, 2005). The Alex Catalogue of
Electronic Texts in one example of a digital library.
The Digital Library Foundation (DLF) was founded in 1995 by some of the most prominent institutions in the United States
including Harvard University, Columbia University, Princeton, Yale, and the Library of Congress. In the Digital Library
Foundations charter signed May 1, 1995, the foundation lists seven main goals (DLF, 1995). These seven goals are a perfect
example of the thinking required before, during, and after creating a digital library. These seven goals are the implementation
of an open digital library accessible across the global internet filled with printed documents converted to digital form and
incorporation of holdings already in electronic form. The establishment of a collaborative management structure for ongoing
maintenance of the digital library. The development of a coordinated funding strategy from both public and private sources.
The formation of selection guidelines that will ensure conformance to a theme and to ensure the digital library has a large
corpus of significant materials. The involvement of leaders in government, education, and the private sector to address network
issues and policy. Establishment of a comprehensive evaluation of how clients make use of the digital library for research,
how that usage compares to traditional library research, and how digital libraries affect the mission, economy, and staffing of
organizations and library institutions (DLF, 1995).
The Center for the Study of Digital Libraries (CSDL) was established in 1995 at Texas A&M. The center provides experience
and expertise to help transfer all types of collections, from books to biological specimens, into digital libraries. The center
also provides a leadership role in the online development and application of world-wide access to digital library services
(CSDL, 1995). According to the Center for the Study of Digital Libraries mission statement, Digital libraries will be
ubiquitous in the future and will provide the basis for a very broad set of distributed living activities including computer-
supported cooperative work, distance learning, electronic commerce and entertainment. The transition to an electronic
information workplace has already begun in full force. We believe that digital libraries will significantly impact the quality
of education and, indeed, the quality of life over the next decade (CSDL, 1995). The CSDL has created a few notable digital
library projects including George Bush Digital Library, the Cervantes Project 2001, and the TAMU Herbaria Project. One of
their most interesting projects is called Walden's Paths. Walden's Paths is a K-12 education project intended to help educators
organize the web for their students (Alderman, 1998).
These digital libraries do completely different things, have completely different interfaces, and use different technology to
display information for their users. The Alex Catalogue of Electronic Texts has a very simple, clean layout. It also has a simple
search function. There is no advanced search available with this digital library. American Memory by the Library of Congress
is a little more advanced than the Alex Catalogue of Electronic Texts. American Memory has a little more complicated set up
because it offers a wider variety in its collections. The Alexandria Digital Library has a much more sophisticated search
function. While using their National Geospatial Digital Archive, one can search anywhere throughout the United States and
recover satellite images, multiple varieties of air photos, and maps.
The University of California, Berkley started up the Cheshire II system to change their OPAC to full-text online resources. It
was developed to retrieve ranked based probabilistic retrieval methods while still supporting Boolean retrieval methods. It
has been constantly redesigned to meet the information retrieval needs of a broader world. Its primary usage is in full text and
structured metadata collections based on SGML (Standard Generalized Markup Language) and XML (Extensible Markup
Language) (Larson, 1999). The Cheshire II search engine functions as an information retrieval protocol server providing
access to a set of databases (Larson, 1999). The functionality that sets the Cheshire II search engine apart from other search
engines is that a natural language query can be used to retrieve the records that have the highest estimated probability of being
relevant given the user's query (Larson, 1999). Any items found in a search can be selected and used as queries in a relevance
feedback search.
One of the most important issues within the digital library community is conversion of paper text to a digital format. There
are compelling preservation, access, and economic reasons for creating a digital master in which all significant information
contained in the source document is fully represented. The most obvious argument for full-informational capture can be made
in the name of preservation. Under some circumstances, the digital image can serve as a replacement for the
original. In these cases, the digital image must fully represent all significant information contained in the original as the image
itself becomes the source document and must satisfy all research, legal, and fiscal requirements. If the digital image is to serve
as a surrogate for the original (which can then be stored away under proper environmental controls), the image must be rich
enough to reduce or eliminate users' needs to view the original (Chapman, 1996). Digital conversions should remain faithful
to their respective originals. They should remain faithful because users' needs and computing capabilities vary tremendously.
Also because printing, display, and image processing requirements are not all served by delivering the same image. The
completeness, detail, and speed of output are often conflicting requirements within a digital conversion.
Digital libraries have distinct advantages and disadvantages over traditional libraries. Digital libraries have the potential to
contain an unlimited amount of information. Traditional libraries on the other hand have to be contained in a physical space
which limits the amount of information that can be held within the library. In the same idea, a traditional library's physical
space makes it difficult for people far away to access the library's collection. A digital library can be accessed from all around
the world. Traditional libraries have hours of operation. When the library is closed, you cannot access the information. A
digital library can be accessed at any time. When someone takes out a piece of a traditional library's collection, no one else
can use it. Any piece of a digital library can be accessed by numerous people at the exact same time.
Digital libraries have had their share of critics. Digital libraries can have issues with copyright and copyright law. These
concerns can be avoided by working around the problem by using public domain content only. Another way around copyright
law is by licensing content and distributing it on a commercial basis. A big criticism of digital libraries is that many can only
be accessed by certain audiences. For example, many digital libraries force you to pay to access their collections. The only
people that usually get to access these collections are in an academic setting. Many people who could gain a lot from access
to such information will never get the chance. Many argue that digital libraries could mean the end of the printed book.
Digital libraries will continue to grow and grow but will not completely replace the printed book. According to Walt Crawford,
books are comfortable, portable, reliable, and economical means of providing large amounts of information in compact form
(Crawford, 1998). Crawford gives a very strong argument on why traditional libraries will continue to be viable even with the
proliferation of digital libraries.
Figure 3.1: Mode of operation of a digital library (Eskinder Asmelash PPT, 2010)
4. Client services for the browser, including repository querying and workflow
5. Content delivery via file transfer or streaming media
6. Patron access through a browser or dedicated client
7. A private or public network.
3.4.1 Alfresco (software):-Alfresco is a free/libre enterprise content management system for Microsoft Windows and
Unix-like operating systems. It is used for Enterprise content management for documents, web, records, images, and
collaborative content development.
3.4.2 Cambridge imaging system: - It was founded in 1996, is a software company based near Cambridge, UK that
specializes in enterprise video platforms. It has one subsidiary company, Screenocean, based in London, UK, an online
digital library containing program material and related metadata from the Channel 4 archive.
3.4.3 Digital Commons: - Digital Commons is a hosted open access institutional repository and publishing solution,
combining traditional institutional repository functionality with tools for peer-reviewed journal publishing, conference
management, and multimedia.
3.4.4 DSpace: - DSpace is an open source repository software package typically used for creating open access repositories
for scholarly and/or published digital content. While DSpace shares some feature overlap with content management
systems and document management systems, the DSpace repository software serves a specific need as a digital
archives system, focused on the long-term storage, access and preservation of digital content.
3.4.5 EXo-Platform: - eXo Platform is an open source, standard-based, Enterprise Social Platform written in Java and
distributed under the GNU Lesser General Public License. The platform is sold and distributed by eXo Inc., a global
company with U.S. headquarters in San Francisco, California, global headquarters in France, and offices in Tunisia
and Vietnam.
3.4.7 Fedora Commons: - Fedora (or Flexible Extensible Digital Object Repository Architecture) is a digital asset
management (DAM) architecture upon which institutional repositories, digital archives, and digital library systems
might be built. Fedora is the underlying architecture for a digital repository, and is not a complete management,
indexing, discovery, and delivery application. It is a modular architecture built on the principle that interoperability
and extensibility are best achieved by the integration of data, interfaces, and mechanisms (i.e., executable programs)
as clearly defined modules.
3.4.8 Greenstone (Software): - Greenstone is a suite of software tools for building and distributing digital library
collections on the Internet or CD-ROM. It is open-source, multilingual software, issued under the terms of the GNU
General Public License. Greenstone is produced by the New Zealand Digital Library Project at the University of
Waikato, and has been developed and distributed in cooperation with UNESCO and the Human Info NGO in Belgium.
3.4.9 Intra-text: - IntraText is a digital library that offers an interface while meeting formal requirements. Texts are
displayed in a hypertextual way, based on a Tablet PC interface. By linking words in the text, it provides
Concordances, word lists, statistics and links to cited works. Most content is available under a Creative Commons
license it also offers publishing services that enable similar advantages. The IntraText interface applies a cognitive
ergonomics model based on lexical hypertext and on the Tablet PC or touch screen interface. It uses a set of tools and
methods based on HLT (Human Language Technologies). IntraText is a reading, reference and search tool. It can be
used to read a work, to browse a text as hypertext, to search for words and phrases just through a simple click of your
pen or mouse.
3.4.10 Invenio: - Invenio is an open source software package that provides the tools for management of digital assets in an
institutional repository. The software is typically used for open access repositories for scholarly and/or published
digital content and as a digital library. Invenio is developed by the CERN Document Server Software
Consortium, and is freely available for download. Free and paid support models are available. The service provider TIND
Technologies was established in 2013 to accommodate the growing demand for support of Invenio.
3.4.11 Islandora: - Islandora is an open source digital repository system based on Fedora Commons, Drupal and a host of
additional applications. It is open source software (released under the GNU General Public License) and was
developed at the University of Prince Edward Island by the Robertson Library. Islandora may be used to create large,
searchable collections of digital assets of any type and is domain-agnostic in terms of the type of content it can steward.
It has a highly modular architecture with a number of key features
3.4.12 Knowledge Tree: - Knowledge Tree, Inc. provides online software that helps sales and marketing teams discover,
manage, and refine the collateral they use in sales engagements. The technology is tuned for sales, sales operations,
and marketing teams. Based in Raleigh, North Carolina, the company also has an office in Cape Town, South Africa.
The company's product, also called Knowledge Tree, makes use of the Amazon EC2 cloud computing platform and
Salesforce.com's Force.com platform. Knowledge Tree’s features — including content discovery, reporting, and
editing — are designed to support B2B sales situations that depend on collateral and documents. The service is
available on a subscription basis.
3.4.13 Pleade: - Pleade is an open source search engine and browser for archival finding aids encoded in EAD (an XML
standard for encoding archival finding aids). Based on the SDX platform, it is a very flexible web application.
3.4.14 SABDA: - SABDA or SABDA Bible Software is an Indonesian integrated Bible study
platform that's based on the Online Bible engine, [1] with multilingual Bibles available in the program (including Indonesian,
Malay, English, Greek and Hebrew, and many local languages of Indonesia). The word sabda is the Indonesian word for
Logos (via Sanskrit: shabda), and also abbreviation of "Software Alkitab, Biblika Dan Alat-alat" (Bible Software, Biblical
Resources, And Tools). It is produced and managed by Yayasan Lembaga SABDA (SABDA Foundation) which translated
and made available freely more than 100 Biblical modules in Indonesian since 1994, besides the default OLB modules.
1) Computer, mobile phone or any device that can access the network.
2) Storage device or Database where data and information are kept and stored
3) Scanner that will be used to convert traditional object into Digitized objects.
4) Printer will be used to print out digitized object.
5) Internet Modem which will be used to access the network.
6) Traditional Materials Such as books, magazines e.tc.
1) No physical boundary. The user of a digital library need not to go to the library physically; people from all over the
world can gain access to the same information, as long as an Internet connection is available.
2) Round the clock availability a major advantage of digital libraries is that people can gain access 24/7 to the information.
3) Multiple access. The same resources can be used simultaneously by a number of institutions and patrons. This may not
be the case for copyrighted material: a library may have a license for "lending out" only one copy at a time; this is
achieved with a system of digital rights management where a resource can become inaccessible after expiration of the
lending period or after the lender chooses to make it inaccessible (equivalent to returning the resource).
4) Information retrieval. The user is able to use any search term (word, phrase, title, name, and subject) to search the
entire collection. Digital libraries can provide very user-friendly interfaces, giving click able access to its resources.
5) Preservation and conservation. Digitization is not a long-term preservation solution for physical collections, but does
succeed in providing access copies for materials that would otherwise fall to degradation from repeated use.
6) Space. Whereas traditional libraries are limited by storage space, digital libraries have the potential to store much more
information, simply because digital information requires very little physical space to contain them and media storage
technologies are more affordable than ever before.
7) Added value. Certain characteristics of objects, primarily the quality of images, may be improved. Digitization can
enhance legibility and remove visible flaws such as stains and discoloration. [14]
8) Easily accessible.
1) Schools.
2) Banks.
3) Business organizations.
4) Economics.
5) Hospitals.
IV. CONCLUSION
There will be continuing expansion of digital library activities. Digital libraries will build upon work being done in the
information and data management area. Digital libraries provide an effective means to distribute learning resources to students
and other users. Planning a digital library requires thoughtful analysis of the organization and its users, and an
acknowledgement of the cost and the need for infrastructure and ongoing maintenance (Adams, Jansen, and Smith 1999).
V. REFERENCES
[1] Academic Info. (n.d.). Retrieved from Academic Info: Digital Library: http://www.academicinfo.net/digital.html
[2] Adams, W. J. (1999) . Planning, building, and using a distributed digital library. Third International Conference on Concepts in Library and
Information Science. Dubrovnik,Crotia.
[3] AlderMan, J. (1998). Digital Library Project. Retrieved from http://www.unf.edu/~alderman/DigitalLibraries/DLProjects.html
[4] Besser, H. (n.d.). Historical Background of Digital library. Retrieved from Digital Humanities : http://www.digitalhumanities.org/companion/
[5] CSDL. (2007). The Center for the Study of Digital Libraries. Retrieved from http://www.csdl.tamu.edu/csdl/center/center.htm
[6] Different types of Digital Library. (2005). Retrieved from Computer Technology File Storage: www.helpme.com
[7] Digital Libraries. (n.d.). Retrieved from Digital Libraries: A selected Resource Guide: http://www.lita.org/ital/1603_klemperer.html
[8] Digital library scope. (n.d.). Retrieved from Digital library scope: http://www.dlib.org/metrics/public/papers/dig-lib-scope.html
[9] Gopal, K. (2000). Digital libraries in electronic information era. New Delhi: Author Press.