Metadata Catalogues in Spatial Information Systems
Metadata Catalogues in Spatial Information Systems
UDK 528:004.6:004.822:81’37:004.738.5
Pregledni znanstveni èlanak
ABSTRACT. This paper gives the short review of the Open Geospatial Consortium
(OGC) metadata catalogue services that have the key role in geospatial resource
discovery in Spatial Data Infrastructures (SDI). The notion of Spatial Data Infra-
structure comprises a collection of technologies, policies and institutional agreements
that provide an easier access to geospatial data. The SDI is suitable for usage in geo-
spatial data discovery, evaluation, and also various applications within government,
commercial and non-profit sectors, academic institutions, etc. Metadata catalogue
services have been specified in OGC Catalogue Service Implementation Specification.
The part of the specification that specifies a web interface that supports the storage,
retrieval, and management of data related to web services, is called Catalogue Servi-
ce for the Web (CSW). Metadata catalogues are service brokers that represent a key
component in a service-oriented architecture that manages shared resources and fa-
cilitates the discovery of resources within an open, distributed system. OGC services
have gained significant popularity in recent years and the number of organizations
using them has increased. However, the full potential of metadata catalogues has not
yet been reached, not only because of the lack of appropriate documentation of data in
the form of standardized metadata, but because the lack of semantics of the data. The
analysis of the usage of metadata catalogue services in geodetic information systems
has been given and the proposal for a possible solution for improvement has been
made.
1
Prof. dr. Miro Govedarica, Faculty of Technical Sciences, University of Novi Sad, Trg Dositeja Obradoviæa 6,
RS-21000 Novi Sad, Serbia, e-mail: [email protected],
Dubravka Boškoviæ, MSc, Faculty of Technical Sciences, University of Novi Sad, Trg Dositeja Obradoviæa 6,
RS-21000 Novi Sad, Serbia, e-mail: [email protected],
Prof. emer. dr. Dušan Petrovaèki, Faculty of Technical Sciences, University of Novi Sad, Trg Dositeja Obradovi-
æa 6, RS-21000 Novi Sad, Serbia, e-mail: [email protected],
Prof. dr. Toša Ninkov, Faculty of Technical Sciences, University of Novi Sad, Trg Dositeja Obradoviæa 6,
RS-21000 Novi Sad, Serbia, e-mail: [email protected],
Doc. dr. Aleksandar Ristiæ, Faculty of Technical Sciences, University of Novi Sad, Trg Dositeja Obradoviæa 6,
RS-21000 Novi Sad, Serbia, e-mail: [email protected].
314 Govedarica, M. i dr.: Metadata Catalogues in Spatial Information …, Geod. list 2010, 4, 313–334
1. Introduction
A service-oriented architecture (SOA) (Erl 2005) is the distributed computing
architecture based on loosely coupled interactions of geo-services in which the
service interaction model illustrates the interaction between different agents for
publishing, discovering, and invoking geo-services, so called “publish-find-bind”
model (Fig. 1). This model involves: publishing resource descriptions so that they
are accessible to prospective users (publish), discovering resources of interest
according to some set of search criteria (find) and interacting with the resource
provider to access the desired resources (bind). Within such architecture a
registry service plays the essential role of matchmaker by providing publication
and search functionality, thereby enabling a requester to dynamically discover
and communicate with a suitable resource provider without requiring the
requester to have advance knowledge about the provider. Benefits of using SOA is
that the monolith software applications are replaced by a set of loosely coupled
services which can be reused and combined in various application domains. Those
services comply with the standards so their users are not vendor-dependent.
Resource descriptions are called metadata, which means data about data.
Metadata are registered in the catalogues. Metadata is used to describe geospatial
resources (data and/or services). Its purpose is to enable geospatial resource dis-
covery, its evaluation and to provide information how to access and use that
resource. Therefore metadata can be divided in three categories: discovery
metadata, exploration metadata and exploitation metadata. Discovery Metadata is
the minimum amount of information that needs to be provided to reveal to the
user the content of the resource. This kind of metadata answers the “what, why
when who, where and how” questions about geospatial resource. Exploration
metadata provides sufficient information to determine that a resource that fit for
a given purpose exists, to evaluate its properties, and to reference some point of
contact for more information. Exploration metadata include information required
to allow the user know whether the data will meet general requirements of
a given problem. Exploitation metadata include those information required to
access, transfer, load, interpret, and apply the data in the end application where it
is exploited.
This paper is organized as follows: the next Section presents the basic concepts of
the OGC Catalogue Service Specification including information models, catalogue
interfaces together with examples, application profiles and catalogue implementa-
tions. Section 3 describes the usage of metadata catalogues in geodetic-cadastral
information systems, lists some of the problems and possible solutions. Finally,
conclusions are given in Section 4.
Dublin Core
The Dublin Core is a metadata element set intended to enable discovery of elec-
tronic resources. It is primarily used for author-generated description of Web re-
sources, but it is also used in communities such as museums, libraries, govern-
ment agencies, and commercial organizations. Dublin Core metadata is specifi-
cally intended to support general-purpose resource discovery. The elements repre-
sent concepts of core elements that are likely to be useful to support resource dis-
covery. It uses only fifteen base text fields, which are usually inadequate for even
basic geospatial resource description and discovery, because there is no mean to
declare what type of content is present in the text element (coordinates, date or
time, place name, etc). Therefore, a more detailed metadata model is needed to
support the discovery of geospatial resources.
OWL (URL 9) is a standard for ontology on the Semantic Web from the World
Wide Web Consortium (W3C) (URL 10). It is built on top of RDF (Resource
Description Frame) (URL 11) and RDF Schema (URL 12), a family of specifica-
tions for description of web resources. OWL ontologies may be categorized into
three species or sub-languages: OWL-Lite, OWL-DL and OWL-Full. A defining
feature of each sub-language is its expressiveness. OWL-Lite is the least ex-
pressive sub-language. OWL-Full is the most expressive sub-language, but does
not guarantee computational completeness or decidability. The expressiveness of
OWL-DL falls between that of OWL-Lite and OWL-Full. OWL-DL is much more
expressive than OWL-Lite and is based on Description Logics (Baader et al. 2002),
hence the suffix DL, and allows expressivity without losing computational com-
pleteness and decidability for reasoning.
The information model has two components: one component models RDF and
RDF Schema and another the OWL-DL. OWL-DL is used to ensure computa-
tional resolvability. The main benefit of this information model over others is
that meaning of the concepts are made explicit, so semantic interoperability may
also be achieved in addition to syntactic interoperability provided by OGC stan-
dards. Ontology may be used to automatically reason about the properties of a do-
main, and may be used to describe that domain. Its role is to provide a shared vo-
cabulary within a certain domain and therefore avoid semantic disambiguates.
catalogue service. The interface to the resource service can be private or some
OGC interface, such as WMS or WFS. The interface between the catalogue ser-
vices is the OGC catalogue interface. In this case, the catalogue service acts as a
client and a server. Data returned from the OGC catalogue service is handled by
the catalogue service that sent the request and it returns them to the original re-
quest. In this way distributed search is accomplished.
General interfaces can be bound to several application protocols. OGC Catalogue
Service specification provides the possibility of implementing a catalogue service
using one of the following application protocols:
• HTTP protocol (URL 13) binding – involves mapping the operations of a Gene-
ral model to the message requests and responses that are common to all web-ba-
sed catalogue services. When the catalogue service implements HTTP protocol
binding it is called Catalogue Services for the Web (CSW) and it will be given in
the following section in more detail.
• Z39.50 protocol binding – it uses client-server architecture based on the messa-
ges implemented using ANSI / NISO Z39.50 Application Service Definition and
Protocol Specification (URL 14) for searching and retrieving information from
remote computer databases.
• CORBA/ IIOP protocol binding – CORBA (Common Object Request Broker
Architecture) (URL 15) is a standard defined by the OMG (Object Management
Group) (URL 16) that enables software components written in different
programming languages and that run on different computers to work
together.
2.2.1. HTTP protocol binding (Catalogue Services for the Web, CSW)
In HTTP protocol binding, the interaction between the client and the server
is accomplished using the standard request-response model of HTTP protocol.
This means that the client, such as web browser, submits an HTTP request mes-
sage to the server using the HTTP protocol, and expects to receive a response
message from the server. The basic message exchange pattern is illustrated in
Fig. 3. The server stores content or provides resources, which it delivers to the cli-
ent.
There are two ways for encoding request and response messages of the CSW.
First, they can be encoded as pairs of keywords (request parameters) and values
within the URL address of the server. This method is called Keyword-Value Pairs
(KVP). Secondly, they can be encoded using XML, an industry standard for the
exchange of data on Internet. CSW client’s requests can also be included in a
framework based on messages such as Simple Object Access Protocol (SOAP),
320 Govedarica, M. i dr.: Metadata Catalogues in Spatial Information …, Geod. list 2010, 4, 313–334
The mandatory GetRecords operation is used for searching and presenting infor-
mation from the catalogue. The searching part of the GetRecords operation is en-
coded using the Query element. The Query element includes the parameters that
specify which entities from the information model of the catalogue are queried,
and may also specify which query constraints shall be applied to identify the re-
quest set. It is specified using OGC Filter specification (URL 17). The presenting
part of the GetRecords indicates which schema i.e. information model (ISO 19115,
ebRIM…) is used to generate the response to the GetRecords operation and which
properties of that schema should be included in each record in the GetRecords
response.
The mandatory GetRecordById request retrieves the catalogue records using
their identifier. In order for this operation to be performed a previous query
has to be performed in order to obtain the identifiers that may be used with
this operation. For example, records returned by a GetRecords operation may con-
tain references to other records (its identifier) in the catalogue that may be re-
trieved using the GetRecordById operation. This operation is a subset of the
GetRecords operation and is suitable for retrieving and linking to records in a
catalogue.
There are two optional operations that may be used to insert, delete or update
records in the catalogue: Transaction and Harvest. The Transaction operation
is used to “push” data into the catalogue whereas the Harvest operation “pulls”
data into the catalogue. That is, this operation only references the data to be in-
serted or updated in the catalogue, and the catalogue service should resolve the
reference, fetch that data, and process it into the catalogue, immediately or later
depending of the mode of operation (synchronous and asynchronous).
Fig. 4 shows a conceptual architecture that illustrates the relationship of CSW in-
terfaces to metadata consumers and producers. The arrows show the CSW re-
quests that producers and consumers of metadata can generate. For example, in
order to create metadata, metadata producer may invoke Transaction or Harvest
request. Similarly, the user of metadata may invoke GetRecords request to per-
form queries on the catalog.
the address of the service, whereas the other part represents parameters of
the service expressed as keyword-value pairs. The order of parameters is arbi-
trary.
Parameters of this request are the same as the parameters of the GetCapabilities
request with the additional parameter:
• TypeName – a list of type names that are to be described by the catalogue. It is a
mandatory paremeter. In this example, the request demends a whole metadata
schema according to the ISO 19115 / ISO 19139 standard.
Parameters of this request are the same as the parameters of the GetCapabilities
request with the additional parameter:
• typeNames – a list of one or more names of entities, from the information
model of the catalogue, that will be queried. It is a mandatory pareme-
ter. In this example, the request demends all metadata records from the cata-
logue.
Govedarica, M. i dr.: Metadata Catalogues in Spatial Information …, Geod. list 2010, 4, 313–334 323
version, profile, binding and uniform resource name. In this way the problem
of various application profiles may be solved and increased interoperability
achieved.
There are several proprietary and open source solutions that implement some of
the OGC catalogue application profiles. They also provide the ability to create
metadata for the geospatial resources according to various information models:
• GeoNetwork opensource (URL 24) is a web based geographic metadata catalogue
application. It implements the ISO19115/19139 Geographic Metadata, Z39.50,
CSW 2.0.2 and OGC WMS standards among others.
• Deegree Web Catalogue Service (CSW) (URL 25) is a software package that im-
plements the OGC Catalogue Service Implementation Specification 2.0.2 and
ISO 19115/19119 Application Profile 1.0.0.
• ESRI ArcCatalog (URL 26) is part of ArcGIS development environment for geo-
graphic information systems. ArcCatalog is used for cataloging all GIS resources
within an organization and provides basic information about each of them. It al-
so allows the creation and update of metadata for each GIS resource according to
ISO 19115.
• ERDAS APOLLO Catalog (URL 27) offers a CSW compliant view on the content
of the ERDAS APOLLO Catalog. The preferred OGC registry information model
is based on the ebXML registry information model, ebRIM Application Profile
for CSW.
The similar metadata description may be given for raster data. The main
difference is the spatial representation information which includes grid spatial
representation and can be divided in georectified and georeferencable grid.
Georectified grid is a grid whose cells are regularly spaced in a geographic or map
coordinate system defined in the spatial referencing system so that any cell in the
grid can be geolocated given its grid coordinate and the grid origin, cell spacing,
and orientation. Georeferencable grid is a grid with cells irregularly spaced in any
given geographic or map projection coordinate system, whose individual cells can
be geolocated using geolocation information supplied with the data but cannot be
geolocated from the grid properties alone. Spatial representation information
includes properties of the grid such as number of dimensions, axis properties, cell
geometry, availability of check points and transformation parameters, etc. The
following listing shows extract of metadata for raster data in ISO 19115 format.
Information about reference system, bounding box and distribution are similar as
in previous example. The difference can be observed for spatial representation in-
formation which gives details of the grid.
This sort of information enables discovery and retrieval of data according to title,
abstract, keywords, spatial and temporal extent, categories, themes, etc. It
answers the “what, where, when, why, who, and how” questions about geospatial
328 Govedarica, M. i dr.: Metadata Catalogues in Spatial Information …, Geod. list 2010, 4, 313–334
resources. ISO 19115 also specifies Core metadata set which is a basic minimum
number of metadata elements that should be maintained for a dataset in order
to identify a dataset for catalogue purposes. It includes mandatory metadata
elements as well as recommended optional elements which will increase
interoperability, allowing users to understand the geographic data and the related
metadata provided by either the producer or the distributor
Metadata catalogues may facilitate retrieval of the themes and features of the
Digital geodetic plan. However, the retrieval of the data is only based on
keyword-based search, and the part concerning the semantics of the data is still
missing and the user is not able to see the details about underlying data model.
Retrieval of the data should consider feature attributes which can be spatial,
thematic, qualitative and temporal. Although application schema may be
referenced in metadata set, the problems of heterogeneity of formats for its
representation, as well the meaning of schema elements persist and therefore it is
not suitable for the any kind of automatic processing.
Record orientation of catalogues as in ISO 19115, is a clear user / client paradigm
but it is hard to maintain and limited for complex metadata relationships. A
registry model makes catalogs easier and more flexible to maintain, but it is
rather complex when exposed to the clients. ebRIM allows the classification of
data and services into categories which only partially solves the problem of
Govedarica, M. i dr.: Metadata Catalogues in Spatial Information …, Geod. list 2010, 4, 313–334 329
General ontology is the core upper level vocabulary representing common human
consensus reality that all other ontologies must reference and it is domain inde-
pendent. Geospatial feature ontology provides the core geospatial vocabulary and
330 Govedarica, M. i dr.: Metadata Catalogues in Spatial Information …, Geod. list 2010, 4, 313–334
Examples of the OWL classes from the LADM and the national cadastre
ontologies have been given in the following listing. Listing 3.a shows the descrip-
tion of the OWL class LA_SubParcel in N3 notation. This class is related with
the class LA_Parcel via existential and universal restriction on property
isPartOfParcel, which specifies that the subparcel must be related to particular
parcel i.e. if subparcel exists it must be the part of a particular parcel. Listing 3.b
shows the OWL class Building. This class is the subclass of the OWL class
LA_Building and it is related to PartOfParcel. This relation indicates that a
building must be placed at exactly one part of parcel. OWL classes with the prefix
LA_ belong to the LADM ontology, whereas other classes belong to the national
cadastre ontology.
Govedarica, M. i dr.: Metadata Catalogues in Spatial Information …, Geod. list 2010, 4, 313–334 331
4. Conclusions
One of the essential components for the construction of a spatial data infra-
structure at a regional, national or global level is the geospatial catalogue service.
But, for the catalogue to be a useful component, it must enable access to
geospatial metadata independently of the nature of search client applications.
Client applications do not need to be developed by the same company or same
technology that implemented the server. This is achieved by OGC Catalogue spe-
cification which various vendors must comply in order to achieve interoperability
and make possible this enterprise and technological independence. This paper re-
views General catalogue model, various information models that can be imple-
mented, different protocol bindings, among which the most common is HTTP
protocol binding which enables web interface and application profiles that com-
bines various information models and protocol bindings. The analysis of the
usage of metadata catalogue services in geodetic information systems has been
given. The problem of the semantics of data has been discussed and the proposal
for a possible solution for improvement based on ontologies has been made. It is
necessary to develop these ontologies in accordance with OGC and ISO 19100
series of standards and data model.
332 Govedarica, M. i dr.: Metadata Catalogues in Spatial Information …, Geod. list 2010, 4, 313–334
References
Baader, F., McGuinness, D., Nardi, D., Patel-Schneider, P. F. (2002): Description Logic
Handbook: Theory, Implementation and Applications, Cambridge University
Press, Cambridge, UK.
Bai, Y., Di, L., Wei, Y. (2009): A taxonomy of geospatial services for global service disco-
very and interoperability, Computers and Geosciences, 35, 783–790.
Boškoviæ, D., Ristiæ, A., Govedarica, M., Prulj, Ð. (2010): Ontology Development for
Land Administration, Proceedings of IEEE International Symposium on Intelli-
gent Systems and Informatics (SISY), Subotica, 437–442.
Bulatoviæ, V., Ninkov, T., Sušiæ, Z. (2010): Open Geospatial Consortium Web Services in
Complex Distribution Systems, Geodetski list, 1, 13–29.
Erl, T. (2005): Service-Oriented Architecture: Concepts, Technology, and Design,
Prentice Hall, New Yersey, USA.
Guarino, N. (1998): Formal ontology and information systems, Proceedings of the First
International Conference on Formal Ontologies in Information Systems, FOIS’98,
Trento, Italy, 3–15.
Knublauch, H., Fergerson, R. W., Noy, N. F., Musen, M. A. (2004): The Protégé OWL
Plugin: an open development environment for semantic web applications, Lecture
Notes in Computer Science 3298, Springer, 229–243.
Lutz, M., Klien, E. (2006): Ontology-based retrieval of geographic information, Interna-
tional Journal of Geographical Information Science, 20, 233–260.
Maguire, D. J., Longley, P. A. (2005): The emergence of geoportals and their role in spa-
tial data infrastructures, Computers, Environment and Urban Systems, 29, 3–14.
Nebert, D. (2004): Developing Spatial Data Infrastructures: The SDI Cookbook, available
at: http://www.gsdi.org/docs2004/Cookbook/cookbookV2.0.pdf, (20.09.2010.).
Nogueras-Iso, J., Zarazaga-Soria, F. J., Bejar, R., Alvarez, P. J., Muro-Medrano, P. R.
(2005): OGC Catalog Services: a key element for the development of Spatial Data
Infrastructures, Computers and Geosciences, 31, 199–209.
Staab, S., Studer, R. (2009): Handbook on Ontologies, Springer, Berlin, Germany.
Tait, M. G. (2005): Implementing geoportals: applications of distributed GIS, Compu-
ters, Environment and Urban Systems, 29, 33–47.
Zhao, P., Di, L., Yang, W., Yu, G., Yue, P. (2009): Geospatial Semantic Web: Critical
Issues, in: Karimi, H.A., Handbook of Research on Geoinformatics, Information
Science Reference, 178–189.
URL 1: Open Geospatial Consortium,
http://www.opengeospatial.org, (21.09.2010.).
URL 2: OpenGIS Catalogue Service Implementation Specification 2.0.2, 2007.,
http://www.opengeospatial.org/standards/cat, (21.09.2010.).
URL 3: ISO 15836:2003 – Information and documentation – The Dublin Core metadata
element set,
http://www.iso.org/iso/catalogue_detail.htm?csnumber=37629, (21.09.2010.).
URL 4: ISO 19115:2003 – Geographic information – Metadata,
http://www.iso.org/iso/catalogue_detail.htm?csnumber=26020, (21.09.2010.).
URL 5: OASIS ebXML Registry Information Model Version 3.0,
http://www.oasis-open.org/committees/download.php/13591/
/docs.oasis-open.orgregrepv3.0specsregrep-rim-3.0-os.pdf, (21.09.2010.).
Govedarica, M. i dr.: Metadata Catalogues in Spatial Information …, Geod. list 2010, 4, 313–334 333
Kljuène rijeèi: OGC katalog, CSW, prostorni informacijski sustavi, metapodaci, se-
mantika.
Prihvaæeno: 2010-11-05