Geo Network User Manual
Geo Network User Manual
Release 2.8.0
GeoNetwork
Contents
Preface 1.1 About this Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 License Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3 Author Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Quick Start Guide 2.1 Geographic Information Management for all . . . . . . . . . 2.2 Getting Started . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 Viewing and Analysing the Data . . . . . . . . . . . . . . . . 2.4 Adding a metadata record . . . . . . . . . . . . . . . . . . . 2.5 Uploading a New Record using the XML Metadata Insert Tool 2.6 Metadata in Spatial Data Management . . . . . . . . . . . . . 2.7 New Features . . . . . . . . . . . . . . . . . . . . . . . . . . 2.8 Installing the software . . . . . . . . . . . . . . . . . . . . . 2.9 Upgrading to a new Version . . . . . . . . . . . . . . . . . . Administration 3.1 Basic conguration . . . . . . . 3.2 OGC CSW server conguration 3.3 Advanced conguration . . . . 3.4 User and Group Administration 3.5 Localization . . . . . . . . . . 3.6 System Monitoring . . . . . . . Managing Metadata 4.1 Templates . . . . . . . . . 4.2 Ownership and Privileges 4.3 Import facilities . . . . . 4.4 Export facilities . . . . . 4.5 Status . . . . . . . . . . . 4.6 Versioning . . . . . . . . 4.7 Harvesting . . . . . . . . 4.8 Formatter . . . . . . . . .
3 3 3 4 5 5 8 21 28 53 55 57 60 68 69 69 82 87 99 105 106 109 109 111 114 121 123 126 132 184
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
4.9 Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187 4.10 Fragments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190 4.11 Schemas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197 5 Features 5.1 Multilingual search . . . . . . . . 5.2 Search Statistics . . . . . . . . . 5.3 Thesaurus . . . . . . . . . . . . . 5.4 User Self-Registration Functions 201 201 206 208 219 223 227 231 231 231 231 233 233 233 233 234 235 235 237 241
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
6 7 8
Glossary of Metadata Fields Description ISO Topic Categories Free and Open Source Software for Geospatial Information Systems 8.1 Web Map Server software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.2 GIS Desktop software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.3 Web Map Viewer and Map Server Management . . . . . . . . . . . . . . . . . . . . . . Frequently Asked Questions 9.1 HTTP Status 400 Bad request . . . . . . . . . . 9.2 Metadata insert fails . . . . . . . . . . . . . . . 9.3 Thumbnail insert fails . . . . . . . . . . . . . . 9.4 The data/tmp directory . . . . . . . . . . . . . . 9.5 What/Where is the GeoNetwork data directory? . 9.6 The base maps are not visible . . . . . . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
10 Glossary Index
ii
Welcome to the GeoNetwork User Manual v2.8.0. The manual is a guide describing how to use the metadata catalog. Other documents: GeoNetwork Developer Manual GeoNetwork User Manual (PDF)
Contents
Contents
CHAPTER 1
Preface
1.2.2 Documentation
Documentation is released under a Creative Commons license with the following conditions. You are free to Share (to copy, distribute and transmit) and to Remix (to adapt) the documentation under the following conditions:
Attribution. You must attribute GeoNetwork opensource documentation to GeoNetwork opensource developers. Share Alike. If you alter, transform, or build upon this work, you may distribute the resulting work only under the same or similar license to this one. With the understanding that: Any of the above conditions can be waived if you get permission from the copyright holder. Public Domain. Where the work or any of its elements is in the public domain under applicable law, that status is in no way affected by the license. Other Rights. In no way are any of the following rights affected by the license: Your fair dealing or fair use rights, or other applicable copyright exceptions and limitations; The authors moral rights; Rights other persons may have either in the work itself or in how the work is used, such as publicity or privacy rights. Notice: For any reuse or distribution, you must make clear to others the license terms of this work. The best way to do this is with a link to this web page. You may obtain a copy of the License at Creative Commons Attribution-ShareAlike 3.0 Unported License The document is written in reStructuredText format for consistency and portability.
Chapter 1. Preface
CHAPTER 2
Background and evolution The prototype of the GeoNetwork catalogue was developed by the Food and Agriculture organisation of the United Nations (FAO) in 2001 to systematically archive and publish the geographic datasets produced within the organisation. The prototype was built on experiences within and outside the organisation. It used metadata content available from legacy systems that was transformed into what was then only a draft metadata standard, the ISO 19115. Later on, another UN agency, the World Food Programme (WFP) joined the project and with its contribution the rst version of the software was released in 2003 and operational catalogues were established in FAO and WFP. The system was based on the ISO19115:DIS metadata standard and embedded the Web Map Client InterMap that supported Open Geospatial Consortium (OGC) compliant Web Map Services. Distributed searches were possible using the standard Z39.50 catalogue protocol. At that moment it was decided to develop the program as a Free and Open Source Software to allow the whole geospatial users community to benet from the development results and to contribute to the further advancement of the software. Jointly with the UN Environmental Programme (UNEP), FAO developed a second version in 2004. The new release allowed users to work with multiple metadata standards (ISO 19115, FGDC and Dublin Core) in a transparent manner. It also allowed metadata to be shared between catalogues through a caching mechanism, improving reliability when searching in multiple catalogues. In 2006, the GeoNetwork team dedicated efforts to develop a DVD containing the GeoNetwork version 2.0.3 and the best free and open source software in the eld of Geoinformatics. The DVD was produced and distributed in hard copy to over three thousand people. More recently, the OSGeo Live project has been developed with GeoNetwork and all the best Open Source Geospatial software available on a selfcontained bootable DVD, USB thumb drive or Virtual Machine based on Xubuntu. The GeoNetwork community has been a part of this project and will continue to make sure the latest stable version of GeoNetwork is included. You can download the OSGeo-Live images from OSGeo Live website. GeoNetwork opensource is the result of the collaborative development of many contributors. These include among others the Food and Agriculture organisation (FAO), the UN Ofce for the Coordination of Humanitarian Affairs (UNOCHA), the Consultative Group on International Agricultural Research (CSICGIAR), The UN Environmental Programme (UNEP), The European Space Agency (ESA) and many others. Support for the metadata standard ISO19115:2003 has been added by using the ISO19139:2007 implementation specication schema published in May 2007. The release also serves as the open source reference implementation of the OGC Catalogue Service for the Web (CSW 2.0.2) specication. Improvements to give users a more responsive and interactive experience have been substantial and include a new Web map viewer and a complete revision of search interface. The use of International Standards GeoNetwork has been developed following the principles of a Free and Open Source Software (FOSS) and based on International and Open Standards for services and protocols, like the ISO-TC211 and the Open Geospatial Consortium (OGC) specications. The architecture is largely compatible with the OGC Portal Reference Architecture, i.e. the OGC guide for implementing standardised geospatial portals. Indeed the structure relies on the same three main modules identied by the OGC Portal Reference Architecture, that are focused on spatial data, metadata and interactive map visualisation. The system is also fully compliant with the OGC specications for querying and retrieving information from Web catalogues (CSW). It supports the most common standards to specically describe geographic data (ISO19139 and FGDC) and the international standard for general documents (Dublin Core). It uses standards (OGS WMS) also for visualising maps through the Internet.
Harvesting geospatial data in a shared environment Within the geographic information environment, the increased collaboration between data providers and their efforts to reduce duplication have stimulated the development of tools and systems to signicantly improve the information sharing and guarantee an easier and quicker access of data from a variety of sources without undermining the ownership of the information. The harvesting functionality in GeoNetwork is a mechanism of data collection in perfect accordance with both rights to data access and data ownership protection. Through the harvesting functionality it is possible to collect public information from the different GeoNetwork nodes installed around the world and to copy and store periodically this information locally. In this way a user from a single entry point can get information also from distributed catalogues. The logo posted on top each harvested record informs the user about the data source.
Geographic search. For the geographic search, two options are available for selecting a particular region to limit the search: You can select a region from a predened list;
Figure 2.2: The region eld You can select your own area of interest in a more interactive way. A small global map is shown on the screen from which you can drag and drop the frame of your location area. Just click on the button on the upper right of the map screen. Perform search. Both types of search, free text search and geographic search can be combined to restrict the query further. Click the Search button to proceed and show the results.
10
11
12
13
Figure 2.9: Where section in the Advanced search If you choose Spatial search type is Country, only maps for the selected country will be displayed. In other words, a city map within that country will not show in the output results. If you choose Spatial search type overlaps Country, all maps with the bounding box overlapping that country will be displayed in the results, i.e. the neighbouring countries, the continent of which that country is part of and the global maps. If you choose Spatial search type encloses Country you will get, in the output results, maps of that country rst and then all maps within its bounding box. Similarly, if you choose Spatial search type is fully outside of a selected region, only maps that follow that exact criteria will show in the output results. The WHEN? section gives you the possibility to restrict your search in terms of temporal extent, indicating a specic range of time referred to the data creation or publication date. To specify a range of time, click on the date selector button next to From To elds. Make use of the symbols > and >> on top of the calendar to select the month and the year rst and then click on the exact day; a complete date will be lled in using the following standard order: YY-MM-DD. To clean the time elds, simply click on the white cross on their right; the box Any will be automatically selected and the search will be performed without any restriction on the time period. Finally, the advanced search allows you to apply further restrictions on the basis of additional parameters as data source, data categories and data format. To limit your queries to only one Catalogue out of those made available by the installation through the harvesting process, highlight the catalogue of preference or just keep Any selected to search all sites. To search for data organised by Category, such as Applications, Datasets, etc., simply highlight the category you wish to search in from the related drop-down list, otherwise we suggest to leave 14 Chapter 2. Quick Start Guide
Figure 2.10: When section in the Advanced search this eld in Any Category. You can search for Digital or Hard Copy maps. To search in one or the other, simply check the box next to the one you wish to search. If no box is checked, all content will be searched. At last, you can customise the number of output results per page in the Hits Per Page eld. Simply highlight the number of records to be displayed or leave the eld set on the default number (10). Click the Search button.
Inspire If INSPIRE Search panel is enable in Administration > System conguration page, an additional section is displayed to allow searching INSPIRE metadata in the catalog.
Annex: Allows to search for metadata related to a specic Inspire annex. The Inspire annexes for a metadata are based on the Inspire theme keywords assigned to it. Source type: Allows to search for dataset or service metadata. Service type: Allows to search for service metadata using the service type values dened in INSPIRE metadata regulation (section 1.3.1). 2.2. Getting Started 15
16
Classication of data services: Allows to search for metadata that have selected keyword from the Inspire service taxonomy thesaurus. Inspire themes: Allows to search for metadata that have selected keywords from the Inspire themes thesaurus.
Figure 2.13: Search results 1. Metadata: The metadata section describes the dataset (e.g.: citation, data owner, temporal/spatial/methodological information) and could contain links to other web sites that could provide further information about the dataset. 2. Download: Depending on the privileges that have been set for each record, when this button is present, the dataset is available and downloadable. The process for retrieving data is simple and quick by just clicking the download button or by using the proper link in the specic metadata section for distribution info in the full metadata view.
3. Interactive Map: The map service is also optional. When this button is shown, an interactive map for this layer is available and, by default, it will be displayed on the map screen of the simple 2.2. Getting Started 17
Figure 2.15: Available services related to the resource search. To better visualise the map through the map viewer, click on Show Map on the top of search results panel.
4. Graphic Overviews: There are small and large overviews of the map used to properly evaluate usefulness of the data, especially if the interactive map is not available. Simply click on the small image to enlarge it.
18
19
20
Roles. Users with an Editor role can create, import and edit metadata records. They can also upload data and congure links to interactive map services. User groups. Every authenticated user is assigned to a particular work group and is able to view data within that work group.
21
22
23
24
25
Figure 2.24: Temporal extent The spatial extent of the interested area is dened through geographic coordinates or through the selection of a country or region from a predened list. Free text supplemental information can be added to complete the data identication section.
Distribution Section This section provides metadata elements for accessing other useful on-line resources available through the web. The distribution elements allow for on-line access using an URL address or similar addressing scheme and provide the protocol for the proper connection for accessing geographic data or any other types of digital documents using the download function. Furthermore, it is possible to link a metadata with a predened map service through the online resource and see the map interactively.
26
Figure 2.26: Distribution information Reference System Section The Spatial Reference System section denes metadata required to describe the spatial reference system of a dataset. It contains one element to identify the name of the reference system used. Using elements from the advanced form, this section may be modied to provide more details on data projection, ellipsoid and datum. Note that if this information is provided, a reference system identier is not mandatory.
Data Quality Section The Data Quality section provides a general assessment of the quality of the data. It describes the*different hierarchical levels of data quality*, namely a dataset series, dataset, features, attributes, etc. This section also contains information about sources of the input data, and a general explanation of the production processes (lineage) used for creating the data. Metadata Information Section This section contains information about the metadata itself: the Universally Unique Identier (UUID) assigned to the record (this is the File identier), language and characterset used, date of last edit 2.3. Viewing and Analysing the Data 27
Figure 2.28: Data quality (Date stamp) and the metadata standard and version name of the record. It also contains information on the metadata author responsible for the metadata record; this person can also be a point of contact for the resource described. Information on the Metadata author is mandatory.
29
Type - Geographic Location - Reference System Info - Temporal Extent - Data Quality Info Access and Use Constraints - Point of Contact - Distribution Info: Online Resources. You should also prepare an image of your data that is required to be displayed in search results as thumbnail. Next section will guide you through the process of metadata creation using the online editor.
2. Open the Administration page by clicking the Administration button in the banner and then click on the New metadata link.
3. From the metadata creation page, select the metadata standard to use from the dropdown list (Figure 4.3, Template selection)
4. After selecting the correct template, you should identify which group of users the metadata will belong to and nally click on Create.
5. A new metadata form based on the selected template will be displayed for you to ll out.
30
31
Figure 2.34: Metadata view options In the previous chapter you have analyzed the metadata structure as it is presented in the Default View. A selection of the main elds from different categories of information is shown in one single view. The minimum set of metadata required to serve the full range of metadata applications (data discovery, determination of data tness for use, data access, data transfer and use of digital data) is dened here, along with optional metadata elements to allow for a more extensive standard description of geographic data, if required. However, if should be there a need to add more metadata elements, you can switch to the advanced view at any time while editing. In the Advanced View, the ISO prole offers the possibility to visualize and edit the entire metadata structure organized in sections accessible through tabs from the left column. You can use this view to write more advanced metadata descriptions or templates to t specialized needs. The XML View shows the entire content of the metadata in the original hierarchical structure; different colors allow to distinguish between an elements name and its value. The XML structure is composed of tags and to every tag must correspond a closing tag. The content is entirely contained withing the two, i.e.:
<gmd:language> <gco:CharacterString>eng</gco:CharacterString> </gmd:language>
Nevertheless, the use of the XML view requires some knowledge of the XML language. Both the Default and the Advanced Views are composed of mandatory, conditional and optional metadata elds. The meaning of mandatory and optional is fairly intuitive; the mandatory elds are required, like Title and Abstract for instance, whereas the optional elds can be provided but are not fundamental, depending on the metadata author. The conditional elds may be considered mandatory under certain circumstances: essentially a conditional requirement indicates that the presence of a specied data element is dependent on the value or presence of other data elements in the same section. For instance, the Individual name metadata element of the Point of Contact, which is a conditional element of the Identication section, becomes mandatory if another element of the same section, Organization name or Position name is not already dened.
32
34
The mandatory elds as well as those highly recommended are agged with red asterisk [*]. The standard denition for each eld can be read by passing the mouse on the element name. The Default View is the preferred view as it provides a selection of the available metadata elements, facilitating both the user and the editor in reading and editing a metadata record, and at the same time it ensures that a geospatial data can be properly described, through : the minimum set of metadata required to serve the full range of metadata applications (data discovery, determination of data tness for use, data access, data transfer, and use of digital data); optional metadata elements - to allow for a more extensive standard description of geographic data, if required; a method for extending metadata to t specialized needs.
gather as much information as possible to identify and understand the maps resource and characteristics you want to describe. Use the default view to start. If necessary, you can always switch to advanced view or come back later and edit the record with the additional information collected. Please follow these steps to enter your maps metadata. Note that we will only go through the elds that have been identied as compulsory (i.e. those elds marked with the asterix [*], mandatory or highly recommended). Title *: Under the Identication Info eld, give your map a name. There will be a default name of your data. Use free text to describe your map here. Date *: Indicate the exact date of creation, publication or revision on your map. Presentation Form: Specify the type of presentation, i.e. digital, hard copy, table, etc. Abstract *: Enter some description of the map. Purpose: Enter a short summary of the purposes for your map to be developed. Status: Specify the status of your map within the following options: completed, historical archive, obsolete, ongoing, planned, required, under development. Point of Contact: Enter all mandatory information and others you have at hand for the contact of the person(s) associated with this resources of the map. Note that some elds are only conditionally mandatory, such as Organization Name if Individual Name and Position are not entered. Maintenance and update frequency * : Specify the frequency with which you expect to make changes and additions to your map after the initial version is completed. If any changes are scheduled you can leave As Needed selected from the drop-down list. Descriptive Keywords: Enter keywords that describe your map. Also specify the type of keyword you are entering, i.e. place, theme, etc. Remember that you can add another keyword eld if you need to add different types of keywords. Access Constraints: Enter an access constraint here, such as a copyright, trademark, etc. to assure the protection of privacy and intellectual property. User Constraints: Enter a user constraint here to assure the protection of privacy and intellectual property. Other Constraints * : Enter other constraint here to assure the protection of privacy and intellectual property. Note that this eld is conditionally mandatory if Access and Use constraints are not entered. Spatial representation type: Select, from the drop-down list the method used to spatially represent your data. The options are: vector, grid, text table, stereo model, video. Scale Denominator * : Enter the denominator for an equivalent scale of a hard copy of the map. Language* : Select the language used within your map Topic category * : Specify the main ISO category/ies through which your map could be classied (see Annex for the complete list of ISO topic categories). Temporal Extent * : Enter the starting and ending date of the validity period. Geographic Bounding Box * : Enter the longitude and latitude for the map or select a region from the predened drop-down list. Make sure you use degrees for the unit of the geographic coordinates as they are the basis for the geographic searches. Supplemental Information: Enter any other descriptive information about your map that can help the user to better understand its content.
36
Distribution Info: Enter information about the distributor and about options for obtaining your map. Online Resource: Enter information about online resources for the map, such as where a user may download it, etc. This information should include a link, the link type (protocol) and a description of the resource. Reference System Info: Enter information about the spatial reference system of your map. The default view contains one element to provide the alphanumeric value identifying the reference system used. GNos uses the EPSG codes which are numeric codes associated with coordinate system denitions. For instance, EPSG:4326 is Geographic lat-long WGS84, and EPSG:32611 is UTM zone 11 North, WGS84. Using elements from the advanced view, you may add more details on data projection, ellipsoid and datum. Note that if this information is provided, a reference system identier is not mandatory. Data Quality: Specify the hierarchal level of the data (dataset series, dataset, features, attributes, etc.) and provide a general explanation on the production processes (lineage) used for creating the data. The statement element is mandatory if the hierarchical level element is equal to dataset or series. Detailed information on completeness, logical consistency and positional, thematic and temporal accuracy can be directly added into the advanced form. Metadata Author * : Provide information about the author of the map, including the persons name, organization, position, role and any other contact information available. After completion of this section, you may select the Type of document that you are going to save in the catalogue. You have three options: Metadata, Template, Sub-template. By default Metadata is set up. When done, you may click Save or Save and Close to close the editing session.
37
38
To create a thumbnail, go to the editing menu. If you are no longer in editing mode, retrieve the metadata record using one of the search options then click on Edit. Then follow these simple steps: From the editing menu, click on the Thumbnails button on the top or bottom of the page.
Figure 2.39: The thumbnail wizard button You will be taken to the Thumbnail Management wizard. To create a small or large thumbnail, click on the Browse button next to either one. It is recommended that you use 180 pixels for small thumbnails and 800x600 for large thumbnails. Using the Large thumbnail option allows you to create both a small and large thumbnail in one go. You can use GIF, PNG and JPEG images as input for the thumbnails. A pop up window will appear allowing you to browse your les on your computer. Select the le you wish to create a thumbnail with by double-clicking on it. Click on Add. Your thumbnail will be added and displayed on the following page. You can then click on Back to Editing and save your record.
40
41
Figure 2.43: Window to select WMS layer/s referenced in online resource to load in map viewer
42
Selecting protocols OGC-WMS Web Map Service, OGC Web Map Service 1.1.1 or OGC Web Map Service 1.3.0: 1. URL: Url of WMS service 2. Name of the resource: WMS layer name (optional)
Figure 2.44: WMS online resource The behaviour the Interactive Map button depends if user indicated the layer name in the eld Name of the resource or not, to show the window to select the layer/s to load in map viewer or load the layer directly.
43
100 Mbytes unless your system administrator has congured a larger limit in the GeoNetwork cong.xml le; 5. Click Upload and then Save the metadata record.
This mechanism allows users to upload a GeoTIFF le or a zipped Shapele to a metadata record and deploy that dataset as a Web Map Service on one or more GeoServer node. After linking the data for download, the user will see a button that allows her/him to trigger this deployment. The metadata online source section is updated. Conguration If after uploading data, you cannot see the geopublisher button, ask the catalogue administrator to check the conguration. This feature is disabled by default. It could be activated in the cong-gui.xml conguration le. If you cannot see your GeoServer node, ask the catalogue administrator to add the new node in geoservernodes.xml conguration le. Publish your data Edit a metadata Upload a le as explained in the linking data section. In edit mode, online source section with a le for download attached, will provide the geopublisher panel: Select a node to publish the dataset in (See conguration for details on adding a node) GeoNetwork checks if: the le provided is correct (eg. ZIP contains one Shapele or a tiff) the layer has already been published to that node. If yes, the layer is added to the map preview. Publish button: Publish current dataset to remote node. If dataset is already publish in that node, it will be updated. Unpublish button: Remove current dataset from remote node. Add online source button: Add an onlinesource section to the current metadata record pointing to the WMS and layername in order to display the layer in the map viewer of the search interface.
45
Style button: Only available if the GeoServer styler has been installed and declared in the conguration. No layer names are asked to the user. Layer name is compute from the le name. In case of ZIP compression, ZIP le base name must be equal to Shapele or GeoTiff base name (ie. if the shapele is rivers.shp, ZIP le name must be rivers.zip). One Datastore, FeatureType, Layer and Style are created for a vector dataset (one to one relation). One CoverageStore, Coverage, Layer are created for a raster dataset (one to one relation).
Using this option, parent identier will be automatically set up when duplicating the record. Editors could also link an existing metadata record using the parent identier displayed in the advanced view, metadata section. Clicking on the Add or update parent metadata section on the metadata relation list will move to this view. Then editors should use the (+) to expand the parent identier and click on the eld to open the metadata selection panel. Once the parent selected, it will appear in the metadata relation list on the top right corner of the editor. If a metadata record has children attached, the editor suggest the children update mechanism which propagate changes from a parent to all its children. The following interface dene the conguration of the propagation: 46 Chapter 2. Quick Start Guide
Metadata on dataset / metadata on service relation Linking a dataset to a service or a service to a dataset is made using the following panel:
Editor could dene a layer name using the combo box (which try to retrieve layers from the WMS GetCapabilities document) or typing the layer name in the text eld. This information is required to display the layer using the map viewer. Relation is stored in :
<srv:operatesOn uuidref="" xlink:href=""/>
Only relation between records in the same catalogue are handle. Use of XLink attributes are not supported to create relation between datasets and services. Feature catalogue relation Feature catalogues are records stored in ISO 19110 standard. Relation between the 2 records are created using the link feature catalogue menu.
Then in the other actions menu, the compute boundinx box menus are available:
48
The metadata is saved during the process and one extent is added for each keywords.
If user manually add keywords just before computing bounding box, then its recommended to save your metadata record before launching the action in order to have latest keywords taken into account.
50
Below is a brief description for each privilege to help you identify which ones you should assign to which group(s). Publish: Users in the specied group/s are able to view the metadata eg. if it matches search criteria entered by such a user. Download: Users in the specied group/s are able to download the data. Interactive Map: Users in the specied group/s are able to get an interactive map. The interactive map has to be created separately using a Web Map Server such as GeoServer, which is distributed with GeoNetwork. Featured: When randomly selected by GeoNetwork, the metadata record can appear in the Featured section of the GeoNetwork home page. Notify: Users in the specied group receive notication if data attached to the metadata record is downloaded.
51
52
2.5 Uploading a New Record using the XML Metadata Insert Tool
A more advanced procedure to upload a new metadata record in the GeoNetwork system is using an XML document. This procedure is particularly useful for users who already have metadata in XML format, for instance created by some GIS application. To this regard, it has to be noted that the metadata must be in one of the standards used by GeoNetwork: ISO19115, FGDC and Dublin Core. To start the metadata uploading process through the XML Metadata Insert tool, you should log in and select the appropriate option from the Administration page.
Figure 2.49: Administration panel The main part of the page Import XML Formatted Metadata that is displayed is the Metadata text area, where the user can paste the XML metadata to import. Below this, there is the Type choice, which allows you select the type of record that you are going to create (Metadata, Template and Subtemplate). Then you can apply a stylesheet to convert your metadata input from ArcCatalog8 to ISO1915 or from ISO19115 to ISO19139, if required. Otherwise you can just leave none selected. The Destination schema list provides you with four options to choose the nal standard layout for your metadata (ISO19115, ISO19139, FGDC and Dublin Core). Finally you should select the Group as main group in charge of the metadata and the Category that you want to assign to your metadata. By clicking the Insert button the metadata is imported into the system; please note that all links to external les, for instance to thumbnails or data for download, have to be removed from the metadata input, to avoid any conict within the data repository. If your metadata is already in ISO19115 format, the main actions to be performed are the following: 1. Paste the XML le that contains the metadata information in the Metadata text area; 2. Select Metadata as type of record that you are going to create 3. Select the metadata schema ISO19139 that will be the nal destination schema; 2.5. Uploading a New Record using the XML Metadata Insert Tool 53
54
4. Select the validate check box if you want your metadata to be validated according to the related schema. 5. Select the group in charge of the metadata from the drop down list; 6. Select Maps and Graphics from the list of categories; 7. Click the Insert button and the metadata will be imported into the system.
The standards provide a documented, common set of terms and denitions that are presented in a structured format.
56
Figure 2.52: New home page of GeoNetwork opensource using JavaScript Widgets - tab layout
2.7.2 Administration
Search Statistics: Captures and displays statistics on searches carried out in GeoNetwork. The statistics can be summarized in tables or in charts using JFreeChart. There is an extensible interface that you can use to display your own statistics. See Search Statistics. 2.7. New Features 57
Figure 2.53: New home page of GeoNetwork opensource using JavaScript Widgets- sidebar layout New Harvesters: OGC Harvesting: Sensor Observation Service, Z3950 harvesting, Web Accessible Folder (WAF), GeoPortal 9.3.x via REST API See Harvesting. Harvest History and Scheduling: Harvesting events are now recorded in the database for review at any time. See Harvest History. Harvester scheduling is now much more exible, you can start a harvest at any time of the day and at almost any interval (weekly etc). Extended Metadata Exchange Format (MEF): More than one metadata le can be present in a MEF Zip archive. This is MEF version 2. See Export facilities. System Monitoring: Automatically monitoring the health of a Geonetwork web application. See System Monitoring.
2.7.3 Metadata
Metadata Status: Allows ner control of the metadata workow. Records can be assigned a status that reects where they are in the metadata workow: draft, approved, retired, submitted, rejected. When the status changes the relevant user is informed via email. eg. when an editor changes the status to submitted, the content reviewer receives an email requesting review. See Status. Metadata Versioning: Captures changes to metadata records and metadata properties (status, privileges, categories) and records them as versions in a subversion respository. See Versioning. Publishing data to GeoServer from GeoNetwork: You can now publish geospatial information in the form of GeoTIFF, shapele or spatial table in a database to GeoServer from GeoNetwork. See Publish uploaded data as WMS, WFS in GeoServer. Custom Metadata Formatters: You can now create your own XSLT to format metadata to suit
58
your needs, zip it up and plug it in to GeoNetwork. See Formatter. Assembling Metadata Records from Reusable Components: Metadata records can now be assembled from reusable components (eg. contact information). The components can be present in the local catalog or brought in from a remote catalog (with caching to speed up access). A component directory interface is available for editing and viewing the components. See Fragments. Editor Improvements: Picking terms from a thesaurus using a search widget, selecting reusable metadata components for inclusion in the record, user dened suggestions or picklists to control content, context sensitive help, creating relationships between records. Plug in metadata schemas: You can dene your own metadata schema and plug it into GeoNetwork on demand. Documentation to help you do this and example plug in schemas can be found in the Developers Manual. Some of the most common community plug in schemas can be downloaded from the GeoNetwork source code repository. See Schemas. Multilingual Indexing: If you have to cope with metadata in different languages, GeoNetwork can now index each language and search all across language indexes by translating your search terms. See Multilingual search. Enhanced Thesaurus support: Thesauri can be loaded from ISO19135 register records and SKOS les. Keywords in ISO records are anchored to the denition of the concept in the thesaurus. See Thesaurus.
2.7.6 Other
Improved Database Connection Handling and Pooling: Replacement of the Jeeves based database connection pool with the widely used and more robust Apache Database Connection Pool (DBCP). Addition of JNDI or container based database connection support. See Database conguration. Conguration Overrides: Now you can add your own conguration options to GeoNetwork, keep them in one le and maintain them independently from GeoNetwork. See Conguration override. Many other improvements: charset detection and conversion on import, batch application of an XSLT to a selected set of metadata records (see Processing), remote notication of metadata changes, automatic integration tests to improve development and reduce regression and, of course, many bug xes.
59
Use the platform independent installer (.jar) for all platforms except Windows. Windows has a .exe le installer.
60
On Windows If you use Windows, the following steps will guide you to complete the installation (other FOSS will follow): 1. Double click on geonetwork-install-2.8.0.exe to start the GeoNetwork opensource desktop installer 2. Follow the instructions on screen. You can choose to install the embedded map server (based on GeoServer, GAST and the European Union Inspire Directive conguration pack. Developers may be interested in installing the source code and installer building tools. Full source code can be found in the GeoNetwork github code repository at http://github.com/geonetwork. 3. After completion of the installation process, a GeoNetwork desktop menu will be added to your Windows Start menu under Programs 4. Click Start>Programs>GeoNetwork desktop>Start server to start the Geonetwork opensource Web server. The rst time you do this, the system will require about 1 minute to complete startup. 5. Click Start>Programs>Geonetwork desktop>Open GeoNetwork opensource to start using GeoNetwork opensource, or connect your Web browser to http://localhost:8080/geonetwork/
Figure 2.54: Installer The installer allows to install these additional packages: 1. GeoNetwork User Interface: Experimental UI for GeoNetwork using javascript components based on ExtJs library.
61
62
2. GeoServer: Web Map Server that provides default base layers for the GeoNetwork map viewer. 3. European Union INSPIRE Directive conguration pack: Enables INSPIRE support in GeoNetwork. INSPIRE validation rules. Thesaurus les (GEMET, Inspire themes). INSPIRE search panel. INSPIRE metadata view. 4. GAST: Installs GeoNetworks Administrator Survival Tool. See gast. Installation using the platform independent installer If you downloaded the platform independent installer (a .jar le), you can in most cases start the installer by simply double clicking on it. Follow the instructions on screen (see also the section called On Windows). At the end of the installation process you can choose to save the installation script.
63
Commandline installation If you downloaded the platform independent installer (a .jar le), you can perform commandline installations on computers without a graphical interface. You rst need to generate an install script (see Figure Save the installation script for commandline installations). This install script can be edited in a text editor to change some installation parameters. To run the installation from the commandline, issue the following command in a terminal window and hit enter to start:
java -jar geonetwork-install-2.8.0.jar install.xml [ Starting automated installation ] Read pack list from xml definition. Try to add to selection [Name: Core and Index: 0] Try to add to selection [Name: GeoServer and Index: 1] Try to add to selection [Name: European Union INSPIRE Directive configuration pack and I Try to add to selection [Name: GAST and Index: 3] Modify pack selection. Pack [Name: European Union INSPIRE Directive configuration pack and Index: 2] added to s Pack [Name: GAST and Index: 3] added to selection. [ Starting to unpack ] [ Processing package: Core (1/4) ] [ Processing package: GeoServer (2/4) ] [ Processing package: European Union INSPIRE Directive configuration pack (3/4) ] [ Processing package: GAST (4/4) ] [ Unpacking finished ] [ Creating shortcuts ....... done. ] [ Add shortcuts to uninstaller done. ] [ Writing the uninstaller data ... ] [ Automated installation done ]
You can also run the installation with lots of debug output. To do so run the installer with the ag -DTRACE=true:
java -DTRACE=true -jar geonetwork-install-2.8.0.jar
64
stateId="" createParameter=""/>
Conguring the Javascript Widgets user interface Widgets can be used to build custom interfaces. GeoNetwork provides a Javascript Widgets interface for searching, viewing and editing metadata records. This interface can be congured using the following attributes: parameter is used to dene custom application properties like default map extent for example or change the default language to be loaded createParameter is appended to URL when the application is called from the administration > New metadata menu (usually #create). stateId is the identier of the search form (usually s) in the application. It is used to build quick links section in the administration and permalinks. Sample conguration:
<!-- Widget client application with a tab based layout --> <client type="redirect" widget="true" url="../../apps/tabsearch/" createParameter="#create" stateId="s"/>
Conguring the user interface with conguration overrides Instead of changing cong-gui.xml le, the catalog administrator could use the conguration overrides mechanism to create a custom conguration (See Conguration override). By default, no overrides are set and the Default user interface is loaded. To congure which user interface to load, add the following line in WEB-INF/cong-overrides.xml in order to load the Widgets based user interface:
<override>/WEB-INF/config-overrides-widgettab.xml</override>
The le INSTALL_DIR/web/geonetwork/WEB-INF/classes/META-INF/javax.xml.transform.Tra denes the XSLT processor to use in GeoNetwork. The allowed values are: 1. de.fzi.dbs.xml.transform.CachingTransformerFactory: This is the Saxon XSLT processor with caching (recommended value for production use). However, when caching is on, any updates you make to stylesheets may be ignored in favour of the cached stylesheets. 2. net.sf.saxon.TransformerFactoryImpl: This is the Saxon XSLT processor without caching. If you plan to make changes to any XSLT stylesheets you should use this setting until you are ready to move to production. GeoNetwork sets the XSLT processor conguration using Java system properties for an instant in order to obtain its TransformerFactory implementation, then resets it to the original value, to minimize affect the XSL processor conguration for other applications that may be running in the same container. 2.8. Installing the software 65
66
that GeoNetwork supports. Only one of these resource elements can be enabled. The following is an example for the default H2 database used by GeoNetwork:
<resource enabled="true"> <name>main-db</name> <provider>jeeves.resources.dbms.ApacheDBCPool</provider> <config> <user>admin</user> <password>gnos</password> <driver>org.h2.Driver</driver> <url>jdbc:h2:geonetwork;MVCC=TRUE</url> <poolSize>33</poolSize> <validationQuery>SELECT 1</validationQuery> </config> </resource>
If you want to use a different database, then you need to set the enabled attribute on your choice to true and set the enabled attribute on the H2 database to false. NOTE: If two resources are enabled, GeoNetwork will not start. As a minimum, the <user> , <password> and <url> for your database need to be changed. Here is an example for the DB2 database:
<resource enabled="true"> <name>main-db</name> <provider>jeeves.resources.dbms.ApacheDBCPool</provider> <config> <user>db2inst1</user> <password>mypassword</password> <driver>com.ibm.db2.jcc.DB2Driver</driver> <url>jdbc:db2:geonet</url> <poolSize>10</poolSize> <validationQuery>SELECT 1 FROM SYSIBM.SYSDUMMY1</validationQuery> </config> </resource>
67
68
CHAPTER 3
Administration
69
70
Chapter 3. Administration
71
72
Chapter 3. Administration
73
Name The name of the GeoNetwork node. Information that helps identify the catalogue to a human user. Organization The organization the node belongs to. Again, this is information that helps identify the catalogue to a human user. Server parameters Here you have to enter the details of the web address of your GeoNetwork node. This address is important because it will be used to build addresses that access services and data on the GeoNetwork node. In particular: 1. building links to data le uploaded with a metadata record in the editor. 2. when the OGC CSW server is asked to describe its capabilities. The GetCapabilities operation returns an XML document with HTTP links to the CSW services provided by the server. These links are dynamically built using the host and port values. Protocol The HTTP protocol used to access the server. Choosing http means that all communication with GeoNetwork will be visible to anyone listening to the protocol. Since this includes usernames and passwords this is not secure. Choosing https means that all communication with GeoNetwork will be encrypted and thus much harder for a listener to decode. Host The nodes address or IP number. If your node is publicly accessible from the Internet, you have to use the domain name. If your node is hidden inside your private network and you have a rewall or web server that redirects incoming requests to the node, you have to enter the public address of the rewall or web server. A typical conguration is to have an Apache web server on address A that is publicly accessible and redirects the requests to a Tomcat server on a private address B. In this case you have to enter A in the host parameter. Port The nodes port (usually 80 or 8080). If the node is hidden, you have to enter the port on the public rewall or web server. Intranet Parameters A common need for an organisation is to automatically discriminate between anonymous internal users that access the node from within an organisation (Intranet) and anonymous external users from the Internet. GeoNetwork denes anonymous users from inside the organisation as belonging to the group Intranet, while anonymous users from outside the organisation are dened by the group All. To automatically distinguish users that belong to the Intranet group you need to tell GeoNetwork the intranet IP address and netmask. Network The intranet address in IP form (eg. 147.109.100.0). Netmask The intranet netmask (eg. 255.255.255.0). Metadata Search Results Conguration settings in this group determine what the limits are on user interaction with the search results. Maximum Selected Records The maximum number of search results that a user can select and process with the batch operations eg. Set Privileges, Categories etc.
74
Chapter 3. Administration
Multi-Threaded Indexing Conguration settings in this group determine how many processor threads are allocated to indexing tasks in GeoNetwork. If your machine has many processor cores, you can now determine how many to allocate to GeoNetwork indexing tasks. This can bring dramatic speed improvements on large indexing tasks (eg. changing the privileges on 20,000 records) because GeoNetwork can split the indexing task into a number of pieces and assign them to different processor cores. Number of processing threads The maximum number of processing threads that can be allocated to an indexing task. Note: this option is only available for databases that have been tested. Those databases are PostGIS and Oracle. You should also carefully consider how many connections to the database you allocate in the database conguration as each thread could tie up one database connection for the duration of a long indexing session (for example). See Advanced conguration for more details of how to congure the number of connections in the database connection pool. Lucene Index Optimizer Conguration settings in this group determine when the Lucene Index Optimizer is run. By default, this takes place at midnight each day. With recent upgrades to Lucene, particularly Lucene 3.6.1, the optimizer is becoming less useful, so this conguration group will very likely be removed in future versions. Z39.50 conguration GeoNetwork can act as a Z39.50 server. Z39.50 is the name of an older communication protocol used for distributed searching across metadata catalogs. Enable: Check this option to enable the Z39.50 server, uncheck it to disable the Z39.50 server. Port: This is the port on which GeoNetwork will be listening for incoming Z39.50 requests. Z3950 servers can run on any port, but 210 (not recommended), 2100 and 6668 are common choices. If you have multiple GeoNetwork nodes running on the same machine then you need to make sure each one has a different port number. GeoNetwork must be restarted to put any changes to these values into use. OAI Provider Options in this group control the way in which the OAI Server in GeoNetwork responds to OAIPMH harvest requests from remote sites. Datesearch: OAI Harvesters may request records from GeoNetwork in a date range. GeoNetwork can use one of two date elds from the metadata to check for a match with this date range. The default choice is Temporal extent, which is the temporal extent from the metadata record. The other option, Modication date, uses the modication date of the metadata record in the GeoNetwork database. The modication date is the last time the metadata record was updated in or harvested by GeoNetwork. Resumption Token Timeout: Metadata records that match an OAI harvest search request are usually returned to the harvester in groups with a xed size (eg. in groups of 10 records). With each group a resumption token is included so that the harvester can request the next group of records. The resumption token timeout is the time (in seconds) that GeoNetwork OAI server will wait for a resumption token to 3.1. Basic conguration 75
be used. If the timeout is exceeded GeoNetwork OAI server will drop the search results and refuse to recognize the resumption token. The aim of this feature is to ensure that resources in the GeoNetwork OAI server are released. Cache size: The maximum number of concurrent OAI harvests that the GeoNetwork OAI server can support. GeoNetwork must be restarted to put any changes to the resumption token timeout and the Datesearch options into use. XLink resolver The XLink resolver replaces the content of elements with an attribute @xlink:href (except for srv:operatesOn element) with the content obtained from the URL content of @xlink:href. The XLink resolver should be enabled if you want to harvest metadata fragments or reuse fragments of metadata in your metadata records. Enable: Enables/disables the XLink resolver. Note: to improve performance GeoNetwork will cache content that is not in the local catalog. Search Statistics Enables/disables search statistics capture. Search statistics are stored in the database and can be queried using the Search Statistics interface on the Administration page. There is very little compute overhead involved in storing search statistics as they are written to the database in a background thread. However database storage for a very busy site must be carefully planned. Multilingual Settings Options in this group determine how GeoNetwork will search metadata in multiple languages. Enable auto-detecting search request language: If this option is selected, Geonetwork will analyse the search query and attempt to detect the language that is used before defaulting to the GUI language. Search results in requested language sorted on top: If this option is selected, a sort clause will be added to each query to ensure that results in the current language are always sorted on top. This is different from increasing priority of the language in that it overrides the relevance of the result. For example, if a german result has very high relevance but the search language is french then the french results will all come before the german result. Search only in requested language The options in this section determines how documents are sorted/prioritised relative to the language in the document compared to the search language. All documents in all languages (No preferences) - The search language is ignored and will have no effect on the ordering of the results Prefer documents with translations requested language - Documents with a translation in the search language (anywhere in the document) will be prioritized over documents without any elements in the search language Prefer documents whose language is the requested language - Documents that are the same language as the search language (ie. the documents that are specied as being in the same language as the search language) are prioritized over documents that are not.
76
Chapter 3. Administration
Translations in requested language - The search results will only contain documents that have some translations in the search language. Document language is the requested language - The search results will contain documents whose metadata language is specied as being the in search language Data-For-Download Service GeoNetwork editor supports uploading one or more les that can be stored with the metadata record. When such a record is displayed in the search results, a Download button is provided which will allow the user to select which le they want to download. This option group determines how that download will occur. Use GeoNetwork simple le download service: Clicking on any le stored with the metadata record will deliver that le directly to the user via the browser. Use GeoNetwork disclaimer and constraints service: Clicking on any le stored with the metadata record will deliver a zip archive to the user (via the browser) that contains the data le, the metadata record itself and a summary of the resource constraint metadata as an html document. In addition, the user will need to provide some details (name, organisation, email and optional comment) and view the resource constraints before they can download the zip archive. Clickable hyperlinks Enables/disables hyperlinks in metadata content. If a URL is present in the metadata content, GeoNetwork will detect this and make it into a clickable hyperlink when it displays the metadata content. Local rating Enables/disables local rating of metadata records. Automatic xes For each metadata schema, GeoNetwork has an XSLT that it can apply to a metadata record belonging to that schema. This XSLT is called update-xed-info.xsl and the aim of this XSLT is to allow xed schema, site and GeoNetwork information to be applied to a metadata record every time the metadata record is saved in the editor. As an example, GeoNetwork uses this XSLT to build and store the URL of any les uploaded and stored with the metadata record in the editor. Enable: Enabled by default. It is recommended you do not use the GeoNetwork default or advanced editor when auto-xing is disabled. See http://trac.osgeo.org/geonetwork/ticket/368 for more details. INSPIRE Enables/disables the INSPIRE search options in advanced search panel.
77
Metadata Views Options in this section enable/disable metadata element groups in the metadata editor/viewer. Enable simple view: The simple view in the metadata editor/viewer: - removes much of the hierarchy from nested metadata records (such as ISO19115/19139) - will not let the user add metadata elements that are not already in the metadata record It is intended to provide a at, simple view of the metadata record. A disadvantage of the simple view is that some of the context information supplied by the nesting in the metadata record is lost. Enable ISO view: The ISO19115/19139 metadata standard denes three groups of elements: - Minimum: those elements that are mandatory - Core: the elements that should be present in any metadata record describing a geographic dataset - All: all the elements Enable INSPIRE view: Enables the metadata element groups dened in the EU INSPIRE directive. Enable XML view: This is a raw text edit view of the XML record. You can disable this if (for example), you dont want inexperienced users to be confused by the XML presentation provided by this view. Metadata Privileges Only set privileges to users groups: If enabled then only the groups that the user belongs to will be displayed in the metadata privileges page (unless the user is an Administrator). At the moment this option cannot be disabled and is likely to be deprecated in the next version of GeoNetwork. Harvesting Allow editing on harvested records: Enables/Disables editing of harvested records in the catalogue. By default, harvested records cannot be edited. Proxy For some functions (eg. harvesting) GeoNetwork must be able to connect to remote sites. This may not be possible if an organisation uses proxy servers. If your organisation uses a proxy server then GeoNetwork must be congured to use the proxy server in order to correctly route outgoing requests to remote sites. Use: Checking this box will display the proxy conguration options panel.
Figure 3.5: The proxy conguration options Host: The proxy server name or address to use (usually an IP address). Port: The proxy server port to use. 78 Chapter 3. Administration
Username (optional): a username should be provided if the proxy server requires authentication. Password (optional): a password should be provided if the proxy server requires authentication. Feedback GeoNetwork needs to send email if: you are using the User Self-registration system or the Metadata Status workow a le uploaded with a metadata record is downloaded a user provides feedback using the online form. You have to congure the mail server GeoNetwork should use in order to enable it to send these emails. Email: This is the email address that will be used to send the email (the From address). SMTP host: the mail server address to use when sending email. SMTP port: the mail server SMTP port (usually 25). Removed metadata Denes the directory used to store a backup of metadata and data after a delete action. This directory is used as a backup directory to allow system administrators to recover metadata and possibly related data after erroneous deletion. By default the removed directory is created in the GeoNetwork data folder. Authentication In this section you dene the source against which GeoNetwork will authenticate users and passwords.
Figure 3.6: Authentication conguration options By default, users are authenticated against info held in the GeoNetwork database. When the GeoNetwork database is used as the authentication source, the user self-registration function can be enabled. A later section discusses user self-registration and the conguration options it requires. You may choose to authenticate logins against either the GeoNetwork database tables or LDAP (the lightweight directory access protocol) but not both. The next section describes how to authenticate against LDAP. In addition to either of these options, you may also congure other authentication sources. At present, Shibboleth is the only additional authentication source that can be congured. Shibboleth is typically 3.1. Basic conguration 79
used for national access federations such as the Australian Access Federation. Conguring shibboleth authentication in GeoNetwork to use such a federation would allow not only users from a local database or LDAP directory to use your installation, but any user from such a federation.
LDAP Authentication
The section denes how to connect to an LDAP authentication system.
Figure 3.7: The LDAP conguration options Typically all users must have their details in the LDAP directory to login to GeoNetwork. However if a user is added to the GeoNetwork database with the Administrator prole then they will be able to login without their details being present in the LDAP directory.
Shibboleth Authentication
When using either the GeoNetwork database or LDAP for authentication, you can also congure shibboleth to allow authentication against access federations. Shibboleth authentication requires interaction with Apache web server. In particular, the apache web server must be congured to require Shibboleth authentication to access the path entered in the conguration. The apache web server conguration will contain the details of the shibboleth server that works out where a user is located (sometimes called a where are you from server). The remainder of the shibboleth login conguration describes how shibboleth authentication attributes are mapped to GeoNetwork user database elds as once a user is authenticated against shibboleth, their details are copied to the local GeoNetwork database.
80
Chapter 3. Administration
congure feedback email address, SMTP host and SMTP port. The feedback email address will be sent an email when a new user registers and requests a prole other than Registered User. An example of how to cong these elds in the system conguration form is:
check the box, enable user self-registration in the Authentication section of the system conguration form as follows: 3.1. Basic conguration 81
When you save the system conguration form, return to the home page and log out as admin, your banner menu should now include two new options, Forgot your password? and Register (or their translations into your selected language) as follows:
You should also congure the xml le that includes contact details to be displayed when an error occurs in the registration process. This le is localized - the english version is located in INSTALL_DIR/web/geonetwork/loc/en/xml/registration-sent.xml. Finally, if you want to change the content of the email that contains registration details for new users, you should modify INSTALL_DIR/web/geonetwork/xsl/registration-pwd-email.xsl.
83
84
Chapter 3. Administration
Abstract: The abstract of the CSW service. The abstract can contain a brief description of what the service provides and who runs it. Fees: If there are any fees for usage of the service then they should be detailed here. Access constraints: If there are any constraints on access to the service then they should be detailed here. The last function on this page is the CSW ISO Prole test. Clicking on this link brings up a javascript based interface that allows you to submit requests to the CSW server. The requests used by this interface are XML les in INSTALL_DIR/web/geonetwork/xml/csw/test.
<service name="csw-with-my-filter-environment"> <class name=".services.main.CswDispatcher" > <param name="filter" value="+inspirerelated:on +themekey:environment"/> </class> </service> <service name="csw-with-my-filter-climate"> <class name=".services.main.CswDispatcher" > <param name="filter" value="+inspirerelated:on +themekey:climate"/> </class> </service>
The filter parameter value should use the Lucene query parser syntax (see http://lucene.apache.org/java/2_9_1/queryparsersyntax.html) and is use in these CSW operations: GetRecords: the filter is applied with the CSW query as an extra query criteria. GetRecordById: the filter is applied with the metadata id requested as an extra query criteria. GetDomain: the filter is applied as a query criteria to retrieve the metadata properties requested. GetCapabilities: the filter is applied as a query criteria to ll the metadata keywords list in the GetCapabilities document. The list of available Lucene index elds to use in the filter parameter can be obtained from the les index-fields.xsl in the schema folders located in WEB-INF/xml/schemas. As Harvest and Transaction operations are not affected by filter parameter, to avoid confusion is better to use this feature as readonly CSW endpoints.
85
Conguration Adding a new CSW entry point to GeoNetwork opensource requires these steps (suppose the new CSW entry point is call csw-with-my-filter-environment): Create the service denition in the conguration le WEB-INF/config-csw-servers.xml with the custom lter criteria as describe before:
<service name="csw-with-my-filter-environment"> <class name=".services.main.CswDispatcher" > <param name="filter" value="+inspirerelated:on +themekey:environment"/ </class> </service>
Restart the application. The new CSW entry http://localhost:8080/srv/en/csw-with-my-lter-environment Conguration using GeoNetwork overrides
point
is
accessible
in
In this section is described how to use GeoNetwork overrides feature to congure a new CSW entry point. This feature allows to use different congurations to handle multiple deployment platforms. See additional documentation of this feature in Conguration override. Add the next override to a conguration WEB-INF/config-overrides-csw.xml: override le, for example
<overrides xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> <!-- Add custom CSW service --> <file name=".*/WEB-INF/config-csw-servers.xml"> <addXML xpath="services"> <service name="csw-with-my-filter-environment"> <class name=".services.main.CswDispatcher"> <param name="filter" value="+inspirerelated:on +themekey:environment"/> </class> </service> </addXML> </file> <file name=".*/WEB-INF/user-profiles.xml"> <addXML xpath="profile[@name=Guest]"> <allow service="csw-with-my-filter-environment"/> </addXML> </file> </overrides>
For more information about conguration overrides see Conguration override Restart the application. The new CSW entry http://localhost:8080/srv/en/csw-with-my-lter-environment point is accessible in
86
Chapter 3. Administration
The parameters that can be specied to control the Apache Database Connection Pool are described at http://commons.apache.org/dbcp/conguration.html. You can congure a subset of these parameters in your resource element. The parameters that can be specied are: Parameter maxActive maxIdle minIdle maxWait validationQuery timeBetweenEvictionRunsMillis testWhileIdle minEvictableIdleTimeMillis numTestsPerEvictionRun maxOpenPreparedStatements defaultTransactionIsolation Description pool size/maximum number of active connections maximum number of idle connections minimum number of idle connections number of milliseconds to wait for a connection to become available sql statement for verifying a connection, must return a least one row time between eviction runs (-1 means next three params are ignored) test connections when idle idle time before connection can be evicted number of connections tested per eviction run Default 10 maxActive 0 200 no default -1 false 30 x 60 x 1000 msecs 3
number of sql statements that can be cached for reuse -1 (-1 none, 0 unlimited) see READ_COMMITTED http://en.wikipedia.org/wiki/Isolation_%28database_systems%29 87
For performance reasons you should set the following parameter after GeoNetwork has created and lled the database tables it has been congured to use: maxOpenPreparedStatements=300 (at least) The following parameters are set by GeoNetwork and cannot be congured by the user: removeAbandoned - true removeAbandonedTimeout - 60 x 60 seconds = 1 hour logAbandoned - true testOnBorrow - true defaultReadOnly - false defaultAutoCommit - false initialSize - maxActive poolPreparedStatements - true, if maxOpenPreparedStatements >= 0, otherwise false Note: Some rewalls kill idle connections to databases after say 1 hour (= 3600 secs). To keep idle connections alive by testing them with the validationQuery, set minEvictableIdleTimeMillis to something less than timeout, interval (eg. 2 mins = 120 secs = 120000 millisecs), set testWhileIdle to true and set timeBetweenEvictionRunsMillis and numTestsPerEvictionRun high enough to visit connections frequently eg 15 mins = 900 secs = 900000 millisecs and 4 connections per test. For example:
<testWhileIdle>true</testWhileIdle> <minEvictableIdleTimeMillis>120000</minEvictableIdleTimeMillis> <timeBetweenEvictionRunsMillis>900000</timeBetweenEvictionRunsMillis> <numTestsPerEvictionRun>4</numTestsPerEvictionRun>
Note: When GeoNetwork manages the database connection pool, PostGIS database is the only database that can hold the spatial index in the database. All other database choices hold the spatial index as a shapele. If using PostGIS, two pools of database connections are created. The rst is managed and congured using parameters in this section, the second is created by GeoTools and cannot be congured. This approach is now deprecated: if you want to use the database to hold the spatial index you should use the JNDI conguration described in the next section because it uses a single, congurable database pool through GeoTools as well as the more modern NG (Next Generation) GeoTools datastore factories. For more on transaction isolation see http://en.wikipedia.org/wiki/Isolation_%28database_systems%29. Database connection pool managed by the container A typical conguration in the resources element of INSTALL_DIR/web/geonetwork/WEBINF/cong.xml uses the jeeves.resources.dbms.JNDIPool class and looks something like:
<resource enabled="true"> <name>main-db</name> <provider>jeeves.resources.dbms.JNDIPool</provider> <config> <context>java:/comp/env</context> <resourceName>jdbc/geonetwork</resourceName> <url>jdbc:oracle:thin:@localhost:1521:XE</url>
88
Chapter 3. Administration
The conguration parameters and their meanings are as follows: Cong Parameter context resourceName url provideDataStore Description The name of the context from which to obtain the resource - almost always this is java:/comp/env The name of the resource in the context to use The URL of the database - this is needed to let GeoTools know the database type If set to true then the database will be used for the spatial index, otherwise a shapele will be used
The remainder of the conguration is done in the container context. eg. for tomcat this conguration is in conf/context.xml in the resource called jdbc/geonetwork. Here is an example for the Oracle database:
<Resource name="jdbc/geonetwork" auth="Container" type="javax.sql.DataSource" username="system" password="oracle" factory="org.apache.commons.dbcp.BasicDataSourceFactory" driverClassName="oracle.jdbc.OracleDriver" url="jdbc:oracle:thin:@localhost:1521:XE" maxActive="10" maxIdle="10" removeAbandoned="true" removeAbandonedTimeout="3600" logAbandoned="true" testOnBorrow="true" defaultAutoCommit="false" validationQuery="SELECT 1 FROM DUAL" accessToUnderlyingConnectionAllowed="true" />
eg. for jetty, this conguration is in INSTALL_DIR/web/geonetwork/WEB-INF/jetty-env.xml. Here is an example for the Postgis database:
<Configure class="org.eclipse.jetty.webapp.WebAppContext"> <New id="gnresources" class="org.eclipse.jetty.plus.jndi.Resource"> <Arg></Arg> <Arg>jdbc/geonetwork</Arg> <Arg> <New class="org.apache.commons.dbcp.BasicDataSource"> <Set name="driverClassName">org.postgis.DriverWrapper</Set> <Set name="url">jdbc:postgresql_postGIS://localhost:5432/gndb</Set> <Set name="username">geonetwork</Set> <Set name="password">geonetworkgn</Set> <Set name="validationQuery">SELECT 1</Set> <Set name="maxActive">10</Set> <Set name="maxIdle">10</Set> <Set name="removeAbandoned">true</Set> <Set name="removeAbandonedTimeout">3600</Set> <Set name="logAbandoned">true</Set> <Set name="testOnBorrow">true</Set>
89
<Set name="defaultAutoCommit">false</Set> <!-- 2=READ_COMMITTED, 8=SERIALIZABLE --> <Set name="defaultTransactionIsolation">2</Set> <Set name="accessToUnderlyingConnectionAllowed">true</Set> </New> </Arg> <Call name="bindToENC"> <Arg>jdbc/geonetwork</Arg> </Call> </New> </Configure>
The parameters that can be specied to control the Apache Database Connection Pool used by the container are described at http://commons.apache.org/dbcp/conguration.html. The following parameters must be set to ensure GeoNetwork operates correctly: Tomcat Syntax defaultAutoCommit=false accessToUnderlyingConnectionAllowed=true Jetty Syntax <Set name=defaultAutoCommit>false</Set> <Set name=accessToUnderlyingConnectionAllowed>true</Set>
For performance reasons you should set the following parameters after GeoNetwork has created and lled the database it has been congured to use: Tomcat Syntax poolPreparedStatements=true maxOpenPreparedStatements=300 (at least) Notes: both PostGIS and Oracle will build and use a table in the database for the spatialindex if provideDataStore is set to true. Other databases could be made to do the same if a spatialindex table is created - see the denition for the spatialIndex table in INSTALL_DIR/web/geonetwork/WEBINF/classes/setup/sql/create/create-db-postgis.sql for example. you should install commons-dbcp-1.3.jar and commons-pool-1.5.5.jar in the container class path (eg. common/lib for tomcat5 or jetty/lib/ext for Jetty) as the only supported DataSourceFactory in geotools is apache commons dbcp. the default tomcat-dbcp.jar version of apache commons dbcp for tomcat appears to work correctly for geotools and PostGIS but does not work for those databases that need to unwrap the connection in order to do spatial operations (eg. Oracle). Oracle ojdbc-14.jar or ojdbc5.jar or ojdbc6.jar (depending on the version of Java being used) and sdoapi.jar should also be installed in the container class path (for tomcat: common/lib or lib and for jetty: jetty/lib/ext). advanced: you should check the default transaction isolation level for your database driver. READ_COMMITTED appears to be a safe level of isolation to use with GeoNetwork for commonly used databases. Also note that McKoi can only support SERIALIZABLE (does anyone still use McKoi?). For more on transaction isolation see http://en.wikipedia.org/wiki/Isolation_%28database_systems%29. Jetty Syntax <Set name=poolPreparedStatements>true</Set> <Set name=maxOpenPreparedStatements>300</Set>
90
Chapter 3. Administration
Oracle
ORACLE on Linux (x86_64): if your connection with the database takes a long time to establish or frequently times out then adding -Djava.security.egd=le:/dev/../dev/urandom to your JAVA_OPTS environment variable (for tomcat) or the start-geonetwork.sh script may help. For more information on this see https://kr.forums.oracle.com/forums/thread.jspa?messageID=3699989. ORACLE returns ORA-01000: maximum open cursors exceeded whilst lling the tables in a newly created GeoNetwork database. This occurs because you have enabled the prepared statement pool in either the container database conguration or the GeoNetwork database conguration in WEB-INF/cong.xml. Until the database ll statements used by GeoNetwork are refactored, you will not be able to use a prepared statement cache with ORACLE if you are creating and lling a new GeoNetwork database so you should set the DBCP maxOpenPreparedStatements parameter to -1. However, after the database has been created and lled, you can use a prepared statement cache so, you should stop GeoNetwork and congure the prepared statement cache as described above before restarting.
DB2
DB2 may produce an exception when GeoNetwork is started for the rst time:
DB2 SQL error: SQLCODE: -805, SQLSTATE: 51002, SQLERRMC: NULLID.SYSLH203
There are two possible solutions to this problem: Setup the database manually using a procedure like the following:
db2 db2 db2 db2 db2
create db geonet connect to geonet user db2inst1 using mypassword -tf INSTALL_DIR/WEB-INF/classes/setup/sql/create/create-db-db2.sql > res1.txt -tf INSTALL_DIR/WEB-INF/classes/setup/sql/data/data-db-default.sql > res2.txt connect reset
After execution, check res1.txt and res2.txt if errors have occurred. Drop the database, re-create it, locate the le db2cli.lst in the db2 installation folder and execute the following command:
db2 bind @db2cli.lst CLIPKG 30**
boxes and polygons, in order to support spatial queries for the Catalog Services Web (CSW) interface eg. select all metadata records that intersect a search polygon. By default GeoNetwork uses a shapele but the shapele quickly becomes costly to maintain during reindexing usually after the number of records in the catalog exceeds 20,000. If you select PostGIS or Oracle as your database via JNDI (see previous section), GeoNetwork will build the spatial index in a table (called spatialindex). The spatialindex table in the database is much faster to reindex. But more importantly, if appropriate database hardware and conguration steps are taken, it should also be faster to query than the shapele when the number of records in the catalog becomes very large. 3. Consider the Java heap space Typically as much memory as you can give GeoNetwork is the answer here. If you have a 32bit machine then you are stuck below 2Gb (or maybe a little higher with some hacks). A 64bit machine is best for large catalogs. Jetty users can set the Java heap space in INSTALL_DIR/bin/start-geonetwork.sh (see the -Xmx option: eg. -Xmx4g will set the heap space to 4Gb on a 64bit machine). Tomcat users can set an environment variable JAVA_OPTS eg. export JAVA_OPTS=-Xmx4g 4. Consider the number of processors you wish to allocate to GeoNetwork GeoNetwork 2.8 allows you to use more than one system processor (or core) to speed up reindexing and batch operations on large numbers of metadata records. The records to be processed are split into groups with each group assigned to an execution thread. You can specify how many threads can be used in the system conguration menu. A reasonable value for the number of threads is the number of processors or cores you have allocated to the GeoNetwork Java Virtual Machine (JVM) or just the number of processors on the machine that you have dedicated to GeoNetwork. 5. Consider the number of database connections to be allocated to GeoNetwork GeoNetwork uses and reuses a pool of database connections. This is congured in INSTALL_DIR/web/geonetwork/WEB-INF/cong.xml or in the container via JNDI. To arrive at a reasonable number for the pool size is not straight forward. You need to consider the number of concurrent harvesters you will run, the number of concurrent batch import and batch operations you expect to run and the number of concurrent users you are expecting to arrive. The default value of 10 is really only for small sites. The more connections you can allocate, the less time your users and other tasks will spend waiting for a free connection. 6. Consider the maximum number of les your system will allow any process to have open Most operating systems will only allow a process to open a limited number of les. If you are expecting a large number of records to be in your catalog then you should change the default value to something larger (eg. 4096) as the lucene index in GeoNetwork will occasionally require large numbers of open les during reindexing. In Linux this value can be changed using the ulimit command (ulimit -a typically shows you the current setting). Find a value that suits your needs and add the appropriate ulimit command (eg. ulimit -n 4096) to the GeoNetwork startup script to make sure that the new limit is used when GeoNetwork is started. 7. Raise the stack size limit for the postgres database Each process has some memory allocated as a stack. The stack is used to store process arguments and variables as well as state when functions are called. Most operating systems limit the size that the stack can grow to. With large catalogs and spatial searches, very large SQL queries can be generated on the PostGIS spatial index table. This can cause postgres to exceed the process stack size limit (typically 8192k on smaller machines). You will know when this happens because a very long SQL query will be output to the GeoNetwork log le prexed with a cryptic message something along the lines of:
java.util.NoSuchElementException: Could not acquire feature:org.geotools.data.DataSourceException: Error Performing SQL query: SELECT
In Linux the stack size can be changed using the ulimit command (ulimit -a typically shows you the current setting). You will need to choose a value and set it (eg. ulimit -s 262140) in 92 Chapter 3. Administration
the shell startup script of the postgres user (eg. .bashrc if using the bash shell). The setting may also need to be added to the postgres cong - see max_stack_depth in the postgresql.conf le for your system. You may also have to enable to postgres user to change the stack size in /etc/security/limits.conf. After this has been done, restart postgres. 8. If you need to support a catalog with more than 1 million records GeoNetwork creates a directory for each record that in turn contains a public and a private directory for holding attached data and thumbnails. These directories are in the GeoNetwork data directory - typically: INSTALL_DIR/web/geonetwork/WEB-INF/data - see GeoNetwork data directory. This can exhaust the number of inodes available in a Linux le system (you will often see misleading error reports saying that the lesystem is out of space - even though the lesystem may have lots of freespace). Check this using df -i. Since inodes are allocated statically when the lesystem is created for most common lesystems (including extfs4), it is rather inconvenient to have to backup all your data and recreate the lesystem! So if you are planning a large catalog with over 1 million records, make sure that you create a lesystem on your machine with the number of inodes set to at least 5x (and to be safe 10x) the number of records you are expecting to hold and let GeoNetwork create its data directory on that lesystem.
93
System environment variable For java environment variable and servlet context parameter use: <webappName>.dir and if not set using: geonetwork.dir For system environment variable use: <webappName>_dir and if not set using: geonetwork_dir Java System Property Depending on the servlet container used it is also possible to specify the data directory location with a Java System Property. For Tomcat, conguration is:
CATALINA_OPTS="-Dgeonetwork.dir=/var/lib/geonetwork_data"
Run the web application in read-only mode In order to run GeoNetwork with the webapp folder in read-only mode, the user needs to set two variables: <webappName>.dir or geonetwork.dir for the data folder. (optional) cong overrides if conguration les need to be changed (See Conguration override). For Tomcat, conguration could be:
CATALINA_OPTS="-Dgeonetwork.dir=/var/lib/geonetwork_data -Dgeonetwork.jeeves.configurati
Structure of the data directory The structure of the data directory is:
data_directory/ |--data | |--metadata_data: The data related to metadata records | |--resources: | | |--htmlcache | | |--images | | | |--harvesting | | | |--logos | | | |--statTmp | | | |--metadata_subversion: The subversion repository | |--config: Extra configuration (eg. could contain overrides) | |--schemaplugin-uri-catalog.xml | |--codelist: The thesauri in SKOS format | |--schemaPlugins: The directory used to store new metadata standards |
94
Chapter 3. Administration
|--index: All indexes used for search | |--nonspatial: Lucene index | |--spatialindex.*: ESRI Shapefile for the index (if not using spatial db) | |--removed: Folder with removed metadata.
Advanced data directory conguration All sub-directories could be congured separately using java system property. For example, to put index directory in a custom location use: <webappName>.lucene.dir and if not set using: geonetwork.lucene.dir Example: Add the following java properties to start-geonetwork.sh script:
System information All catalogue conguration directory can be found using the System Information in the Administration page.
Other system properties In Geonetwork there are several system properties that can be used to congure different aspects of Geonetwork. When a webcontainer is started the properties can be set. For example in Tomcat one can set either JAVA_OPTS or CATALINA_OPTS with -D<propertyname>=<value>.
95
<webappname>.jeeves.conguration.overrides.le - See Conguration override jeeves.conguration.overrides.le - See Conguration override mime-mappings - mime mappings used by jeeves for generating the response content type http.proxyHost - The internal geonetwork Http proxy uses this for conguring how it can access the external network (Note for harvesters there is also a setting in the Settings page of the administration page) http.proxyPort - The internal geonetwork Http proxy uses this for conguring how it can access the external network (Note for harvesters there is also a setting in the Settings page of the administration page) geonetwork.sequential.execution - (true,false) Force indexing to occur in current thread rather than being queued in the ThreadPool. Good for debugging issues. There is a usecase where multiple geonetwork instances might be ran in the same webcontainer, because of this many of the system properties listed above have <webappname>. When declaring the property this should be replaced with the webapp name the setting applies to. Typically this will be geonetwork.
System property with key: {servlet.getServletContext().getServletContextName()}.jeeves.conguration.overr Servlet init parameter with key: jeeves.conguration.overrides.le System property with key: jeeves.conguration.overrides.le Servlet context init parameters with key: jeeves.conguration.overrides.le The property should be a path or a URL. The method used to nd a overrides le is as follows: 1. It is attempted to be used as a URL. if an exception occurs the next option is tried 2. It is assumed to be a path and uses the servlet context to look up the resources. If it can not be found the next option is tried 3. It is assumed to be a le. If the le is not found then an exception is thrown An example of a overrides le is as follows:
96
Chapter 3. Administration
<overrides> <!-- import values. The imported values are put at top of sections --> <import file="./imported-config-overrides.xml" /> <!-- properties allow some properties to be defined that will be substituted --> <!-- into text or attributes where ${property} is the substitution pattern --> <!-- The properties can reference other properties --> <properties> <enabled>true</enabled> <dir>xml</dir> <aparam>overridden</aparam> </properties> <!-- A regular expression for matching the file affected. --> <file name=".*WEB-INF/config\.xml"> <!-- This example will update the file attribute of the xml element with the na <replaceAtt xpath="default/gui/xml[@name = countries]" attName="file" value=" <!-- if there is no value then the attribute is removed --> <replaceAtt xpath="default/gui" attName="removeAtt"/> <!-- If the attribute does not exist it is added --> <replaceAtt xpath="default/gui" attName="newAtt" value="newValue"/>
<!-- This example will replace all the xml in resources with the contained xml <replaceXML xpath="resources"> <resource enabled="${enabled}"> <name>main-db</name> <provider>jeeves.resources.dbms.DbmsPool</provider> <config> <user>admin</user> <password>admin</password> <driver>oracle.jdbc.driver.OracleDriver</driver> <!-- ${host} will be updated to be local host --> <url>jdbc:oracle:thin:@${host}:1521:fs</url> <poolSize>10</poolSize> </config> </resource> </replaceXML> <!-- This example simple replaces the text of an element --> <replaceText xpath="default/language">${lang}</replaceText> <!-- This examples shows how only the text is replaced not the nodes --> <replaceText xpath="default/gui">ExtraText</replaceText> <!-- append xml as a child to a section (If xpath == "" then that indicates the this case adds nodes to the root document --> <addXML xpath=""><newNode/></addXML> <!-- append xml as a child to a section, this case adds nodes to the root docum <addXML xpath="default/gui"><newNode2/></addXML> <!-- remove a single node --> <removeXML xpath="default/gui/xml[@name = countries2]"/> <!-- The logging files can also be overridden, although not as easily as other The files are assumed to be property files and all the properties are load The later properties overriding the previously defined parameters. Since t log file is not automatically located, the base must be also defined. It shipped with geonetwork or another. --> <logging> <logFile>/WEB-INF/log4j.cfg</logFile> <logFile>/WEB-INF/log4j-jeichar.cfg</logFile> </logging> </file> <file name=".*WEB-INF/config2\.xml">
97
<replaceText xpath="default/language">de</replaceText> </file> <!-- a normal file tag is for updating XML configuration files --> <!-- textFile tags are for updating normal text files like sql files --> <textFile name="test-sql.sql"> <!-- each line in the text file is matched against the linePattern attribute an <update linePattern="(.*) Relations">$1 NewRelations</update> <update linePattern="(.*)relatedId(.*)">$1${aparam}$2</update> </textFile> </overrides>
Usually, if the eld is only for searching and should not be displayed in search results the store attribute could be set to false. Once the eld added to the index, user could query using it as a search criteria in the different kind of search services. For example using:
http://localhost:8080/geonetwork/srv/en/q?mytitle=africa
If user wants this eld to be tokenized, it should be added to the tokenized section of cong-lucene.xml:
<tokenized> <Field name="mytitle"/>
If user wants this eld to be returned in search results for the search service, then the eld should be added to the Lucene conguration in the dumpFields section:
<dumpFields> <field name="mytitle" tagName="mytitle"/>
Boosting documents and elds Document and eld boosting allows catalogue administrator to be able to customize default Lucene scoring in order to promote certain types of records. A common use case is when the catalogue contains lot of series for aggregating datasets. Not promoting the series could make the series useless even if those records contains important content. Boosting 98 Chapter 3. Administration
this type of document allows to promote series and guide the end-user from series to related records (through the relation navigation). In that case, the following conguration allows boosting series and minor importance of records part of a series:
<boostDocument name="org.fao.geonet.kernel.search.function.ImportantDocument"> <Param name="fields" type="java.lang.String" value="type,parentUuid"/> <Param name="values" type="java.lang.String" value="series,NOTNULL"/> <Param name="boosts" type="java.lang.String" value=".2F,-.3F"/> </boostDocument>
The boost is a positive or negative oat value. This feature has to be used by expert users to alter default search behavior scoring according to catalogue content. It needs tuning and experimentation to not promote too much some records. During testing, if search results looks different while being logged or not, it could be relevant to ignore some internal elds in boost computation which may alter scoring according to current user. Example conguration:
<fieldBoosting> <Field name="_op0" boost="0.0F"/> <Field name="_op1" boost="0.0F"/> <Field name="_op2" boost="0.0F"/> <Field name="_dummy" boost="0.0F"/> <Field name="_isTemplate" boost="0.0F"/> <Field name="_owner" boost="0.0F"/> </fieldBoosting>
Boosting search results By default Lucene compute score according to search criteria and the corresponding result set and the index content. In case of search with no criteria, Lucene will return top docs in index order (because none are more relevant than others). In order to change the score computation, a boost function could be dene. Boosting query needs to be loaded in classpath. A sample boosting class is available. RecencyBoostingQuery will promote recently modied documents:
<boostQuery name="org.fao.geonet.kernel.search.function.RecencyBoostingQuery"> <Param name="multiplier" type="double" value="2.0"/> <Param name="maxDaysAgo" type="int" value="365"/> <Param name="dayField" type="java.lang.String" value="_changeDate"/> </boostQuery>
2. Select Add a new group. You may want to remove the Sample group;
3. Fill out the details. The email address will be used to send feedback on data downloads when they occur for resources that are part of the Group. Warning: The Name should NOT contain spaces! You can use the Localization panel to provide localized names for groups.
100
Chapter 3. Administration
Figure 3.11: Group edit form 4. Click on Save Access privileges can be set per metadata record. You can dene privileges on a per Group basis. Privileges that can be set relate to visibility of the Metadata (Publish), data Download, Interactive Map access and display of the record in the Featured section of the home page. Editing denes the groups for which editors can edit the metadata record. Notify denes what Groups are notied when a le managed by GeoNetwork is downloaded. Below is an example of the privileges management table related to a dataset.
101
103
3. Click on Save.
104
Chapter 3. Administration
3.5 Localization
3.5.1 Localization of dynamic user interface elements
The user interface of GeoNetwork can be localized into several languages through XML language les. Beside static text, there is also more dynamic text that can be added and changed interactively. This text is stored in the database and is translated using the Localization form that is part of the administrative functions.
Figure 3.16: How to open the Localization form The form allows you to localize the following entities: Groups, Categories, Operations and Regions. The localization form is subdivided in a left and a right panel. The left panel allows you to choose which elements you want to edit. On the top, a dropdown let you choose which entity to edit. All elements of the selected type are shown in a list. When you select an element from the list, the right panel will show the text as it will be displayed in the user interface. The text in the source language is read only while you can update the text in the target language eld. 3.5. Localization 105
Note: You can change the source and target languages to best suit your needs. Some users may for instance prefer to translate from French to Spanish, others prefer to work with English as the source language. Use the Save button to store the updated label and move to the next element. Warning: If the user changes a label and chooses another target language without saving, the label change is lost.
/criticalhealthcheck - runs only the critical (fast) health checks and returns 200 if all checks pass or 500 Internal Service Error if one fails /warninghealthcheck - runs only the non-critical health checks and returns 200 if all checks pass or 500 Internal Service Error if one fails /expensivehealthcheck - runs only the expensive critical health checks and returns 200 if all checks pass or 500 Internal Service Error if one fails /monitor - provide links to pages listed above. Links to this data is also available in the geonetwork/srv/eng/cong.info administration user interface as well. By default the /monitor/* urls are protected and may only be accessed by an administrator or monitor, however it is possible in the web.xml to provide a whitelist of URLs or IP addresses of monitoring servers that are permitted to access the monitoring data without needing an administration account. The monitors available are: Database Health Monitor - checks that the database is accessible Index Health Monitor - checks that the Lucene index is searchable Index Error Health Monitor - checks that there are no index errors in index (documents with _indexError eld == 1) CSW GetRecords Health Monitor - Checks that GetRecords? does not return an error for a basic hits search CSW GetCapabilities Health Monitor - Checks that the GetCapabilities is returned and is not an error document Database Access timer - Time taken to access a DBMS instance. This gives and idea of the level of contention over the database connections Database Open Timer - Tracks the length of time a Database access is kept open Database Connection Counter - Counts the number of open Database connections Harvester Error Counter - Tracks errors that are raised during harvesting Service timer - Track the time of service execution Gui Services timer - Track the time of spend executing Gui services XSL output timer - Track the time of output xsl transform Log4j integration - monitors the frequency that logs are made for each log level so (for example) the rate that error are logged can be monitored. See http://metrics.codahale.com/manual/log4j The monitors that are enabled are in the cong-monitoring.xml le and if desired certain monitors can be disabled. In the source code repository there are conguration les for collectd (and perhaps other monitoring software in the future).
107
108
Chapter 3. Administration
CHAPTER 4
Managing Metadata
4.1 Templates
The Metadata and Templates options in the Administration page allows you to manage the metadata templates in the catalog. You have to be logged in as an administrator to access this page and function.
109
Figure 4.1: The listing as shown to Editors Use drag and drop to re-order the templates.
110
4.2.1 Viewing
An administrator can view any metadata. A content reviewer can view a metadata if: 1. The metadata owner is member of one of the groups assigned to the reviewer. 2. She/he is the metadata owner. A user administrator or an editor can view: 1. All metadata that has the view privilege selected for one of the groups she/he is member of. 2. All metadata created by her/him. A registered user can view: 1. All metadata that has the view privilege selected for one of the groups she/he is member of. Public metadata can be viewed by any user (logged in or not).
4.2.2 Editing
An administrator can edit any metadata. A reviewer can edit a metadata if: 1. The metadata owner is member of one of the groups assigned to the reviewer. 2. She/he is the metadata owner. A User Administrator or an Editor can only edit metadata she/he created.
111
The following rules apply: the groups that will appear in the Privileges page will be those that the user belongs to the Privileges specied will only be applied to records that the user has ownership or administration rights on - any other records will be skipped.
112
Figure 4.3: How to open the Transfer Ownership page an Editor will select all metadata that is managed by that Editor. An empty dropdown means that there are no Editors with metadata associated and hence no transfer is possible. Note: The drop down will be lled with all Editors visible to you. If you are not an Administrator, you will view only a subset of all Editors.
Figure 4.4: The Transfer Ownership page Once a Source Editor has been selected, a set of rows is displayed. Each row refers to the group of the Editor for which there are privileges. The meaning of each column is the following: 1. Source group: This is a group that has privileges in the metadata that belong to the source editor. Put in another way, if one of the editors metadata has privileges for one group, that group is listed here. 2. Target group: This is the destination group of the transferring process. All privileges relative to the source group are transferred to the target group. The target group drop down is lled with all groups visible to the logged user (typically an administrator or a user administrator). By default, 4.2. Ownership and Privileges 113
the Source group is selected in the target dropdown. Privileges to groups All and Intranet are not transferable. 3. Target editor: Once a Target group is selected, this drop down is lled with all editors that belong to that Target group. 4. Operation: Currently only the Transfer operation is possible. By selecting the Transfer operation, if the Source group is different than the Target group, the system performs the Transfer of Ownership, shows a brief summary and removes the current row because now there are no privileges to transfer anymore.
The following rules apply: Only administrators or user administrators can set ownership on a selected set of records administrators can set ownership to any user user administrators can set ownership to any user in the same group(s) as them Ownership will only be transferred on those records that the ownership or administration rights on - any others will be skipped.
114
1. XML le from the lesystem on your machine. 2. MEF le from the lesystem on your machine 3. Copy/Paste XML In order to use this facility, you have to be logged in as an editor. After the login step, go to the administration page and select the Metadata insert link.
Clicking the link will open the metadata import page. You will then have to specify a set of parameters. The following screenshot shows the parameters for importing an XML le. Well describe the options you see on this page because they are common ways you can import metadata records in this interface. File Type - First option is to choose the type of metadata record you are loading. The two choices are: Metadata - use when loading a normal metadata record Template - use when loading a metadata record that will be used as a template to build new records in the editor.
115
Figure 4.5: The XML le import options Import Action - This option group determines how to handle potential clashes between the UUID of the metadata record you are loading and the UUIDs of metadata records already present in the catalog. There are three actions and you can select one: No action on import - the UUID of the metadata record you are loading is left unchanged. If a metadata record with the same UUID is already present in the catalog, you will receive an error message. Overwrite metadata with same UUID - any existing metadata record in the catalog with the same UUID as the record you are loading will be replaced with the metadata record you are loading. Generate UUID for inserted metadata - create new a UUID for the metadata records you are loading. Stylesheet - Allows you to transform the metadata record using an XSLT stylesheet before loading the record. The drop down control is lled with the names of les taken from the INSTALL_DIR/web/geonetwork/xsl/conversion/import folder. (Files can be added to this folder without restarting GeoNetwork). As an example, you could use this option to convert a metadata into schema that is supported by GeoNetwork. Validate - The metadata is validated against its schema before loading. If it is not valid it will not be loaded. Group - Use this option to select a user group to assign to the imported metadata. Category - Use this option to select a local category to assign to the imported metadata. Categories are local to the catalogue you are using and are intended to provide a simple way of searching groups of metadata records. MEF le import If you select MEF le in the File type option, only the Import actions option group is show. See above for more details. Note: a MEF le can contain more than one metadata record.
116
Figure 4.6: The MEF le import options Copy/Paste XML If you select Copy/Paste in the Insert mode option, then a text box appears. You can copy the XML from another window and paste it into that text box. The options for loading that XML are the same as those for loading an XML le - see above.
117
118
119
Figure 4.9: The batch import options Stylesheet - Allows you to transform the metadata record using an XSLT stylesheet before loading the record. The drop down control is lled with the names of les taken from the INSTALL_DIR/web/geonetwork/xsl/conversion/import folder. (Files can be added to this folder without restarting GeoNetwork). As an example, you could use this option to convert a metadata into schema that is supported by GeoNetwork. Validate - The metadata is validated against its schema before loading. If it is not valid it will not be loaded. Group - Use this option to select a user group to assign to the imported metadata. Category - Use this option to select a local category to assign to the imported metadata. Categories are local to the catalogue you are using and are intended to provide a simple way of searching groups of metadata records. At the bottom of the page there are two buttons: Back Goes back to the administration form. Upload Starts the import process. Notes on the batch import process When the import process ends, the total count of imported metadata will be shown The import is transactional: the metadata set will be fully imported or fully discarded (there are no partial imports) Files that start with . or that do not end with .xml or .mef are ignored Structured batch import using import-cong.xml Finer control of the batch import process can be obtained by structuring the metadata les into directories mapped to categories and metadata schemas and describing the mapping in a le called import120 Chapter 4. Managing Metadata
cong.xml. The import-cong.xml should be placed in the directory from which you will batch import (see Directory parameter above). It has a cong root element with the following children: 1. categoryMapping [1]: this element species the mapping of directories to categories. (a) mapping [0..n]: This element can appear 0 or more times and maps one directory name to a category name. It must have a dir attribute that indicates the directory and a to attribute that indicates the category name. (b) default [1]: This element species a default mapping of categories for all directories that do not match the other mapping elements. The default element can only have one attribute called to. 2. schemaMapping [1]: this element species the mapping of directories to metadata schemas. (a) mapping [0..n]: This element can appear 0 or more times and maps one directory to the schema name that must be used when importing. The provided schema must match the one used by the metadata contained into the specied directory, which must all have the same schema. It must have a dir attribute that indicates the directory and a to attribute that indicates the schema name. (b) default [1]: default behaviour to use when all other mapping elements do not match. The default element can only have one attribute called to. Here is an example of the import-cong.xml le:
<config> <categoryMapping> <mapping dir="1" to="maps" /> <mapping dir="3" to="datasets" /> <mapping dir="6" to="interactiveResources" /> <mapping dir="30" to="photo" /> <default to="maps" /> </categoryMapping> <schemaMapping> <mapping dir="3" to="fgdc-std" /> <default to="dublin-core" /> </schemaMapping> </config>
As described above, the import procedure starts by scanning the specied Directory. Apart from the import-cong.xml le, this directory should only contain subdirectories - these are the category directories referred to in the categoryMapping section of the import-cong.xml le described above. Each of the category directories should only contain subdirectories - these are the schema directories referred to in the schemaMapping section of the import-cong.xml le described above.
121
a brief summary of some of the elements from each selected metadata record is generated by applying the brief template from the metadata schema eg. for an iso19139 metadata record the brief template from GEONETWORK_DATA_DIR/config/schema_plugins/iso19139/present/metadata-iso19139 would be applied to the metadata record. the elements common to the brief summary elements for all metadata records are extracted (as they may differ according to the metadata schema) a title record with comma separated element names is created the content of each element is laid out in comma separated form. Where there is more than one child element in the brief element (eg. for geoBox), the content from each child element is separated using ###. An example of an ISO metadata record in CSV format is shown as follows:
122
It is possible to override the brief summary of metadata elements by creating a special template in the presentation XSLT of the metadata schema. As an example of how to do this, we will override the brief summary for the iso19139 schema and replace it with just one element: gmd:title. To do this we create an XSLT template as follows:
<xsl:template match="gmd:MD_Metadata" mode="csv"> <xsl:param name="internalSep"/> <metadata> <!-- add in our field --> <xsl:copy-of select="gmd:identificationInfo/gmd:MD_DataIdentification/gmd:citation/g <!-- copy geonet:info element in - has special metadata eg schema name --> <xsl:copy-of select="geonet:info"/> </metadata> </xsl:template>
This template, when added to GEONETWORK_DATA_DIR/config/schema_plugins/iso19139/present/me will replace the brief summary (produced by the brief template) with just one element, gmd:title.
4.5 Status
Metadata records have a lifecycle that typically goes through one or more states. For example, when a record is created and edited by an Editor user it is in the Draft state. Whilst it is reviewed by a Content Reviewer user it would typically be in a Submitted state. If the record is found to be complete and correct by the Content Reviewer it would be in the Approved state and may be made available for casual search and harvest by assigning privileges to the GeoNetwork All group. Eventually, the record may be superseded or replaced and the state would be Retired. GeoNetwork has (an extensible) set of states that a metadata record can have: Unknown - this is the default state - nothing is known about the status of the metadata record Draft - the record is under construction or being edited. Submitted - the record has been submitted for approval to a content review. Approved - the content reviewer has reviewed and approved the metadata record Rejected - the content reviewer has reviewed and rejected the metadata record Retired - the record has been retired Status can be assigned to metadata records individually or as a selected set.
Initiating status change for a single metadata record Initiating status change for a set of metadata records The interface for setting the status looks like the following: Changing the status of a set of metadata records
4.5. Status
123
124
It is also possible to search for metadata records with a particular status using a search restriction in the Advanced Search menu.
Date: Tue, 13 Dec 2011 12:58:58 +1100 (EST) From: Metadata Workflow <[email protected]> Subject: Metadata records SUBMITTED by [email protected] (User One To: "[email protected]" <[email protected]> Reply-to: User One <[email protected]> Message-id: <1968852534.01323741538713.JavaMail.geonetwork@localgeonetwork.org. These records are complete. Please review.
when a Content Reviewer changes the state on a metadata record(s) from Submitted to Accepted or Rejected, the owner of the metadata record is informed of the status change via email. The email received by the metadata record owner looks like the following. Again, the user can log in and use the link supplied in the email to access the approved/rejected records. Here is an example email sent by this action:
Date: Wed, 14 Dec 2011 12:28:01 +1100 (EST) From: Metadata Workflow <[email protected]> Subject: Metadata records APPROVED by [email protected] (Reviewer To: "User One" <[email protected]> Message-ID: <1064170697.31323826081004.JavaMail.geonetwork@localgeonetwork.org. Reply-To: Reviewer <[email protected]>
4.5. Status
125
Records approved - please resubmit for approval when online resources attached
onEdit: This action is called when a record is edited and saved by a user. If the user did not indicate that the edit changes were a Minor edit and the current status of the record is Approved, then the default action is to set the status to Draft and remove the privileges for the GeoNetwork group All.
4.6 Versioning
There are many use cases where it is important to be able to track (over time): changes to the metadata record changes to properties of the metadata record eg. privileges, categories, status GeoNetwork uses a subversion repository to capture these changes and allow the user to examine the changes through the various visual interfaces to subversion repositories that already exist eg. viewvc. Apart from the advantage of ready to use tools for examining the changes, the subversion approach is efcient for XML les and simple to maintain. The database remains the point of truth for GeoNetwork. That is, changes will be tracked in subversion, but all services will continue to extract the latest version of the metadata record from the database.
126
4.6. Versioning
127
from the database and passed as a commit to the subversion repository, creating a new version in the repository. This process is automatic - at the moment the user cannot force a new version to be created, unless they change the metadata record or its properties. Due to recent changes in the way in which GeoNetwork database sessions are committed (forced by the adoption of background threads for work tasks) and the implementation dependent way in which database transaction isolation is handled by different vendors, there is a small chance that database sessions may overlap. This may mean that the ordering of the changes committed to the subversion repository may not be correct in a small number of cases. After some discussion amongst the developers, the implementation may change to remove this possibility in the next version of GeoNetwork.
128
<record> <group_name>sample</group_name> <operation_id>3</operation_id> <operation_name>notify</operation_name> </record> <record> <group_name>intranet</group_name> <operation_id>5</operation_id> <operation_name>dynamic</operation_name> </record> <record> <group_name>all</group_name> <operation_id>5</operation_id> <operation_name>dynamic</operation_name> </record> <record> <group_name>intranet</group_name> <operation_id>6</operation_id> <operation_name>featured</operation_name> </record> <record> <group_name>all</group_name> <operation_id>6</operation_id> <operation_name>featured</operation_name> </record> </response>
Difference between revisions 3 and 4 for the privileges.xml le for metadata record 10:
svn diff -r 3:4 Index: 10/privileges.xml =================================================================== --- 10/privileges.xml (revision 3) +++ 10/privileges.xml (revision 4) @@ -1,12 +1,52 @@ <response> <record> + <group_name>intranet</group_name> + <operation_id>0</operation_id> + <operation_name>view</operation_name> + </record> + <record> <group_name>sample</group_name> <operation_id>0</operation_id> <operation_name>view</operation_name> </record> <record> + <group_name>all</group_name> + <operation_id>0</operation_id> + <operation_name>view</operation_name> + </record> + <record> + <group_name>intranet</group_name> + <operation_id>1</operation_id> + <operation_name>download</operation_name> + </record> + <record> + <group_name>all</group_name>
4.6. Versioning
129
+ + + +
<operation_id>1</operation_id> <operation_name>download</operation_name> </record> <record> <group_name>sample</group_name> <operation_id>3</operation_id> <operation_name>notify</operation_name> </record> + <record> + <group_name>intranet</group_name> + <operation_id>5</operation_id> + <operation_name>dynamic</operation_name> + </record> + <record> + <group_name>all</group_name> + <operation_id>5</operation_id> + <operation_name>dynamic</operation_name> + </record> + <record> + <group_name>intranet</group_name> + <operation_id>6</operation_id> + <operation_name>featured</operation_name> + </record> + <record> + <group_name>all</group_name> + <operation_id>6</operation_id> + <operation_name>featured</operation_name> + </record> </response>
Examination of this diff le shows that privileges for the All and Intranet groups have been added between revision 3 and 4 - in short, the record has been published. Here is an example of a change that has been made to a metadata record:
svn diff -r 2:3 Index: 10/metadata.xml =================================================================== --- 10/metadata.xml (revision 2) +++ 10/metadata.xml (revision 3) @@ -61,7 +61,7 @@ </gmd:CI_ResponsibleParty> </gmd:contact> <gmd:dateStamp> <gco:DateTime>2012-01-10T01:47:51</gco:DateTime> + <gco:DateTime>2012-01-10T01:48:06</gco:DateTime> </gmd:dateStamp> <gmd:metadataStandardName> <gco:CharacterString>ISO 19115:2003/19139</gco:CharacterString> @@ -85,7 +85,7 @@ <gmd:citation> <gmd:CI_Citation> <gmd:title> <gco:CharacterString>Template for Vector data in ISO19139 (preferr ed!)</gco:CharacterString> + <gco:CharacterString>fobblers foibblers</gco:CharacterString> </gmd:title> <gmd:date>
130
<gmd:CI_Date>
This example shows that the editor has made a change to the title and the dateStamp.
4.6.4 Looking at the revision history using viewvc - a graphical user interface
The viewvc subversion repository tool has a graphical interface that allows side-by-side comparison of changes/differences between les:
4.6. Versioning
131
Looking at the changes in the privileges set on a metadata record using browser to query viewvc
4.7 Harvesting
There has always been a need to share metadata between GeoNetwork nodes and bring metadata into GeoNetwork from other sources eg. self-describing web services that deliver data and metadata or databases with organisational metadata etc. Harvesting is the process of collecting metadata from a remote source and storing it locally in GeoNetwork for fast searching via Lucene. This is a periodic process to do, for example, once a week. Harvesting is not a simple import: local and remote metadata are kept aligned. GeoNetwork is able to harvest from the following sources (for more details see below): 1. Another GeoNetwork node (version 2.1 or above). See GeoNetwork Harvesting 2. An old GeoNetwork 2.0 node (deprecated). See GeoNetwork 2.0 Harvester 3. A WebDAV server. See WEBDAV Harvesting 4. A CSW 2.0.1 or 2.0.2 catalogue server. See CSW Harvesting 5. A GeoPortal 9.3.x server. See GeoPortal REST Harvesting 6. A File system acessible by GeoNetwork. See Local File System Harvesting 7. An OAI-PMH server. See OAIPMH Harvesting 8. An OGC service using its GetCapabilities document. These include WMS, WFS, WPS and WCS services. See Harvesting OGC Services 9. An ArcSDE server. See Harvesting an ARCSDE Node 10. A THREDDS catalog. See THREDDS Harvesting 11. An OGC WFS using a GetFeature query. See WFS GetFeature Harvesting 12. One or more Z3950 servers. See Z3950 Harvesting
132
4.7. Harvesting
133
4. Node (D) harvests from both (A), (B) and (C) In this scenario, Node (D) will get the same metadata (a) from all 3 nodes (A), (B), (C). The metadata will ow to (D) following 3 different paths but thanks to its UUID only one copy will be stored. When (a) is changed in (A), a new version will ow to (D) but, thanks to the change date, the copy in (D) will be updated with the most recent version.
Caused by: sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException unable to find valid certification path to requested target
134
4.7. Harvesting
135
Caused by: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target
The server certicate for the GeoNetwork server being harvested needs to be added to the JVM keystore with keytool in order to be trusted. An alternative way to add the certicate is to use a script like:
## ## ## ## ## ## JAVA SSL Certificate import script Based on original MacOs script by [email protected] : http://louise.hu Usage: ./ssl_key_import.sh <sitename> <port>
## Compile and start javac InstallCert.java java InstallCert $1:$2 ## Copy new cert into local JAVA keystore echo "Please, enter admnistrator password:" sudo cp jssecacerts $JAVA_HOME/jre/lib/security/jssecacerts # Comment previous line and uncomment next one for MacOs #sudo cp jssecacerts /Library/Java/Home/lib/security/
To use the script, the Java compiler must be installed and the le InstallCert.java, must be downloaded and placed in the same directory as the script.
136
The script will add the certicate to the JVM keystore, if you run it as follows::
$ ./ssl_key_import.sh https_server_name 443
Note: Use this script at your own risk! Before installing a certicate in the JVM keystore as trusted, make sure you understand the security implications. Note: After adding the certicate you will need to restart GeoNetwork.
Figure 4.11: How to access the harvesting main page The harvesting main page will then be displayed. The page shows a list of the currently dened harvesters and a set of buttons for management functions. The meaning of each column in the list of harvesters is as follows: 1. Select Check box to select one or more harvesters. The selected harvesters will be affected by the rst row of buttons (activate, deactivate, run, remove). For example, if you select three harvesters and press the Remove button, they will all be removed. 2. Name This is the harvester name provided by the administrator. 3. Type The harvester type (eg. GeoNetwork, WebDAV etc...).
4.7. Harvesting
137
Figure 4.12: The harvesting main page 4. Status An icon showing current status. See Harvesting Status and Error Icons for the different icons and status descriptions. 5. Errors An icon showing the result of the last harvesting run, which could have succeeded or not. See Harvesting Status and Error Icons for the different icons and error descriptions. Hovering the cursor over the icon will show detailed information about the last harvesting run. 6. Run at and Every: Scheduling of harvester runs. Essentially the time of the day + how many hours between repeats and on which days the harvester will run. 7. Last run The date, in ISO 8601 format, of the most recent harvesting run. 8. Operation A list of buttons/links to operations on a harvester. Selecting Edit will allow you to change the parameters for a harvester. Selecting Clone will allow you to create a clone of this harvester and start editing the details of the clone. Selecting History will allow you to view/change the harvesting history for a harvester - see Harvest History. At the bottom of the list of harvesters are two rows of buttons. The rst row contains buttons that can operate on a selected set of harvesters. You can select the harvesters you want to operate on using the check box in the Select column and then press one of these buttons. When the button nishes its action, the check boxes are cleared. Here is the meaning of each button: 1. Activate When a new harvester is created, the status is inactive. Use this button to make it active and start the harvester(s) according to the schedule it has/they have been congured to use. 2. Deactivate Stops the harvester(s). Note: this does not mean that currently running harvest(s) will be stopped. Instead, it means that the harvester(s) will not be scheduled to run again. 3. Run Start the selected harvesters immediately. This is useful for testing harvester setups. 4. Remove Remove all currently selected harvesters. A dialogue will ask the user to conrm the action. The second row contains general purpose buttons. Here is the meaning of each button: 1. Back Simply returns to the main administration page. 2. Add This button creates a new harvester. 3. Refresh Refreshes the current list of harvesters from the server. This can be useful to see if the harvesting list has been altered by someone else or to get the status of any running harvesters.
138
4. History Show the harvesting history of all harvesters. See Harvest History for more details.
The harvesting engine is waiting for the next scheduled run time of the harvester.
Run- The harvesting engine is currently running, fetching metadata. When the process is ning nished, the result of the harvest will be available as an icon in the Errors column Possible status icons Icon Description The harvesting was OK, no errors were found. In this case, a tool tip will show some harvesting results (like the number of harvested metadata etc...). The harvesting was aborted due to an unexpected condition. In this case, a tool tip will show some information about the error. Possible error icons
4.7. Harvesting
139
an option to force validation: if you want to harvest these metadata anyway, simply turn/leave it off. Thumbnails/Thumbnails failed - Number of metadata thumbnail images added/that could not be added due to some failure. Metadata URL attribute used - Number of layers/featuretypes/coverages that had a metadata URL that could be used to link to a metadata record (OGC Service Harvester only). Services added - Number of ISO19119 service records created and added to the catalogue (for THREDDS catalog harvesting only). Collections added - Number of collection dataset records added to the catalogue (for THREDDS catalog harvesting only). Atomics added - Number of atomic dataset records added to the catalogue (for THREDDS catalog harvesting only). Subtemplates added - Number of subtemplates (= fragment visible in the catalog) added to the metadata catalog. Subtemplates removed - Number of subtemplates (= fragment visible in the catalog) removed from the metadata catalog. Fragments w/Unknown schema - Number of fragments which have an unknown metadata schema. Fragments returned - Number of fragments returned by the harvester. Fragments matched - Number of fragments that had identiers that in the template used by the harvester. Existing datasets - Number of metadata records for datasets that existed when the THREDDS harvester was run. Records built - Number of records built by the harvester from the template and fragments. Could not insert - Number of records that the harvester could not insert into the catalog (usually because the record was already present eg. in the Z3950 harvester this can occur if the same record is harvested from different servers).
140
Result vs harvesting type Total Added Removed Updated Unchanged Unknown schema Unretrievable Bad Format Does Not Validate Thumbnails / Thumbnails failed Metadata URL attribute used Services Added Collections Added Atomics Added Subtemplates Added Subtemplates removed Fragments w/Unknown Schema Fragments Returned Fragments Matched Existing 4.7. Harvesting datasets Records Built
OGC Service
141
Figure 4.13: Adding a new harvester You can choose the type of harvest you intend to perform and press Add to begin the process of adding the harvester. The supported harvesters and details of what to do next are in the following sections: GeoNetwork Harvesting This is the standard and most powerful harvesting protocol used in GeoNetwork. It is able to log in into the remote site, to perform a standard search using the common query elds and to import all matching metadata. Furthermore, the protocol will try to keep both remote privileges and categories of the harvested metadata if they exist locally.
142
4.7. Harvesting
143
Search criteria - In this section you can specify search parameters to select metadata records for harvesting. The parameters are the same or similar to those found on the GeoNetwork search form. source: A GeoNetwork site can contain both its own metadata and metadata harvested from other sources. Use the Retrieve sources button to retrieve the sources from the remote site. You can then choose a source name to constrain the search to a particular source. eg. You could constrain the search to the source representing metadata that has not been harvested from other sites. Leaving source blank will retrieve all metadata from the remote site. You can add multiple search criteria through the Add button: multiple searches will be performed and results merged. Search criteria sets an be removed using the small cross button at the top left of the criteria set. If no search criteria are added, a global unconstrained search will be performed. Options - Scheduling Options. Run at - The time when the harvester will run. Will run again every - Choose an interval from the drop down list and then select the days for which this scheduling will take place. One run only - Checking this box will cause the harvester to run only when manually started using the Run button on the Harvesting Management page. Harvested Content Validate - if checked then harvested metadata records will be validated against the relevant metadata schema. Invalid records will be rejected. Privileges - Use this section to handle remote group privileges. Press the Retrieve groups button and the list of groups on the remote site will be returned. You can then assign a copy policy to each group. The All group has a different policy to the other groups: 1. Copy: Privileges are copied. 2. Copy to Intranet: Privileges are copied but to the Intranet group. This allows public metadata to be protected. 3. Dont copy: Privileges are not copied and harvested metadata will not be publicly visible. For all other groups the policies are these: 1. Copy: Privileges are copied only if there is a local group with the same (not localised) name as the remote group. 2. Create and copy: Privileges are copied. If there is no local group with the same name as the remote group then it is created. 3. Dont copy: Privileges are not copied. Note: The Intranet group is not considered because it does not make sense to copy its privileges. Categories Select one or more categories from the scrolling list. The harvested metadata will be assigned to the selected categories (except where the Set categories if exist locally option described above causes the metadata to be assigned to a matching local category).
144
Notes
This harvester will not work if the remote site has a version prior to GeoNetwork 2.1 eg. GeoNetwork 2.0.2. During harvesting, site icons are harvested and local copies are updated. Icons are propagated to new sites as soon as those sites harvest from this one. The metadata record uuid is taken from the info.xml le of the MEF bundle. in order to be successfully harvested, metadata records retrieved from the remote site must match a metadata schema in the local GeoNetwork instance WEBDAV Harvesting This harvesting type uses the WebDAV (Distributed Authoring and Versioning) protocol or the WAF (web accessible folder) protocol to harvest metadata from a web server. It can be useful to users that want to publish their metadata through a web server that offers a DAV interface. The protocol permits retrieval of the contents of a web page (a list of les) along with the change date.
4.7. Harvesting
145
146
Groups - Groups can be selected from the scrolling list. When the Add button is pushed, a row of privileges will be created below the scrolling list for each group. Privileges can then be checked/unchecked for each group as required. Remove - To remove a row click on the Remove button on the right of the row. Categories Select one or more categories from the scrolling list. The harvested metadata will be assigned to the selected categories.
Notes
The same metadata could be harvested several times by different instances of the WebDAV harvester. This is not good practise because copies of the same metadata record will have a different UUID. in order to be successfully harvested, metadata records retrieved from the remote site must match a metadata schema in the local GeoNetwork instance CSW Harvesting This harvester will connect to a remote CSW server and retrieve metadata records that match the query parameters specied.
Figure 4.15: Adding a Catalogue Services for the Web harvesting node
148
Options - Specic harvesting options for this harvester. Validate - If checked, the metadata will be validated after retrieval. If the validation does not pass, the metadata will be skipped. Privileges - Assign privileges to harvested metadata. Groups - Groups can be selected from the scrolling list. When the Add button is pushed, a row of privileges will be created below the scrolling list for each group. Privileges can then be checked/unchecked for each group as required. Remove - To remove a row click on the Remove button on the right of the row. Categories Select one or more categories from the scrolling list. The harvested metadata will be assigned to the selected categories.
Notes
in order to be successfully harvested, metadata records retrieved from the remote site must match a metadata schema in the local GeoNetwork instance GeoPortal REST Harvesting This harvester will connect to a remote GeoPortal version 9.3.x server and retrieve metadata records that match the query parameters specied using the GeoPortal REST API.
4.7. Harvesting
149
150
One run only - Checking this box will cause the harvester to run only when manually started using the Run button on the Harvesting Management page. Harvested Content - Options that are applied to harvested content. Apply this XSLT to harvested records - Choose an XSLT here that will convert harvested records to a different format. See notes section below for typical usage. Validate - If checked, the metadata will be validated after retrieval. If the validation does not pass, the metadata will be skipped. Privileges - Assign privileges to harvested metadata. Groups - Groups can be selected from the scrolling list. When the Add button is pushed, a row of privileges will be created below the scrolling list for each group. Privileges can then be checked/unchecked for each group as required. Remove - To remove a row click on the Remove button on the right of the row. Categories Select one or more categories from the scrolling list. The harvested metadata will be assigned to the selected categories.
Notes
in order to be successfully harvested, metadata records retrieved from the remote site must match a metadata schema in the local GeoNetwork instance this harvester uses two REST services from the GeoPortal API: rest/find/document with searchText parameter to return an RSS listing of metadata records that meet the search criteria (maximum 100000) rest/document with id parameter from each result returned in the RSS listing this harvester has been tested with GeoPortal 9.3.x. It should be used for that version of GeoPortal in preference to the CSW harvester typically ISO19115 metadata produced by the Geoportal software will not have a gmd prex for the namespace http://www.isotc211.org/2005/gmd. GeoNetwork XSLTs will not have any trouble understanding this metadata but will not be able to map titles and codelists in the viewer/editor. To x this problem, please select the Add-gmd-prex XSLT for the Apply this XSLT to harvested records in the Harvested Content set of options described earlier Local File System Harvesting This harvester will harvest metadata as XML les from a lesystem available on the machine running the GeoNetwork server.
4.7. Harvesting
151
152
Name - This is a short description of the lesystem harvester. It will be shown in the harvesting main page as the name for this instance of the Local Filesystem harvester. Directory - The path name of the directory containing the metadata (as XML les) to be harvested. Recurse - If checked and the Directory path contains other directories, then the harvester will traverse the entire le system tree in that directory and add all metadata les found. Keep local if deleted at source - If checked then metadata records that have already been harvested will be kept even if they have been deleted from the Directory specied. Icon - An icon to assign to harvested metadata. The icon will be used when showing harvested metadata records in the search results. Options - Scheduling options. Run at - The time when the harvester will run. Will run again every - Choose an interval from the drop down list and then select the days for which this scheduling will take place. One run only - Checking this box will cause the harvester to run only when manually started using the Run button on the Harvesting Management page. Harvested Content - Options that are applied to harvested content. Apply this XSLT to harvested records - Choose an XSLT here that will convert harvested records to a different format. Validate - If checked, the metadata will be validated after retrieval. If the validation does not pass, the metadata will be skipped. Privileges - Assign privileges to harvested metadata. Groups - Groups can be selected from the scrolling list. When the Add button is pushed, a row of privileges will be created below the scrolling list for each group. Privileges can then be checked/unchecked for each group as required. Remove - To remove a row click on the Remove button on the right of the row. Categories Select one or more categories from the scrolling list. The harvested metadata will be assigned to the selected categories.
Notes
in order to be successfully harvested, metadata records retrieved from the le system must match a metadata schema in the local GeoNetwork instance GeoNetwork 2.0 Harvester GeoNetwork 2.1 introduced a new powerful harvesting engine which is not compatible with GeoNetwork version 2.0 based catalogues. Old 2.0 servers can still harvest from 2.1 servers but harvesting metadata from a v2.0 server requires this harvesting type. Due to the fact that GeoNetwork 2.0 was released more than 5 years ago, this harvesting type is deprecated.
4.7. Harvesting
153
OAIPMH Harvesting This is a harvesting protocol that is widely used among libraries. GeoNetwork implements version 2.0 of the protocol.
Figure 4.18: Adding an OAI-PMH harvesting harvester Conguration options: 154 Chapter 4. Managing Metadata
Site - Options describing the remote site. Name - This is a short description of the remote site. It will be shown in the harvesting main page as the name for this instance of the OAIPMH harvester. URL - The URL of the OAI-PMH server from which metadata will be harvested. Icon - An icon to assign to harvested metadata. The icon will be used when showing search results. Use account - Account credentials for basic HTTP authentication on the OAIPMH server. Search criteria - This allows you to select metadata records for harvest based on certain criteria: From - You can provide a start date here. Any metadata whose last change date is equal to or greater than this date will be harvested. To add or edit a value for this eld you need to use the icon alongside the text box. This eld is optional so if you dont provide a start date the constraint is dropped. Use the icon to clear the eld. Until - Functions in the same way as the From parameter but adds an end constraint to the last change date search. Any metadata whose last change data is less than or equal to this data will be harvested. Set - An OAI-PMH server classies metadata into sets (like categories in GeoNetwork). You can request all metadata records that belong to a set (and any of its subsets) by specifying the name of that set here. Prex - Prex means metadata format. The oai_dc prex must be supported by all OAIPMH compliant servers. You can use the Add button to add more than one Search Criteria set. Search Criteria sets can be removed by clicking on the small cross at the top left of the set. Note: the OAI provider sets drop down next to the Set text box and the OAI provider prexes drop down next to the Prex textbox are initially blank. After specifying the connection URL, you can press the Retrieve Info button, which will connect to the remote OAI-PMH server, retrieve all supported sets and prexes and ll the drop downs with these values. Selecting a value from either of these drop downs will ll the appropriate text box with the selected value. Options - Scheduling Options. Run at - The time when the harvester will run. Will run again every - Choose an interval from the drop down list and then select the days for which this scheduling will take place. One run only - Checking this box will cause the harvester to run only when manually started using the Run button on the Harvesting Management page. Privileges Groups - Groups can be selected from the scrolling list. When the Add button is pushed, a row of privileges will be created below the scrolling list for each group. Privileges can then be checked/unchecked for each group as required. Remove - To remove a row click on the Remove button on the right of the row. Categories Select one or more categories from the scrolling list. The harvested metadata will be assigned to the selected categories.
4.7. Harvesting
155
Notes
if you request the oai_dc output format, GeoNetwork will convert it to Dublin Core format. when you edit a previously created OAIPMH harvester instance, both the set and prex drop down lists will be empty. You have to press the retrieve info button again to connect to the remote server and retrieve set and prex information. the id of the remote server must be a UUID. If not, metadata can be harvested but during hierarchical propagation id clashes could corrupt harvested metadata. in order to be successfully harvested, metadata records retrieved from the remote site must match a metadata schema in the local GeoNetwork instance Harvesting OGC Services An OGC service implements a GetCapabilities operation that GeoNetwork, acting as a client, can use to produce metadata for the service (ISO19119) and resources delivered by the service (ISO19115/19139). This harvester supports the following OGC services and versions: Web Map Service (WMS) - versions 1.0.0, 1.1.1, 1.3.0 Web Feature Service (WFS) - versions 1.0.0 and 1.1.0 Web Coverage Service (WCS) - version 1.0.0 Web Processing Service (WPS) - version 0.4.0 and 1.0.0 Catalogue Services for the Web (CSW) - version 2.0.2 Sensor Observation Service (SOS) - version 1.0.0
156
4.7. Harvesting
157
* Create metadata for layer elements using GetCapabilities information: Checking this option means that the harvester will loop over datasets served by the service as described in the GetCapabilities document. * Create metadata for layer elements using MetadataURL attributes: Checkthis option means that the harvester will generate metadata from an XML document referenced in the MetadataUrl attribute of the dataset in the GetCapabilities document. If the document referred to by this attribute is not valid (eg. unknown schema, bad XML format), the GetCapabilites document is used as per the previous option. * Create thumbnails for WMS layers: If harvesting from an OGC WMS, then checking this options means that thumbnails will be created during harvesting. Target schema - The metadata schema of the dataset metadata records that will be created by this harvester. Icon - The default icon displayed as attribution logo for metadata created by this harvester. Options - Scheduling Options. Run at - The time when the harvester will run. Will run again every - Choose an interval from the drop down list and then select the days for which this scheduling will take place. One run only - Checking this box will cause the harvester to run only when manually started using the Run button on the Harvesting Management page. Privileges Groups - Groups can be selected from the scrolling list. When the Add button is pushed, a row of privileges will be created below the scrolling list for each group. Privileges can then be checked/unchecked for each group as required. Remove - To remove a row click on the Remove button on the right of the row. Category for service - Metadata for the harvested service is assigned to the category selected in this option (eg. interactive resources). Category for datasets - Metadata for the harvested datasets is assigned to the category selected in this option (eg. datasets).
Notes
every time the harvester runs, it will remove previously harvested records and create new records. GeoNetwork will generate the uuid for all metadata (both service and datasets). The exception to this rule is dataset metadata created using the MetadataUrl tag is in the GetCapabilities document, in that case, the uuid of the remote XML document is used instead thumbnails can only be generated when harvesting an OGC Web Map Service (WMS). The WMS should support the WGS84 projection the chosen Target schema must have the support XSLTs which are used by the harvester to convert the GetCapabilities statement to metadata records from that schema. If in doubt, use iso19139.
158
Harvesting an ARCSDE Node This is a harvesting protocol for metadata stored in an ArcSDE installation.
4.7. Harvesting
159
Conguration options: Site Name - This is a short description of the node. It will be shown in the harvesting main page. Server - ArcSde server IP address or name Port - ArcSde service port (typically 5151) Username - Username to connect to ArcSDE server Password - Password of the ArcSDE user Database name - ArcSDE instance name (typically esri_sde) Options Run at - The time when the harvester will run. Will run again every - Choose an interval from the drop down list and then select the days for which this scheduling will take place. One run only - Checking this box will cause the harvester to run only when manually started using the Run button on the Harvesting Management page. Harvested Content Validate - if checked then harvested metadata records will be validated against the relevant metadata schema. Invalid records will be rejected. Privileges Groups - Groups can be selected from the scrolling list. When the Add button is pushed, a row of privileges will be created below the scrolling list for each group. Privileges can then be checked/unchecked for each group as required. Remove - To remove a row click on the Remove button on the right of the row. Categories Select one or more categories from the scrolling list. The harvested metadata will be assigned to the selected categories. THREDDS Harvesting THREDDS catalogs describe inventories of datasets. They are organised in a hierarchical manner, listing descriptive information and access methods for each dataset. They typically catalog netCDF datasets but are not restricted to these types of les. This harvesting type crawls through a THREDDS catalog harvesting metadata for datasets and services described in it or in referenced netCDF datasets. This harvesting type can extract fragments of metadata from the THREDDS catalog, allowing the user to link or copy these fragments into a template to create metadata records.
160
Name - This is a short description of the THREDDS catalog. It will be shown in the harvesting main page as the name of this THREDDS harvester instance. Catalog URL - The remote URL of the THREDDS Catalog from which metadata will be harvested. This must be the xml version of the catalog (i.e. ending with .xml). The harvester will crawl through all datasets and services dened in this catalog creating metadata for them as specied by the options described further below. Metadata language - Use this option to specify the language of the metadata to be harvested. ISO topic category - Use this option to specify the ISO topic category of service metadata. Create ISO19119 metadata for all services in catalog - Select this option to generate iso19119 metadata for services dened in the THREDDS catalog (eg. OpenDAP, OGC WCS, ftp) and for the THREDDS catalog itself. Create metadata for Collection datasets - Select this option to generate metadata for each collection dataset (THREDDS dataset containing other datasets). Creation of metadata can be customised using options that are displayed when this option is selected as described further below. Create metadata for Atomic datasets - Select this option to generate metadata for each atomic dataset (THREDDS dataset not containing other datasets for example cataloguing a netCDF dataset). Creation of metadata can be customised using options that are displayed when this option is selected as described further below. * Ignore harvesting attribute - Select this option to harvest metadata for selected datasets regardless of the harvest attribute for the dataset in the THREDDS catalog. If this option is not selected, metadata will only be created for datasets that have a harvest attribute set to true.
* Extract DIF metadata elements and create ISO metadata - Select this option to generate ISO metadata for datasets in the THREDDS catalog that have DIF metadata elements. When this option is selected a list of schemas is shown that have a DIFToISO.xsl stylesheet available (see for example GEONETWORK_DATA_DIR/config/schema_plugins/iso19139/convert/DIFToISO.x Metadata is generated by reading the DIF metadata items in the THREDDS into a DIF format metadata record and then converting that DIF record to ISO using the DIFToISO stylesheet. * Extract Unidata dataset discovery metadata using fragments - Select this option when the metadata in your THREDDS or netCDF/ncml datasets follows Unidata dataset discovery conventions (see http://www.unidata.ucar.edu/software/netcdfjava/formats/DataDiscoveryAttConvention.html). You will need to write your own stylesheets to extract this metadata as fragments and dene a template to combine with the fragments. When this option is selected the following additional options will be shown: Select schema for output metadata records - choose the ISO metadata schema or prole for the harvested metadata records. Note: only the schemas that have THREDDS fragment stylesheets will be displayed in the list (see the next option for the location of these stylesheets). Stylesheet to create metadata fragments - Select a stylesheet to use to convert metadata for the dataset (THREDDS metadata and netCDF ncml where applicable) into metadata fragments. These stylesheets can be found in the directory con-
162
vert/ThreddsToFragments in the schema directory eg. for iso19139 this would be GEONETWORK_DATA_DIR/config/schema_plugins/iso19139/convert/ThreddsT Create subtemplates for fragments and XLink them into template - Select this option to create a subtemplate (=metadata fragment stored in GeoNetwork catalog) for each metadata fragment generated. Template to combine with fragments - Select a template that will be lled in with the metadata fragments generated for each dataset. The generated metadata fragments are used to replace referenced elements in the templates with an xlink to a subtemplate if the Create subtemplates option is checked. If Create subtemplates is not checked, then the fragments are simply copied into the template metadata record. For Atomic Datasets , one additional option is provided Harvest new or modied datasets only. If this option is checked only datasets that have been modied or didnt exist when the harvester was last run will be harvested. Create Thumbnails - Select this option to create thumbnails for WMS layers in referenced WMS services Icon - An icon to assign to harvested metadata. The icon will be used when showing search results. Options - Scheduling Options. Run at - The time when the harvester will run. Will run again every - Choose an interval from the drop down list and then select the days for which this scheduling will take place. One run only - Checking this box will cause the harvester to run only when manually started using the Run button on the Harvesting Management page. Privileges Groups - Groups can be selected from the scrolling list. When the Add button is pushed, a row of privileges will be created below the scrolling list for each group. Privileges can then be checked/unchecked for each group as required. Remove - To remove a row click on the Remove button on the right of the row. Category for Service - Select the category to assign to the ISO19119 service records for the THREDDS services. Category for Datasets - Select the category to assign the generated metadata records (and any subtemplates) to. At the bottom of the page there are the following buttons: Back - Go back to the main harvesting page. The harvesting denition is not added. Save - Saves this harvester denition creating a new harvesting instance. After the save operation has completed, the main harvesting page will be displayed.
4.7. Harvesting
163
More about harvesting THREDDS DIF metadata elements with the THREDDS Harvester
THREDDS catalogs can include elements from the DIF metadata standard. The Unidata netcdf-java library provides a DIFWriter process that can create a DIF metadata record from these elements. GeoNetwork has a DIFToISO stylesheet to transform these DIF records to ISO. An example of a THREDDS Catalog with DIF-compliant metadata elements is shown below.
More about harvesting Unidata dataset discovery metadata with the THREDDS Harvester
The options described above for the Extract Unidata dataset discovery metadata using fragments (see http://www.unidata.ucar.edu/software/netcdf-java/formats/DataDiscoveryAttConvention.html for more details of these conventions) invoke the following process for each collection dataset or atomic dataset in the THREDDS catalog: 1. The harvester bundles up the catalog URI, a generated uuid, the THREDDS metadata for the dataset (generated using the catalog subset web service) and the ncml for netCDF datasets into a single xml document. An example is shown below. 164 Chapter 4. Managing Metadata
2. This document is then transformed using the specied stylesheet (see Stylesheet option above) to obtain a metadata fragments document. 3. The metadata fragment harvester is then called to create subtemplates and/or metadata for the each dataset as requested
Figure 4.23: An example THREDDS dataset document created by the THREDDS fragment harvester
Example
DIF Metadata elements on datasets in THREDDS catalogs are not as widely used as metadata elements that follow the Unidata dataset discovery metadata conventions. This example will show how to harvest metadata elements that follow the Unidata data discovery conventions (see http://www.unidata.ucar.edu/software/netcdf-java/formats/DataDiscoveryAttConvention.html). Two reference stylesheets are provided as examples of how to harvest metadata fragments from a THREDDS catalog. One of these stylesheets, thredds-metadata.xsl, is for generating iso19139 metadata fragments from THREDDS metadata following Unidata dataset discovery conventions. The other stylesheet, netcdf-attributes.xsl, is for generating iso19139 fragments from netCDF datasets following Unidata dataset discovery conventions. These stylesheets are designed for use with the HARVESTING TEMPLATE THREDDS DATA DISCOVERY template and can be found in the schema convert directory eg. for ISO19139 this is GEONETWORK_DATA_DIR/config/schema_plugins/iso19139/convert/ThreddsToFragments.
A sample template HARVESTING TEMPLATE THREDDS DATA DISCOVERY has been provided for use with the stylesheets described above for the iso19139 metadata schema. This template is in the schema templates directory eg. for ISO19139, this is GEONETWORK_DATA_DIR/config/schema_plugins/iso19139/templates/thredds-harvester-u Before attempting to run this example, you should make sure that this template and others from the iso19139 schema have been loaded into GeoNetwork using the Add templates function in the Administration menu. Well now give an example of how to set up a harvester and harvest THREDDS metadata from one of the public unidata motherlode catalogs at http://motherlode.ucar.edu:8080/thredds/catalog/satellite/3.9/WEST-CONUS_4km/catalog.xml. If you were to paste this URL into your browser, you would see the XML representation of this THREDDS catalog. This is the document that is read and converted into metadata by the THREDDS harvester. A snippet of this catalog is shown below.
4.7. Harvesting
165
166
In GeoNetwork, go into the Administration menu, choose Harvesting Management as described earlier. Add a THREDDS Catalog harvester. Fill out the harvesting management form as shown in the form below. The rst thing to notice is that the Service URL should be http://motherlode.ucar.edu:8080/thredds/catalog/satellite/3.9/WEST-CONUS_4km/catalog.xml. Make sure that you use the xml version of the catalog. If you use an html version, you will not be able to harvest any metadata. Now because this unidata motherload THREDDS catalog has lots of le level datasets (many thousands in fact), we will only harvest collection metadata. To do this you should check Create metadata for Collection Datasets and ignore the atomic datasets. Next, because the metadata in this catalog follows Unidata data discovery conventions, so we will choose Extract Unidata dataset discovery metadata using fragments. Next, we will check Ignore harvesting attribute. We do this because datasets in the THREDDS catalog can have an attribute indicating whether the dataset should be harvested or not. Since none of the datasets in this catalog have the harvesting attribute, we will ignore it. If we didnt check this box, all the datasets would be skipped. Next we will select the metadata schema that the harvested metadata will be written out in. We will choose iso19139 here because this is the schema for which we have stylesheets that will convert THREDDS metadata to fragments of iso19139 metadata and a template into which these fragments of metadata can be copied or linked. After choosing iso19139, choices will appear that show these stylesheets and templates. The rst choice is the stylesheet that will create iso19139 metadata fragments. Because we are interested in the thredds metadata elements in the THREDDS catalog, we will choose the (iso19139) thredds-metadata (located in GEONETWORK_DATA_DIR/config/schema_plugins/iso19139/convert/ThreddsToFragments) to convert these elements to iso19139 metadata fragments. For the purposes of this demonstration, we will not check Create subtemplates for fragments (xlinks...). This means that the fragments of metadata created by the stylesheet will be copied directly into the metadata template. They will not be able to be reused (eg. shared between different metadata records). See the earlier section on metadata fragments if you are not sure what this means.
Finally, we will choose HARVESTING TEMPLATE - THREDDS - UNIDATA DISCOVERY as the template metadata record that will be combined with the metadata fragments to create the output records. This template will have been loaded into GeoNetwork from GEONETWORK_DATA_DIR/config/schema_plugins/iso19139/templates/thredds-harvester-u through the Add Templates function in the Administration interface. This template could be lled out with metadata common to all records before the harvester is run. The process by which the template is used to create metadata records is as follows: 1. For each dataset in the THREDDS catalog, the template will be copied to create a new iso19139 metadata record 2. Each fragment of metadata harvested from a THREDDS dataset will be copied into the new iso19139 metadata record by matching an identier in the template with an identier in the fragment (this match is created by the developer of the template and the stylesheet). 3. The new record is then inserted into the GeoNetwork metadata catalog and the record content is indexed in Lucene for searching. You can then ll out the remainder of the form according to how often you want the harvested metadata
4.7. Harvesting
167
Figure 4.25: THREDDS harvester form for motherlode THREDDS catalog example
168
to be updated, what categories will be assigned to the created metadata record, which icon will be displayed with the metadata records in the search results and what the privileges on the created metadata records will be. Save the harvester screen. Then from the harvesting management screen, check the box beside the newly created harvester, Activate it and then Run it. After a few moments (depending on your internet connection and machine) you should click on Refresh. If your harvest has been successful you should see a results panel appear something like the one shown in the following screenshot.
Figure 4.26: Results of harvesting collection records from a motherlode THREDDS catalog Notice that there were 48 metadata records created for the 48 collection level datasets in this THREDDS catalog. Each metadata record was formed by duplicating the metadata template and then copying 13 fragments of metadata into it - hence the total of 624 fragments harvested. An example of one of the collection level metadata records created by the harvester in this example and rendered by GeoNetwork is shown below. WFS GetFeature Harvesting Metadata can be present in the tables of a relational databases, which are commonly used by many organisations. Putting an OGC Web Feature Service (WFS) over a relational database will allow metadata to be extracted via standard query mechanisms. This harvesting type allows the user to specify a GetFeature query and map information from the features to fragments of metadata that can be linked or copied into a template to create metadata records.
4.7. Harvesting
169
Figure 4.27: ISO Metadata record harvested from a motherlode THREDDS catalog
170
Stylesheet to create fragments - User-supplied stylesheet that transforms the GetFeature response to a metadata fragments document (see below for the format of that document). Stylesheets exist in the WFSToFragments directory which is in the convert directory of the selected output schema. eg. for the iso19139 schema, this directory is GEONETWORK_DATA_DIR/config/schema_plugins/iso19139/convert/WFSToFragments Save large response to disk - Check this box if you expect the WFS GetFeature response to be large (eg. greater than 10MB). If checked, the GetFeature response will be saved to disk in a temporary le. Each feature will then be extracted from the temporary le and used to create the fragments and metadata records. If not checked, the response will be held in RAM. Create subtemplates - Check this box if you want the harvested metadata fragments to be saved as subtemplates in the metadata catalog and xlinkd into the metadata template (see next option). If not checked, the fragments will be copied into the metadata template. Template to use to build metadata using fragments - Choose the metadata template that will be combined with the harvested metadata fragments to create metadata records. This is a standard GeoNetwork metadata template record. Category for records built with linked fragments - Choose the metadata template that will be combined with the harvested metadata fragments to create metadata records. This is a standard GeoNetwork metadata template record. Options Run at - The time when the harvester will run. Will run again every - Choose an interval from the drop down list and then select the days for which this scheduling will take place. One run only - Checking this box will cause the harvester to run only when manually started using the Run button on the Harvesting Management page. 4.7. Harvesting 171
172
Privileges Groups - Groups can be selected from the scrolling list. When the Add button is pushed, a row of privileges will be created below the scrolling list for each group. Privileges can then be checked/unchecked for each group as required. Remove - To remove a row click on the Remove button on the right of the row. Category for subtemplates - When fragments are saved to GeoNetwork as subtemplates they will be assigned to the category selected here.
ReplacementGroup
Finally, two examples of how to harvest metadata from the Features of an OGC WFS harvester can be given using stylesheets and templates supplied with GeoNetwork.
4.7. Harvesting
173
174
Choose an output schema - well choose iso19139 as this schema has the example stylesheets and templates we need for this example. Notice that after this option is chosen the following options become visible and well take the following actions:
Choose the supplied geoserver_boundary_fragments stylesheet to extract fragments from the GetFeature response in the Stylesheet to use to create fragments pull-down list. This stylesheet can be found in GEONETWORK_DATA_DIR/config/schema_plugins/iso19139/convert/WFSToFragments
Select the supplied Geoserver WFS Fragments Country Boundaries Test Template template from the Template to use to build metadata using fragments pull-down list. This template can be found in GEONETWORK_DATA_DIR/config/schema_plugins/iso19139/templates/geoserver_fr Choose a category for the records created by the harvester, check the One run only box, add some privileges (simplest is to let All users have View rights). At this stage your harvester entry form should look like the following screenshot. Save the harvester entry form. You will be returned to the harvester operations menu where you can Activate the harvester and then Run it. After the harvester has been run you should see a results screen that looks something like the following screenshot. WFS GetFeature Harvesting - Results for geoserver boundaries example The results page shows that there were 1506 fragments of metadata harvested from the WFS GetFeature response. They were saved to the GeoNetwork database as subtemplates and linked into the metadata template to form 251 new metadata records.
4.7. Harvesting
175
176
Choose an output schema - well choose iso19139 as this schema has the example stylesheets and templates we need for this example. Notice that after this option is chosen the following options become visible and well take the following actions:
Choose the supplied deegree2_philosopher_fragments stylesheet to extract fragments from the GetFeature response in the Stylesheet to use to create fragments pull-down list. This stylesheet can be found in GEONETWORK_DATA_DIR/config/schema_plugins/iso19139/convert/WFSToFragments
Select the supplied Deegree 22 WFS Fragments Philosopher Database Test Template template from the Template to use to build metadata using fragments pull-down list. This template can be found in GEONETWORK_DATA_DIR/config/schema_plugins/iso19139/templates/deegree_frag 4.7. Harvesting 177
Choose a category for the records created by the harvester, check the One run only box, add some privileges (simplest is to let All users have View rights). At this stage your harvester entry form should look like the following screenshot. Save the harvester entry form. You will be returned to the harvester operations menu where you can Activate the harvester and then Run it. After the harvester has been run you should see a results screen that looks something like the following screenshot. WFS GetFeature Harvesting - Results for deegree philosopher database example The results page shows that there were 42 fragments of metadata harvested from the WFS GetFeature response. They were saved to the GeoNetwork database as subtemplates and linked into the metadata template to form 7 new metadata records. Z3950 Harvesting Z3950 is a remote search and harvesting protocol that is commonly used to permit search and harvest of metadata. Although the protocol is often used for library catalogs, signicant geospatial metadata catalogs can also be searched using Z3950 (eg. the metadata collections of the Australian Government agencies that participate in the Australian Spatial Data Directory - ASDD). This harvester allows the user to specify a Z3950 query and retrieve metadata records from one or more Z3950 servers.
178
4.7. Harvesting
179
Harvested Content Apply this XSLT to harvested records - Choose an XSLT here that will convert harvested records to a different format. Validate - If checked, records that do not/cannot be validated will be rejected. Privileges Groups - Groups can be selected from the scrolling list. When the Add button is pushed, a row of privileges will be created below the scrolling list for each group. Privileges can then be checked/unchecked for each group as required. Remove - To remove a row click on the Remove button on the right of the row. Categories Select one or more categories from the scrolling list. The harvested metadata will be assigned to the selected categories. Note: this harvester automatically creates a new Category named after each of the Z3950 servers that return records. Records that are returned by a server are assigned to the category named after that server.
In GeoNetwork the numeric values that can be specied for @attr 1 map to the lucene index eld names as follows: 180 Chapter 4. Managing Metadata
4.7. Harvesting
181
Lucene index ISO19139 element eld 1016 any All text from all metadata elements 4 title, altTitle gmd:identicationInfo//gmd:citation//gmd:title/gco:CharacterString 62 abstract gmd:identicationInfo//gmd:abstract/gco:CharacterString 1012 _changeDate Not a metadata element (maintained by GeoNetwork) 30 createDate gmd:MD_Metadata/gmd:dateStamp/gco:Date 31 publicationDate gmd:identicationInfo//gmd:citation//gmd:date/gmd:CI_DateCode/@codeListValu 2072 tempExtentBegin gmd:identicationInfo//gmd:extent//gmd:temporalElement//gml:begin(Position) 2073 tempExtentEnd gmd:identicationInfo//gmd:extent//gmd:temporalElement//gml:end(Position) 2012 leId gmd:MD_Metadata/gmd:leIdentier/* 12 identier gmd:identicationInfo//gmd:citation//gmd:identier//gmd:code/* 21,29,2002,3121,3122 keyword gmd:identicationInfo//gmd:keyword/* 2060 northBL,eastBL,southBL,westBL gmd:identicationInfo//gmd:extent//gmd:EX_GeographicBoundingBox/gmd:westB (etc) Note that this is not a complete set of the mappings between Z3950 GEO attribute set and the GeoNetwork lucene index eld names for ISO19139. Check out INSTALL_DIR/web/geonetwork/xml/search/z3950Server.xsl and INSTALL_DIR/web/geonetwork/xml/schemas/iso19139/index-elds.xsl for more details and annexe A of the GEO attribute set for Z3950 at http://www.fgdc.gov/standards/projects/GeoProle/annex_a.html for more details. Common values for the relation attribute (@attr=2): @attr 2= 1 2 3 4 5 6 7 8 9 10 Description Less than Less than or equal to Equals Greater than or equal to Greater than Not equal to Overlaps Fully enclosed within Encloses Fully outside of
@attr 1=
So a simple query to get all metadata records that have the word the in any eld would be: @attrset geo @attr 1=1016 the @attr 1=1016 means that we are doing a search on any eld in the metadata record A more sophisticated search on a bounding box might be formulated as: @attrset geo @attr 1=2060 @attr 4=201 @attr 2=7 "-36.8262 142.6465 -44.3848 151.2598 @attr 1=2060 means that we are doing a bounding box search @attr 4=201 means that the query contains coordinate strings @attr 2=7 means that we are searching for records whose bounding box overlaps the query box specied at the end of the query
182
Notes
Z3950 servers must be congured for GeoNetwork in INSTALL_DIR/web/geonetwork/WEB-INF/classes/JZKitConfig.xml.tem every time the harvester runs, it will remove previously harvested records and create new ones.
Figure 4.33: An example of the Harvesting Management Page with History functions
4.7. Harvesting
183
Figure 4.35: An example of the Harvesting History for a harvester Once the harvest history has been displayed it is possible to: expand the detail of any exceptions sort the history by harvest date (or in the case of the history of all harvesters, by harvester name) delete any history entry or the entire history
4.8 Formatter
4.8.1 Introduction
The metadata.show service (the metadata viewer) displays a metadata document using the default metadata display stylesheets. However it can be useful to provide alternate stylesheets for displaying the metadata. Consider a central catalog that is used by several partners. Each partner might have special branding and wish to emphasize particular components of the metadata document. To this end the metadata.formatter.html and metadata.formatter.xml services allow an alternate stylesheet to be used for displaying the metadata. The urls of interest to an end-user are: /geonetwork/srv/<langCode>/metadata.formatter.html?xsl=<formatterId>&id=<metadataId> Applies the stylesheet identied by xsl parameter to the metadata identied by id param and returns the document with the html contentType /geonetwork/srv/<langCode>/metadata.formatter.xml?xsl=<formatterId>&id=<metadataId> Applies the stylesheet identied by xsl parameter to the metadata identied by id param and returns the document with the xml contentType /geonetwork/srv/<langCode>/metadata.formatter.list Lists all of the metadata formatter ids Another use-case for metadata formatters is to embed the metadata in other websites. Often a metadata document contains a very large amount of data and perhaps only a subset is interesting for a particular website or perhaps the branding/stylesheets needs to be customized to match the website.
184
4.8.2 Administration
A metadata formatter is a bundle of les that can be uploaded to Geonetwork as a zip le (or in the simplest case just upload the xsl). An administration user interface exists for managing these bundles. The starting page of the ui contains a list of the available bundles and has a eld for uploading new bundles. There are three upload options: Single xsl le - A new bundle will be created for the xsl le and the name of the bundle will be based on the xsl le name Zip le (at) - A zip le which contains a view.xsl le and other required resources at the root of the zip le so that when unzipped the les will be unzipped into the current directory Zip le (folder) - A zip le with a single folder that contains a view.xsl le and the other required resources so that when unzipped a single directory will be created that contains the formatter resources. If a bundle is uploaded any existing bundles with the same name will be replaced with the new version. See Bundle format section below for more details about what les can be contained in the format bundle. When a format in the formatter list is selected the following options become enabled: Delete - Delete the format bundler from Geonetwork Download - Download the bundle. This allows the administrator to download the bundle and edit the contents then upload at a later date Edit - This provides some online edit capabilities of the bundle. At the moment it allows editing of existing text les. Adding new les etc... maybe added in the future but is not possible at the moment. When edit is clicked a dialog with a list of all editable les are displayed in a tree and double clicking on a le will open a new window/tab with a text area containing the contents of the le. The webpage has buttons for saving the le or viewing a metadata with the style. The view options do NOT save the document before execution, that must be done before pressing the view buttons.
4.8. Formatter
185
<img src="{/root/resourceURL}/img.png"/>
<metadata> - the root of the metadata document loc lang - the text of this tag is the lang code of the localization les that are loaded in this section <bundle loc le> - the contents of the bundles loc/<locale>/*.xml les strings - the contents of geonetwork/loc/<locale>/xml/strings.xml schemas <schema> - the name of the schema of the labels and codelists strings to come labels - the localised labels for the schema as dened in the schema_plugins/<schema>/loc/<locale>/labels.xml codelists - the localised codelists labels for the schema as dened in the schema_plugins/<schema>/loc/<locale>/codelists.xml strings - the localised strings for the schema as dened in the schema_plugins/<schema>/loc/<locale>/strings.xml If the view.xsl output needs to access resources in the formatter bundle (like css les or javascript les) the xml document contains a tag: resourceUrl that contains the url for obtaining that resource. An example of an image tag is:
<img src="{/root/resourceURL}/img.png"/>
By default the strings, labels, etc... will be localized based on the language provided in the URL. For example if the url is /geonetwork/srv/eng/metadata.formatter.html?xsl=default&id=32 then the language code that is used to look up the localization will be eng. However if the language code does not exist it will fall back to the Geonetwork platform default and then nally just load the rst local it nds. Schemas and geonetwork strings all have several different translations but extra strings, etc... can be added to the formatter bundle under the loc directory. The structure would be:
loc/<langCode>/strings.xml
The name of the le does not have to be strings.xml. All xml les in the loc/<langCode>/ directory will be loaded and added to the xml document. The format of the formatter bundle is as follows:
config.properties view.xsl loc/<langCode>/
Only the view.xsl is required. If a single xsl le is uploaded then the rest of the directory structure will be created and some les will be added with default values. So a quick way to get started on a bundle is to upload an empty xsl le and then download it again. The downloaded zip le will have the correct layout and contain any other optional les.
186
4.8.4 Cong.properties
The cong.properties le contains some conguration options used when creating the xml document. Some of the properties include: xedLang - sets the language of the strings to the xed language, this ensures that the formatter will always use the same language for its labels, strings, etc... no matter what language code is in the url. loadGeonetworkStrings - if true or non-existent then geonetwork strings will be added to the xml document before view.xsl is applied. The default is true so if this parameter is not present then the strings will be loaded schemasToLoad - denes which schema localization les should be loaded and added to the xml document before view.xsl is applied if a comma separated list then only those schemas will be loaded if all then all will be loaded if none then no schemas will be loaded applicableSchemas - declares which schemas the bundle can format A comma separated list indicates specically which schemas the bundle applies to If the value is all (or value is empty) then all schemas are considered supported
4.9 Processing
GeoNetwork can batch process metadata records by applying an XSLT. The processing XSLTs are schema dependent and must be stored in the process folder of each metadata schema. For example, the process folder for the iso19139 metadata schema can be found in GEONETWORK_DATA_DIR/config/schema_plugins/iso19139/process. Some examples of batch processing are: Filtering harvested records from another GeoNetwork node (See GeoNetwork Harvesting in the Harvesting section of this manual) Suggesting content for metadata elements (editor suggestion mechanism) Applying an XSLT to a selected set of metadata records by using the xml.batch.processing service (this service does not have a user interface, it is intended to be used with an http submitter such as curl).
4.9. Processing
187
keywords and remove internal online resources. These options are controlled by the following parameters: protocol: Protocol of the online resources that must be removed email: Generic email to use for all email addresses in a particular domain (ie. after @domain.org). thesaurus: Portion of thesaurus name for which keywords should be removed It could be used in the GeoNetwork harvesting XSL lter conguration using:
anonymizer?protocol=DB:&[email protected]&thesaurus=MYINTERNALTHESAUR
Scale denominator formatter schema: ISO19139 usage: Suggestion Format scale which contains , / or : characters. Add extent form geographic keywords schema: ISO19139 usage: Suggestion Compute extent based on keyword of type place using installed thesaurus. WMS synchronizer schema: ISO19139 usage: Suggestion If an OGC WMS server is dened in distribution section, suggest that the user add extent, CRS and graphic overview based on that WMS. Add INSPIRE conformity schema: ISO19139 usage: Suggestion If INSPIRE themes are found, suggest that the user add an INSPIRE conformity section. Add INSPIRE data quality report schema: ISO19139 usage: Suggestion Suggest the creation of a default topological consistency report when INSPIRE theme is set to Hydrography, Transport Networks or Utility and governmental services
188
Keywords comma exploder schema: ISO19139 usage: Suggestion Suggest that comma separated keywords be expanded to remove the commas (which is better for indexing and searching). Keywords mapper schema: ISO19139 usage: Batch process Process records and map keyword dene in a mapping table (to be dened manually in the process). Linked data checker schema: ISO19139 usage: Suggestion Check URL status and suggest to remove the link on error. Thumbnail linker schema: ISO19139 usage: Batch process This batch process creates a browse graphic or thumbnail for all metadata records. Process parameters: prex: thumbnail URL prex (mandatory) thumbnail_name: Name of the element to use in the metadata for the thumbnail le name (without extension). This element should be unique in a record. Default is gmd:leIdentier (optional). thumbnail_desc: Thumbnail description (optional). thumbnail_type: Thumbnail type (optional). sufx: Thumbnail le extension. Default is .png (optional). Inserted fragment is:
<gmd:graphicOverview> <gmd:MD_BrowseGraphic> <gmd:fileName> <gco:CharacterString>$prefix + $thumbnail_name + $suffix</gco:CharacterString> </gmd:fileName> <gmd:fileDescription> <gco:CharacterString>$thumbnail_desc</gco:CharacterString> </gmd:fileDescription> <gmd:fileType> <gco:CharacterString>$thumbnail_type</gco:CharacterString>
4.9. Processing
189
4.10 Fragments
GeoNetwork supports metadata records that are composed from fragments of metadata. The idea is that the fragments of metadata can be used in more than one metadata record. Here is a typical example of a fragment. This is a responsible party and it could be used in the same metadata record more than once or in more than one metadata record if applicable.
<gmd:CI_ResponsibleParty xmlns:gmd="http://www.isotc211.org/2005/gmd" xmlns:gco="http:// <gmd:individualName> <gco:CharacterString>John Dath</gco:CharacterString> </gmd:individualName> <gmd:organisationName> <gco:CharacterString>Mulligan & Sons, Funeral Directors</gco:CharacterString> </gmd:organisationName> <gmd:positionName> <gco:CharacterString>Undertaker</gco:CharacterString> </gmd:positionName> <gmd:role> <gmd:CI_RoleCode codeList="./resources/codeList.xml#CI_RoleCode" codeListValue="poin </gmd:role> </gmd:CI_ResponsibleParty>
Metadata fragments that are saved in the GeoNetwork database are called subtemplates. This is mainly for historical reasons as a subtemplate is like a template metadata record in that it can be used as a template for constructing a new metadata record. Fragments are not handled by GeoNetwork unless xlink support is enabled. See XLink resolver in the System Conguration section of this manual. The reason for this is that XLinks are the main mechanism by which fragments of metadata can be included in metadata records. Fragments may be created by harvesting (see Harvesting Fragments of Metadata to support re-use), used in register thesauri (see Preparing to edit an ISO19135 register record) and linked into metadata records using the GeoNetwork editor in the javascript widget interface. This section of the manual will describe: how to manage directories of subtemplates how to extract fragments from an existing set of metadata records and store them as subtemplates how to manage the fragment cache that GeoNetwork uses to speed up access to fragments that are not in the local catalogue
Because of these differences, a separate interface has been built to search, display and edit subtemplates in directories based upon their root element. The interface is accessed from the GeoNetwork Administration page. To access this page you need to be logged in as a GeoNetwork Administrator. The relevant part of the GeoNetwork Administration page is shown in the following screenshot with the directory interface highlighted. The subtemplate directory function on the Administration page Clicking on this link will bring up the directory interface. The directory interface allows you to browse the available subtemplates according to their root element or search for any subtemplate with content content containing the search term. A typical directory for a site is shown in the following screenshot. Notice in the screenshot that we have selected the directory of subtemplates with the root element gmd:CI_Contact? The other directories for this particular site are also shown. The edit interface shown in the right hand panel is self-explanatory.
4.10. Fragments
191
192
Identify fragments of metadata that they would like to manage as reusable subtemplates in the metadata record. This can be done using an XPath. eg. the XPath /gmd:MD_Metadata/gmd:contact/gmd:CI_ResponsibleParty/gmd:contactInfo/gmd:CI_ identies metadata contact information in iso19139 metadata records. An example of such a fragment (taken from one of the GeoNetwork sample records) is shown in the following example:
<gmd:CI_Contact> <gmd:phone> <gmd:CI_Telephone> <gmd:voice> <gco:CharacterString/> </gmd:voice> <gmd:facsimile> <gco:CharacterString/> </gmd:facsimile> </gmd:CI_Telephone> </gmd:phone> <gmd:address> <gmd:CI_Address> <gmd:deliveryPoint> <gco:CharacterString>Viale delle Terme di Caracalla</gco:CharacterString> </gmd:deliveryPoint> <gmd:city> <gco:CharacterString>Rome</gco:CharacterString> </gmd:city> <gmd:administrativeArea> <gco:CharacterString/> </gmd:administrativeArea> <gmd:postalCode> <gco:CharacterString>00153</gco:CharacterString> </gmd:postalCode> <gmd:country> <gco:CharacterString>Italy</gco:CharacterString> </gmd:country> <gmd:electronicMailAddress> <gco:CharacterString>[email protected]</gco:CharacterString> </gmd:electronicMailAddress> </gmd:CI_Address> </gmd:address> </gmd:CI_Contact>
Identify and record the XPath of a eld or elds within the fragment which text content will be used as the title of the subtemplate. It is important to choose a set of elds that will allow a human to identify the subtemplate when they choose to either reuse the subtemplate in a new record or edit in the subtemplate directories interface. This XPath should be relative to the root element of the fragment identied in the previous step. So for example, in the fragment above we could choose gmd:address/gmd:CI_Address/gmd:electronicMailAddress/gco:CharacterString as the title for the fragments to be created. On the GeoNetwork home page, search for and then select the records from which the subtemplates will be extracted. Choose Extract subtemplates from the actions on selection menu as shown in the following screenshot: Fill in the form with the information collected in the previous steps. It should look something like the following: Run the extract subtemplate function in test mode (ie. without checking the I really want to do
4.10. Fragments
193
194
this box). This will test whether your XPaths are correct by extracting one subtemplate from the selected set of records and displaying the results. If you are happy with the test results, go ahead with the actual extraction by checking the I really want to do this checkbox. After the extraction completes you should see some results.
Finally, go to the subtemplate directory management interface and you should be able to select the root element of your subtemplates to examine the extracted subtemplates.
The metadata records from which the subtemplates were extracted now have xlinks to the subtemplates.
4.10. Fragments
195
196
Note: ner control of the XLink cache will be implemented in a future version of GeoNetwork.
4.11 Schemas
Metadata records in GeoNetwork are described by a schema. The schema sets out the structuring of the metadata record and provides all the ancillary data and functions to use the schema in GeoNetwork. A metadata schema plugin capability has been introduced in GeoNetwork 2.8.0. This allows the administrator to add, update and delete metadata schemas in GeoNetwork without the need to stop and restart GeoNetwork. Note: Adding a metadata schema to GeoNetwork that is incorrect or invalid can thoroughly break your GeoNetwork instance. This section is for catalogue administrators who are condent about metadata schemas and understand the different technologies involved with a GeoNetwork metadata schema. A detailed description of what constitutes a metadata schema for GeoNetwork can be found in the GeoNetwork Developers Manual. This section will describe how to access the schema add, update and delete functions and how those functions should be used. To access these functiuons you need to be logged in to GeoNetwork as an Administrator. The schema functions are on the Administration page as shown below.
The Administration page with the metadata schema functions highlighted Note: Metadata schemas should be thoroughly tested in a development instance of GeoNetwork before they are added to a production instance. Errors in a schema plugin (particularly in the presentation XSLTs) may make your GeoNetwork instance unusable.
4.11. Schemas
197
There are three possible locations for the ZIP archive: 1. on the server lesystem - you specify the path of the ZIP archive on the server lesystem 2. on a web server accessible via a http link - you specify the URL of the ZIP archive on the web server. 3. Attached to a metadata record describing the schema which is present in the local GeoNetwork catalog - you specify the UUID of that metadata record which must be an iso19139 metadata record.
198
Note: You cannot delete a metadata schema if there are records that belong to that schema in the catalog. You must delete all the records that belong to that schema rst before you can delete the schema itself. Note: You cannot delete a metadata schema if another schema depends upon that schema eg. you cannot delete the iso19139 schema if the iso19139.mcp schema is present because the iso19139.mcp schema is a prole that depends on iso19139. Schema dependencies can be found/specied in the schema-ident.xml le.
4.11. Schemas
199
200
CHAPTER 5
Features
201
202
Chapter 5. Features
where language-code is one of the ISO 639-2 (three-character) language codes, http://www.loc.gov/standards/iso639-2/php/code_list.php.
see
if the request parameter is not sent (the user selected any language, or its not in the XML request), the requested language may be automatically detected, if an Administrator user has enabled this in the System Conguration:
The auto-detection feature uses Language Detection Library for Java, see https://code.google.com/p/language-detection/. This library tries to detect the language of search terms in parameter any. This may not work very well, depending on the language, if there is only one or very few search terms. This is why this feature is disabled by default. At the time of writing the auto-detection supports these languages: Afrikaans Arabic Bulgarian Bengali Czech Danish German Greek (modern) English Spanish Estonian Persian Finnish French Gujarati Hebrew Hindi Croatian Hungarian Indonesian Italian Japanese Kannada Korean
203
Lithuanian Latvian Macedonian Malayalam Marathi Nepali Dutch Norwegian Punjabi Polish Portuguese Romanian Russian Slovak Slovenian Somali Albanian Swedish Swahili Tamil Telugu Thai Tagalog Turkish Ukrainian Urdu Vietnamese Chinese (traditional) Chinese (simplied) if autodetecting the language is disabled (the default), the current language of the users GUI is used as the requested language if there is no GUI, the requested language is hardcoded to be English
204
Chapter 5. Features
5.1.3 Stopwords
Stopwords are words that are considered to carry little or no meaning relevant to search. To improve relevance ranking of search results, stopwords are often removed from search terms. In GeoNetwork stopwords are automatically used if a stopwords list for the requested language is available; if not, no stopwords are used. At the time of writing there are stopword lists for: Arabic Bulgarian Bengali Catalan Czech Danish German Greek (modern) English Spanish Persian Finnish French Hindi Hungarian Italian Japanese Korean Marathi Malay Dutch Norwegian Polish Portuguese Romanian Russian Swedish Turkish Chinese
205
System administrators may add additional languages stopword lists by placing them in the directory <geonetwork>/web/resources/stopwords. The lenames should be <ISO 639-2 code>.txt. If you do add a stopwords list for another language, please consider contributing it for inclusion in GeoNetwork. Likewise, to disable stopwords usage for one or more languages, the stopword list les should be removed or renamed.
206
Chapter 5. Features
207
5.3 Thesaurus
5.3.1 Introduction
A thesaurus is a list of words (or concepts) from a specialized eld of knowledge. In a metadata catalog, words from a thesaurus can be assigned to a metadata record (as keywords) as a way of associating it with one or more concepts from a eld of knowledge. For example, a record may be assigned a keyword AGRICULTURE - Crops meaning that the record describes a resource or activity relating to crops in the eld of Agriculture. In GeoNetwork, the process of assigning keywords to a metadata record takes place in the metadata editor. The user can choose words from one or more thesauri to associate the record with the concepts described by those words. This process is supported for both ISO19115/19139 and dublin core metadata records using an extjs based thesaurus picker. Concepts within a eld of knowledge or in different elds of knowledge may be related or even be equivalent. For example, in a thesaurus describing geographic regions, the Australian state of Tasmania is a specialization of the country of Australia. As an example of overlapping concepts in different elds, a thesaurus describing science activities in the eld of global change may have concepts relating to agricultural activities that will be equivalent to terms from a thesaurus that describes the themes used in a map series. In GeoNetwork, thesauri are represented as SKOS (Simple Knowledge Organisation System). SKOS (more on this below) captures concepts and relationships between concepts. SKOS thesauri can be imported from standalone les or they can be generated from ISO19135 register records in the local GeoNetwork catalog. ISO19135 (more on this below) not only captures the concepts and relationships between the concepts, but (amongst other things) how the concepts have evolved and most importantly, who has contributed to and managed the evolution of the concepts and the thesauri itself.
208
Chapter 5. Features
Description Thesaurus has concepts identifying a location Thesaurus has concepts identifying layers of any deposited substance Thesaurus has concepts identifying a time period Thesaurus has concepts identifying a particular subject or topic Thesaurus has concepts identifying a branch of instruction or specialized learning
GeoNetwork supports multilingual thesauri (e.g. Agrovoc). Search and editing takes place in the current user interface language (i.e. if the interface is in English, when editing metadata, GeoNetwork will only search for concept in English). We use SKOS to represent thesauri in GeoNetwork because: it provides a simple and compact method of describing concepts and relationships between concepts from a eld of knowledge SKOS concepts can be queried and managed by the sesame/openRDF software used by GeoNetwork
the name and a description of the register version and language information contact information of those that have a role in the register (eg. manager, contributor, custodian, publisher etc) the elements used to describe an item in the register the items The standard information used to describe a register item includes: identier name and description of item eld of application lineage and references to related register items An example of a register item from register of the NASA GCMD (Global Change Master Directory) science keywords is shown below.
<grg:RE_RegisterItem uuid="d1e7"> <grg:itemIdentifier> <gco:Integer>7</gco:Integer> </grg:itemIdentifier> <grg:name> <gco:CharacterString>Aquaculture</gco:CharacterString> </grg:name> <grg:status> <grg:RE_ItemStatus>valid</grg:RE_ItemStatus> </grg:status> <grg:dateAccepted> <gco:Date>2006</gco:Date> </grg:dateAccepted> <grg:definition gco:nilReason="missing"/> <grg:itemClass xlink:href="#Item_Class"/> <grg:specificationLineage> <grg:RE_Reference> <grg:itemIdentifierAtSource> <gco:CharacterString>5</gco:CharacterString> </grg:itemIdentifierAtSource> <grg:similarity> <grg:RE_SimilarityToSource codeListValue="generalization" codeList="http://ww.../lists.xml#RE_SimilarityToSource"/> </grg:similarity> </grg:RE_Reference> </grg:specificationLineage> </grg:RE_RegisterItem>
As mentioned earlier, to use a thesaurus described by an ISO19135 register record, GeoNetwork uses an XSLT called xml_iso19135ToSKOS.xsl (from the convert subdirectory in the iso19135 plugin schema) to extract the following from the ISO19135 register record: valid concepts (grg:itemIdentier, grg:name, grg:status) relationships to other concepts (grg:specicationLineage) title, version and other management info
210
Chapter 5. Features
This information is used build a SKOS le. The SKOS le is then available for query and management by the sesame/openRDF software used in GeoNetwork.
Figure 5.2: Administration interface for thesaurus For each thesaurus the following buttons are available: Download - Link to the SKOS RDF le.
5.3. Thesaurus
211
Delete - Remove thesaurus from the current node. View - If type is external, the view button allows to search and view concepts. Edit - If type is local, the edit button allows to search, add, remove and view concepts. Import an external thesaurus GeoNetwork allows thesaurus import in SKOS format. Once uploaded, an external thesaurus cannot be updated. Select the category, browse for the thesaurus le and click upload. The SKOS le will be in GEONETWORK_DATA_DIR/config/codelist/external/thesauri/<category>.
Figure 5.3: Upload interface for thesaurus At the bottom of the page there are the following buttons: 1. Back: Go back to the main administration page. 2. Upload: Upload the selected RDF le to the node. Then it will list all thesaurus available on the node. Creating a register thesaurus An ISO19135 record in the local GeoNetwork catalog can be turned into a SKOS le and used as a thesaurus in GeoNetwork. ISO19135 records not in the local catalog can be harvested from other catalogs (eg. the catalog of the organisation that manages the register). Once the ISO19135 register record is in the local catalog, the process of turning it into a thesaurus for use in the keywords selector begins a search for the record. Having located the record in the search results, one of the actions on the record is to Create/Update Thesaurus. After selecting this action, you can choose the ISO thesaurus category appropriate for this thesaurus: After selecting the ISO thesaurus category, the ISO19135 register record is converted to a SKOS le and installed as a thesaurus ready for use in the metadata editor. As described above in the section on ISO19135, only the valid register items are included in the thesaurus. This behaviour and any of the mappings between ISO19135 register items and the SKOS thesaurus le can be changed or inspected by looking at the XSLT xml_iso19135TOSKOS.xsl in the convert subdirectory of the iso19135 schema plugin.
212
Chapter 5. Features
Figure 5.4: Search results showing ISO19135 record with thesaurus creation action
Figure 5.5: Selecting the ISO thesaurus category when creating a thesaurus 5.3. Thesaurus 213
keywords search add/remove keywords for local thesaurus. Use the textbox and the type of search in order to search for keywords.
214
Chapter 5. Features
Preparing to edit an ISO19135 register record Register records can be very large. For example, a register record describing the ANZLIC Geographic Extent Names register has approx 1800 register items. Each register item holds not only the name of the geographic extent, but also its geographic extent and details of the lineage, relationships to other terms and potentially, the evolution of the extent (changes to name, geographic extent) including the details of changes and why those changes occurred. Editing such a large record in the GeoNetwork editor can cause performance problems for both the browser and the server because the editor constructs an HTML form describing the entire record. Fortunately a much more scaleable approach exists which is based on extracting the register items from the ISO19135 register record and storing them as subtemplates (essentially small metadata records with just the content of the register item). The process for extracting register items from an ISO19135 register record is as follows: search for and select the register record choose Extract register items from the Actions on selected set menu
Figure 5.8: Extracting subtemplates from a register record After the register items have been extracted, you should see a results summary like the following.
5.3. Thesaurus
215
The gure for Subtemplates extracted is the number of register items extracted from the ISO19135 register record. Editing a register item To edit/change any of the register items that have been extracted as subtemplates, you can use the Directory management interface. This interface is accessed from the Administration menu, under Manage Directories. In this interface: select Register Item (GeoNetwork) as the type of subtemplate to edit as follows.
Figure 5.9: Managing a Directory of subtemplates, selecting Register Item subtemplates enter a search term or just select the search option to return the rst 50 register items. register items will appear in the left hand side bar, selecting on one will open an editing interface in the right hand panel.
216
Chapter 5. Features
Figure 5.10: Managing a Directory of subtemplates, opening a Register Item for editing Editing global register information To edit/change any of the global register information (eg. register owner, manager, version, languages), edit the register record in the normal GeoNetwork metadata editing interface.
5.3. Thesaurus
217
218
Chapter 5. Features
Figure 5.15: Auto-complete function in thesaurus search interface If an XML element named keyword-select-panel is present as a child of the search element in the conggui.xml le (in the WEB-INF directory), then search for keyword using the keyword selection panel is available as in the metadata editor:
<search> <!-- Display or not keyword selection panel in advanced search panel <keyword-selection-panel/> --> </search>
If Register is chosen the user will be asked to ll out a form as follows: The elds in this form are self-explanatory except for the following: Email: The users email address. This is mandatory and will be used as the username. 5.4. User Self-Registration Functions 219
Prole: By default, self-registered users are given the Registered User prole (see previous section). If any other prole is selected: the user will still be given the Registered User prole an email will be sent to the Email address nominated in the Feedback section of the System Administration menu, informing them of the request for a more privileged prole
Youve told us that you want to be "Editor", you will be contacted by our office soon. To log in and access your account, please click on the link below. http://greenhouse.gov/geonetwork Thanks for your registration. Yours sincerely, The team at The Greenhouse GeoNetwork Site
Notice that the user has requested an Editor prole. As a result an email will be sent to the Email address nominated in the Feedback section of the System Administration menu which looks something 220 Chapter 5. Features
like the following: Notice also that the user has been added to the built-in user group GUEST. This is a security restriction. An administrator/user-administrator can add the user to other groups if that is required later. If you want to change the content of this INSTALL_DIR/web/geonetwork/xsl/registration-pwd-email.xsl.
Dear Admin,
email,
you
should
modify
Newly registered user [email protected] has requested "Editor" access f Instance: Url: The Greenhouse GeoNetwork Site http://greenhouse.gov/geonetwork
User registration details: Name: Surname: Email: Organisation: Type: Address: State: Post Code: Country: Please action. The Greenhouse GeoNetwork Site Dubya Shrub [email protected] The Greenhouse gov 146 Main Avenue, Creationville Clerical 92373 Mythical
email,
you
should
modify
http://localhost:8080/geonetwork/srv/en/password.change.form?username=dubya.shrub@greenh This link is valid for today only. Greenhouse GeoNetwork Site
GeoNetwork has generated a changeKey from the forgotten password and the current date and emailed that to the user as part of a link to a change password form. If you want to change the content of this email, INSTALL_DIR/web/geonetwork/xsl/password-forgotten-email.xsl. you should modify
221
When the user clicks on the link, a change password form is displayed in their browser and a new password can be entered. When that form is submitted to GeoNetwork, the changeKey is regenerated and checked with the changeKey supplied in the link, if they match then the password is changed to the new password supplied by the user. The nal step in this process is a verication email sent to the email address of the user conrming that a change of password has taken place:
Your Greenhouse GeoNetwork Site password has been changed. If you did not change this password contact the Greenhouse GeoNetwork Site helpdesk The Greenhouse GeoNetwork Site team
you
should
modify
222
Chapter 5. Features
CHAPTER 6
This glossary provides you with brief descriptions of the minimum set of metadata elds required to properly describe a geographical data as well as some optional elements highly suggested for a more extensive standard description. Access constraints Access constraints applied to assure the protection of privacy or intellectual property, and any special restrictions or limitations on obtaining the resource Abstract Brief narrative summary of the content of the resource(s) Administrative area State, province of the location Temporal extent - Begin date Formatted as 2007-09-12T15:00:00 (YYYY-MM-DDTHH:mm:ss) Character set Full name of the character coding standard used for the metadata set Grid spatial representation - Cell geometry Identication of grid data as point or cell City City of the location Reference System Info - Code Alphanumeric value identifying an instance in the namespace Country Country of the physical address Data quality info Provides overall assessment of quality of a resource(s) Date Reference date and event used to describe it (YYYY-MM-DD) Date stamp Date that the metadata was created (YYYY-MM-DDThh:mm:ss) Date type Event used for reference date Delivery point Address line for the location (as described in ISO 11180, annex A) Equivalent scale - Denominator The number below the line in a vulgar fraction Data Quality - Description Description of the event, including related parameters or tolerances OnLine resource - Description Detailed text description of what the online resource is/does Descriptive keywords Provides category keywords, their type, and reference source Grid spatial representation - Dimension name Name of the axis i.e. row, column
223
Grid spatial representation - Dimension size Number of elements along the axis Dimension size Resolution Number of elements along the axis Distribution info Provides information about the distributor of and options for obtaining the resource(s) Geographic bounding box - East bound longitude Eastern-most coordinate of the limit of the dataset extent, expressed in longitude in decimal degrees (positive east) Edition Version of the cited resource Electronic mail address Address of the electronic mailbox of the responsible organisation or individual Temporal extent - End date Formatted as 2007-09-12T15:00:00 (YYYY-MM-DDTHH:mm:ss) Equivalent scale Level of detail expressed as the scale of a comparable hardcopy map or chart Extent Information about spatial, vertical, and temporal extent Facsimile Telephone number of a facsimile machine for the responsible organisation or individual File identier Unique identier for this metadata le Vector spatial representation - Geometric object type Name of point and vector spatial objects used to locate zero-, one-and two-dimensional spatial locations in the dataset Vector spatial representation - Geometric object count Total number of the point or vector object type occurring in the dataset Geographic bounding box Geographic position of the dataset Grid spatial representation Information about grid spatial objects in the dataset Grid spatial representation - Resolution value Degree of detail in the grid dataset Grid spatial representation - Transformation parameter availability Indication of whether or not parameters for transformation exists Data Quality - Hierarchy level Hierarchical level of the data specied by the scope Identication info Basic information about the resource(s) to which the metadata applies Point of Contact - Individual name Name of the responsible person- surname, given name, title separated by a delimiter Keyword Commonly used word(s) or formalised word(s) or phrase(s) used to describe the subject Data Language Language used for documenting data Metadata - Language Language used for documenting metadata Data Quality - Lineage Non-quantitative quality information about the lineage of the data specied by the scope. Mandatory if report not provided OnLine resource - Linkage Location (address) for on-line access using a Uniform Resource Locator address or similar addressing scheme such as http://www.statkart.no/isotc211 Maintenance and update frequency Frequency with which changes and additions are made to the resource after the initial resource is completed Metadata author Party responsible for the metadata information Metadata standard name Name of the metadata standard (including prole name) used Metadata standard version Version (prole) of the metadata standard used
224
OnLine resource - Name Name of the online resource Geographic bounding box - North bound latitude Northern-most, coordinate of the limit of the dataset extent expressed in latitude in decimal degrees (positive north) Grid spatial representation - Number of dimensions Number of independent spatial-temporal axes Distribution Info - OnLine resource Information about online sources from which the resource can be obtained Point of Contact - Organisation name Name of the responsible organisation Other constraints Other restrictions and legal prerequisites for accessing and using the resource Point of contact Identication of, and means of communication with, person(s) and organisations(s) associated with the resource(s) Point of contact - Position name Role or position of the responsible person Postal code ZIP or other postal code Presentation form Mode in which the resource is represented OnLine resource - Protocol Connection protocol to be used Purpose Summary of the intentions with which the resource(s) was developed Reference system info Description of the spatial and temporal reference systems used in the datasetData Data Quality - Report Quantitative quality information for the data specied by the scope. Mandatory if lineage not provided Grid spatial representation - Resolution value Degree of detail in the grid dataset Point of contact - Role Function performed by the responsible party Geographic bounding box - South bound latitude Southern-most coordinate of the limit of the dataset extent, expressed in latitude in decimal degrees (positive north) Spatial representation info Digital representation of spatial information in the dataset Spatial representation type Method used to spatially represent geographic information Data Quality - Statement General explanation of the data producers knowledge about the lineage of a dataset Status Status of the resource(s) Supplemental Information Any other descriptive information about the dataset Temporal Extent Time period covered by the content of the dataset Title Name by which the cited resource is known Topic category code High-level geographic data thematic classication to assist in the grouping and search of available geographic data sets. Can be used to group keywords as well. Listed examples are not exhaustive. NOTE It is understood there are overlaps between general categories and the user is encouraged to select the one most appropriate. Grid spatial representation - Transformation parameter availability Indication of whether or not parameters for transformation exists
225
Vector spatial representation - Topology level Code which identies the degree of complexity of the spatial relationships Type Subject matter used to group similar keywords URL Unied Resource Locator Use constraints Constraints applied to assure the protection of privacy or intellectual property, and any special restrictions or limitations or warnings on using the resource Vector spatial representation Information about the vector spatial objects in the dataset Voice Telephone number by which individuals can speak to the responsible organisation or individual Geographic bounding box - West bound longitude Western-most coordinate of the limit of the dataset extent, expressed in longitude in decimal degrees (positive east)
226
CHAPTER 7
227
228
229
230
CHAPTER 8
A range of related software packages can be used in addition to GeoNetwork opensource to deploy a full Spatial Data Infrastructure. These include Web Map Server software, GIS desktop applications and Web Map Viewers. Below you will nd some examples of open source software available for each categories.
231
232
Chapter 8. Free and Open Source Software for Geospatial Information Systems
CHAPTER 9
This is a list of frequently encountered problems, suggestions that help to nd the cause of the problem and possible solutions. The list is by no means exhaustive. Feel free to contribute by submitting new problems and their solutions to the developer mailing list. Note: <install directory> is a placeholder for the GeoNetwork web application directory (eg. <your_tomcat>/webapps/geonetwork or <your_jetty>/web/geonetwork). <some le> should be read as some random le name. Warning: Be very careful when issuing commands on the terminal! You can easily damage your operating system with no way back. If you are not familiar with using the terminal: dont do it, contact an expert instead! Make a backup of your data before you make any of the suggested changes below!
233
Then check The data/tmp directory and What/Where is the GeoNetwork data directory?.
The above example shows that only the user tomcat has write access on the directories listed. All other users have read (and execute) rights only. See http://en.wikipedia.org/wiki/Filesystem_permissions for more details on le permissions. Make sure your web server is running as user tomcat. Check this with the command: ps aux | grep tomcat You should see the processes that have tomcat in their description. Something like this:
bash-3.2# ps aux | grep tomcat tomcat 22253 0,7 0,0 2435120 tomcat 22251 0,0 1,9 2861960 532 s000 80596 s000 S+ S 5:03pm 5:03pm
If all is well, the user referred to at the start of this string (in this case tomcat) is the same user that has write permissions on the data and tmp directories. You now have two possible solutions: Make the data and temporary directories writable to all users. You can change this using the command: chmod -R a+w <install directory>/data Your permissions should now look like this:
drwxrwxrwx etc.. 6 tomcat tomcat 204 19 jan 15:34 .
Note: the w refers to write access The second solution is to ensure the user running the webserver is the same user that holds write access to the data directory (in this case tomcat). For this, you can (a) change the user running the process, or (b) change ownership of the directory using the chown command: chown -R tomcat:tomcat <install directory>/data
234
If all is well, then the tomcat user will have write permissions on all sub directories. If not then you should ensure that the user running the webserver is the same user that holds write access to the GeoNetwork data directory (in this case tomcat). For this, you can (a) change the user running the process, or (b) change ownership of the directory using the chown command: chown -R tomcat:tomcat <install directory>/WEB-INF/data
Jetty by default ships with a classloader that does not conform to the Java classloading model: youll notice because Geoserver will fail all (JAI ) usage attempt with a sealing violation exception. It can be restored to standard behaviour locating the etc/jetty-webapps.xml conguration le and changing the web app context conguration to look like the following:
235
<Configure id="Server" class="org.eclipse.jetty.server.Server"> <Ref id="DeploymentManager"> <Call id="webappprovider" name="addAppProvider"> <Arg> <New class="org.eclipse.jetty.deploy.providers.WebAppProvider"> <Set name="monitoredDir"><Property name="jetty.home" default="." />/../web <Set name="defaultsDescriptor"><Property name="jetty.home" default="."/>/e <Set name="scanInterval">1</Set> <Set name="contextXmlDir"><Property name="jetty.home" default="." />/conte <Set name="extractWars">true</Set>
<!-- uncomment in case of a JAI usage attempt with a "sealing violation" e <Set name="parentLoaderPriority">true</Set> </New> </Arg> </Call> </Ref> </Configure>
Note: The important line is the one where the parentLoaderPriority property is set to true
236
CHAPTER 10
Glossary
ebRIM Enterprise Business Registry Information Model.. CSW Catalog Service for the Web. The OGC Catalog Service denes common interfaces to discover, browse, and query metadata about data, services, and other potential resources. ISO International Standards Organisation is an international-standard-setting body composed of representatives from various national standards organizations. http://www.iso.org ISO TC211 ISO/TC 211 is a standard technical committee formed within ISO, tasked with covering the areas of digital geographic information (such as used by geographic information systems) and geomatics. It is responsible for preparation of a series of International Standards and Technical Specications numbered in the range starting at 19101. GeoNetwork GeoNetwork opensource is a standards based, Free and Open Source catalog application to manage spatially referenced resources through the web. http://geonetwork-opensource.org GeoServer GeoServer is an open source software server written in Java that allows users to share and edit geospatial data. Designed for interoperability, it publishes data from any major spatial data source using open standards. GPL The GNU General Public License is a free, copyleft license for software and other kinds of works. GeoNetwork opensource is released under the GPL 2 license. http://www.gnu.org/licenses/oldlicenses/gpl-2.0.html Creative Commons GeoNetwork documentation is released mons Attribution-ShareAlike 3.0 Unported License. http://creativecommons.org/licenses/by-sa/3.0/ under the Creative ComFind more information at
XML Extensible Markup Language is a general-purpose specication for creating custom markup languages. XSD XML Schema, published as a W3C recommendation in May 2001, is one of several XML schema languages. http://en.wikipedia.org/wiki/XSD ebXML Enterprise Business XML. DAO Data Access Object. CRUD Create Read Update and Delete. DB (or DBMS) A database management system (DBMS) is computer software that manages databases. DBMSes may use any of a variety of database models, such as the network model 237
or relational model. In large systems, a DBMS allows users and other software to store and retrieve data in a structured way. SOA Service Oriented Architecture provides methods for systems development and integration where systems package functionality as interoperable services. A SOA infrastructure allows different applications to exchange data with one another. FGDC The Federal Geographic Data Committee (FGDC) is an interagency committee that promotes the coordinated development, use, sharing, and dissemination of geospatial data on a national basis in the USA. See http://www.fgdc.gov JMS Java Messaging Service. TDD Test Driven Development. JIBX Binding XML to Java Code. HQL Hibernate Query Language. OO Object Oriented. EJB Enterprise Java Beans. SOAP Simple Object Access Protocol is a protocol specication for exchanging structured information in the implementation of Web Services in computer networks. OGC Open Geospatial Consortium. A standards organization for geospatial information systems http://www.opengeospatial.org OSGeo The Open Source Geospatial Foundation (OSGeo), is a non-prot non-governmental organization whose mission is to support and promote the collaborative development of open geospatial technologies and data. http://www.osgeo.org FAO Food and Agriculture Organisation of the United Nations is a specialised agency of the United Nations that leads international efforts to defeat hunger. http://www.fao.org WFP World Food Programme of the United Nations is the food aid branch of the United Nations, and the worlds largest humanitarian organization. http://www.wfp.org UNEP The UN Environment Programme (UNEP) coordinates United Nations environmental activities, assisting developing countries in implementing environmentally sound policies and encourages sustainable development through sound environmental practices. http://www.unep.org OCHA United Nations Ofce for the Coordination of Humanitarian Affairs is designed to strengthen the UNs response to complex emergencies and natural disasters. http://ochaonline.un.org/ URL A Uniform Resource Locator species where an identied resource is available and the mechanism for retrieving it. GAST GeoNetwork Administrator Survival Tool. A desktop application that allows administrators of a GeoNetwork catalog to perform simple database conguration using a GUI. WebDAV Web-based Distributed Authoring and Versioning. WebDAV is a set of extensions to the Hypertext Transfer Protocol (HTTP) that allows users to edit and manage les collaboratively on remote World Wide Web servers. OAI-PMH Open Archive Initiative Protocol for Metadata Harvesting. It is a protocol developed by the Open Archives Initiative. It is used to harvest (or collect) the metadata descriptions of the records in an archive so that services can be built using metadata from many archives.
238
WMS Web Map Service is a standard protocol for serving georeferenced map images over the Internet that are generated by a map server using data from a GIS database. The specication was developed and rst published by the Open Geospatial Consortium in 1999. WFS Web Feature Service provides an interface allowing requests for geographical features across the web using platform-independent calls. One can think of geographical features as the source code behind a map. WCS Web Coverage Service provides an interface allowing requests for geographical coverages across the web using platform-independent calls. The coverages are objects (or images) in a geographical area WPS Web Processing Service is designed to standardize the way that GIS calculations are made available to the Internet. WPS can describe any calculation (i.e. process) including all of its inputs and outputs, and trigger its execution as a Web Service. UUID A Universally Unique Identier (UUID) is an identier standard used in software construction, standardized by the Open Software Foundation (OSF) as part of the Distributed Computing Environment (DCE). MAC address Media Access Control address (MAC address) is a unique identier assigned to most network adapters or network interface cards (NICs) by the manufacturer for identication, and used in the Media Access Control protocol sublayer. See also http://en.wikipedia.org/wiki/MAC_address on Wikipedia MEF Metadata Exchange Format. An export format developed by the GeoNetwork community. More details can be found in this manual in Chapter Metadata Exchange Format. SKOS The Simple Knowledge Organisation Systems (SKOS) is an area of work developing specications and standards to support the use of knowledge organisation systems (KOS) such as thesauri, classication schemes. http://www.w3.org/2004/02/skos/ Z39.50 protocol Z39.50 is a client-server protocol for searching and retrieving information from remote computer databases. It is covered by ANSI/NISO standard Z39.50, and ISO standard 23950. The standards maintenance agency is the Library of Congress. SMTP Simple Mail Transfer Protocol is an Internet standard for electronic mail (e-mail) transmission across Internet Protocol (IP) networks. LDAP Lightweight Directory Access Protocol is an application protocol for querying and modifying directory services running over TCP/IP. Shibboleth The Shibboleth System is a standards based, open source software package for web single sign-on across or within organisational boundaries. It allows sites to make informed authorisation decisions for individual access of protected online resources in a privacy-preserving manner. DC The Dublin Core metadata element set is a standard for cross-domain information resource description. It provides a simple and standardised set of conventions for describing things online in ways that make them easier to nd. ESA European Space Agency is an intergovernmental organisation dedicated to the exploration of space. http://www.esa.int FOSS Free and Open Source Software, also F/OSS, FOSS, or FLOSS (free/libre/open source software) is software which is liberally licensed to grant the right of users to study, change, and improve its design through the availability of its source code. http://en.wikipedia.org/wiki/FOSS JDBC The Java Database Connectivity (JDBC) API is the industry standard for database-independent connectivity between the Java programming language and a wide range of databases SQL 239
databases and other tabular data sources, such as spreadsheets or at les. The JDBC API provides a call-level API for SQL-based database access. JDBC technology allows you to use the Java programming language to exploit Write Once, Run Anywhere capabilities for applications that require access to enterprise data. With a JDBC technology-enabled driver, you can connect all corporate data even in a heterogeneous environment. JAI Java Advanced Imaging (JAI) is a Java platform extension API that provides a set of object-oriented interfaces that support a simple, high-level programming model which allows developers to create their own image manipulation routines without the additional cost or licensing restrictions, associated with commercial image processing software.
240
Index
C
Creative Commons, 237 CRUD, 237 CSW, 237
J
JAI, 240 JDBC, 239 JIBX, 238 JMS, 238
D
DAO, 237 DB (or DBMS), 237 DC, 239
L
LDAP, 239
M
MAC address, 239 MEF, 239 import, 114
E
ebRIM, 237 ebXML, 237 EJB, 238 ESA, 239
O
OAI-PMH, 238 OCHA, 238 OGC, 238 OO, 238 OSGeo, 238
F
FAO, 238 FGDC, 238 FOSS, 239
G
GAST, 238 GeoNetwork, 237 GEONETWORK_DATA_DIR, 93 GeoServer, 237 GPL, 237
S
Shibboleth, 239 SKOS, 239 SMTP, 239 SOA, 238 SOAP, 238
H
HQL, 238
T
TDD, 238
I
import MEF, 114 XML, 114 ISO, 237 ISO TC211, 237
U
UNEP, 238 URL, 238 UUID, 239
W
WCS, 239 241
WebDAV, 238 WFP, 238 WFS, 239 WMS, 238 WPS, 239
X
XML, 237 import, 114 XSD, 237
Z
Z39.50 protocol, 239
242
Index