XDB 10.5.0 Manual
XDB 10.5.0 Manual
xDB
Version 10.5
Manual
EMC Corporation
Corporate Headquarters:
Hopkinton, MA 01748 9103
1 508 435 1000
www.EMC.com
Copyright © 2000-2013 EMC Corporation. All rights reserved.
EMC believes the information in this publication is accurate as of its publication date. The information is subject to change without
notice.
THE INFORMATION IN THIS PUBLICATION IS PROVIDED "AS IS." EMC CORPORATION MAKES NO
REPRESENTATIONS OR WARRANTIES OF ANY KIND WITH RESPECT TO THE INFORMATION IN THIS PUBLICATION,
AND SPECIFICALLY DISCLAIMS IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR
PURPOSE.
Use, copying, and distribution of any EMC software described in this publication requires an applicable software license.
For the most up-to-date listing of EMC product names, see EMC Corporation Trademarks on EMC.com. Adobe and Adobe PDF
Library are trademarks or registered trademarks of Adobbe Systems Inc, in the U.S. and other countries. All other trademarks used
herein are the property of their respective owners.
Documentation Feedback
Your opinion matters. We want to hear from you regarding our product documentation. If you have feedback about how we can
make our documentation better or easier to use, please send us your feedback directly at [email protected].
Table of Contents
xDB Documentation
This xDB Manual is designed to provide a technical introduction to the EMC Documentum xDB
product. It contains basic xDB information, and discusses installing, configuring, administering and
using xDB for software development. There is also more detailed information on more advanced
subjects, on specific aspects of xDB, and on using xDB in combination with other tools.
The xDB distribution includes further information for developers, including API documentation
and sample code.
Intended audience
This manual is for xDB developers and administrators. It assumes that the reader is familiar with:
• Java and XML
• the operating system that is used with xDB
• basic database principles such as transactions, locking, and access rights
• general principles of client/server architecture and networking
Some knowledge of DOM, XQuery, and XSLT is helpful but not required.
Resources
The XML Technologies section of the EMC Developer Network offers resources related to xDB,
information about specific aspects of xDB and its use, case studies, tools and sample code that you
can download.
Support information
EMC Documentum technical support services are designed to make deployment and management
of Documentum products as effective as possible.
For the latest product documentation and support materials, including White Papers and Technical
Advisories, refer to EMC Online Support (https://support.emc.com). Check regularly for new and
updated documentation to ensure that you have the latest system information.
Note: Documentation installed, or packaged with the product on the download center, is current at
the time of release. Documentation updates made after a release are available for download from
EMC Online Support (https://support.emc.com).
Typographic conventions
The following table describes the typographic conventions used in this guide.
Table 1 Typographic conventions
Appearance Meaning
Fixed terms Terminology, such as names of features, standards, etc.
User interface control User interface controls, such as menu entries, buttons, etc.
API names Application Programming Interface elements, such as classes and methods
in Java APIs.
Monospaced text Example code, parameter values.
Commands Commands and their arguments, to be entered in a command prompt.
File paths Paths to files in the file system, usually relative to the installation directory.
Variable Names Variables in command strings and user input variables
Revision History
• Fixed a concurrency issue with the hot backup functionality which prevented users from
connecting to the server when the backup was in progress.
• Fixed a bug that causes the backup LSN returned by XhiveBackupInfo.getBackupLSN() to be 0.
• Made the API specification of XhiveBackupInfoIf.getBackupLSNs() clearer about the return
value of XhiveBackupInfoIf.getBackupLSN().
• Fixed the problem that the bootstrap log record generated for
XhiveLibraryIf.removeBinding(String) is wrong.
• Fixed a recovery issue which might happen if the system crashes at the moment of lucene
segment shrinking.
• Fixed a recovery issue which might cause bootstrap file inconsistency if the system crashes.
• Fixed an issue with read-only federations that caused a NPE when obtaining segment information
for detachable libraries.
• Fixed an issue with read-only federations where an attempt would be made to modify temporary
segments.
• Fixed an issue with read-only federations where an attempt would be made to update the
bootstrap file.
• Fixed a bug which in rare cases could cause a federation backup to hang.
• Fixed a deadlock which occurs when a user thread and a system both tried to kill the server.
• Made XhiveLibraryIf.attach(String, String, String, XhiveFederationFactoryIf.SegmentIdMapper)
official.
• The startup logic will now try to fix violations of data file naming convention.
• Indexes: Deprecated com.xhive.index.interfaces.TokenMetadata and introduced a custom xDB
Lucene attribute com.xhive.index.interfaces.XhiveWeightAttribute to be used instead.
• Indexes: Fixed an issue where non-Administrator could not create/delete multipath indexes.
• Indexes: Fixed an issue where multipath indexes ignored date-times with timezone.
• Indexes: Support for unique keys option in the concurrent B-tree indexes.
• Indexes: Support for concurrent multipath indexes.
• Indexes: Support for intra-collection parallelism in multipath indexes.
• Indexes: Support by IndexInConstructionList and IndexInConstruction of all of the index types,
except for LibraryID and LibraryName.
• Indexes: Support for concurrent indexing session of IndexInConstruction.
• Indexes: Added moveToIndexList(threshold) method to IndexInConstructionListIf, which
attempts to index a number of nodes before moving the index.
• XQuery: Updated to the XPath and XQuery Functions and Operators 3.0 spec of 21 May
2013. Note: All functions that operate on function items as their arguments have changed.
Functions fn:map and fn:map-pairs are renamed to respectively fn:for-each and fn:for-each-pair.
In addition, the signatures of fn:filter, fn:fold-left and fn:fold-right have changed. As a result,
the old functions fn:map, fn:map-pairs, fn:filter, fn:fold-left and fn:fold-right as defined in the
previous spec version (08 January 2013) are no longer supported.
• XQuery: Support for XQuery 3.0 Decimal Format Declarations and fn:format-number.
• XQuery: Support for XQuery Copy-Namespaces declaration.
• XQuery: Support for parallel execution of fn:for-each.
• XQuery: Added extension function xhive:get-metadata-keys to get the metadata keys of one
or more documents.
• XQuery: Changed the default implicit timezone from local time to PT0H. Note: WARNING:
this change may lead to different results for XQuery functions that depend on the implicit
timezone, like fn:current-dateTime, fn:adjust-dateTime-to-timezone and fn:implicit-timezone.
Refer to the manual sections on option xhive:implicit-timezone and ’Indexes and timezones’
for more information on the subject.
• XQuery: Fixed an issue where comparison of 2 dateTimes was not correct if only one had a
timezone.
• XQuery: Fixed a multi-path index bug causing wrong score to be calculated in XQueries that
used score but did not order by score.
• XQuery: Fixed a bug where usage of a for clause with ’allowing empty’ could throw an
ArrayIndexOutOfBoundsException.
• XQuery: fixed 5 xquery pretty printer issues. Some of these issues could make an xquery
unparsable.
• XQuery: Added documentation of the xhive:return-blobs XQuery option to the manual.
• XQuery: Fixed an issue where a user defined function throwing an fn:error caused another
exception whithout the original fn:error message.
• XQuery: Fixed namespace support in the xhive:index-paths-values XQuery option.
• XQuery: Fixed a bug where fn:doc-available on an empty library caused a NPE.
• XQuery: Added interface XhiveXQueryParallelJobIf to access sub-query info during parallel
query execution.
• XQuery: Added extension function xhive:version-id($doc) to be used in conjunction with
xhive:collection-*-date() functions.
• Admin client, command line client and ant: added support to access a federation through a
federation set (description file or server). The path to such a federation contains a hash. The part
before the hash should point to the set, the part after the hash should navigate to the federation.
When creating a federation bootstrap file, its path should not contain a hash.
• Command line client and Ant: Added new commands (add-file, set-file-maxsize) and new Ant
tasks (<addsegmentfile/>, <setmaxfilesize>). Added a check so that the maxsize of a datafile (if
specified) has to be at least 10 pages.
• Admin client: Now able to parse IPv6 addresses.
• Admin client: No longer lists reserved segments, as it cannot display information for
non-existent data files.
• Admin client: Using Check out/ Edit/ Checkin now preserves existing metadata entries.
• Admin client: It is now possible to edit metadata entries on versions.
• Admin client: Now has functionality for creating/updating indexes with the version info option,
allowing the creation of versioned documents with this structure.
• Command line client: Extended the check-federation, check-database, check-node, and
check-library command-line tools to support consistency checking of federation backups.
• Added support to keep duplicated transaction log files, per server node, in multiple locations.
• Added interface XhiveLogConfiugurationIf to manage multiple transaction log files locations.
• Modified file extension (from .log to .wal) for transaction log files and xhive_checkpoint.log and
xhive_id.log files as well.
• Command line client: Extended the create-federation and add-node command-line tools to
support multiple log directories.
• Command line client: Added global options --stdout , --stdout-append, --stderr , and
--stderr-append, to allow redirection of standard output and standard error output to a file.
• Command line client: Added options to the backup command to specify included or excluded
segments via a file, by using --include-segments-file or --skip-segments-file
respectively.
• Admin client: Added new functionality: When executing an XPath/XUpdate/XQuery, if the
results are idle, the underlying session will timeout. The user will then be asked whether or not to
re-execute the action, and if not then the results tree will be disabled. The default timeout is set to
5 minutes and can be set through the options dialog.
• Admin client: You can now run/kill merge tasks on Multipath indexes using the context menu
of the Index tab.
• Fixed a concurrency issue with the SegmentAccessInfo cache which may cause NPE while a
front-end is retrieving the information for connection switch.
• Fixed several issues that could result in "incorrect magic number" exceptions.
• Fixed several concurrency issues with the segment cleaner.
• Fixed a data corruption issue related to not removing keys properly from deserialized extended
full-text indexes. (Extended full-text indexes originate from X-Hive/DB 8 and older.)
• Added new functionality which allows consumers to subscribe to monitored statistics in regards
to the cache buffer pool.
• Publishing MBean which displays cache buffer pool statistics.
• Added new method in XhiveLibraryChildIf interface for storing versioned documents,
which enables historical searches on new versioned documents and libraries,
using new xquery functions and new indexing options. To make library children
searchable with the new xhive:collection-*-date xquery functions, they MUST
be created with the new makeVersionable() method’s ’queryable’ parameter on
true. Indexing is provided for by new options XhiveIndexIf.VERSION_INFO and
XhiveExternalIndexConfigurationIf.setStoreVersionInfo(true). The admin client currently only
has functionality for creating/updating indexes with the version info option, it doesn’t allow the
creation of versioned documents with this structure yet.
• XQuery: XQuery 3.0 support.
• XQuery: Added xhive:version-date-property function to retrieve date properties from document
versions. The function returns xs:dateTime values.
• XQuery: Fixed an xquery optimizer bug. An ’order by’ query was optimized by an index join.
The optimizer wrongfully assumed the result of the indexes already ordered.
• XQuery: Each xquery module of a specific namespace can now consist of multiple files.
For this purpose, a new resolveModuleImports function is added to the XQueryResolverIf
and the existing resolveModuleImport has become deprecated. When using abstract class
AbstractXQueryResolver, the old resolveModuleImport is now called for each single import
location. Depending on the existing customer implementation of function resolveModuleImport,
this may result in different behavior of module resolving.
• XQuery: Fixed a multi-path index bug causing wrong score to be calculated in XQueries that
used score but did not order by score.
• Indexes: Implemented support of Lucene IndexReader objects cache. It can significantly
improve the performance of queries which use Lucene MultiPath index.
• Indexes: Fixed an issue with the INCLUDE_DESCENDANTS multi-path
index option that did not distinguish among siblings with the same path even if
ENUMERATE_REPEATING_ELEMENTS was set, causing incorrect content to be indexed.
• Indexes: Fixed the support for full-text logical queries ("LINE contains text ’assemble’ ftand
’draw’") in the multi-path index.
• Indexes: Fixed an issue with index adoption when creating multiple new indexes at once (via
XhiveIndexAdderIf), which could lead to unusable indexes.
• Indexes: Fixed an issue where non-Administrator could not create/delete multi-path indexes.
• Samples: Added a Spring Web MVC sample.
• Samples: Made it clearer how to run the J2EE/Spring samples.
function expressions, external variable declaration default values, annotations (no supported
annotation implementations).
• XQuery: added XQuery 3.0 functions: fn:function-name, fn:function-arity, fn:function-lookup.
• XQuery: added collation support for order by clauses
• XQuery: the xquery optimizer now supports ordering by multipath indexes. When using
multipath indexes, order by clauses support features ascending/descending, empty least/greatest
and collations.
• XQuery: added parallel and non-parallel query evaluation support for queries with order by
clauses addressing multiple roots or library sequences
• XQuery: xhive:evaluate supports specifying values for external query variables.
• XQuery: fixed multiple xquery optimizer bugs with negated conditions using not().
• XQuery: fixed wildcard search on a filtered term caused nullpointer exception.
• XQuery: fixed xquery optimizer bug where the result of or-ing 2 ordered index results was
considered ordered.
• XQuery: fixed a number of concurrency issues in the parallel query implementation.
• XQuery: The call function of XhiveXQueryExtensionFunctionIf used as highlighter has one
additional argument containing position information. Depending of the custom implementation of
the highlighter, this change may cause backward compatibility issues.
• XQuery: Fixed an XQuery optimizer bug that caused the xhive:ignore-indexes option to be
ignored in certain cases.
• Samples: New sample DeleteDatabase.java shows how to delete database.
• Samples: New sample MultithreadedOperations.java shows how to perform multithreaded
read/write operations and how to handle exceptions thrown in different threads, specifically
how to handle LockNotGrantedExceptions.
• Samples: New sample CreateMultinodeDatabase.java shows how to create and handle
multi-node databases.
• Samples: Replaced occurrences of ’xhive:fts’ and ’ftcontains’ with ’contains text’ within
XQueries.
• Samples: Implemented correct session handling in samples, ensuring that we commit transactions
at the end of samples unless there is an exception, if so we roll back to transactions.
• Samples: Removed parser.setParameter("namespace-declarations", Boolean.FALSE); since we
encourage users to use a namespace when parsing a document.
• Samples: Replaced old document parsing API with new DOM API. When parsing documents,
use LSParser.parseURI instead of XhiveLibraryIf.parseDocument.
• Samples: When using an LSParser set an error handler to display errors which occur during
parsing.
• Samples: Replacing occurrences of GetAttribute with GetAttributeNS and of SetAttribute with
SetAttributeNS.
• Samples: Replaced old syntax for use of wildcards in XQueries with new one.
• Samples: Replaced old syntax for use of ’And’, ’Or’ ... in XQueries with new one.
• Fixed an issue where the segment may remain in the segment cleaner’s queue after being marked
as unusable.
• Fixed an issue where the segment may remain in the segment cleaner’s queue after being
removed with forceDelete().
• Added retry logic for RPC requests sent to libraries that are bound to more than one node. Now it
will automatically attempt to send the request to next binding node if the previous one fails.
• Fixed a potential deadlock in XhiveLibraryIf.changeBinding(String).
• Fixed a repair tool bug where it may modify read-only indexes.
• Added a new attribute named ’reserved’ for segments. Reserved segments are those whose data
files are not present, however their segment records are kept in the bootstrap file so that those
records are present when users restore a library which occupies them. Currently, only MultiPath
Index segments can be ’reserved’.
• Replaced the JDK writeUTF()/readUTF() calls in socket communication with an implementation
that can send/receive more than 64K bytes.
• Writes on temporary segments are allowed even if all writes are suspended.
• Added support for (optional) FIPS 140-2 encryption.
• The "format-pretty-print" LS Serializer option honors xml:space="preserve".
• Admin client: Fixed: option "Open in browser" could cause Null pointer exception.
• Admin client: Changed "Select active federation" shortcut to ’Ctrl+o’.
• Admin client: New look and feel, including new icons.
• Admin client: Added progress bar functionality.
• Admin client: Added cleaning lucene segment functionality. We will be able to clean segment
from a database level or a library level
• Admin client: Index, Metadata, Properties and Multipath index subpath tables can be sorted.
• Admin client: Consistency check on blob no longer offers check index or check dom node
options.
• Admin client: Removed "Cancel" button from error messages.
• Admin client: Adding more toolbar commands.
• Admin client: Changed administrator user icon.
• Admin client: Errors in database backup dialog no longer cause backup file corruption.
• Admin client: Added check box to "Create segment" dialog in order to apply unlimited storage.
• Admin client: Added check box to "Superuser login" dialog in order to remember previously
entered password.
• Admin client: Connecting to a chained remote client no longer causes a freeze.
• Admin client: When connecting locally to a multi node configuration we no longer have access
to server node management.
• Admin client: When creating a segment only active nodes are available for binding.
• Admin client: Added configurable path mapping functionality to restoration.
• Command line client: Added ’webserver-disable’ option to ’run-server’ command, which when
used stops the webserver from being deployed on server startup.
• Command line client: When running a non primary node and providing a bootstrap file, that file
will be used and not the one in the properties file.
• Command line client: Statistics commands can now be run while server is running in regular
mode
• Command line client: When running consistency check commands, if no check options are
specified then all options will be checked.
• Command line client: When running consistency check commands with options which are not
supported, we no longer fail the check, but print out which options are not supported.
• Command line client: Added statistics-ls command, allowing printout of information on
requested index and all of it’s subindexes.
• Command line client: Added repair-shrinksegments command, This allows us to to shrink index
segments which are not referenced by the record but are still not empty.
• Command line client: Added repair-set-usable command. This allows you to set if a multipath
index is usable. Meaning if the index will be used in a query or not.
• Command line client: When trying to create a segment and failing we will print out that
segment was not created.
• Command line client: Added clean-library command. This allows you to clean lucene segments
on a library level.
• Command line client: Added clean-database command. This allows you to clean lucene
segments on a database level.
• Command line client: You can now return the previous lines in command either by backspace
or the back key.
• Command line client: Each time you run the run-server command, it re-reads the xdb.properties
file. Any changes made to xdb.properties will apply to the new server.
• Command line client: When running the command line client in windows we only cache some
properties in as system variables.
• Command line client: Added tab completion for xDB commands.
• Command line client: Added relative and configurable path mapping functionality to restoration.
• Command line client: In order to enjoy full functionality when running Windows 2008R1
or Windows 7, you must install Microsoft Visual C++ Redistributed packages. Refer to the
Installing section of the manual for specifics.
• Implemented XhiveExternalIndexMBean which is deployed when a primary server is deployed.
This MBean supplies it’s consumer with the notifications on External Index merge opertations,
such as: final/not final, which index is being merged, state of the merge.
• Consistency Checker: Extended ConsistencyCheckerResult API: boolean
isCheckApplicable(DatabaseCheckOptions option), boolean areChecksApplicable()
• Consistency Checker: When performing a check for
CHECK_SEGMENTS_ADMINISTRATION_STRUCTURES options, now checks that size tree
extents are contained within block tree.
• Consistency Checker: When performing a check for
CHECK_SEGMENTS_ADMINISTRATION_STRUCTURES options, if there are extra files in a
segment you will no longer recieve a false positive.
• Federation backup with BACKUP_KEEP_LOG_FILES option is only applicable to standalone
backup.
• New documentation: xDB Administration Guide (xDB_admin_guide.pdf). This Guide is
intended as a convenience for readers who do not need specifics on java software development
with xDB, for example system administrators and end users of xDB-based applications. Omitting
developer-specific information, this Guide has about half as many pages as the xDB manual.
• Fixed a force-recovery issue where it may be unable to identify the correct affected data file(s) if
the redo of a LOGTYPE_NAMEBASE_SPLIT record fails.
• Fixed a concurrency issue with the suspension of disk writes that caused XQuery queries to block
because of waiting Lucene asynchronous tasks.
• Fixed Final Merge issue where log truncation was prevented, causing the disk with the logs
to fill up.
• Increased installation defaults for minimum and maximum memory: 128MB and 256MB.
• Changed request type from byte to short to accommodate more RPC commands.
• No longer begin transactions on all secondary nodes at the start of a read-only transaction.
• Fixed a false alarm of SEGMENT_BEING_DETACHED in multi-node environment.
• Fixed a restore issue where existing data files of the segments that are excluded from the backup
get overwritten by the empty files created by the restore procedure.
• Ensured that temporary pages are written in a temporary segment whenever possible, to avoid
issues.
• Fixed restore issue which could lead to an unsupported operation exception.
• Fixed bug in creation of sorted indexes which could corrupt the database.
• Fixed a force-recovery issue where log records are mistakenly generated when segments are
marked as unusable.
• Fixed NPE associated with checking the consistency of a server node.
• Fixed xDB log record issue which added zeroes to the end of the log and could, in case of a crash
at that time, stop xDB server from starting up.
• Fixed bug in xDB installer concerning server and webserver listen addresses.
• Fixed bug which prevented some value index updates if a path value index was also in scope.
• Removed two public APIs: XhiveFederationIf.registerReplicator(String, String) and
XhiveFederationIf.unregisterReplicator(String, String) which were never usable.
• Added support for custom loggers in xdb.properties.
• Added support for separately setting the web administration address (webserver-address).
• Fixed a NPE in xhive:evaluate() when the context item was undefined.
• No longer automatically rollback the transaction if the connection breaks during an RPC call.
Instead, the transaction will remain open and the user will have to decide whether to abort it or
not when receiving an IO_ERROR exception in such case.
• Fixed a potential deadlock (involving the control session and error channels) which might cause
the remote server to hang in extremely rare cases.
• XQuery: added support for the XQuery Full Text weight option (works with multi-path indexes
only).
• XQuery: added thesaurus support for full text xqueries.
• XQuery: added XhiveXQueryFilterIf profile information to Query Plan Profile.
• XQuery: xhive:metadata function can now be called on any context node. If the context node is
a descendant of a document node, the function is executed on the metadata of the document node.
• XQuery: Path value indexes with metadata conditions are used by the xquery optimizer.
• XQuery: Fixed regression where xqueries with a [..[last()]] predicate caused a nullpointer
exception.
• XQuery: Fixed regression where xqueries with a conditional expression that contains a context
independent subexpression caused a nullpointer exception.
• XQuery: Fixed multiple cardinality matching bugs.
• XQuery: A for, let, where, or order by clause containing an updating expression now throws an
exception.
• XQuery: Fixed regression where xqueries with a conditional expression that contains a context
independent subexpression caused a nullpointer exception.
• XQuery: fixed wildcard search on a filtered term caused nullpointer exception.
• No longer begin transactions on all secondary nodes at the start of a read-only transaction.
• Fixed a false alarm of SEGMENT_BEING_DETACHED in multi-node environment.
• Fixed a restore issue where existing data files of the segments that are excluded from the backup
get overwritten by the empty files created by the restore procedure.
• Fixed a force-recovery issue where log records are mistakenly generated when segments are
marked as unusable.
• Removed two public APIs: XhiveFederationIf.registerReplicator(String, String) and
XhiveFederationIf.unregisterReplicator(String, String) which were never usable.
• Administration: a web based xDB Administrator Client has been introduced. It supplements
the existing administration tools by also allowing various tasks to be performed from within
a web browser. This new web based client is started on port 1280 by default when running
the xDB server.
• Added new XhiveIndexInConstructionIf that allows creating a multi-path index without locking
the library.
• Indexes: Added XhiveIndexListIf.definitionsToXml() and
XhiveIndexListIf.addFromXml(XhiveDocumentIf) to export all index definitions from an
index list, and to create indexes based on such files. Admin client, and Ant task support (see
<batchindexadder/> and <listindexes/>) also added.
• Ant: new task <metadata/>; <librarydelete/> now accepts a "path" attribute for greather
convenience, <listindexes/> and <batchindexadder/> were updated (see note about
XhiveIndexListIf.definitionsToXml()).
• A XhiveSessionIf can now detect a Federation replacement. This may affect existing application
test suites that replace federations and reuse the sessions.
• Increased stability of multi path indexes, which had several small issues fixed.
• Indexes:Significant library children traversal speed up (when using a library id or library name
index). Libraries created before 10.1 need to rebuild their library indexes to take advantage of this.
• Increased speed of iterating over a XhiveNodeListIf
• XQuery: simple value indexes will now be used for path expressions without comparisons.
• XQuery: index supported order by expressions will now be correctly optimized when used
multiple times. Also available in the Admin client.
• XQuery: new utility com.xhive.query.interfaces.XQueryPrettyPrinter to indent and highlight
XQuery statements.
• XQuery: update queries on versioned documents now throw exception.
• XQuery: added extension function xhive:document-id to retrieve the id of a document.
• When restoring backups that contain multi-path indexes, if a PathMapper is used, it will also be
used to remap the location of the multi-path indexes.
• Known issue: When indexing sub-paths (XhiveSubPathIf) with type DATE_TIME, timezones
will be ignored in range searches. So for best results we suggest normalizing the date-time
data to a single time zone.
• Known issue: Both XhiveAttrIf and XhiveElementIf do not currently implement
ItemPSVI.getSchemaValue()
• Scoring in multi-path indexes is optimized for very large indexes and therefore we save memory
by not storing text length for score normalization. While we (currently) don’t offer any option to
turn this on, you can effectively turn it back on by setting the score-boost of all sub-paths to the
same value (the value must different than the default). So by setting the boost of all sub-paths
to 1.01 you can have better normalization in all your scores.
• Tools: many small fixes to XhiveFtpServer for commands such as store, rmdir, rename and list.
• Tools:the xdb command line: "xdb consistency-check" has a new option "deadobjects"
• Tools:the xdb command line: "xdb backup" has a new option "include-segments"
• Documentation:improved xDB manual structure, and minor content changes in the introductory
chapters
• Fixed multi-path index recovery issue which might result in DEAD_OBJECT exception
• XQuery: parallel query evaluation support for queries addressing multiple roots
• Added support for new weighted score boosting model through XhiveWeightedFreshnessBoostIf
interface
• Added support for full-text fuzzy search option
• Fixed segment cleaner thread starvation problem in concurrent environment
• Fixed SERVER_TERMINATED error caused by segment cleaner thread writing to READ_ONLY
segments
• Fixed a data file handling issue which could cause server to hang while being shut down if
some data files are missing
• Fixed NPE associated with layered import of modules.
• Increased stability of running xDB in multiple nodes
• Increased stability of xDB’s client server protocol
• Upgraded the included Xerces-J library to version 2.11.0. Changes in this version affect the
XMLResourceResolver: the value of XMLResourceResolver.getLiteralSystemId() is now equal
to XMLResourceResolver.getExpandedSystemId().
• Hot backups taken on xDB 10.0 (or less) are not compatible with xDB 10.1. If you want to restore
database/federation from such a backup you should do that using xDB 10.0 then cleanly shutdown
server and upgrade to xDB 10.1. Another option is to create backups again on xDB 10.1.
• Fixed a concurrency issue in the algorithm for determining the database log lower bound for the
active transaction set, which was causing fragmentation of concurrent indexes under certain
conditions.
• Support for IPv6-style bootstrap URLs added.
• Tools:Ant tasks for creating new indexes (including <batchindexadder/>) have a new attribute
"exists" to specify what to in case of name collision. Notice that batchindexadder used to fail by
default if a name collision happened, and now it will "skip" by default.
• XQuery: update queries on versioned documents now throw exception.
• XQuery: Fixed bug fulltext xquery with scoring can throw exception when using index.
• XQuery: Fixed regression where full-text queries against path-value indexes would search for
tokens as a prefix and not as an exact match.
• XQuery: Known Issue: Path value indexes using multiple keys where one key is full-text, such
as "a[b + c<FULL_TEXT::>]", will not use the index to resolve queries if the query does not
specify the value of "b". Example "a[ b and c contains text ’foo’]". This is a regression from xDB
9.X. In xDB 10.0.1 this query would trigger a NullPointerException.
• Fixed NPE associated with layered import of modules.
• Fixed a performance issue in changing the state of or backing up a library with huge amount of
documents beneath it.
• Fixed a bug where a secondary node might attempt to connect to itself when it should connect to
the primary to report corrupt segments.
• An xDB federation can now be distributed over multiple server machines. See manual chapter
"Configuring Multiple Backend Servers".
• There is an interface to find information about an existing backup file. See
XhiveFederationFactoryIf.getBackupInfo.
• Added possibility of compressing Text and CDATASection of imported nodes. See
XhiveNodeCallBackIf for details.
• Add new method XhiveLSParserIf.parseIntoDocument to parse into an existing document. This
method does not have all the features of LSParser.parseWithContext, but is more efficient
in time and memory use.
• Databases can now be created with a concurrent root library. See
XhiveFederationIf.DatabaseOption.CONCURRENT_ROOT.
• An xDB instance can now be enlisted as a resource with an XA transaction manager. See
XhiveDriverIf.getXAResource.
• All new namebases are now concurrent. The options
XhiveLibraryIf.CONCURRENT_NAMEBASE and
XhiveLibraryIf.DOCUMENTS_DO_NOT_LOCK_WITH_PARENT are now ignored and
have been deprecated.
• Added RPC tracing support when running in client/server mode. See
XhiveSessionIf.enableRPCTracing.
• For debugging purposes, a driver can now be given a name using XhiveDriverIf.setName.
• Restore operations can now optionally overwrite existing files. See XhiveRestoreCallbackIf.
• Changed hash code definition for XhiveXQueryValueIf. Atomic number values will now map to
the same hash code even if they are not of the same type (xs:double, xs:decimal, xs:integer etc.).
This means that they will be considered equal in hashmaps and similar containers.
• Index lookups will now be done with values in sorted order to enhance cache use. Predicate
evaluation with equality comparisons on very large sets is now faster.
• Added a cache of search string tokens. It avoids retokenizing the same string if it is used twice
within a single query.
• For performance reasons, all index lookups from XQuery will be performed ascending unless
an order by expression explicitly sorts the result in the other order. This may cause different
result orders for queries using "unordered {}".
• New Ant tasks: <metadata/>, <checkdatabase/>, <checkfederation/>, <checknode/>,
<checklibrary/>, <multipathindex/>, and <xquery/>.
• Added new check-federation, check-database, check-node and check-library command-line tools.
• The admin client now shows a paged view for libraries with many children. This avoids
OutOfMemoryErrors in the client.
• Added database/federation consistency checker interfaces
XhiveConsistencyCheckerIf/XhiveFederationConsistencyCheckerIf. They allows you to check
administrative and data pages consistency.
• Added support for consistency checker in admin client.
• Integrated J2ee module into xDB. Added samples how to use module from EJB and spring
frameworks.
• Added support for times filter of XQFT spec. available at
http://www.w3.org/TR/xpath-full-text-10/#doc-xquery-FTTimes
• Added support for ’at start/at end/entire content’ positional filters of XQFT spec. available at
http://www.w3.org/TR/xpath-full-text-10/#ftcontent
• Implemented XQFT feature parity for all full-text indexes.
• If a full-text index analyzer returns multiple tokens with the same position, the tokens will be
joined in an OR query.
• Marked XhiveCCIndexIf as deprecated to clearly discourage its usage.
• New index type: multi-path indexes. See the manual for more information.
• XQuery: Added an API XhiveScoreBoostFactorIf to boost library child score.
• XQuery: fixed an ArrayIndexOutOfBoundsException with order by expressions and multiple
for clauses
• XQuery: added optimization for conditions that check multiple children using starts-with in a
some ... satisfies loop, e.g. for $x in ...
where some $y in $x/author/last satisfies starts-with($y, ’Smi’)
return $x
Such conditions can now use index scans.
• XQuery: extended support for libraries. XQuerys can now use libraries in sequences ("(doc(’a’),
doc(’b’))//foo") or variables bound by declare variable or let statements, and path expressions
will still pick up indexes at the library level.
• XQuery: order by expressions are now supported across multiple libraries or documents. E.g.
for $x in (doc(’a’), doc(’b’))//test[@price]
order by $x/@price
return $x
will run as an index supported order by if there is an index on @price on ’a’ and ’b’.
• XQuery: Access to variables using positional predicates (e.g. $variable[5]) is now handled
more efficiently.
• XQuery: Added an API to get the query plan for an XQuery, either as a static
description or including profiling information after execution. See JavaDocs for
XhiveXQueryQueryIf.getQueryPlan() and XhiveXQueryResultIf.getQueryPlan().
• XQuery: relaxed the Update Statement Placement rules by default, enable strict checking
through XhiveXQueryCompilerIf.setStrictUpdateExpressions.
• XQuery: replace value of $node with $value will now always convert $value to a single
string instead of appending nodes - this reflects a change in the specification. Also applies to
xhive:replace-value-of WARNING: this might break existing XQuery code!
• XQuery: Java module functions can now also declare parameters as XhiveElementIf, XhiveAttrIf
or XhiveDocumentIf, booleans, and java.util.Calendar.
• The toString() and toXml() methods of DOM nodes will now serialize the XML fragment with
namespace support, i.e. missing declarations will be inserted where necessary.
• XQuery: errors triggered using the function fn:error($name as xs:QName?,
$message as xs:string, $values as item()*) will now cause the exception
com.xhive.error.xquery.XhiveXQueryUserException. This exception class provides accessors for
the QName and values list.
• XQuery: fixed a large number of issues with the regular expression engine. Changed
exception subtype for illegal replacement strings from XhiveXQueryParseException to
XhiveXQueryErrorException.
• Moved configuration file xdb.properties to the subdirectory conf, adjusted it to be a proper Java
properties file. This will be done automatically by the Windows installer.
• Server log output will be written to log/server-out.log and log/server-err.log instead of the bin
directory.
• New implementation of parallel queries using lazy evaluation strategy.
• New samples: CreateMultiPathIndex, LibrariesAsVariable and BoostLibraryScore.
• Fix: updates of indexes indexing attributes of nodes without children could fail on some cases.
• Fix: problem with wildcard queries evaluation that could crash the query, or that could cause a
given node id to be returned more than once from the index.
• Fix: memory leak when using XQFT’s window and distance operators.
• Tools: The xdb xquery command can now read the XQuery from a file, using the –file option.
See xdb help xquery for details.
• XQuery: adjusted the function signature of fn:put() to always require the second argument $uri.
Use the empty string to store documents without a name.
• Known issue: Multi path indexes are implemented as external indexes, and the backup restore
operation of these indexes do not make use of PathMapper.
• Command line: added the cd and mv commands to the command line client
• XQuery: fixed an issue where tail callable functions might produce values in the wrong order if
the same call site would be evaluated in parallel (e.g. through mutual recursion).
• Implemented an interactive shell for the database. The shell can be started using the executable
’xhive’ or ’xhive.bat’ respectively. It replaces the older command line tools (like XHBackup),
which are still in place for backwards compatibility. All previous commands are supported, plus
new commands to manage database contents (list, remove, import, XQuery, ...). See ’xhive help’
or the Administering xDB section in the manual for more details.
• XhiveXQueryCompilerIf has been extended to support setting default values for certain XQuery
prolog settings, including external functions, imported modules, variables, options, etc. See the
JavaDocs for more details.
• The default revalidation mode has been set to ’skip’ (used to be lax) for performance reasons. The
setting can be overridden within the query using ’declare revalidation (skip|lax|strict)’ or using
XhiveXQueryCompilerIf.setRevalidationMode(...).
• Fixed a bug where specifying the Unicode codepoint collation would rather use a Collation for
the current system locale instead of the correct codepoint comparison method.
• Implemented XQuery Update copy/modify expressions
• Changed xhive:insert-document() and fn:put() to overwrite documents instead of giving an error
• Deprecated XhiveXQueryValueException, catch XhiveException instead. This exception
used to signal the error that a full text conjunction setting had an erroneous value in
’xhive:fts-implicit-conjunction’.
• The XHServer tool now registers an instance of XhiveFederationMBean with the platform
MBean server.
• Libraries (XhiveLibraryIf) can now be used as external values within XQuery, as external
variables (via setVariable(...)), when returned from extension functions, Java modules, etc.
• Implemented compression of text and CDATA nodes. Added API which allows to specify which
nodes to compress.
• Implemented recovery of XA transactions after database crash.
• Partially implemented XQuery Full Text Specification, available at:
http://www.w3.org/TR/xpath-full-text-10/ The list of supported features include:
logical full-text operators, wildcard option, anyall options, positional filters and score variables.
• Support for scoring has been added to our Full Text search engine. The user may influence the
score by changing the Similarity measure used to compile the score. See the manual for details.
• Added optimization of ’order by $score’ clause: when the result is coming from index pre-ordered
by score then order by clause does not perform sort operation. The optimization is implemented
for parallel execution of full-text query as well.
• Added new debug option, optimizer-debug, to help understand why the optimizer chose (or did
not choose) a particular query plan. Added a new tab to the Admin client for the optimizer-debug
output, as the output can be very verbose we don’t want it to interfere with the output from any
other debug option.
• Introduction of detachable libraries, detachable libraries can be detached and attached from the
database. See the manual for details.
• Detachable libraries can be backed up and restored individually. See the manual for details.
• Fixed a potential security issue with Java module imports and XhiveXQuerySecurityPolicyIf. If
the security policy was only set after parsing the query, Java access would not be prevented.
• Fixed a bug in the matching of Phrase queries using wildcards in one of the terms.
• Disallowed rebuild the "Library ID index" of concurrent libraries, as this would lead to losing
track of the library children.
The list below presents additional known issues and limitations that may affect your use of the
product, and are not discussed elsewhere in the xDB manual.
For developers who want to get up and running quickly with this version of xDB, this section briefly
describes the minimal necessary steps: installation, creating your first database, and running a sample
command to verify the installation. For more detailed information, see Installing xDB, page 49, and
the readme.txt file of the distribution.
Installing and running xDB requires the Sun JDK 7 or a fully compatible Java Virtual Machine (JVM).
1. Install xDB.
• On Windows, you must be logged on with Windows administrator access privileges. The
Windows installer can upgrade an existing xDB version 9.0 or later (see Upgrading xDB
on Windows, page 57). If you have a previous version of xDB running, either use different
directories and port numbers for the new installation, or uninstall the previous version.
Run the xdb_setup.exe file and follow the instructions on the screen.
• On UNIX, you can install xDB under any account. If you have a previous version of xDB
running, either use different directories and port numbers for the new installation, or uninstall
the previous version.
Extract the distribution and run sh setup.sh and follow the instructions on the screen.
UNIX installation requires some post-installation steps (see Installing xDB on a UNIX platform).
2. The use the code samples that the xDB manual refers to, create a demo database as follows:
a. Start the Admin Client.
You can start it with the xdb admin command in the bin subdirectory of the xDB installation.
On Windows, you can also launch it from the Windows start menu.
b. Select menu option Database > Create database to create a database.
c. Enter united_nations as database name, the superuser password as entered during xDB
installation, and administrator password northsea. The other fields should be left unchanged.
d. After creating the database, you can close the Admin Client, unless you would like to use it
later to view the results of running code samples, or to perform other actions.
3. Run a sample.
a. Open a command prompt and navigate to the bin subdirectory of the xDB installation.
b. Use the xhive-ant command to insert two documents into the database:
If the command runs successfully, a message appears stating the number of documents stored
in the database.
Related links
Pre-installation requirements
Installing xDB on a Windows platform
Installing xDB on a UNIX platform
xDB Overview
General features
A backend server for an application is called a page server, because its purpose is to transfer data pages
to front-end applications (also called client applications), which locally perform operations on the data.
In environments where all database accesses are done from a single application server, performance is
usually best when the page server runs in the same JVM as the server application.
A backend server that combines being a page server with other tasks is called an internal server.
A page server that has no other purpose than being a server is called a dedicated server. A dedicated
server, with client/server communication over TCP/IP, can offer better scalability than an internal
server. In a simple internal server setup, one single web application can access the data on the page
server directly, and if other web applications also need access to the data, the first web application
can run an internal server for them. The larger the scale, the complexity and the number of different
frontend web applications that share a single page server, the more likely it becomes that a dedicated
server is preferable.
The optimum numbers of clients and page servers will depend on the characteristics of the solution,
including how data intensive it is. As the numbers of data retrievals or ingestions increase, more
front-end application servers and/or more backend server nodes may become advisable. XQuery
operations retrieve data pages from server to client side, and then process the query on the client side.
So, query-intensive applications may benefit from additional servers both at client side and at server
side. Ingestion operations parse XML documents on the client side and then pass newly created
data pages to server side. So, in case of a high ingestion workload, client and server sides may both
benefit from additional servers.
If multiple backend servers are required, there is a choice between two different features:
• Replication dynamically maintains one or more complete copies of an entire ’master’ data set on
one or more separate page servers. These copies are called replicas. Read-only transactions and
online backups can be offloaded to the replicas to distribute query load. For more information, see
Replicating Federations, page 289.
• Multi-node distributes a data set over multiple node servers, using Detachable libraries, page 46. It
allows the application workload to be spread over multiple backend servers. For more information,
see Configuring Multiple Backend Servers, page 297.
Supported standards
Implemented and extended recommendations of the World Wide Web Consortium (W3C) for querying,
retrieving, manipulating, and writing XML documents include:
• Document Object Model (DOM)
– Level 1
– Level 2 (Core and Traversal)
– Level 3 (Core and Load/Save)
• The eXtensible Stylesheet Language - Transformation (XSLT)
• XQuery
• XPath
• XLink
• XPointer
Implementation of an extended DOM level 3 interface provides for manipulating content, structure,
and style of documents. All DOM level 3 functionality is supported, including functions for retrieval,
modification and navigation within XML documents.
Since DOM level 3 does not support XML collections for handling more than one document, extended
API functions provide support for processing multiple documents simultaneously. Documents are
collected in libraries, which are implemented as DOM nodes. You can store libraries within other
libraries in the same way that you store documents within libraries. All operations on documents
(including XQuery queries) can also be performed on libraries.
A transformation engine that uses XSLT, a language for transforming XML documents into other
XML documents, makes it possible to transform XML into such formats as HTML or WML. You
can also publish to the PDF format.
XML Query Language (XQuery) has a string syntax that can address any type of information in an
XML document. XQuery can make selections based on conditions and construct new structures based
on queried information. The XQuery query engine implementation supports XPath and XPointer
queries. For more information, see the chapter about XQuery, page 173.
XLink is a W3C recommendation that enables links between XML documents. For more information,
see Linking documents with XLink, page 40.
Indexing
Various different types of indexes enable faster data access and increase the performance and
scalability of applications. For more information, see the chapter about Indexes, page 150.
Data can be imported and exported. The included SQL Loader uses Java Database Connectivity
(JDBC) to import data from relational databases, sequential files and other non-XML sources. An
integrated version of the Xerces parser imports XML documents.
In addition to XML documents, a database can store image files, sound files, Microsoft Office files,
and other file formats as Binary Large Objects (BLOBs). Storing BLOBs and XML documents in the
same database allows managing all resources for a specific project or product in one uniform way.
A transaction mechanism ensures that changes and updates to the database are completed harmoniously
across the system. If a transaction conflicts with other transactions, all transactions that take place
within one session can be committed or rolled back.
The transaction mechanism complies with the ACID database properties:
• Atomicity: either all actions in a transaction succeed and are made persistent in the database, or
none of the actions succeed.
• Consistency: the view of a database within a transaction is coherent: all read actions on a particular
part of the database return the same value.
• Isolation: changes that are made in one transaction are not visible in concurrent transactions
until the transaction is committed.
• Durability: when data is modified and the transaction succeeds, the data is certain to be written
to the disk.
Acces control
User access control is provided in the form of authorization and security support for managing users
and groups, page 47.
Versioning
For applications that need to store multiple versions of documents and BLOBs, xDB offers linear
versioning with branches. Storing multiple document versions allows users to track the changes
in a document and restore older versions. The latest version of a versioned document is used for
retrieval, traversal, querying and indexing. Earlier versions are only searchable if they were created
with a ’queryable’ option.
Branching
A version branch is a sequence of versions which has been separated from the main sequence of
versions. Branches are typically used when multiple users are working in parallel on the same
documents. If a document is checked in and the version has one or more successors, xDB automatically
creates another version branch.
In the example shown below, two users check out document version 1.3 from the head branch. User A
modifies the document, then checks in the document and xDB creates version 1.4. When user B checks
in, the head branch already contains version 1.4 of the document and xDB automatically creates
another branch under the 1.3 branch.
Administration tools
The xDB distribution includes a number of helfpful tools, primarily intended for administrative use in
a development environment.
The Admin Client (also known as administration client or administration tool) offers a database
explorer that displays database structure and content, and provides access to a wide range of
management functions, such as backup and restore, user authorization, and checking the physical and
logical consistency of a database or federation. For more information, refer to Admin Client, page 227.
Some key administrative database functions are provided by the new web client, page 246.
Administrative tasks, including backup and restore, can be performed from the operating system
command line, terminals or scripting languages using Command-line tools, page 246.
Some administrative features are also accessible through xDB Ant tasks.
Logical architecture
A page server works with a federation, which contains one or more databases. Databases can hold data
of various kinds, including XML documents, user accounts and other database objects, page 43.
A federation is a container for related databases, to which it provides a single server connection. A
federation is associated with a superuser account, page 42, and a specific location for transaction log
files, page 43. Applications can connect to the federation’s database driver either directly or remotely.
When a calling application creates a driver, it specifies the required federation or page server by
means of a bootstrap, page 69.
If multiple applications need access to the same federation, one application will access the federation
directly, and act as a server for the other applications.
A federation can contain as many databases as needed.
Federations and databases can be created and managed manually with the Admin Client, page 227, or
with the Command-line client, page 246.
Creating a new database requires the password of the superuser account, the password of the
administrator account and the name of the database you want to create.
Note: xDB database names, user names, and passwords are case sensitive. For example, Xhive
and XHIVE are treated as different database names. Passwords must be alphanumeric and from
3 through 8 characters long.
Superuser
A federation has one superuser account with the user name superuser. The superuser account for the
default federation is created during the installation process, and enables initial database configuration.
The superuser can create and delete databases, and perform administrative operations such as setting
the license key and performing backups.
The superuser cannot access regular data such as libraries and documents.
Database objects
A federation can contain one or more databases. A database can contain various types of objects,
including users, user groups, libraries, documents, indexes, catalogs and BLOBs.
When creating a database, the superuser must create a user account for the database administrator. The
database administrator can create user accounts for all other database users. Every user has a database
user account with a unique user name, with a password for access control. Each user account is
assigned a set of permissions that specify the level of access the user has to the database.
A group is an object in the database that contains one or more users. Groups can be used to assign the
same access rights and privileges to multiple users.
Each database has a user list and a group list.
Libraries
A library stores documents and other libraries in a hierarchical structure. Libraries can be stored
within other libraries; the nested structure of libraries in a database is similar to the nested structure
of directories or folders within a file system. The topmost library in the hierarchy is called the Root
library. A database has exactly one root library, which is created automatically.
From a data modeling perspective, you can think of a library as a folder that can contain XML
documents, indexes, catalogs and BLOBs, as well as sub-libraries. Libraries are implemented as DOM
nodes, and all operations on documents, including XQuery queries, can also be performed on libraries.
Documents
A document is an object that stores XML data. The system can handle both valid (that is, conforming
to a structure defined in a DTD or XML Schema) and well-formed XML documents.
XML documents are validated against document type definitions (DTD) or XML Schemas, collectively
called models. Documents using an XML schema are validated differently than documents using a
DTD. The XML validation process can store DTD or XML schema information as an abstract schema
model (ASModel) in a catalog, which is linked to a library. By default, only the root library has a
catalog. Each model in a catalog has a unique ID.
Abstract schemas contain interfaces for handling schema information, such as the structure of
element declarations, and interfaces for applying schema information to DOMs validation. There is
full abstract schema support for DTDs and a more limited support for XML schema, with some
product-specific modifications.
BLOBs
Binary large objects (BLOBs) are binary non-XML files, such as image files (GIF, JPEG, PNG,
BMP), sound files (MP3, WAV), and Microsoft Office files (DOC, XLS, PPT). Storing BLOBs and
XML documents in the same database allows managing all resources for a specific project or product
in one uniform way.
Physically, a database consists of one or more segments. Each segment has one or more files and each
file occupies one or more pages. The physical and logical structures relate as follows:
A segment is a logical storage location within a database. Each database always has at least one
segment, the default segment. Segments can be added and empty segments can be deleted. The default
segment can never be deleted without deleting the database.
New libraries and data are stored in the default segment, unless specified otherwise. Data is never
automatically stored in another segment, even if the current segment is full. If you want automatic
overflow, use a single segment with multiple files.
Note: Transactions, even read-only ones, can require temporary data, for example new nodes in
XQuery queries, or old versions of documents. Unless specified otherwise, such data is held in
a temporary segment.
Methods exist for attaching a library to an existing segment, for assigning a segment for temporary
data and for storing all children of a specific library on a specific segment.
All segments of one library must have the same state at any time.
Database files
A segment can be spread physically over multiple files. A segment always has at least one file, the
default file. The default file can never be deleted without deleting its owner segment.
Files can be added to a database segment using the Admin Client.
The maximum size limit of a file can be set when the file is added to the segment, or later, provided the
limit is not less than the current size of the file. If a file exceeds the size limit, the overflow data is
allocated to the next file of the segment. If all the files in the segment have reached their limits, any
further allocation in the segment fails. Allocation is random, and different pages of a single document
can be stored in different files. Use different segments if you want to control where the data is stored.
Note: To keep data consistent, xDB sometimes allocates pages for internal page allocation
administration. In such cases, xDB ignores the maximum file size limit. As a result, the file may
slightly exceed the size limit set by the administrator.
Overflow to another file can be prevented by setting the maximum file size limit to 0 (zero), which
effectively means there is no maximum file size. In practice, setting an unlimited file size is useful
only for a last file.
Note: A file that has no maximum size may outgrow the available space. Preventing this can affect
performance, because the only reliable way to check for a full disk is by actually writing the (still
empty) page to the file while allocating the page.
Database configurations
Initial database configuration is specified by a database configuration file, page 230. The default
database configuration has a single segment with a single file and all data clustered in the default
segment. The Admin Client can be used to change the default configuration, to create or delete
segments and files, and to modify default cluster rules.
Detachable libraries
Detachable libraries can be detached from and attached to the database. Once a library is detached, it
is not accessible from the database.
A library can be detached if the library and its descendants are stored on a set of segments that are
not shared with any other libraries, and the ancestors of the library do not have other indexes than
id and name indexes.
A detachable library can have its own MultiPath indexes. The MultiPath index of a detachable library
can be merged with descendant library levels, but not with parent or ancestor levels.
A detachable library can have the following mutually exclusive states:
• read-write - Both read and write operations are allowed.
• read-only - Only read operations are allowed.
• detached - The library is logically or physically removed from the database and is not accessible
from the database.
• detach-point - Similar state as the detached state, except only a detachable library in a detach-point
state can be attached to the database. A detachable library in detached state cannot be directly
attached to the database.
By default, a detachable library has a read-write status.
Detachable libraries can be marked as non-searchable. A non-searchable detachable library is not
visible to search queries.
Detachable libraries can be backed up and restored individually.
Pre-installation requirements
xDB installers are available for Windows 2000/XP/Vista/7 and for Unix. xDB can be used with
virtualized versions of these operating systems running in any version of VMware. Unix installers have
been tested on Linux, Solaris, AIX, and Mac OS X, including 64-bit versions of Red Hat Enterprise
Linux edition 6.3, AIX V6.1 TL1, Oracle Solaris 10, and HP-UX 11i v3 Itanium.
Before installing xDB, verify that your system meets the hardware and software requirements for xDB.
We highly recommend that you read the readme.txt file that is provided with the xDB distribution.
If your new xDB installation is intended for a non-standard environment or a specific software
application, like EMC Documentum Content Server with enabled XML Store or EMC Documentum
xPlore, consult the appropriate documentation for possible differences in requirements, installation
procedure and configuration.
If you have previous versions of xDB running, you should consider upgrading or uninstalling them. If
you want to have multiple xDB installations on one system, for example xDB 9.0 and 10.0 side by
side, you must use different installation locations and port numbers for each one.
Note: On Windows, the installer can not install multiple Windows Services for xDB. For example, it
can not configure both an xDB 10 and an xDB 9 background service.
Note: The xDB installation layout has changed slightly since version 9: third party jar files are now in
subdirectories within subdirectory lib, xdb.properties is in subdirectory conf, and the page server will
write logs to subdirectory log.
Note: xDB is a Java application, and requires a Sun Java Development Kit 7 or higher, or an SDK
with a fully compatible Java Virtual Machine. Ensure that this is present on the target machine before
you install xDB.
Using the Java command line requires configuring the CLASSPATH variable to include the required
JAR Files, page 68.
Installation parameters
The xDB installer will ask you to provide some parameters, including:
• The location where you want to install xDB.
• The base directory path to the JDK.
• A valid xDB license key. Note: xDB software licenses are valid for a limited time. If in doubt about
your license, contact xDB customer support.
• An xDB superuser password. This password is required for creating databases and some other
administrative tasks.
• Optionally, set advanced parameters, page 54.
Related links
Installing xDB on a Windows platform
Installing xDB on a UNIX platform
1. Double-click the xdb_setup.exe file, located in the root directory of the xDB distribution.
If you want to obtain debug output from the installer, hold down the Ctrl key while running
xdb_setup.exe, until a console window opens for display of debug output. You can preserve the
debug output by copy/pasting it from the console window to a text file before you exit the installer.
The installer starts the xDB installation, and displays the Introduction. The left panel of the
installation window shows the sequence and progress of the installation steps. Use the buttons at
bottom right for navigation.
3. Select the I accept option for the terms of the software license agreement, then click Next.
The Enter license key window appears.
The Choose Install Folder window appears. The default folder is C:\Program Files\xDB or similar,
depending on the locale of your Microsoft Windows installation.
Note: The database files and transaction log files will also be stored here, unless you change their
locations from the installation defaults in Advanced Settings.
5. Choose a folder or type a folder path for the target installation folder, then click Next.
The Choose Shortcut Folder window appears.
6. Choose where shortcuts for xDB should be created by the installer, then click Next.
The Choose JDK Location window displays a list of Java executables found on your system. You
can you to pick one from the list, or use the Choose Java Executable button or the Search Another
Location button to find the applicable Java Development Kit and set the path to the java.exe file.
9. Specify whether or not to set additional advanced parameters, then click Next.
• To complete the xDB installation with default values for the advanced parameters, select
Proceed with installation. Note: The default storage locations for database files and transaction
journal files are in subfolders of the installation folder.
• To manually change xDB Advanced settings, including paths for database files and transaction
journal files, and JVM and xDB server parameters, select Set some advanced parameters.
If you selected Set some advanced parameters, the xDB Advanced Settings - Database-directory
window appears.
b. Click Next.
The xDB Advanced Settings - Default JVM settings window appears (showing related java
command line parameters in parenthesis).
d. Click Next.
The xDB Advanced Settings - Server settings window appears.
e. Specify the server port number, and the server and client cache sizes. Optionally, you can
allow access by other hosts, and run the internal web server that is required for the xDB web
client, page 246.
The Windows installer creates and starts a Windows Service that acts as an xDB page server.
The following settings apply to this service:
Server settings Description
Server port number The port number of the page server. The page server can run on
any available port. A check is made whether the port is available.
NoteThe port number determines the URL which must be used
to access the database.
10. Verify your installation settings and make sure that the available disk space is sufficient.
• If not satisfied, use the Previous button to go back and change your installation choices.
• If satisfied, click the Install button to proceed with the xDB installation.
The installer shows installation progress. After succesful installation, the Install Complete window
displays the location of the installation. NoteComplete installation requires a Windows restart.
11. Set options to display documentation or to Restart Windows, and click Done.
Note: On Windows XP or Windows 2008R1, if you wish to use the complete Windows command line
functionality, install the appropriate Microsoft Visual C++ redistribution package, downloadable from:
• http://www.microsoft.com/en-us/download/details.aspx?id=5582 (Windows 32 bit)
• http://www.microsoft.com/en-us/download/details.aspx?id=2092 (Windows 64 bit)
superuser password of the existing installation. The installer needs those for removing the old binaries,
background process, documentation, and so on.
Note: Before upgrading, ensure that you have a backup of your existing installation.
1. Execute xdb_setup.exe from the distribution, and proceed as for a new installation, until the
installer asks for the install location.
2. When asked for the installation location, choose the location of the existing xDB installation
that you want to upgrade.
Note: If you do not specify a valid existing xDB location, the installer will create a new
installation, instead of upgrading an old one.
The installer displays an upgrade notice for the existing installation, for example:
On UNIX platforms, xDB can be installed as any user. You need Write permissions in both the
installation directories, the directory to which you will untar the distribution and the directory where
you will create your initial federation.
You must have a working Java executable in your PATH. You can verify the Java version by running
java -version from the command line. See also the Pre-installation requirements, page 49.
The xDB installer copies files to the proper directories, configures the xDB installation, and augments
the PATH environment variable.
Note: If you are upgrading a previous xDB version, see Upgrading xDB on UNIX, page 61.
To install xDB on a UNIX platform:
1. Extract the distribution .tar.gz file to the directory where you want to install xDB.
2. Run the sh setup.sh setup script, and enter the parameters that are required during installation.
To obtain debug output from the installer, set environment variable LAX_DEBUG=true or
LAX_DEBUG=file before you launch the installer. The file option redirects debug output to a file
jx.log in the install directory.
For most parameters, the xDB installer displays a default value between brackets. For questions
that require a Yes or No answer, the default choice is shown in upper case. For example, if the
choice is [y/N], N is the default value.
To accept a default, press the Enter key. To override a default, type a new value and press
Enter. If the value is incorrect, a message appears and the question is repeated.
Uninstalling xDB
Depending on the platform on which xDB is installed, the uninstall process is as follows:
• On Windows:
Use the Uninstall xDB option in the xDB section of the Windows Start Programs menu to uninstall
xDB. The Windows uninstall process also unregisters the dedicated xDB page server service. If you
have multiple xDB installations on the same machine, the Add/Remove Programs option in the
Windows Control Panel applies only to the latest xDB installation.
The uninstall process may not be able to delete all data directories, compiled samples, and other
files that were created after xDB was installed. Remaining directories and files can be removed
manually afterwards.
• On UNIX:
If the page server is running, use the xdb stop-server command to stop it. Then remove the
installation and data directories.
Note: The uninstalling.txt file contains plain text instructions for uninstalling xDB.
Command Description
xdb admin Starts the Admin Client, a tool for maintainance of
federations, databases, users and content.
xdb create-database, page 251 Creates a new database.
xdb delete-database, page 246 Removes an existing database.
xdb create-federation, page 250 Creates a new empty federation.
xdb configure-federation, page 246 Sets the superuser password and license key on a
federation.
xdb backup, page 263 Saves a federation to a backup file.
xdb restore, page 264 Restores a federation from a backup file.
xdb info, page 251 Displays debug information on currently open
transactions and their locks.
xdb run-server, page 252 Starts the dedicated server process for the default
federation.
xdb stop-server, page 252 Stops the dedicated server process for the default
federation.
xdb suspend-diskwrites, page 246 Ensures the federation files are flushed to disk, and
suspends or resumes writing.
3. Enter database name united_nations with the superuser password as entered during the xDB
installation. Enter northsea as the administrator password.
Note: This database name and these passwords are used in all code samples. If you want to use
diferent ones, you must make appropriate changes to the SampleProperties.java file.
4. Click OK.
Note: Databases can also be created using the xdb create-database command. For more information
about using the command line, see Using the command line client, page 246.
The xdb.properties file contains properties (key/value pairs) for command-line settings and page server
settings. The tables below describe available options.
Command-line settings
The command-line settings apply to tools like the xdb command and the xhive-ant command, as well
as to the page server. They can be overridden by setting environment variables with the same names.
Alternatively, the corresponding command-line switch(es) can be passed to the tool.
By default, the command-line settings are in the file xdb/conf/xdb.properties.
Optionally, user-specific properties can be set in the user’s home directory in a file .xdb.properties
(note the leading "." dot).
Note: Changes to server-related settings, such as the XHIVE_SERVER_MAX_MEMORY property,
require restarting the server to take effect. Other settings are applied the next time a tool is run.
Property Description
XHIVE_SERVER_CACHEPAGES Cache pages for the page server process. If set to 0, half of the
JVM memory is used.
XHIVE_HOME The installation location. It can be changed to use a different
software version or an installation in a different location. If left
empty, the tools try to infer a location.
XHIVE_JAVA_HOME The JDK installation to be used with the tools to which the
properties apply. This must be a proper Java Development Kit,
not a Java Runtime Environment (JRE).
If left empty, the tools use the JAVA_HOME path or any java
executable on the path.
XHIVE_STATISTICS_MONITOR- If set to true, enables statistics monitoring.
ING_ENABLED
XHIVE_STATS_MONITOR_INTERVAL The statistics monitoring interval.
XHIVE_WEBSERVER_ADDRESS The web server listens at this address.
If set to "*", the server accepts all connections. If set to
"localhost", only local connections are accepted.
XHIVE_WEBSERVER_PORT The port used by the web server. If left empty, the web server
will not run.
XHIVE_WEBSERVER_NO_WEBAD- If set to true, the web server will not serve the web admin
MIN application.
These property settings apply to the page server (both internal and dedicated).
The page server looks for the xdb.properties file in the Java classpath.
Note: The xdb command adds the directory XHIVE_HOME/conf to the Java classpath automatically.
Property Description
XHIVE_FIPS_ENABLED Enables (true) or disables (false/not set) FIPS 140-2 Level 1
encryption of user passwords. For more information, see Enabling
FIPS 140-2 Level 1 encryption, page 73.
xdb.lucene.* Various properties related to the multi-path indexes - see
Multi-path index properties, page 157.
• The file $XHIVE_HOME\bin\xDB Admin Client.lax contains options for the Admin Client
executable xDB Admin Client.exe, that only apply when the Admin Client is run from the Windows
Start Menu shortcut or from the executable. Note: These options do not apply to the xdb admin
command and the xhive-ant run-admin command.
• The file $XHIVE_HOME\bin\xDB Server.lax contains the following JVM options for the Windows
Service executable xDB Server.exe:
– lax.nl.java.option.java.heap.size.initial (equivalent to -Xms).
– lax.nl.java.option.java.heap.size.max (equivalent to -Xmx).
– lax.nl.java.option.additional (this option changes JVM parameters).
– lax.nl.current.vm (this option changes the default JVM).
The following server configuration values are determined during xDB installation:
• Port number for accepting connection. The default port number is 1235.
• Page-cache size. The default value is 0, allowing the server to use half of the available JVM memory.
• Addresses that are allowed to connect. The default value is ’localhost’, which only allows
connections from the same machine. Using ’*’ allows connections from every machine.
These configuration parameters are set in the xdb.properties file in the conf subdirectory underneath
the xDB installation home. The server processes must be stopped and restarted after changing
anything in the configuration. Enlarging the cache size requires allocating more memory to the
process by configuring the XHIVE_SERVER_MAX_MEMORY parameter in the properties file.
The selected page size is stored in the XhiveDatabase.bootstrap file, but cannot be changed after
creating a federation.
By default, the dedicated process is not configured for SSL connections. For more information about
SSL, see Using the Secure Socket Layer, page 283.
Due to differences in system layouts on UNIX systems, the xDB installer does not automatically
configure a background service for the page server. Users can set up their own startup item, typically a
shell script in /etc/init.d. The page server can be started as a background process using the
typical shell syntax.
#!/bin/sh
xdb run-server \
--debug --non-interactive \
>>/var/log/xdb-server.log \
2>>/var/log/xdb-server-error.log &
echo $! > /var/run/xdb-server.pid
The xdb run-server command starts a server process for the default federation, to which other
processes can connect. The log output of any errors is sent to the console.
The xdb stop-server command connects to the server JVM running at the configured bootstrap
location and tells the server process to terminate.
For more information about these commands, see Server-related commands, page 252 and
Command-line client gobal options, page 251.
The page server process can be run inside a Java application, instead of using a dedicated page server.
Typically, running the server process inside an application offers better performance, and if necessary
the embedding application can accept connections from other applications.
To configure use without a dedicated server:
• Use the /path/to/XhiveDatabase.bootstrap path as bootstrap in your application, instead of
xhive://hostname:portname.
• Ensure that no dedicated server is running, because only one process at a time is allowed access to
the federation. On Windows, this is more likely than on Unix, because on Windows the installer
can configure a dedicated page server by default. On UNIX, the dedicated server is not started
automatically by the installer.
Managing DTDs
Some external XML editors must have access to a DTD to edit an XML document. In xDB, the DTDs
are located in a catalog and are associated with an XML document using a doctype declaration.
Example
The following example associates a DTD with an XML document.
The default doctype declaration for a new XML document is
<!DOCTYPE rootElem PUBLIC "publicId" "systemId">
There are two ways to associate a DTD with a document:
• Setting the publicId string to match a public ID of a DTD stored in the catalog of the library
where the document is stored.
• Setting the systemId to the full URL to the DTD as it can be accessed remotely, for example
http://localhost:8080/xhive/xhive-catalog/play.dtd.
When the file is opened again from the server, the document type declaration is changed to the
following format:
<!DOCTYPE rootElem PUBLIC "xDB public id" "xhive-catalog/filename.dtd">
Troubleshooting DTDs
Every library has access to a catalog that holds a list of stored DTDs. Documents can refer to the
DTDs in this catalog using the public ID in the document type declaration. The catalog is available
as a folder in each library, regardless of whether that library has a local catalog or not. The server
adds the folder to the list of file and directory names. The folder name is arbitrary and can be changed
in the source code. When a document is loaded from the database to an FTP client, the client uses
the system ID to look up the DTD. The client can only find the DTD in the database if the system
ID is a relative path, using a format like xhive-catalog/fileName.dtd. The client then requests the
fileName.dtd file in the xhive-catalog folder relative to the library where the document is stored. The
server sends the DTD to the client.
The relative file path can be set when the document is parsed for the first time.
The following information can be helpful when troubleshooting DTDs:
• The DTD is not stored when FTP is used to store a new document. .
• To store a new document with a DTD, the public ID must point to an existing file the catalog or
pass a full system ID. If the public ID matches a DTD file in the catalog, the document is linked to
that DTD.
• Documents with a linked DTD must have a system ID with a format like xhive-catalog/fileName.dtd.
• When documents are parsed into the database, they are parsed or created with certain default
parameters, which can be changed in the source code of the FTP server.
• When storing documents on the FTP server, new documents can be created during parsing and
existing documents can be replaced.
• When documents or DTDs are stored in the database over FTP, the client sends the document
contents as a stream to the server. If the document contains relative paths to files on the client
system, such as references to DTDs, the references cannot be resolved and parsing can fail.
• The FTP server can store both XML documents and BLOBs based on the file extension. The file
extensions that are associated with XML files can be modified accordingly.
• On systems that already have an FTP-server running, the FTP server cannot run on the default port
number 21. The port-constant must be changed to connect the FTP server to another port.
XHIVE_FIPS_ENABLED=true
When FIPS 140-2 Level 1 encryption is enabled, the Crypto-J FIPS JCE security provider can be used
to encrypt user passwords.Note: Enabling FIPS 140-2 encryption on an existing federation does not
affect existing users. FIPS 140-2 encryption applies only to passwords of newly created users and to
password changes of existing users. This means that to encrypt the passwords of existing users in a
FIPS 140-2 compliant manner, those users must reset their passwords. Passwords encrypted using a
FIPS 140-2 compliant algorithm can co-exist with passwords encrypted using other algorithms.
Using the server compiler can improve performance for CPU bound processes significantly. The
trade-off is a slower application startup that is irrelevant for server applications.
Upgrading to a new major version of the Sun JDK often causes a 10 percent performance improvement
for CPU bound applications due to optimizations. The new versions are more stable as well, and it is
recommended to use the latest release.
Depending on the application, the amount of memory available to the JVM and the page server cache
can also be important. Generally, the more memory is available, the better the performance. The
default installation configuration does not have much impact on a typical developer desktop, but is
usually insufficient for real server applications. It is impossible to recommend specific numbers,
because the optimal settings will depend on circumstances, including the application and the data, the
hardware and any other tasks that the hardware has to perform.
As a first estimation of the required cache pages for a page server, take common operations in the
application, such as frequently run XQueries. Make sure the queries are supported by indexes, then
add up the size of the cache pages occupied by the indexes. The Admin Client shows the index
page sizes on its Indexes tab.
Measure the performance of these common operations and see if they meet performance requirements.
Gradually increase the size of the page server cache until it does not increase performance. You can
obtain performance statistics in the Admin Client via main menu option Settings -> Performance
Statistics.
If statistics show that during a certain amount of work the number of cache pages loaded increases,
not all operations were serviced from the page cache. If the number of cache pages loaded does not
increase for a set of operations, then increasing the cache size will not increase performance.
Cache size does not have an impact on parsing new documents into the database.
To calculate the amount of memory used by the page server cache, multiply the number of cache
pages with the page size (in addition, there is a small amount of overhead per page, which should be
insignificant). The default value 0 for cache pages will automatically take half of the available JVM
heap.
# cache pages * page size = total memory
memory for cache / page size = # cache pages
For example, if you have a page server server running with a 16 GB heap, and you’d like to allocate
75% of that as a cache on a federation with 4096 byte pages, the calculation would be roughly:
Note: Do not allocate the complete heap as a database cache. The server needs some room to maintain
additional data structures outside the cache in order to function.
A larger page cache leaves a smaller Java heap for regular program operation, which can
cause excessive Garbage Collection. Increasing the page server cache might thus lead to more
frequent/longer pauses and/or decrease application throughput. Tuning JVM Garbage Collection is
outside of the scope of this article, refer to your JVM supplier’s documentation.
Creating a federation requires specifying a page size for its databases. This value should be a power or
2 in the range from 512 to 65536. In most cases, it is best to use the same value for the database page
size as the file system uses.
Each document and BLOB occupies an integer number of database pages. A smaller database page
size than the file system page size may save disk space, but at a cost in performance: when writing a
database page, the operating system must retrieve the old file system page, which involves copying the
database page into the file system page and writing back the whole file system page. When database
page size equals file system page size, retrieving the old file system page is not necessary.
Note: The database page size should never exceed the file system page size, because then file-writes
are not atomic. If the operating system crashes while a database page is only partly written to the disk,
that page becomes inaccessible, and it may even corrupt the entire database.
File systems
On Solaris operating systems, the default file system block size is 8192 bytes. On Windows with a
default NTFS file system and on Linux operating systems, the file system block size is 4096. These
default block sizes can be modified. Below is a list of commands that report block sizes on various
operating systems:
• On Windows 2000, chkdsk displays the size of an allocation unit (this command also checks the file
system, which may take some time to complete).
• On Windows XP and later, fsutil fsinfo ntfsinfo displays the number of bytes per cluster.
• On Linux with the ext2/ext3 filesystem, tune2fs -l /dev/device displays the block size of the file
system. On Linux with xfs, use the xfs_info /mountpoint command. The mount command displays
the mapping between devices and logical mount points.
• On Solaris and HP-UX, mkfs -m /dev/device displays the block size (bsize) of the file system. This
command may also work on other Unix-variants. Use with care, as mkfs is also used to create
new filesystems.
If multiple disks are used to store the data, the easiest and most effective way is a RAID 0 or RAID
1+0 configuration.
RPC tracing
When RPC tracing is enabled, all Remote Procedure Calls to the backend server are logged. RPC
traces are useful in performance tuning and trouble-shooting, especially in a multi-node configuration.
Note: The section about RPC tracing assumes that the java.util.logging SLF4J binding is used. For
more information refer to the section about message logging, page 286.
An RPC call trace message can contain the following items:
1. Begin time
2. Logger name
3. Logging level
4. Thread ID
5. Session ID
6. Transaction ID
7. RPC request type
8. RPC call duration (in milliseconds)
9. Number of bytes sent
10. Number of bytes received
11. Host address of front-end machine
12. Host address of backend server
13. Node name
14. Method name
15. Parameters
16. Return value or exception
The logging mode and message format for RPC call trace messages can be configured in system
properties, as described in the table below.
Table 10 RPC tracing mode and format
System property Description
XHIVE_RPC_TRACING_MODE The logging mode. Valid values are:
standard - includes all 16 items.
compact - includes all items except Parameters and
Return value.
The default mode is standard.
XHIVE_RPC_TRACING_FORMAT The trace message format. Valid values are:
plain - plain text format.
xml - XML format.
The default format is plain. In plain text format, items
within the same trace message are delimited by space.
By default, RPC tracing is disabled. RPC tracing can be turned on or off using the
XHIVE_RPC_TRACING_ON system property. To enable RPC tracing use:
java.exe -DXHIVE_RPC_TRACING_ON=true
The com.xhive.trace.rpc Java logger and its associated logging handler record RPC trace messages
when RPC tracing is initialized.
You can use the %JAVA_HOME%//jre/lib/logging.properties file to set the logging level and to
direct xDB trace message output to console and/or to a file, as discussed below. Note: The xDB
JAVA_HOME can be set by XHIVE_JAVA_HOME in the file xdb.properties.
java.util.logging.FileHandler.level = FINEST
java.util.logging.ConsoleHandler.level = FINEST
If you send trace output to file, you can configure the trace file name, max size, and other parameters
in the logging.properties filehandler keys, as described in the table FileHandler configuration
keys, page 81.
Note: To apply the changes to the Java logging properties file, you must restart the application.
Changes to system properties are dynamic.
Key Description
java.util.logging.FileHandler.pattern Specifies a pattern for generating the output file name. For
more information, refer to the java.util.logging.FileHandler
API Java Doc.
java.util.logging.FileHandler.limit Specifies an approximate maximum amount in bytes to
write to any one file. If the value is 0, there is no limit. The
default value is 0.
java.util.logging.FileHandler.count Specifies how many output files to cycle through. The
default value is 1.
java.util.logging.FileHandler.level Specifies the default level for the Handler. This key must
be set to FINEST.
java.util.logging.FileHandler.formatter Specifies the name of a Formatter class to use.
Set this key to com.xhive.trace.log.RPCSimpleFormatter
when the value of XHIVE_RPC_TRACING_FORMAT is
plain.
Set this key to com.xhive.trace.log.RPCXMLFormatter
when the value XHIVE_RPC_TRACING_FORMAT is xml.
<xs:complexType name="rpc-trace-type">
<xs:sequence>
<xs:element name="param" minOccurs="0" maxOccurs="unbounded">
<xs:complexType>
<xs:simpleContent>
<xs:extension base="xs:string">
<xs:attribute name="name" type="xs:string" use="required"/>
</xs:extension>
</xs:simpleContent>
</xs:complexType>
</xs:element>
<xs:element name="returnValue" type="xs:string" minOccurs="0"/>
<xs:element name="exception" type="exception-type" minOccurs="0"/>
</xs:sequence>
<xs:attribute name="beginTime" type="xs:string" use="required"/>
<xs:attribute name="loggerName" type="xs:string" use="required"/>
<xs:attribute name="loggingLevel" type="xs:string" use="required"/>
<xs:attribute name="threadId" type="xs:string" use="required"/>
<xs:attribute name="sessionId" type="xs:string" use="required"/>
<xs:attribute name="transactionId" type="xs:long" use="required"/>
<xs:attribute name="requestType" type="xs:string" use="required"/>
<xs:attribute name="duration" type="xs:long" use="required"/>
<xs:attribute name="bytesSent" type="xs:long" use="required"/>
<xs:attribute name="bytesReceived" type="xs:long" use="required"/>
<xs:attribute name="frontEndAddress" type="xs:string" use="required"/>
<xs:attribute name="frontEndPort" type="xs:unsignedShort" use="required"/>
<xs:attribute name="backEndAddress" type="xs:string" use="required"/>
<xs:attribute name="backEndPort" type="xs:unsignedShort" use="required"/>
<xs:attribute name="serverNodeName" type="xs:string" use="required"/>
<xs:attribute name="methodName" type="xs:string" use="required"/>
</xs:complexType>
<xs:complexType name="exception-type">
<xs:sequence>
<xs:element name="message" type="xs:string"/>
<xs:element name="frame" type="frame-type" minOccurs="0" maxOccurs="unbounded"/>
</xs:sequence>
</xs:complexType>
<xs:complexType name="frame-type">
<xs:sequence>
<xs:element name="class" type="xs:string"/>
<xs:element name="method" type="xs:string"/>
<xs:element name="line" type="xs:int" minOccurs="0"/>
</xs:sequence>
</xs:complexType>
</xs:schema>
For information on compiling and running code samples and other programs, see Running a sample,
page 84.
Running a sample
A standard xDB installation includes a /src/ directory, which contains a number of java code samples.
Most of the general examples that the manual refers to are in /src/samples/manual. These samples
can be run using the xhive-ant tool:
xhive-ant run-sample -Dname=manual.[sample]
Note: The settings in the file SampleProperties.java must match your current setup. These settings
include the superuser and administrator passwords and the database name. Note: The /src/ directory
also includes some separate, special samples that need to be set up and run differently from the
“manual” Java code samples. For more information, refer to the file /bin/build.xml and the inline
Javadoc documentation of the sample’s source code.
To run a “manual” sample:
The xhive-ant command sets the proper CLASSPATH and other parameters. The example
command above runs a sample that inserts two documents into the database. If the command runs
successfully, a message appears stating the number of documents stored in the database.
1. Start a session and open a connection as superuser. The databaseName parameter of the connect()
call should be null.
XhiveDriverIf driver = XhiveDriverFactory.getDriver();
if (!driver.isInitialized()) {
driver.init(1024);
}
XhiveSessionIf session = driver.createSession();
session.connect(superUserName, superUserPassword, null);
3. Call the createDatabase() method with the name of the new database and its administrator
password.
federation.createDatabase(newDbName, administratorPassword, null, System.out);
This creates a default configuration. If you want a custom configuration, you can specify a
configuration file, page 86. For more information about the physical file structure of a database,
see Database files, page 46.
Note: The superuser can create and delete databases, but cannot administer them. To perform
administrative actions on the new database, you need to disconnect and then reconnect as database
administrator.
Note: The superuser is not represented by an object in the xDB API.
Note: If you changed the default superuser password of the xDB installation, change the source
code of the CreateDatabase sample accordingly.
Samples
CreateDatabase.java
API documentation
com.xhive.core.interfaces.XhiveFederationIf
Connecting to a database
Applications can explicitly specify a bootstrap by calling XhiveDriverFactory.getDriver(String
bootstrap). When called without a parameter, XhiveDriverFactory.getDriver() tries to find a
federation via (in this order):
• the Java system property xhive.bootstrap
• the first line of a text file called xhive.bootstrap in the current working directory of the Java process
• the environment variable XHIVE_BOOTSTRAP (The xDB command line tool will also use this
environment variable, if run without an explicit federation argument.)
Note: To use the same internal xDB server from different applications, those applications must use
the same Java class loader to load the xDB classes. Otherwise, the xDB code loaded by each class
loader attempts to start its own xDB server and all but the first one fail.
To connect to an xDB database:
1. Obtain an xDB driver:
XhiveDriverIf xhiveDriver = XhiveDriverFactory.getDriver();
If you did not specify the bootstrap in the JVM environment, pass the location as an argument to
the getDriver() method, for example:
XhiveDriverIf xhiveDriver = XhiveDriverFactory.getDriver("xhive://localhost:1235");
If you connect to the database without a server, use a path to the XhiveDatabase.bootstrap file.
2. Initialize the local page cache shared by the sessions for this driver:
xhiveDriver.init();
You need initialize a specific driver only once in your application. You can use the isInitialized()
method to verify whether the driver has been initialized.
3. Create a new XhiveSessionIf session using the createSession() method from the
com.xhive.core.interfaces.XhiveSessionIf interface:
XhiveSessionIf session = xhiveDriver.createSession();
4. Connect to the database, supplying a user name, password, and database name:
session.connect(UserName, UserPassword, DatabaseName);
Samples
ConnectDatabase.java
API documentation
com.xhive.XhiveDriverFactory
com.xhive.core.interfaces.XhiveDriverIf
com.xhive.core.interfaces.XhiveSessionIf
API documentation
com.xhive.core.interfaces.XhiveDatabaseIf.html#getConfigurationFile()
To performing transactions, first connect to a database and create a session, as described in Connecting
to a database, page 85, then follow the steps below. For more information, see the chapter on Session
and Transaction Management, page 133.
To use transactions within xDB:
1. Start the transaction with the begin() method of the com.xhive.core.interfaces.XhiveSessionIf
interface.
2. Enter the instructions that should be executed during the transaction.
3. End the transaction with either the commit() or the rollback() method.
The rollback() method reverses all instructions in the transaction. It should always be used
if a failure or unexpected exception occurs during the transaction, for the sake of database
consistency. For example, if the disk space is exceeded while loading a document, partial document
modifications can result in an inconsistent data structure.
Note: After a call to commit() or rollback(), all references to database objects (such as nodes,
libraries, etc.) become invalid. If you want to continue using the objects after a commit(), use
checkpoint() instead. This method commits all persistent operations executed since the previous
begin() or checkpoint() method call. The transaction remains active after the checkpoint() call
and references to database variables remain usable.
Example
The example transaction below parses external XML documents and appends them to a library. If
an error occurs during parsing or appending, the entire transaction is rolled back and none of the
documents are appended.
session.begin();
try {
XhiveLSParserIf parser = rootLibrary.createLSParser();
for ( int i=1; i<=numFiles; i++ ) {
XhiveDocumentIf newDocument =
parser.parseURI( new File(baseFileName + i + ".xml").toURL().toString());
rootLibrary.appendChild(newDocument);
}
session.commit();
} catch (Exception e) {
// in case of an error: do a rollback
session.rollback();
e.printStackTrace();
} finally {
// always ensure that the session is cleaned up in a finally block
if (session.isOpen()) session.rollback();
// remove the session
session.disconnect();
session.terminate();
Samples
UseSessions.java
API documentation
com.xhive.core.interfaces.XhiveSessionIf
com.xhive.core.interfaces.XhiveDriverIf
Creating libraries
The nested structure of libraries within a database is like the nested structure of directories or folders
within a file system. There is only one root library, which is created automatically with a new
database. You can add libraries and documents as needed to build a suitable document hierarchy or
storage architecture.
To create a library:
1. Obtain a handle to the parent library. If that parent is the root library, use the getRoot() method to
get a handle. Otherwise, use a previously instantiated variable.
2. Create the library using the createLibrary() method.
3. Select a unique name for the new library using the setName() method.
Note: Naming a library is optional, but strongly recommended because several access and indexing
methods only work with named libraries.
4. Append the new library to its parent using the appendChild() method.
Example
The sample code below creates a library called Publications in the root, with one nested library
called General Info.
// get a handle to the root library
XhiveLibraryIf rootLibrary = united_nations_db.getRoot();
// create a library
XhiveLibraryIf newLibA = rootLibrary.createLibrary();
Related topics
Managing detachable libraries
Samples
CreateLibrary.java
API documentation
com.xhive.core.interfaces.XhiveDatabaseIf
com.xhive.dom.interfaces.XhiveLibraryIf
com.xhive.dom.interfaces.XhiveLibraryChildIf
Storing BLOBs
xDB stores BLOBs as a special type of node, the BLOB node. The method createBlob() creates
a BLOB node. After creating a BLOB node, the content of the node must be filled through the
setContents() method. To add the BLOB node, the normal methods for adding nodes can be
used: appendChild() or insertBefore():
String imgFileName = "un_flags.gif";
String imgName = "Flags of UN members";
FileInputStream imgFile = new FileInputStream(SampleProperties.baseDir + imgFileName);
Samples
StoreBLOBs.java
API documentation
com.xhive.dom.interfaces.XhiveBlobNodeIf
com.xhive.dom.interfaces.XhiveNodeIf
Interface Methods
XhiveUserListIf addUser(), removeUser(), getUser(), hasUser()
XhiveUserIf setPassword(), isAdministrator()
Interface Methods
addGroup(), removeGroup(), getGroup(), hasGroup()
isMember(), users()
Example
The following example uses the XhiveUserListIf and XhiveGroupListIf interface to create a user, a
group, and then adds the user to the group.
XhiveUserListIf userList = united_nations_db.getUserList();
XhiveGroupListIf groupList = united_nations_db.getGroupList();
Samples
ManageUsers.java
API documentation
com.xhive.core.interfaces.XhiveAuthorityIf
com.xhive.core.interfaces.XhiveGroupIf
com.xhive.core.interfaces.XhiveGroupListIf
com.xhive.core.interfaces.XhiveUserIf
com.xhive.core.interfaces.XhiveUserListIf
</ns2:federation-set>
XhivePageCacheIf pageCache =
XhiveDriverFactory.getFederationFactory().createPageCache(4096);
XhiveFederationSetIf fs =
XhiveFederationSetFactory.getFederationSet("xhive://hostname:1235/",
pageCache);
XhiveDriverIf driver = fs.getFederation("fdname");
// Use driver normally, e.g., create session on it
To use a federation set internally, the server does not require a start. Only the URI of the federation set
is changed in the example code below:
XhivePageCacheIf pageCache =
XhiveDriverFactory.getFederationFactory().createPageCache(4096);
XhiveFederationSetIf fs =
XhiveFederationSetFactory.getFederationSet("/path/to/federationset",
pageCache);
XhiveDriverIf driver = fs.getFederation("fdname");
// Use driver normally, e.g., create session on it
All federations in the federation set share the single page cache.
There is a convenience method as well:
XhiveFederationSetFactory.getFederation()
To set up xDB in your local Maven repository, follow the steps below.
knowledge of the Spring framework, and in particular Spring data sources. This text does not discuss
xDB’s XA transaction support, which can also be used in conjunction with Spring.
xDB provides three classes to support Spring transactions:
• XhiveDataSource is comparable to a JDBC session pool. This class holds the XhiveDriverIf
reference and pools XhiveSessionIf objects for your application. First read connection settings from
a properties file, connection.properties in the WEB-INF folder of your war file, and then pass the
bootstrap, database, username, and password values for XhiveDataSource’s constructor.
• XhiveTransactionManager is a Spring TransactionManager that will handle transactions according
to Spring’s rules. In particular, it will open transactions, make sure to commit them or roll them back
after web requests, and handle configuration such as the readOnly = true flag in the example below.
• XhiveSessionAccess provides access to the xDB transaction. You need to explicitly tell Spring
about its presence, so that Spring can automatically inject it into your code.
Note: The Maven 2 POM file included in the xDB distribution ($XHIVE_HOME/lib/pom.xml)
specifies all necessary dependencies for developing Spring-based xDB applications. See Using xDB
with Maven 2, page 94 for more information.
The following example shows a Spring application context configuration, using these classes to
provide transactional support.
After configuring the transactional support, it can be used in Java code like this:
@Controller
public class TestController {
@Autowired
private XhiveSessionAccess acc;
@Transactional(readOnly = true)
@RequestMapping("/test-{name}.html")
public ModelAndView test(@PathVariable String name) {
API documentation
XhiveDataSource
XhiveTransactionManager
XhiveSessionAccess
OSGi (Open Service Gateway initiative) is a framework for developing and deploying modular
software programs and libraries in Java. OSGi applications are built of bundles which are dynamically
loadable collections of Java classes, jars, and configuration files that explicitly declare their external
dependencies.
xDB supports deployment in an OSGi environment in two ways. The first approach is to simply use
the main library $XHIVE_HOME/lib/xhive.jar. Because this library is at the same time an OSGi
bundle, it can be deployed in an OSGi container easily.
The second approach is to use the libraries $XHIVE_HOME/lib/osgi/xhive-api.jar and
$XHIVE_HOME/lib/xhive-impl.jar. These libraries are OSGi bundles that decouple the xDB interfaces
(xhive-api.jar) from their actual implementation (xhive-impl.jar). The implementation bundle uses
OSGi Declarative Services to register itself as the implementation of the following API services:
• com.xhive.core.interfaces.XhiveDriverFactoryIf - a service for obtaining xDB XhiveDriverIf
implementations; and
• com.xhive.federationset.interfaces.XhiveFederationSetFactoryIf - a service for creating and
retrieving xDB federation sets.
Depending on the requirements, one of the above deployment options may be preferable over the other.
Note: The third-party libraries that are included in the xDB distribution are not OSGi-compatible. In
order to use xDB with OSGi, OSGi-enabled versions of these libraries must obtained and deployed
in the OSGi container. The Ant build script $XHIVE_HOME/bin/build.xml contains a simple
functionality for converting the core xDB dependencies into OSGi bundles - however, its use for
anything beyond executing the xDB sample applications is discouraged. Refer to the Import-Package
OSGi manifest headers in xhive.jar (or xhive-api.jar and xhive-impl.jar, respectively) for the exact
packages (and versions of thereof) that the xDB bundles depend on.
The xDB distribution comes with a simple OSGi sample application. The application source code
and configuration files can be found in the directories src/samples/osgi/ and src/samples/etc/osgi/,
respectively. Follow the instructions in $XHIVE_HOME/bin/build.xml on how to build and deploy
the sample application.
The LDAP sample code included with xDB shows the interfaces that must be implemented and
the API calls that enable JAAS authentication.
Samples
../ldap/SampleClient.java
../ldap/XhiveServerWithLDAP.java
API documentation
com.xhive.core.interfaces.XhiveDriverIf
com.xhive.core.interfaces.XhiveSessionIf
been created with any required options. The API can also be used for more complex features, like
using your own key manager, specifying cipher suites, and other more specialized SSL configuration.
Configuring SSL on the client includes specifying a URL of the format xhives://host:port as the
argument in the XhiveDriverFactory.getDriver() method.
To do anything special, like using a custom trust manager, pass the same URL to the
XhiveDriverFactory.getDriver() method, and also pass a SocketFactory object to the
XhiveDriverIf.init(int cachePages, SocketFactory socketFactory) call.
DOMs. For more informaton about the specification, see the W3C Document Object Model (DOM)
Level 3 Load and Save Specification.
The XhiveLibraryIf interface extends the DOMImplementationLS interface, which can be used
to create LSParser and LSSerializer objects. You can use the parseURI method of the LSParser
interface for parsing documents. LSParser objects must be created on the library where the parsed
documents are stored.
Example
The example below uses the parseURI method of the DOM Load/Save LSParser interface to parse a
document. When the document is parsed successfully, a DOM Document object is returned.
LSParser builder = rootLibrary.createLSParser();
Document firstDocument = builder.parseURI( new File(fileName).toURL().toString());
Note: An explicit appendChild is required to store a parsed document in the database, otherwise the
parsed document not stored.
Related topics
Storing XML documents
Validating XML documents
Samples
ParseDocuments.java
DOMLoadSave.java
API documentation
org.w3c.dom.as
com.xhive.dom.interfaces.XhiveLibraryIf
org.w3c.dom.ls.LSParser
org.w3c.dom.ls.LSSerializer
Note:
Samples
ParseDocumentsWithContext.java
API documentation
org.w3c.dom.ls.LSParser
org.w3c.dom.ls.LSSerializer
com.xhive.dom.interfaces.XhiveNodeIf
Examples
By default the validate configuration parameter is disabled and must be enabled to parse a file with
validation. The following example describes how to enable parsing with validation.
LSParser parser = charterLib.createLSParser();
parser.getDomConfig().setParameter("validate", Boolean.TRUE);
Document firstDocument = parser.parseURI( new File(fileName).toURL().toString());
If the parsed document contains a reference to a DTD, the DTD is stored as an ASModel within the
library catalog.
The XhiveCatalogIf interface in the com.xhive.dom.interfaces package contains several methods for
updating and querying abstract schema models. The following example retrieves a catalog and the
abstract schema from the root library.
// retrieve the catalog of the "UN Charter" library
XhiveCatalogIf unCharterCatalog = charterLib.getCatalog();
// get the abstract schema models that exist in the root library catalog
// do not include models from locations higher in the tree
Iterator<ASModel> iter = unCharterCatalog.getASModels(false);
while (iter.hasNext()) {
ASModel asModel = iter.next();
System.out.println(" asModel = " + asModel.getLocation());
}
Samples
ParseDocumentsWithValidation.java
ParseDocumentsWithContext.java
API documentation
com.xhive.dom.interfaces.XhiveLibraryIf
org.w3c.dom.ls.LSParser
Examples
The following code example describes how to normalize a document using the DOMConfiguration
interface.
DOMConfiguration config = ((XhiveDocumentIf)document).getDomConfig();
config.setParameter("validate", Boolean.TRUE);
config.setParameter("error-handler", new SimpleDOMErrorPrinter());
document.normalizeDocument();
Normalizing a document with validation requires enabling the "validate" parameter. The
normalizeDocument method does not throw any exceptions. An error handler can be set during
normalization.
The "schema-location" parameter can be used to validate against a different schema. Setting a
schema location also requires setting the schema type. The following code example describes how to
set a schema-location.
config.setParameter("schema-type", "http://www.w3.org/2001/XMLSchema");
config.setParameter("schema-location", "personal.xsd");
If the schema-location is set, the validation process first searches for a corresponding XML schema in
the catalog. If no schema is found, the validation process searches for a schema in the file system.
During document parsing, the schema-location is resolved relative to the document URI. The
document URI is not available during validation. A full path must be set if a document is validated
against a schema located in the file system.
Related topics
Post Validation Schema Infoset (PSVI)
Accessing PSVI information
Samples
ValidateDocumentWithXMLSchema.java
API documentation
com.xhive.dom.interfaces.XhiveDocumentIf
org.w3c.dom.DOMConfiguration
Where
Samples
StoreDocuments.java
API documentation
org.w3c.dom.Node
The xDB default configuration settings conform to the default settings defined by the Document Object
Model (DOM) Level 3 Core Specification and the Document Object Model (DOM) Level 3 Load and
Save Specification. A supported parameter can be set to another value.
The following code example shows how to test whether a configuration supports a Boolean parameter
value:
config.canSetParameter("validate", Boolean.TRUE);
In xDB, the LSParser default value is set to "Boolean.FALSE". According to the LSParser
object specification, the default value for the "element-content-whitespace" Boolean parameter is
"Boolean.TRUE". Documents that are parsed and stored with this setting can have many text nodes
containing only spaces. These additional nodes need more space on the disk and can adversely affect
query and validation performance.
Additional parameters
xDB adds some parameters for use by LSParser and/or Document objects.
Samples
DOMLoadSave.java
TextCompression.java
API documentation
org.w3c.dom.DOMConfiguration
org.w3c.dom.ls.LSParser
org.w3c.dom.ls.LSSerializer
com.xhive.dom.interfaces.XhiveDocumentIf
loader.setEscape(’\\’);
loader.setEnclose(’"’);
loader.setHeadersIncl(true);
loader.setRowName("member");
loader.setColumnNames(new String[]{"name", "admission_date", "additional_note"});
loader.setColumn2Attribute(new boolean[] { false, false, false });
Samples
StoreRelationalData.java
API documentation
com.xhive.util.interfaces.XhiveSQLDataLoaderIf
com.xhive.util.interfaces.XhiveCSVFileLoaderIf
Creating a document
To create a document:
1. Obtain a handle to a DOM implementation with rootLibrary because XhiveLibraryIf extends
DOMImplementation, as follows:
2. Create a DocumentType object and a Document object using the createDocument() method in
org.w3c.dom.DOMImplementation, as follows:
Because no namespaceURI is used, the first parameter can be left empty. The second parameter
of the createDocument() method, events, is the tag name of the root element. The third
parameter sets the docType of the new document.
3. Obtain a handle to the root element of the newly created document, as follows:
Element rootElement = eventsDocument.getDocumentElement();
eventElement.appendChild(eventText);
Add date element with the value "4-8 June, 2001" to the event element, as follows:
// add a new element to event
Element dateElement = eventsDocument.createElement("date");
eventElement.appendChild( dateElement );
Use the appendChild() or insertBefore() method to store the document in the database. The
following code uses appendChild() to store the document in the root library:
rootLibrary.appendChild(eventsDocument);
Samples
CreateDocument.java
API documentation
org.w3c.dom.Document
org.w3c.dom.DOMImplementation
Samples
RetrieveDocuments.java
RetrieveDocumentParts.java
API documentation
org.w3c.dom.Node
org.w3c.dom.Element
com.xhive.dom.interfaces.XhiveLibraryChildIf
com.xhive.dom.interfaces.XhiveLibraryIf
com.xhive.dom.interfaces.XhiveNodeIf
Note: Retrieving all children of a node using the getChildNodes() method can be slow. The
getNextSibling() method is a faster way to iterate across child nodes
Examples
The following example code checks whether a library has any children and counts the number
of children.
int nrChildren = 0;
Node n = charterLib.getFirstChild();
while(n != null) {
nrChildren++;
n = n.getNextSibling();
}
The following example code displays all elements within an XML document.
Node n = theNode.getFirstChild();
int j = 1;
Using document ID
xDB automatically assigns an identifier to a new document. This identifier is unique within the context
of a library. The get() method in the com.xhive.dom.interfaces.XhiveLibraryIf interface retrieves
documents by identifier.
Example
The following code example retrieves a document by identifier using the get() method.
int anId = 10;
Node child = charterLib.get(anId);
System.out.println("document with ID = " + anId + " in \"UN Charter\"
has name: " + ((XhiveLibraryChildIf)child).getName());
Example
The following code example retrieves a document by name using the get() method.
Using XQuery
XQueries can be run on libraries and documents using the executeXQuery(String query) method in
the XhiveLibraryChildIf interface. The method returns a result sequence and each result element is
an instance of the XhiveXQueryValueIf object.
Example
The following example code executes a query that retrieves all chapter titles of a document.
Iterator result = charterLib.executeXQuery("//chapter/title");
while (result.hasNext()) {
XhiveXQueryValueIf value = (XhiveXQueryValueIf) result.next();
// We know this query will only return nodes.
Node node = value.asNode();
// Do something with the node ...
}
Samples
XQuery.java
API documentation
com.xhive.dom.interfaces.XhiveLibraryChildIf
com.xhive.query.interfaces
Using indexes
Using indexes can dramatically improve query performance. For more information about indexes,
see Indexes, page 150.
Examples
The following example code retrieves and displays the document titled UN Charter - Chapter 2
from the UN Charter library.
Iterator docsFound = rootLibrary.executeFullPathXPointerQuery(
"/UN Charter/UN Charter - Chapter 2");
Document docRetrievedByFPXPQ = (Document)docsFound.next();
System.out.println(docRetrievedByFPXPQ.toString());
The following example code specifies a relative path to retrieve the document.
// newLib is a sub library of "UN Charter"
Examples
The following example retrieves all title elements from the Un Charter - Chapter 5 sample
document in the UN Charter library.
String sampleLibName = "/UN Charter";
String sampleDocName = "UN Charter - Chapter 5";
String sampleDocPath = sampleLibName + "/" + sampleDocName;
String queryXPointer = "#xpointer(/descendant::title)";
// note that we only specify the library path and not the document name:
resultNodes = rootLibrary.executeFullPathXPointerQuery(sampleLibName + queryXPointer);
while ( resultNodes.hasNext() ) {
Node resultNode = (Node)resultNodes.next();
System.out.println( resultNode.getFirstChild().getNodeValue() );
}
Using XPath
For XPath queries, you can use the executeXPathQuery(...) methods in the XhiveNodeIf interface.
Optionally, you can supply a query context that contains namespace declarations, variable and function
bindings, and an absolute root.
The results of an XPath query are represented by the XhiveQueryResultIf interface. For example, the
following code executes a query that retrieves all chapter titles of a document.
XhiveQueryResultIf result = charterLib.executeXPathQuery("descendant::chapter/title");
The XhiveQueryResultIf interface includes methods for extracting different types of information
from a query result. The result can be of one of several types: a string, a Boolean, a number or a
location set. A location set is a collection of nodes, points, and ranges. For information about the rules
that determine the outcome of conversion, refer to the XPath specifications.
To convert a query result, you can use the following methods to extract different result types:
• getStringValue() - Retrieves the string value of a result.
• getBooleanValue() - Retrieves the Boolean value of a result.
• getNumberValue() - Retrieves the numeric value of a result.
• getLocationSetValue() - Retrieves the location set value of a result.
The following code processes the results returned as a location set.
if (result != null){
if ( resultType == XhiveQueryResultIf.LOCATIONSET ){
while ( resultNodeSet.hasNext() ) {
resultNode = resultNodeSet.next();
if ( resultNode.getLocationType() == Node.ELEMENT_NODE ) {
System.out.println(" " + ((Node)resultNode).getFirstChild().getNodeValue());
}
}
}
}
Note: xDB throws an XhiveException if the getLocationSetValue() method is called on a result that
is not a location set.
Using XPointer
The XPointer XML Pointer Language is based on the XPath XML Path Language. XPointer supports
addressing the internal structures of XML documents to traverse a hierarchical document structure and
select parts of the hierarchy based on various properties. For a complete and up-to-date description
of XPointer, refer to the XML Pointer Language (XPointer) Version 1.0 documentation at the W3C
website.
XPointer queries can be executed in a similar way as XPath queries:
XhiveQueryResultIf result = charterLib.executeXPointerQuery(
"xpointer(/chapter/article/para/list/item[1]/para)");
The result of an XPointer query has to be a non-empty location set or an exception is thrown.
Samples
XPath.java
XPathXPointerNamespaces.java
API documentation
com.xhive.query.interfaces
com.xhive.xpath.interfaces.XhiveXPathContextIf
com.xhive.dom.interfaces.XhiveNodeIf
Samples
DomTraversal.java
MyNumberFinder.java
FunctionObjects.java
API documentation
org.w3c.dom.traversal
com.xhive.dom.interfaces.XhiveNodeIf
com.xhive.dom.interfaces.XhiveFunctionIf
The interfaces and methods used for traversal are described in the Traversal interfaces and methods,
page 117 table.
Table 15 Traversal interfaces and methods
Examples
The following sampleFilter() code example uses the NodeFilter interface. The implementation
skips all title elements and rejects all list elements:
public class SampleFilter implements NodeFilter {
if ( elem.getNodeName().equals("title")) {
return FILTER_SKIP;
}
if ( elem.getNodeName().equals("list")) {
return FILTER_REJECT;
}
}
return FILTER_ACCEPT;
}
}
Creating a NodeIterator object that uses the filter from the sampleFilter() example requires
getting a handle to the DocumentTraversal implementation, as follows:
DocumentTraversal docTraversal = XhiveDriverFactory.getDriver().getDocumentTraversal();
The createNodeIterator() method is used to create a NodeIterator object and can use the
following parameters:
• root - The node at which to start the traversal.
• whatToShow - The flag specifying which node types to include.
• filter - The filter to use. The value can be set to null if no filter is used.
• entityReferenceExpansion - A flag specifying whether to expand entity reference nodes.
The following example code traverses the first chapter of a document using a NodeIterator object
and without a NodeFilter object.
System.out.println("\n#NodeIterator without a NodeFilter:");
NodeIterator iter = docTraversal.createNodeIterator(resultGetDocument,
NodeFilter.SHOW_ALL,
null,
false);
Node node;
To restrict the traversal and not include title or list elements, change the second line of the
previous example to:
NodeIterator iter = docTraversal.createNodeIterator(resultGetDocument,
NodeFilter.SHOW_ALL,
sampleFilter,
false);
Related topics
Retrieving documents using DOM operations
return false;
}
You could, for example, use the isDone() method to limit the number of processed nodes to a
specified number:
public boolean isDone (Node node){
return nrResults == maxNrResult;
Samples
DOMLoadSave.java
API documentation
com.xhive.dom.interfaces.XhiveNodeIf
org.w3c.dom.ls.LSSerializer
The com.xhive.util.interfaces package provides the following interfaces for publishing XML
documents from xDB:
XSLT
xDB contains an XSL Transformation (XSLT) engine. XSLT can transform an XML source tree into
any required result tree and publish XML content in (X)HTML, PDF and other formats. For more
information about XSLT, see the W3C website.
Publishing from xDB with XSLT requires an XML and XSL document within a Java application that
uses the transformToString(), transformToStream(), or transformToDocument() method. Both
the XML and XSL documents are stored in the database. The output can be another XML document,
or a document of any other format.
The XSL document, which is actually an XML file, specifies the transformations that produce the
desired output.
Note: When parsing XSL documents, the XhiveLibraryIf.PARSER_NAMESPACES_ENABLED
option value must be TRUE. Otherwise an exception is thrown during transformation of the XML
document.
XML documents can be published to PDF format with the formatAsPDFToStream() method in the
com.xhive.util.interfaces.XhiveFormatterIf interface. This method can format either an XSL-FO
document or an XML document as a PDF string. Transforming an XML document also requires an
XSL document.
Samples
Publish2HTML.java
Publish2PDF.java
API documentation
com.xhive.util.interfaces.XhiveTransformerIf
com.xhive.util.interfaces.XhiveFormatterIf
XLink interfaces
The XLink information is accessible through the DOM API because it is stored as attributes. The
com.xhive.dom.xlink.interfaces package provides methods and interfaces that are more convenient
and easier to use. A selection of these methods is listed in table XLink methods, page 122. The
DomLinkBase.java sample uses several of the XLink methods available in xDB.
Methods Description
getTitle(), getHRef(), getRole() Retrieve specific attributes.
getLinks(), getLinksByTitle(), Retrieve all available links, by title, or role.
getLinksByRole()
getNodesLinkingTo() Retrieve all nodes linking to a specific resource.
getArcs(), getFrom(), getTo(), Retrieve all information from arcs.
getStartingResources(),
getEndingResources()
getLocators(), getRole(), getLabel() Retrieve all information from locators.
expandDocument(), getResources- Retrieve the content of the targeted resources.
LinkedBy()
Samples
XLink.java
DomLinkBase.java
API documentation
com.xhive.dom.xlink.interfaces
Using versioning
xDB offers versioning for documents and BLOBs. The makeVersionable() method of the
XhiveLibraryChildIf interface controls versioning.
The example below uses the makeVersionable() method to enable versioning for the
"briefing.xml" document.
XhiveLibraryChildIf doc = briefingLib.get("briefing.xml");
doc.makeVersionable();
The getXhiveVersion() method in the XhiveLibraryChildIf interface can be used to get the
last version of a document:
// get the last version of doc
XhiveVersionIf lastVersion = doc.getXhiveVersion();
The XhiveVersionIf interface contains several methods for obtaining version information, including
getDate(), getLabel() and getCreator():
System.out.println("id: " + version.getId());
System.out.println("creation date: " + version.getDate().toString());
System.out.println("label: " + version.getLabel());
System.out.println("created by: " + version.getCreator().getName());
Samples
Versioning.java
Branching.java
API documentation
com.xhive.dom.interfaces.XhiveLibraryChildIf
com.xhive.versioning.interfaces.XhiveBranchIf
com.xhive.versioning.interfaces.XhiveCheckoutIf
com.xhive.versioning.interfaces.XhiveVersionIf
com.xhive.versioning.interfaces.XhiveVersionSpaceIf
Checked out documents can be checked in using the checkIn() method, as described in the following
example.
// do some updates on "checkedOutDoc"
// ...
// check the document in
lastVersion.checkIn(lastVersionDoc);
The checkIn() method creates a version and releases the checkout lock. The checkout lock can also
be released by using the abort() method. The abort() ignores all document changes and reverts
back to the current version. It is not necessary to check in the specific library child that the checkout
operation created. Any document or BLOB that is checked in creates another version.
Note: Versioned documents remain accessible for regular document retrieval. The last stored version
of a versioned document is used for the retrieval, traversal, query, or index. Earlier versions are only
searchable if they were created with a ’queryable’ option.
Example
The following example uses the getVersionSpace() method in XhiveVersionIf to access the version
space of a document.
XhiveVersionSpaceIf versionSpace = doc.getXhiveVersion().getVersionSpace();
To retrieve a version by version space, use either the getVersionById() or the getVersionByLabel()
method, as described in the following example.
// example of accessing a version via the getVersionById() method
XhiveVersionIf version1_1 = versionSpace.getVersionById("1.1");
Document doc1_1 = version1_1.getAsDocument();
Branching methods
xDB offers various methods for creating branches and for retrieving branch and version information,
including:
Table 18 Branch and version methods
Interface Method Description
XhiveVersionIf createBranch() Creates a new branch.
Node-level versioning
Instead of checking out the entire document, users can check out individual document nodes and
all their descendants.
The following example code checks out a document node.
XhiveDocumentIf doc = ...; // Some versioned document
Node introChapter = doc.executeXQuery("/root/chapter[@id=’intro’]").next().asNode();
XhiveVersionIf version = doc.getXhiveVersion();
XhiveBranchIf branch = version.getBranch();
// Create an owner document for the copy of the chapter to be edited
XhiveDocumentIf temporaryDoc = session.createTemporaryDocument();
XhiveCheckoutIf checkout = branch.checkOutNode(introChapter, temporaryDoc);
Node chapterCopy = checkout.getNodeCopy();
// Edit the copy of the chapter contained in chapterCopy
// ...
Map<Node, Node> nodes = Collections.singletonMap(introChapter, chapterCopy);
branch.checkInNodesAndMetadata(nodes, null, 0);
The following example code checks out metadata fields. Checking out a particular key name allows
checking in a value for that key.
XhiveLibraryChildIf lc = ...; // Some versioned document or blob
XhiveVersionIf version = lc.getXhiveVersion();
XhiveBranchIf branch = version.getBranch();
branch.checkOutMetadataField("key");
String oldValue = lc.getMetadata().get("key"); // If required
Map<String, String> metadata = Collections.singletonMap("key", "newValue");
branch.checkInNodesAndMetadata(null, metadata, 0);
Any number of nodes and metadata fields can be checked in at once to create a single new version
of the document in that branch. Because nodes are checked out from a branch, the checkin creates
another head version of the branch, regardless of the latest version.
Note: The XhiveVersionIf.createBranch() method creates another branch.
To determine whether an existing version is searchable, you can use the isQueryable() method of the
XhiveVersionIf interface. This is accessible through XhiveLibraryChildIf.getXhiveVersion().
Querying
Indexing
Default indexes only store info about the current version of the document. On documents that were
created with the ‘queryable’ option set to true, you can use XhiveIndexIf.VERSION_INFO to enable
indexing of old versions. On multipath indexes, which don’t use the XhiveIndexIf options, you can
use XhiveExternalIndexConfigurationIf.setStoreVersionInfo(true).
API documentation
com.xhive.dom.interfaces.XhiveLibraryChildIf
Examples
The following code example gets all required attributes of an element.
ASModel model = ((DocumentAS) document).getActiveASModel();
ASElementDecl eltDeclaration = model.getElementDecl(element.getNamespaceURI(),
element.getTagName());
ASNamedObjectMap attributeDeclarations = eltDeclaration.getASAttributeDecls();
System.out.println("The following attributes are all required:");
for (int i = 0; i < attributeDeclarations.getLength(); i++) {
ASAttributeDecl attDecl = (ASAttributeDecl) attributeDeclarations.item(i);
// Check whether this attribute is required
if (attDecl.getDefaultType() == ASAttributeDecl.REQUIRED) {
String attName = attDecl.getObjectName();
System.out.println(attName);
}
}
The following example code shows how to parse a DTD or XML Schema directly into the catalog
of a library.
ASDOMBuilder builder = (ASDOMBuilder) charterLib.createLSParser();
model = builder.parseASURI(url, ASDOMBuilder.DTD_SCHEMA_TYPE);
unCharterCatalog.setPublicId(publicId, model);
The following example code serializes a schema model in the catalog.
DOMASWriter writer = (DOMASWriter) charterLib.createLSSerializer();
ByteArrayOutputStream output = new ByteArrayOutputStream();
writer.writeASModel(output, model);
String schemaString = output.toString();
API documentation
org.w3c.dom.as
com.xhive.dom.interfaces.XhiveCatalogIf
XhiveFormatterIf.formatAsPDF
import org.apache.avalon.framework.logger.ConsoleLogger;
import org.apache.avalon.framework.logger.Logger;
import org.apache.fop.apps.Driver;
import org.apache.fop.messaging.MessageHandler;
// FOP part
Driver driver = new Driver();
Logger logger = new ConsoleLogger(ConsoleLogger.LEVEL_ERROR);
MessageHandler.setScreenLogger(logger);
driver.setLogger(logger);
driver.setRenderer(Driver.RENDER_PDF);
driver.setOutputStream(os);
driver.render(foDocument);
} catch (Exception e) {
//e.printStackTrace();
throw new XhiveException(XhiveException.FORMAT_EXCEPTION, e);
}
}
XhiveTransformerIf.transform
import com.xhive.core.interfaces.XhiveSessionIf;
import com.xhive.error.XhiveException;
import com.xhive.dom.*;
import org.w3c.dom.*;
import javax.xml.transform.*;
import java.io.*;
import java.util.Iterator;
/**
* URI resolver that translates
* xhive:path#query
* into an xquery run on path, e.g.
* xhive:/plays#//TITLE
*/
private class XhiveURIResolver implements URIResolver {
private static final String XHIVE_PREFIX = "xhive:";
private static final String SEPARATOR = "#";
} catch (XhiveException e) {
throw new TransformerException("XhiveXalanTransformer:
Problem with query " + href + ": " + e.getMessage(), e);
}
// We will only use the first result here (otherwise we would
have to include a new top-element)
if (queryResult.hasNext()) {
XhiveXQueryValueIf queryValue = (XhiveXQueryValueIf)
queryResult.next();
// Is it a node?
try {
return new DOMSource(queryValue.asNode());
} catch (XhiveException e) {
// must be XQUERY_ERROR_VALUE, so interpret it as a string
return new StreamSource(new StringReader(queryValue.asString()));
}
} else {
throw new TransformerException("XhiveXalanTransformer:
Query " + href + ": " + " has no results");
}
}
}
For improved Xalan XSLT performance, it is best to create Templates objects using the
newTemplates call on TransformerFactory. The example below shows how to use the interface.
For more information, see the Xalan documentation.
TransformerFactory tFactory = TransformerFactory.newInstance();tFactory.
translet = tFactory.newTemplates(new DOMSource(xslSource));
// Now keep this translet cached somewhere, and for transforming do:
Transformer transformer = translet.newTransformer();
transformer.transform( new DOMSource(xmlSource), new StreamResult(writer)) );
One advantage of a compiled stylesheet is that it no longer has references to any xDB persistent data,
so you can use the compiled version with any session.
Action Locked
Add/remove a document (or library) The library to which the document is added.
Modify a document The document.
Add/remove an index to/from a library The library to which the index is added.
Add/remove a user/group The database object (which means that one concurrent
thread can make these changes).
Update a user/ group The database object.
Update a context conditioned index The context conditioned index.
Documents stored in xDB use an internal data structure called namebase. This structure is relevant
to the locking behavior but cannot be accessed directly using the API. The namebase maps element
and attribute names to small integers which are processed faster. The namebase is locked when it is
modified.
When a library is created, the following option influences the namebase locking behavior:
• XhiveLibraryIf.LOCK_WITH_PARENT
By default, each library is created with its own namebase. Using more namebases improves
concurrency, but adds some space and processing overhead. In some cases, for example if there
are many libraries with little content, it may be better for a new library to share the namebase
of the parent library.
API documentation
com.xhive.dom.interfaces.XhiveLibraryIf
Session lifecycle
In xDB, the data is accessed using transactions within sessions. A transaction starts with a call to
session.begin() and ends with a call to session.commit() or session.rollback(). When a
session is in a transaction the data in the database can be viewed or changed.
You can use a call to session.isOpen() to check if a session is in a transaction (an open or active
session is a session in a transaction).
At a minimum, a full session lifecycle consists of the following operations:
• createSession(), page 141
After a call to session.disconnect() you can do a new call to session.connect() and then
start a new transaction.
After a call to session.terminate() you can not use the session anymore.
The code below is a complete example how to use sessions following this model:
package com.emc.xhivesupport.demo;
import com.xhive.XhiveDriverFactory;
import com.xhive.core.interfaces.XhiveDriverIf;
import com.xhive.core.interfaces.XhiveSessionIf;
import com.xhive.dom.interfaces.XhiveLibraryIf;
Whenever exceptions occur in a transaction, the session must always rollback, whereas whenever
a transaction succeeds, the session must always commit. This is also true when the session is in
read-only mode. Even though there are no changes, and no locks to release, a read-only transaction
still does eat resources. A commit or rollback is needed to free these resources.
The finally block is always executed. If there are exceptions in the try block, the
session.commit() in the try is not executed and the session.rollback() in the finally
block will rollback the session. If there are no exceptions, the execution of the session.rollback()
immediately follows the session.commit(). In this case the rollback simply does nothing.
A session.terminate() is allowed when a session is not in a transaction. No need to replace the
above finally block by this:
} finally {
if (session.isOpen()) {
session.rollback();
}
if (session.isConnected()) {
session.disconnect();
}
session.terminate();
}
Following this model, you can always create your sessions in method scope, and usually you can do
all the session management in the same method. It is not recommended to ’remember’ sessions as
instance members or class members. It will almost certainly lead to errors related to not respecting
the sessions lifecycle.
Samples
UseSessions.java
API documentation
com.xhive.core.interfaces.XhiveSessionIf
Joined sessions
You can have concurrent transactions: different threads can each create their own session and execute
transactions concurrently. Here, locks make sure data cannot be changed in conflicting ways. But to
make sure that data cannot be changed in conflicting ways, you can not do different things concurrently
in one transaction.
The requirement that a session is used by one thread at a time is ensured by the xDB session API
as follows:
• With session.join() you can register the current thread to the session (the session joins the
current thread).
• With session.leave() you can unregister the current thread from the session (the session
leaves the current thread).
• With session.isJoined() you can check whether the current thread is registered to the session
(the session is joined to the current thread).
In early versions of xDB the create and terminate of a (remote) session was expensive, because of
the create and close of connections under the hood. It was recommended to avoid these operations
as much as possible by using a session pool. Starting with xDB 8, xDB has a connection manager,
keeping its own pool of connections. Unless performance tests or special conditions suggest otherwise,
in production code, session pools usually are not needed.
If you still need a session pool, a session returned to the pool by one thread will in general be reused
by some other thread. The session should leave the current thread when it is returned to the pool.
The leave and join of sessions is hard enough to make sure it is coded only once, as part of the
implementation of the pool and not on each use in the code using the pool.
The example below shows how you could implement a session pool:
package com.emc.xhivesupport.demo;
import com.xhive.core.interfaces.XhiveDriverIf;
import com.xhive.core.interfaces.XhiveSessionIf;
import java.util.concurrent.ConcurrentLinkedQueue;
// singleton
private static SessionPool instance = new SessionPool();
private SessionPool() {
}
// core functionality
public synchronized void init(XhiveDriverIf driver) { // thread safe
this.driver = driver;
}
public XhiveSessionIf getSession() {
XhiveSessionIf session = freeSessions.poll(); // thread safe
if (session == null) {
session = driver.createSession();
} else {
session.join();
}
return session;
}
public void returnSession(XhiveSessionIf session) {
session.rollback();
session.disconnect();
session.leave();
freeSessions.add(session); // thread safe
}
public synchronized void close() { // thread safe
XhiveSessionIf session = freeSessions.poll();
while (session != null) {
session.terminate();
session = freeSessions.poll();
}
instance = null; // encourage garbage collection
}
}
package com.emc.xhivesupport.demo;
import com.xhive.XhiveDriverFactory;
import com.xhive.core.interfaces.XhiveDriverIf;
import com.xhive.core.interfaces.XhiveSessionIf;
Other requirements for a pool are conceivable. For instance, pools that keep connected sessions, or
keep sessions that come from different drivers.
API documentation
com.xhive.core.interfaces.XhiveSessionIf
Objects in the database can only be accessed in an open transaction. Furthermore, objects retrieved
from a database in one transaction cannot be used in a subsequent transaction. For example, the
following code snippet will throw an XhiveException with errorcode OBJECT_CLOSED:
session.begin();
XhiveLibraryIf library = session.getDatabase().getRoot();
System.out.println(library.getNumChildren());
session.commit();
session.begin();
System.out.println(library.getNumChildren()); // ERROR: OBJECT_CLOSED
session.commit();
System.out.println(library.getNumChildren());
appears two times in the code. The first occurrence is in the same transaction where we get the
library reference. The second occurrence is in another transaction.
session.begin();
XhiveLibraryIf library = session.getDatabase().getRoot();
System.out.println(library.getNumChildren());
session.commit();
session.begin();
library = session.getDatabase().getRoot(); // get again
System.out.println(library.getNumChildren()); // OK
session.commit();
Use of checkpoint
session.begin();
XhiveLibraryIf library = session.getDatabase().getRoot();
System.out.println(library.getNumChildren());
session.checkpoint(false); // or: session.checkpoint(true)
System.out.println(library.getNumChildren()); // ref library still valid
session.commit();
session.commit();
session.begin();
by:
session.checkpoint();
API documentation
com.xhive.core.interfaces.XhiveSessionIf
XhiveDriverIf.createSession()
The createSession() method creates a new database session. The session is implicitly joined to the
current thread.
Example
The following example creates a session.
XhiveDriverIf driver = ...; // e.g. from XhiveDriverFactory.getDriver()
API documentation
com.xhive.core.interfaces.XhiveDriverIf
connect()
The connect() method connects a session to a database. The connect method has a relatively small
overhead. This call’s overhead is relatively small, so in a multi-user setting you may choose to connect
every time you start using a session for an individual users.
begin()
The begin() method starts a transaction. All database changes are part of the transaction and only
become visible in the database after a commit() call or a checkpoint() call. All data read from the
database is in the same state as at the time of the begin() call.
checkpoint()
The checkpoint() method commits database changes within a transaction to the database, including all
changes that were made since the begin() call or the last checkpoint() call. The changes are visible to
other sessions. The transaction remains open after a checkpoint() call.
A checkpoint() call keeps the locks on the database, unless the true option is passed to downgrade the
locks. A checkpoint() call is faster that a commit() call followed by a begin() call.
Like a commit() call, a checkpoint() call deletes all temporary objects, such as XQuery constructed
result nodes or elements created but not appended to their document. Therefore, even though
references to existing database objects remain valid, these temporary objects can no longer be used
after a checkpoint() call.
For optimal concurrency, it best to keep the number of readlocks as low as possible. For this reason, it
may be better to use a commit() call followed by a begin() call instead of a checkpoint() call.
commit()
The commit() method commits all data changes to the database, cleans up data temporarily stored in
the database, and releases all locks.
The time to process a commit() method depends on the changes that were made in the transaction.
rollback()
The rollback() method revokes all changes in the database that were made since calling the begin()
method or the checkpoint() method. If changes were already written to the disk, the rollback() method
can have considerable overhead.
disconnect()
The disconnect() method returns a session back to the initial state it was in when the session was
created. The disconnect() method has no overhead.
Note: Although a disconnect() call marks the end of a session scope, it does not free all the resources
allocated by the session. To improve performance, use the terminate() method to terminate a session
when it is no longer needed. If the session is a remote session, the TCP connection to the page server is
returned to the connection manager. If the terminate() method is not called, the connection is closed
when the session is finalized, after it has been garbage collected.
terminate()
The terminate() method closes the connection to the page server. Terminating a local session has
no effect.
After calling the terminate() method, the session object can no longer be used.
If a session is not terminated, it continues to use resources until it is garbage collected. Open sessions
in transaction can have internal references pointing to them that are otherwise never garbage collected.
Sessions that are not in transaction are garbage collected when all references are released.
join()
The join() method joins a session to the current thread. Only the current thread can use database
objects, for example documents that belong to this session.
When a session is created, it is automatically joined to the thread that creates it.
You should always call the join() method before you start using a session in a given thread. When
working with servlets or EJBs, each request is executed in an ’unknown’ thread, but during execution
of the request the thread does not change, so join() is only called once. The time it takes to execute the
join() method is almost negligable. Still, you should only call it as needed, as when it can help you to
detect unexpected or unwanted thread changes.
Note: It is not possible to use a single session concurrently in multiple threads. Do not try to serialize
the use of a session in multiple threads by synchronizing on the session object. This conflicts with
internal use of the session synchronization and may lead to deadlocks. If it is necessary to serialize the
use of a session, the synchronization should be done on an application object.
leave()
The leave() method unbinds a session from a thread.
It is important to call the leave() method on sessions that are no longer used in threads. Otherwise, it
may become possible to terminate the session after the thread is exited and the terminate() method is
called from another thread.
After calling the commit() and begin() methods, the library could have been removed in
another session. An attempt to use an object in a different transaction usually generates an
XhiveException.OBJECT_DEAD error. The error indicates that the object can no longer be used in
the Java application.
Instead of calling the commit() and begin() method, call the checkpoint() method. The
checkpoint() method applies permanent changes, does not refresh the database view, and keeps all
locks. All references to database objects that were retrieved can still be used.
Operation execution
A good way to perform different operations consistently over multiple threads is to first create a base
class for performing an operation, which would include the session management code, the exception
handling code and so forth. Then create various derived classes whose only purpose is to run the
desired operation.
In general, if you know in advance that the operation is read-only, it is best to set the session state
to read-only. This may improve concurrency, because no locks are taken by read-only transactions,
page 148.
Exception handling
Where operations happen concurrently, proper session exception handling is of the outmust
importance, to help avoid data corrupotion, deadlocks and unexpected behaviour.
First and foremost: if an exception occurs, make sure you roll back the transaction:
try {
session.begin()
//Some sort of operation
session.commit()
}
catch (Exception e) {
e.printStackTrace();
session.rollback();
}
When operations are being performed concurrently, locking issues may arise (for example: transaction
A has a read lock on the database, while transaction B tries to retrieve a write lock on the database). In
such cases a XhiveLockNotGrantedException will be raised. Since the issue can be temporary (for
example, if you retry the offending operation later, the lock may have been released), a good way to
handle such an exception may be to rollback the session and try again.
boolean completed = false;
int unsuccesfullAttempts = 0;
Samples
MultithreadedOperations.java
The document ’doc’ added in the transaction within session A remains invisible in any other
transaction until the transaction within session A is committed. Even then, the open transactions of
sessions B and C do not see the added document until their next begin().
in milliseconds for a transaction to wait for a lock grant. If the wait time has passed and the other
transaction still locks the database object, an XhiveLockNotGrantedException error is thrown.
Lock exceptions can occur regardless of the wait option setting. For example:
Transaction A reads document X
Transaction A writes document X
Transaction B reads document Y
Transaction A wants to write document Y
-> blocks because transaction B already has a readlock
Transaction B wants to read X
-> blocks because transaction A already has a writelock
At this point, both transactions are unable to continue, because each one is waiting for the other to
finish. In this case, xDB picks one transaction and throws an XhiveDeadlockException error, which is
a subclass of the XhiveLockNotGrantedException error. If a rollback is performed on that transaction,
the locks are released so that the other transaction can continue.
It is possible to get a deadlock even when only one database object is involved. For example:
Transaction A reads document X
Transaction B reads document X
Transaction A wants to write document X
-> blocks because transaction B already has a readlock
Transaction B wants to write document X
-> blocks because transaction A already has a readlock
Any application code should always take into account that locking exceptions can occur. Usually the
course of action is to rollback the transaction and retry the same transaction a number of times:
Note: Read-only transactions do not need locks, so they do not need such retry logic. Use read-only
transactions whenever you can.
Read-only transactions
By default, transactions can modify database objects. Transactions are made read-only by calling the
setReadOnlyMode() method. Read-only transactions cannot modify database objects. The advantage
of read-only transactions is that they do not take any locks. This method improves concurrency with
transactions that modify data.
To get a consistent view of the database without using locks, read-only transactions view a logical
snapshot of the data at the time the transaction begins. Read-only transactions do not recognize
any modifications to the data.
Data pages that have been deleted are not reallocated to new documents as long as there are still open
transactions that could use the old data. Therefore, transactions should not be kept open indefinitely.
The checkpoint() method does not affect read-only transactions.
API documentation
com.xhive.core.interfaces.XhiveDriverIf
• Indexes
• Index APIs and samples
• Path indexes
• Multipath indexes
• Multipath index merge
• Multipath indexing methods
• Value indexes
• Value indexing methods
• Full-text indexes
• Full-text indexing methods
• Metadata value indexes
• Metadata full text indexes
• Library indexes
• Library indexing methods
• ID attribute indexes
• ID attribute indexing methods
• Element name indexes
• Element name indexing methods
• Concurrent indexes
• Concurrent indexing methods
• Non-blocking incremental indexes
• Context conditioned indexes
• Context conditioned indexing methods
• Optimizing index performance
• Indexes and timezones
Indexes
Various types of indexes can be used to speed up queries. Especially with large data sets, indexes
are essential to query performance.
Indexes that are ’live’ are updated automatically when the indexed data is updated. Since updating the
data means updating the indexes, the number of indexes directly impacts update performance. The
only non-live indexes are the context conditioned indexes, page 168, which are deprecated, and
supported only for the sake of compatibilty with previous xDB versions.
The following live index types are currently supported:
• path indexes, page 151
• full text indexes, page 163
• multipath indexes, page 153
• value indexes, page 161
• metadata indexes, page 164
• library ID indexes and library name indexes, page 165
• ID attribute indexes, page 166
• element name indexes, page 167
Only libraries can have a library name index and/or a library ID index. The other index types can be
defined for a library or for a document. Indexes are maintained automatically for all descendants and
children of the library or document. The index is not locked with the library or document, to improve
concurrent access to the index.
Most of the index types can be defined as either concurrent or compressed.
An index stores key-value pairs. The index key is a string, or in case of value indexes, a number type
and a node set value. The index keys are always sorted. Indexes are scalable and the number of
index keys and values can grow without limitations. The node sets can become large, especially with
ID attribute, element name, and value indexes.
For information about index use with XQuery, refer to Using indexes in XQuery, page 188.
Samples
FTI.java
CreateMultiPathIndex.java
LibraryIndexes.java
IdAttributeIndex.java
ValueIndex.java
ElementNameIndex.java
IndexAdder.java
IndexInConstruction.java
API documentation
com.xhive.index.interfaces.XhiveIndexIf
com.xhive.index.interfaces.XhiveIndexListIf
com.xhive.index.interfaces.XhiveIndexAdderIf
com.xhive.index.interfaces.XhiveSubPathIf
com.xhive.index.interfaces.XhiveExternalIndexConfigurationIf
com.xhive.index.interfaces.XhiveAnalyzer
com.xhive.index.interfaces.XhiveXqftOptionsAnalyzer
com.xhive.index.interfaces.XhiveIndexWithPathsAnalyzer
com.xhive.index.interfaces.XhiveIndexInConstructionIf
com.xhive.index.interfaces.XhiveIndexInConstructionListIf
com.xhive.index.interfaces
Path indexes
Path indexes index the value of elements and attributes. Path indexes provide a more general way
of specifying the indexed element and allow multiple element and attribute values to be used as
index keys. For example, a full-text field index key can be used as one of the multiple values to
accelerate xhive:fts queries.
Path indexes are specified using an XPath-like syntax. The syntax consists of a path to the indexed
element and an optional specification of the values that are used as index keys. The element associated
with the index key value can only contain a single text or CDATA child node.
//elem indexes all elements named elem. This is similar to an element value index. Such an index
can speed up queries such as //elem[. = "value"], /foo/bar/elem[. = "value"], or
/foo[bar/elem = "value"].
//elem[@attr] indexes all elem elements that contain attr attribute values. This index option is
similar to an attribute value index and speeds up queries such as //elem[@attr = "value"].
//{http://www.example.com}
elem[@{http://www.example.com}attr] indexes all elem elements that contain attr attribute
values that are in the http://www.example.com namespace.
//*[@attr] indexes all attr attribute values independent of the element name. This index option
is similar to an attribute value index and speeds up queries such as //*[@attr = "value"] and
//elem[@attr = "value"].
//chapter/title indexes all title elements that are nested in chapter elements. This index option
can be used for queries such as //chapter/title[. = "Intro"].
//chapter[title] indexes all chapter elements that contain the specified title element value. This
index option can be used for queries such as //chapter[title="Intro"].
/root[node/@id] indexes the id attribute values of node elements nested in a root element. Using
this type of path can speed up index updates because only the nested elements are searched and not
the entire document.
//elem[@attr1 + @attr2] indexes all elem elements that contain attr1 and attr2 attribute values.
This index option can be used for queries that require only a single lookup, such as //elem[@attr1
= "value1" and @attr2 = "value2"], or a query such as //elem[@attr1 = "value1"
and @attr2].
//elem<INT> indexes the elem element values as integers, similar to a value index using the
TYPE_INT option. This index option can be used for queries such as //elem[. = 10]. Specifying
the type as an option is not possible in path indexes, because in a index with multiple values each value
used as the index key can have a different type.
//items/item[@id<INT> + price<FLOAT> + description/name<STRING>] indexes values
that have different types. This kind of index option can be use for queries such as //items/item[@id
= 10 and price = xs:float(4.53) and description/name = "keyboard"].
//article[body<FULL_TEXT>] indexes articles by the full-text content in the article body. This
index option can be used for queries such as //article[body contains text "apples"].
//article[body<FULL_TEXT:my.package
.CustomAnalyzer:>] indexes articles by the full-text content in the article bodies using a custom
text analyzer.
//article[author<STRING> + body<FULL_TEXT::GET_ALL_TEXT,
SA_ADJUST_TO_LOWERCASE, SA_FILTER_ENGLISH_STOP_WORDS>] indexes
articles on both author and content. This index option can be used for queries such as
//article[author="John" and body contains text "apples"].
//elem[%{key1} + %{key2}] indexes all elem elements where the ownerdocument contains
metadata fields with name key1 and key2. This index option can be used for queries that require
only a single lookup, such as //elem[xhive:metadata(., "key1") = "value1" and
xhive:metadata(., "key2") = "value2"].
In these specifications metadatakey is the metadata field name, localname is an XML name,
reference is an XML character or predefined entity reference, and analyzername is a Java class
name.
Some of the full-text options are only valid in combination with the XhiveStandardAnalyzer. For more
information about the analyzer, see Full-text indexes, page 163.
Multipath indexes
A multipath index allows you to index multiple elements without requiring explicit configuration of
every single index path. It can index the contents of elements as specific value types or as full-text.
To specify which XML elements should be indexed, the user must specify the index’ main path, and
multiple sub-paths. All sub-paths are resolved relative to the main path, and are specified through an
XPath-like path expression and a set of options. All elements matched by the path expression will
be indexed using the given options.
The two most important configuration options for a sub-path are VALUE_COMPARISON and
FULL_TEXT_SEARCH. They can be used together, allowing you to search the contents of an
element either through value or through full-text search. The default type used for value comparison of
a sub-path is String.
Lucene
Assuming that the main path of the multipath index is /main/path, the table below shows various
possibilities for specifying sub-path expressions.
Sub-path specification
Indexing elements as Multipath indexes allow a sub-path to specify Path indexes also allow you to
both by value and as specify a node twice: once being
SubPathOptions.VALUE_COMPARISON
tokenized full-text indexed per value, and another as
together with
full-text, but as you have to search
SubPathOptions.FULL_TEXT_SEARCH, for these elements together, this is
and in that case all elements matching the ineffective.
sub-path both by value as well as full-text will
be indexed.
The merging policy is configurable. This requires care, because merging has direct impact on the
overall performance of the system. While it is desirable to keep the number of sub-indexes small,
sub-index merging can be time-consuming and CPU-intensive, and if done too often it can affect the
perfomance of the page server. The following index merging tasks can be fine-tuned to achieve an
optimal and balanced level of indexing and query performance:
Lucene internal During indexing, Lucene segments are merged into larger ones as new
segment merge data is being indexed.
Final merge An asynchronous process merges sub-indexes into a single, optimized
“final” index. A final merge is very I/O and CPU-intensive, and can take
a long time to finish. Especially for large indexes, it should not be run
during performance-critical periods.
Non-final merge An asynchronous process merges smaller sub-indexes into larger ones. A
non-final merge is more lightweight than a full final merge, so non-final
merge can be executed more frequently.
The asychronous merging tasks produce new sub-indexes as a result of merging smaller sub-indexes.
The original sub-indexes will eventually be deleted by a periodic index cleaning task.
For information on configuration settings for merging and cleaning tasks, refer to Multipath index
properties, page 157.
Various aspects of multipath indexing can be configured using settings in the file xdb.properties,
page 66.
Note: For the settings to take effect, the file xdb.properties must be present in the page server’s
Java classpath.
The settings are global for the page server, and apply to all multipath indexes in the federation. It is
possible to override certain settings (such as finalMergingInterval and finalMergeNoLogging)
for specific indexes using the API.
Table 25 Multipath index settings
Property Description
xhive.lucene.cleanMergeInterval The interval (in seconds) between the start of non-final
merges. Default is 300.
xhive.lucene.finalMergingInterval The interval (in seconds) between the start of final
merges. Default is 14400.
xhive.lucene.finalMergingBlackout The final merges blackout time window.
This defines a daily period during which final merges are
forbidden, using the form StartHour-EndHour (in
24h format). Default value is 0 (no blackout period).
Example: 8-20 (blackout from 8AM to 8PM)
xhive.lucene.cleaningInterval The interval (in seconds) between the start of periodic
index cleaning tasks. Default is 120
Property Description
xhive.lucene.blacklistsKeep Indicates whether or not to keep blacklists. Default is
false.
xhive.lucene.cleaningBlacklistInterval The interval (in seconds) between the start of
periodic cleaning of blacklists. Only used if
xdb.lucene.blacklistsKeep is set to false.
Default is 300.
xhive.lucene.refreshBlacklistCacheDuringMerge Determines whether the blacklist cache will be refreshed
during non-final merges. This may improve query
performance under heavy ingest. Default is false
xhive.lucene.finalindex.size The maximum number of sub-indexes in final-merged
indexes. Default is 1.
xhive.lucene.nonFinalMaxMergeSize The maximum size (in bytes) of a sub-index to be
included in a non-final merge. Default is 300000000
xhive.lucene.mergeFactor The number of documents in a segment that triggers the
internal Lucene merge. Default is 10.
xhive.lucene.maxMergeDoc The maximum number of documents in one segment.
Default is 1000000
xhive.lucene.maxSegmentsForOptimization The maximum number of segments in a sub-index.
Default is 5.
xhive.lucene.parallelExecutionFinalMerge- Indicates whether multiple final merges can be executed
CrossNodes in parallel if there are multiple multipath indexes. Default
is true.
xhive.lucene.mergingPolicyThreadPoolSize The maximum number of threads to use for the
asynchronous merging tasks. Default is 8
xhive.lucene.finalMergeNoLogging Indicates whether transaction logging is disabled for final
merge operations. Default is true.
Note: Disabling of transaction logging for final merges
significantly improves their performance, but it means
that incremental backups will not include corresponding
transaction log records for the multipath index. During
incremental backup of a federation that includes a
multipath index, either enable final merge transaction
logging, or else exclude from the incremental backup
all multipath indexes that have final merge transaction
logging disabled. To back up such indexes, consider
using standalone backup or library backup.
xhive.lucene.ramBufferSizeMB The size (in MB) for the Lucene index writer
RAMDirectory. Default is 3.
xhive.lucene.useCompoundFile Indicates whether to use the Lucene compound file
format. Default is false.
xhive.lucene.temp.path The temporary directory for index entries.
Default is blank (the system default temporary directory).
Property Description
xhive.lucene.queryResultsWindowSize The maximum number of documents returned by one
query. Default is 12000.
xhive.lucene.ratioOfMaxDoc The maximum term frequency (0-1) for terms to be
accepted by wildcard queries. (Term frequency is
calculated during indexing and stored with the Lucene
index.)
For example, if the frequency is 0.5 or higher, the
queries will not accept terms that occur in more than
half of the documents in the index, so that searching for
an.* will be unlikely to return any hits for common
words such as “and”.
Default is 1.
xhive.lucene.termsExpandedNumber The cutoff number of terms returned by wildcard queries.
Default is 65536
xhive.lucene.fuzzyQueryPrefixLength The number of leading characters to ignore when
assessing similar terms in fuzzy queries. Default is 0.
xhive.lucene.fuzzyTermsExpandedNumber The maximum number of similar terms to return by fuzzy
queries. Default is 2147483647.
xhive.lucene.facetPathPatternMatch For facet search, indicates whether to use the matching
sub-path (false) or the XML node path (true) as the
facet key. Default is false.
xhive.lucene.searchValueForFtcontains Indicates whether sub-paths that specify the
VALUE_COMPARISON option, but not the
FULL_TEXT_SEARCH option, can be used when
evaluating full-text search queries. Default is false.
xhive.lucene.ignoreValueComparisonScore Indicates whether to ignore the score for value
comparison queries on a multi-path index. Default is
true.
xhive.lucene.strictIndexTypeCheck Determines whether to perform strict type checking on
indexed sub-paths. Default is true.
If set to false, sub-paths with inconsistent value data
types can still be indexed (as string).
Note: This may cause non-conformant behavior with
respect to the XQuery specification.
xhive.lucene.fieldCacheSize The maximum number of entries in the query cache for
Lucene fields and their corresponding sub-paths. Default
is 4096.
xhive.lucene.maxBooleanClause The maximum allowed number of Lucene
BooleanClauses in a BooleanQuery. Default is 65536.
Samples
CreateMultiPathIndex.java
API documentation
com.xhive.index.interfaces.XhiveIndexIf
com.xhive.index.interfaces.XhiveSubPathIf
com.xhive.index.interfaces.XhiveMultiPathIndexConfigurationIf
com.xhive.index.interfaces.XhiveExternalIndexConfigurationIf
com.xhive.index.interfaces.XhiveIndexWithPathsAnalyzer
Samples
CreateMultiPathIndex.java
API documentation
com.xhive.index.interfaces.XhiveScoreCustomizerIf
com.xhive.index.interfaces.XhiveMultiPathIndexConfigurationIf
Value indexes
A value index stores elements by an element value or attribute value. Value indexes can be created for
a library or a document. An index list can contain multiple value indexes.
Value indexes support namespaces. Use of namespaces requires the indexed documents to be parsed
with the namespaces option enabled, and the value index to be created with Element URI or Attribute
URI parameters.
xDB supports value indexing of:
• elements by element value.
• elements by attribute value.
• named elements by attribute value.
The following example code creates a value index, using the addValueIndex() method. The parameters
that are supplied to the addValueIndex() method determine the exact type of the value index.
// create a value index that stores elements by element value
XhiveIndexIf nameIndex =
indexList.addValueIndex(nameIndexName, null, "NAME", null, null);
Samples
ValueIndexIndex.java
API documentation
com.xhive.index.interfaces.XhiveIndexListIf
Full-text indexes
A full-text index is a special form of value index that stores elements by element text values or an
attribute value. Full-text indexes are more versatile but slower than value indexes, especially during
updates. Full-text indexes support namespaces. Use of namespaces requires the indexed documents to
be parsed with the namespaces option enabled, and the full-text index to be created with Element URI
or Attribute URI parameters.
An index list can contain multiple full text indexes. Full-text indexes can be used to:
• Search for individual words of an element value.
• Perform complex Boolean and wildcard queries.
• Index all the underlying text of an element and subelements.
There are some index options specifically for full-text indexes:
Table 26 Full-text index options
Example
The following example code uses the addFullTextIndex() method to create a full-text index. The
parameters that are supplied to the addFullTextIndex() method determine the exact value index type.
// create a full text index on the text-contents of the name elements
XhiveIndexIf nameIndex = indexList.addFullTextIndex(nameIndexName,
null, "NAME", null, null, null,
XhiveIndexIf.FTI_SUPPORT_PHRASES | XhiveIndexIf.FTI_GET_ALL_TEXT);
Samples
IndexAdder.java
API documentation
com.xhive.index.interfaces.XhiveIndexIf
Library indexes
A library index indexes the content of a library. Library indexes must have a unique name.
There are two types of library indexes:
• Library ID index: uses the IDs of the library content objects, such as documents, libraries, and
BLOBs. Library ID indexing is efficient when many content objects are stored in the library.
• Library name index: uses the names of the library content objects, and improves performance of
XLink operations and full path XPointer queries.
By default, a library has a library name index. The root library also has a library ID index. A library
can only have one library ID index and one library name index. Adding a second library ID or library
name index generates an exception.
Names are not mandatory for library content, and some content is not included in the index.
A library ID index improves the performance and scalability of the get(long Id) method in
XhiveLibraryIf interface. Library ID indexing is efficient when many content objects are stored in
the library.
The following code sample shows how to add a library ID index:
//get the index list of the library
XhiveIndexListIf indexList = library.getIndexList();
A library name index improves the performance and scalability of the get(String name) method in
the XhiveLibraryIf interface, and improves performance of XLink operations and full path XPointer
queries. Names are not mandatory for library content and some content is not included in the index.
The following code sample shows how to add a library name index:
//add a library name index to the library
String nameIndexName = "Library Name Index";
XhiveIndexIf nameIndex = indexList.addLibraryNameIndex(nameIndexName);
ID attribute indexes
An ID attribute index stores elements by their unique element ID. Users typically do not access
ID attribute indexes directly, they are used implicitly to improve the performance of element ID
operations and XQuery/XPath/XPointer queries.
ID attributes can be specified in the DTD or XML-schema associated with a document.
ID attribute indexes can be created for a library or a document. Element IDs are only unique within the
context of a document. Since a library can contain more than one document, each document could use
the same element IDs. The index does not limit the number of entries for a given key.
Examples
The following code sample shows how to add an ID attribute index to a document.
//Get the indexlist
XhiveIndexListIf indexList = document.getIndexList();
//add the id attribute index to the indexlist of the document if the index is not found
String indexName = "ID Attribute Index";
XhiveIndexIf index = indexList.getIndex(indexName);
if (index == null){
index = indexList.addIdAttributeIndex(indexName);
}
XQuery, XPath, and XPointer queries, and the getElementsById(String elementId) method use ID
attribute indexes. Users typically do not access this type of index directly. However, the following
code sample shows how to view the keys of an ID attribute index:
//Print the element of key = "p3"
String key = "p3";
Example
The following code sample shows how to create a selected element name index.
//Add a selected element name index
String[] names = {"NAME", "BORN", "WIFE"};
XhiveIndexIf selectedElementNameIndex = indexList.getIndex(selectedElementIndexName);
if (selectedElementNameIndex == null){
indexList.addElementNameIndex(selectedElementIndexName, names);
}
For namespaces, the selected element names can be defined as follows:
//Define the names of an element name index with namespaces
String[] names = {"http://www.x-hive.com chapter", "{http://www.x-hive.com}owner"};
Concurrent indexes
Concurrent indexes are not locked for the duration of the transaction when accessed or modified.
Only the used pages are latched, and only for the time that they are read or modified. This process
improves concurrency at the expense of some extra overhead when using the indexes. Whether the net
effect is beneficial depends on your application.
Sample
IndexInConstruction.java
API documentation
com.xhive.index.interfaces.XhiveIndexInConstructionIf
com.xhive.index.interfaces.XhiveIndexInConstructionListIf
if ( index != null ) {
// remove existing index first
indexList.removeIndex(index);
}
3. Create an index node filter using the XhiveIndexNodeFilterIf interface to define which nodes to
include in the index, similar to the following:
index = (XhiveCCIndexIf)indexList.addNodeFilterIndex
("samples.manual.SampleIndexFilter",indexName);
4. Add context conditioned index entries for a document using the indexDocument() method in the
XhiveCCIndexIf interface, similar to the following:
index.indexDocument(newDocument);
Example
The code sample below uses the created index to retrieve all titles of even number chapters:
Iterator keyIter = index.getKeys();
while (keyIter.hasNext()) {
String key = (String) keyIter.next();
System.out.println(key);
}
The following example code uses the getNodesByKey() method to retrieve nodes from the index
based on a given key:
XhiveNodeIteratorIf nodesFound = index.getNodesByKey("AMENDMENTS");
while (nodesFound.hasNext()) {
XhiveNodeIf docFound = (XhiveNodeIf)nodesFound.next();
System.out.println(docFound.toXml());
}
Index scope
xDB supports nested library structures. Users are free to select the scope context of an index. For
example, the scope of the root library is larger than the scope of a nested library. The smaller the index
scope, the better the performance of data updates and queries can be.
Note: The scope of library indexes is of no importance because library indexes only apply to the
direct children of the library.
Index selectivity
In general, indexes provide the best query and update performance when the index is as selective as
possible. Each key must have the smallest possible number of nodes.
Library indexes have both unique names and IDs, provide optimal selectivity, and therefore have
excellent query and update behavior. Value indexes and ID attribute indexes are the next best choice
for selective indexes. Element name indexes are not selective because most documents do not have
unique element names. To maintain a good data update performance, it is better to use selected
element name indexes instead of the default element name indexes. Selected element names only
index a subset of all elements.
Ignoring indexes
Queries can disable certain indexes by providing a comma-separated list of index names with the
xhive:ignore-indexes option. These indexes are not used to optimize the associated query.
declare option xhive:ignore-indexes ’myindex1,ftsindex’;
for $x in ...
while (result.hasNext()) {
XhiveXQueryValueIf value = result.next();
// We know this query will only return nodes.
Node node = value.asNode();
// Do something with the node ...
}
Within the query, the context item (accessible via the . operator) is initially bound to the node on
which the query was executed:
XhiveNodeIf node = ...;
XhiveXQueryResultIf result = node.executeXQuery("./author/first, ./author/last,
./contents");
// using a Java 5 for each loop
for (XhiveXQueryValueIf value : result) {
// do something with the value ...
}
If you only want to display the result, you can use the toString() method on the values returned,
regardless of their type:
XhiveLibraryChildIf lc = ... ;
String query = ... ;
XhiveXQueryResultIf result = lc.executeXQuery(query);
while (result.hasNext()) {
System.out.println(result.next().toString());
}
For more control over serialization, nodes can be serialized using an LSSerializer obtained from a
library using XhiveLibraryIf.createLSSerializer().
If the query uses node constructors, nodes are created in a temporary document. If desired, these
nodes can be inserted into another document using the DOM importNode() method. If you want to
insert the nodes into a particular document, specifying an owner document for new nodes in the call
is more efficient than creating a temporary document and importing its nodes into the destination
document. For example:
XhiveLibraryChildIf lc = ... ;
XhiveDocumentIf doc = ... ; // Create new nodes in this document
XhiveXQueryResultIf result = lc.executeXQuery("<count>{count(//item)}</count>", doc);
// We know this query will only return a single node.
XhiveXQueryValueIf value = result.next();
Node node = value.asNode();
// Append it to the document element of destination document
doc.getDocumentElement().appendChild(node);
The query result is evaluated lazily each time the next() method is called on the result iterator. Avoid
calling result.next() within the same session after modification of searched documents or libraries, as
undefined results may occur. If you want to use the query output to modify the searched documents,
use extension function xhive:force() or the update syntax, page 211.
API documentation
com.xhive.dom.interfaces.XhiveNodeIf
External variables
XQuery provides a way to import external values into the query scope (parameters). To use this feature
in xDB, create a query using the createXQuery(String query) method on a XhiveNodeIf interface.
This method parses the query, resolves module imports, and returns an XhiveXQueryQueryIf object
that represents the query.
Note: The XhiveXQueryQueryIf object is only valid for the current database session, so do not
try to use it across multiple sessions.
Example
XQFT Example
The syntax for using variables in XQFT (XQuery Full-Text) requires braces around the variable:
XhiveNodeIf node = ...;
XhiveXQueryQueryIf query = node.createXQuery(
"declare variable $term external; " +
"/book/chapter[. contains text {$term} ]/title");
query.setVariable("term", "bicycle");
Iterator<XhiveXQueryValueIf> result = query.execute();
...
All built-in DOM objects can be used directly, including the XhiveLibraryIf object. Nodes from
another DOM implementation are imported into the creation document for this XQuery. For more
information, see the XhiveXQueryQueryIf.setCreationDocument method.
Values that cannot be mapped are converted into a special Java value. For more information, refer to
Java objects and instance methods, page 218.
It is also possible to supply an Iterator over a sequence of objects. This can be especially handy
for executing XQuery queries over the results of other queries, effectively creating a lazily executed
XQuery pipeline. Iterators used by a query cannot be reused afterwards, not even by the same query.
The declared variable is empty if this query is run again.
Example
Custom functions
Due to optimizations, your function may be called at a different moment during evaluation than you
may expect. Therefore, it is best to avoid side effects in your extension functions.
Example
API documentation
com.xhive.dom.interfaces.XhiveNodeIf#createXQuery(java.lang.String)
com.xhive.dom.interfaces.XhiveXQueryExtensionFunctionIf
Examples
doc("/"), (: All documents in the database :)
doc("/document.xml"), (: Document "document.xml" in the root library :)
doc("/MyLibrary"), (: All documents in "MyLibrary" :)
doc("/id:10"), (: The document(s) in the library with id "10" :)
doc("/mylib/mydoc"),
doc("/mylib/mysublib/id:1234")
doc("relative/path")
doc("../steps/work/./too")
The argument does not have to be a string literal but can be any expression returning a string.
If there is a context node, an absolute path expression starts at the root of the fn:root(.) context node.
In an outer expression it starts at the document or all documents in the library on which the XQuery
was executed, or the document containing an initial context node. You can use xhive:input() to access
the calling documents when there is a context node.
/docelem[//@id="2"]
(: this is equivalent to :)
xhive:input()/docelem[root(.)//@id="2"]
xDB also resolves URLs passed to the doc() function, like
doc(’http://example.com/mydoc.xml’) by retrieving the document and parsing it. The URL is
resolved using Java’s java.net.URL class, so all URI schemes supported by Java are available from
XQuery. Note: Applications can control this behaviour by means of an XQuery security policy.
Applications can control document resolution in XQuery through a custom XQuery resolver.
Note: In xDB, the collection() function is similar to the doc() function, except that collection()
can be called without any parameter.
Exception Thrown on
XhiveXQueryErrorException Semantic errors within the query. Thrown either directly or
on one of its subclasses: XhiveXQueryTypeException or
XhiveStackOverflowException.
Exception Thrown on
XhiveXQueryTypeException Errors related to the type system, for example when a
supplied value did not match the expected type. This
exception is a subclass of the XhiveXQueryErrorException
class.
• To set an option for a specific part of a query, you can use an extension expression. Extension
expressions are specified using the following syntax: (# QName Value #)
{ expr }
The QName option is set for the entire inner expression. Quotes around the Value parameter are
optional. Multiple options can be set at once by writing multiple (# #) parts before the curly
braces part.
Option Description
xhive:index-debug Checks if an index is used in a query. When its value is different from the
empty string, the query evaluator produces a message whenever a value is
looked up in an index selected by the optimizer.
xhive:queryplan-debug Similar to xhive:index-debug, but shows how the query is divided into
parts, the order in which the parts are executed, and which indexes and
options are looked up.
xhive:pathexpr-debug Similar to xhive:index-debug, but shows which low level expressions
within the XQuery are executed, and in what order.
Option Description
xhive:optimizer-debug Similar to xhive:queryplan-debug, but shows how the query optimizer tries
to create an index plan for a path expression. The output contains detailed
information about the indexes that are considered, including those that
are eliminated, and how a query plan is constructed. The contents of the
output are currently not documented.
xhive:ignore-indexes Provides a comma-separated list of indexes that should not be used to
optimize accesses.
xhive:index-paths-values Provides a comma-separated list of paths whose values to retrieve directly
from a multipath index.
xhive:fts-analyzer-class Configures a fully specified analyzer classname to use in text searches for
a full-text query or the xhive:fts function. If an index is present, the value
of this option takes precedence over the analyzer.
xhive:fts-implicit-conjunction Specifies the implicit conjunction operator for full-test searches. The only
valid values are AND and OR. Default is OR.
xhive:fts-similarity-class Configures a fully specified classname of the similarity that is used for
score calculation in the full-text query.
xhive:fts-thesaurus-class Configures a fully specified classname of the thesaurus handler used in the
full-text query. If a thesaurus handler is already set by API, the option
takes precedence.
xhive:timer Specifies a timer for the encapsulated expression.
xhive:max-tail-recursion-depth Specifies the maximum recursion depth for tail recursive functions.
Default is 10000.
xhive:implicit-timezone Specifies the implicit time zone, used by functions and operations in the
various date types, such as xs:date, if no explicit time zone is supplied.
The default implicit time zone is PT0H. When set to an empty string,
the local time zone is used. For more information refer to Indexes and
timezones, page 171.
xhive:return-blobs If set to true, changes the behaviour of the doc() and related functions to
return also BLOBs.
xhive:return-versions-all Changes the behaviour of the doc() and related functions to return all
versions of queryable documents. Other documents, such as versioned
documents that have not been created with the queryable parameter set to
true, will behave as if the option is not defined (meaning that they always
only return the latest version).
xhive:return-versions-at-date Changes the behaviour of the doc() and related functions to return the
last version of queryable documents that has a timestamp less than the
specificed timestamp. The timestamp is defined as an xs:dateTime string
(see example below). For regular documents the same caveat applies as
with xhive:return-versions-all.
Option Description
xhive:return-versions-before- Changes the behaviour of the doc() and related functions to return all
date versions of queryable documents that existed before the specificed
timestamp. Can be used in conjuction with xhive:return-versions-after-date
to define a closed date range. For regular documents the same caveat
applies as with xhive:return-versions-all.
xhive:return-versions-after-date Changes the behaviour of the doc() and related functions to return all
versions of queryable documents that existed after the specificed timestamp.
Can be used in conjuction with xhive:return-versions-before-date to define
a closed date range. For regular documents the same caveat applies as
with xhive:return-versions-all.
Examples
(# xhive:index-debug "true" #) {
doc("/products")//product[@product_id = "42"]
}
(# xhive:queryplan-debug "true" #)
(# xhive:pathexpr-debug "true" #) {
doc("/products")//product[@product_id = "42"]
}
(# xhive:queryplan-debug "true" #)
(# xhive:optimizer-debug "true" #) {
doc("/products")//product[@product_id = "42"]
}
(# xhive:fts-implicit-conjunction ’AND’ #) {
document("/manual")//paragraph[xhive:fts(.,"long list of words")]/text()
}
The values for external query variables can be specified using additional arguments to the
xhive:evaluate() function. The number of these arguments must be even, and they must alternate
between QNames (name of the external variable) and items (value of the external variable).
• xhive:input() as document-node()
This function returns the calling documents and is useful when there is another active context node.
Example:
Example:
Example: If function xhive:full-path($doc) returns path ’/path/to/doc.xml’, you can access the same
document with function doc(’/path/to/doc.xml’)
internally. This function can make it possible to use the query result for modifying the searched
data, which is normally impossible due to lazy query result evaluation.
Example:
xhive:force(doc(’doc’)//elem)
It is possible to specify a set of documents as an argument. The following query retrieves all
document version with the release2 label.
xhive:version(doc("/versioned-lib"), "release2")
– If the branch ID is specified, the result contains only those version IDs that are part of that branch
and the ones shared with other branches.
– If 1 is passed as the argument, the result contains a list of all branch IDs in the version space
of the document argument.
– If the version ID is specified, the result contains the version labels for that version.
For non-versioned documents, or when the branchversion argument refers to a nonexisting branch
or version, the result is the empty sequence.
The example query below gets all the different titles of all book versions created before 2003:
distinct-values(
let $doc := doc("/version-lib/book.xml")
for $version in xhive:version-ids($doc)
where xhive:version-property($doc, $version, "date") < "2003-01-01"
return xhive:version($doc, $version)/book/title
)
return $doc)
Querying inside documents is also possible. The example query below returns all document versions
in the date range that contained <name>John</name> in the XML:
for $doc in xhive:collection-between-dates(’/some/library’,
xs:dateTime(’2011-01-01T01:00:00Z’),xs:dateTime(’2012-05-13T10:00:00Z’))
where $doc//name=’John’ return $doc
Note: For regular documents or versioned documents that do not have the queryable option enabled,
the function ignores the passed date parameters and always returns the current version, as the
normal fn:collection() function would.
xhive:metadata(doc("/mydoc"), "author")
xhive:get-metadata-keys(doc("/mydoc"))
the highlighter function is called with four arguments, token Rotterdam, position 1, phrase ID
1 and the matching para element.
In the following phrase query
the highlighter function is called with four arguments, Rotterdam, harbour, positions 1, 2, phrase
ID’s 1, 1 and the matching para element.
Examples:
xhive:glob-documents("/*")
xhive:glob-documents("doc*.xml")
xhive:glob-documents("/lib?/*/*")
Full text indexes can be used through the xhive:fts function, see Full-text queries, page 194.
• Library name indexes and library ID indexes, page 165.
If available, these are always used by the doc() XQuery function.
• ID attribute indexes, page 166
The id() XQuery function always uses document ID attribute indexes. ID attribute indexes on
libraries are never used by xqueries, except when explicitly used with the xhive:get-nodes-by-key()
extension function, page 182.
(:
: will not use indexes on any libraries as the doc calls
: are expanded to the single documents below before the
: path expression is evaluated
:)
for $doc in (doc(’/lib1’), doc(’/lib2’))
return $doc//foo[@bar = 12]
//parent[@color = "red" and elem[@attr = "green"]] to have the query use possible
value indexes on parent/@color and elem/@attr.
• Contain a predicate or where-clause that checks the indexed value. A value or general comparison is
used against any expression that is constant for this path expression, and whose type corresponds to
the type of the value index.
• Contain a step with an indexed element name.
Examples
The following examples use a value index with the default type "STRING" on the attr attribute of
the elem element on the root library.
(: Use index without further checks :)
doc("/")//elem[@attr = $var]
(: Ditto :)
for $x in doc("/")//elem
where $x/@attr eq func(2)
return ...
With an element value index, the predicate or where-clause must check the contents of the element for
an index. The following example uses an element value index on the elem element in the root library.
(: Use the index directly :)
doc("/")//elem[. eq "red"]
(: In a flower expression :)
for $x in doc("/")//elem
where $x = "green"
(: Use the index and return the parent of the indexed node :)
doc("/")//person[elem = "black"]
(: Or equivalently :)
for $x in doc("/")//person
where $x/elem = "black"
return $x
Range queries
Range queries are queries that constrain data to a range of values, instead of to a single value. If the
optimizer finds a predicate or where-clause that uses both less or equal and greater or equal on the
same node, it can use an index to find the values in the requested range. For example:
doc(’/’)//book[@author >= "A" and @author < "B"]
If there is an index with sorted keys on book/@author element, the optimizer scans the index from A
to B to find the result of this expression.
The conditions refers to the same node, for example:
(: Cannot use range query on author index :)
doc(’/’)//book[author >= "A" and author < "B"]
The optimizer cannot use the author index in a range query. The book could have one author
satisfying the first condition and another author that satisfies the second condition. To make both
conditions refer to the same author and allow use of the index,
(: Can use range query on author index :)
doc(’/’)//book[author[. >= "A" and . < "B"]]
Indexing metadata
The following example uses the doc() function and an index that has been created on the mylib
library and the author metadata field.
doc("/mylib")[xhive:metadata(., "author") = "Jane Doe"]
The function looks up Jane Doe in the author index and returns all matching documents. The
optimizer can only use indexes for expressions where the metadata key is a string literal or the literal
empty sequence, not a generic expression.
Full text indexes would use an expression like the following:
doc("/mylib")[xhive:fts(xhive:metadata(., "p"), "XQuery")]
Multiple indexes
Different parts of a query can use different indexes. The following example uses an index on attribute
x of element x, and an index on attribute y of element y. If both indexes can be used for a single path
expression, the optimizer creates a query plan with an intersection.
Examples
for $x in doc("/")//x[@x = "x"]
for $y in doc("/")//y[@y = $x/@yref]
return ...
(: or, equivalently :)
for $x in doc("/")//x
for $y in doc("/")//y
where $x/@x = "x"
and $y/@y = $x/@yref
return ...
In the following example, the optimizer first looks up "x" in the index for x/@x and stores the result in
a temporary set. Then the optimizer looks up "y" in the index for y/@y and checks that the parents of
the indexed elements are present in the temporary set. If the parents are present, the node element is
added to the result set.
doc("/")//x[@x="x"]/y[@y="y"]
Examples
If there is a multi-valued path value index, the optimizer tries to use as many values from the index as
possible. In this case the order specs have to be in the same order as in the index specification.
(: with an index on foo.xml, this will use an order by index :)
for $book in doc(’foo.xml’)//book[@year]
(: have to mention child for index usage :)
order by $book/@year descending
return $book
Enable queryplan-debug if you want to verify whether an order by query is being optimized. For
example, a path value index like //book[@year<STRING> + title<STRING>] generates an
output like Found an index to support the first 2 order specs.
return $book
If the query plan does not match the expected plan, the optimizer debug statements are enabled to
check whether the optimizer considered the desired index.
declare option xhive:optimizer-debug ’true’;
for $book in doc(’foo.xml’)//book[@year and title]
order by $book/@year, $book/title
return $book
It is also possible to optimize a subset of the order specs. For example, if an index can only support
two of three order specs, only the last order spec is evaluated, and only in the case the first two values
are equal.
Queries that use range or equality comparisons on index values in combination with an order by
statement also benefit from indexes. With the path value index from the last example, the following
query is faster.
for $book in doc(’/booklib/’)//book[@year = ’2002’ and title > ’V’]
order by $book/@year, $book/title
return $book
If the result value is true, the expression in parenthesis is evaluated to a Boolean value and the result is
ordered ascending. If the result value is false, the result is ordered descending.
This syntax can be useful for writing queries that require ordering data by many different columns,
depending on user input. Imagine ordering tabular data with 8 columns in all descending/ascending
combinations by writing 64 different queries and encapsulating them in if statements.
declare variable $asc_order1 external;
declare variable $asc_order2 external;
(: retrieves all books with title containing terms "Unix" and "TCP"
or term "programming" :)
doc(’bib.xml’)/bib/book[title contains text "Unix" ftand "TCP" ftor "programming"]
(: retrieves all books with publisher containing term "Daufman" or simlar to it.
The minimum similarity value is 0.8. :)
doc(’bib.xml’)/bib/book[publisher contains text "Daufman" using
option xhive:fuzzy "similarity=0.8"]
Samples
XQueryWithThesaurus.java
MyThesaurusHandler.java
API documentation
com.xhive.query.interfaces.XhiveThesaurusHandlerIf
Anyall options
xDB supports the any, any word, all, all words, and phrase anyall options. For descriptions of anyall
options, see http://www.w3.org/TR/xpath-full-text-10/#ftwords.
Positional filters
xDB supports the ordered, window distance, and anchoring positional filters. For a description of
positional filters, see http://www.w3.org/TR/xpath-full-text-10/#ftposfilter.
(: retrieves all books with title containing both "unix" and "programming" and
the distance between matched terms must be at least 2 words :)
doc(’bib.xml’)/bib/book[title contains text "unix" ftand "programming"
distance at least 2 words]
Cardinality option
xDB supports the cardinality option. For a description of cardinality option, see
http://www.w3.org/TR/2010/CR-xpath-full-text-10-20100128/#fttimes
Score variables
xDB supports a scoring mechanism using score variables in for and let clauses of FLWOR
expressions. Score variables are xs:double types in the [0, 1] range. A higher score
value implies a higher degree of significance. For a description of XQFT score variables,
seehttp://www.w3.org/TR/xpath-full-text-10/#section-score-variables.
Score calculation
Scoring is available for both indexed and non-indexed queries. When using indexes, the quality of
the score estimation is much higher. Depending on the options that were used, xDB has access to
frequency and occurrence counts for the node set that was searched.
For optimal score estimation, full-text indexes are created with the FTI_SUPPORT_PHRASES and
FTI_SUPPORT_SCORING full-text index options, page 163.
The scoring implementation of xDB is partially based on Lucene. xDB also uses a Lucene-based
similarity class, that lets the user influence the results by changing the similarity measures using the
XQuery option xhive:fts-similarity-class.
For more information about the concepts used to estimate a query score, see the Lucene Similarity API.
The most significant difference between the xDB scoring implementation and the Lucene
implementation is that xDB does not evaluate all results. xDB estimates all scores before returning
the first result and first score. xDB estimates the expected number of results and uses the amount to
normalize and weight different query components.
Note: In the current xDB implementation it is not possible to increase the weight of a node manually,
and thus to increase the relevance of that node with respect to scoring. xDB cannot guarantee the
scoring relevance order stability.
For weighted scoring, use XhiveWeightedFreshnessBoostIf to set the scoring callback for the query.
The code fragment below shows how to define and set the callback.
...
XhiveXQueryQueryIf query = Library.createXQuery(QUERY);
XhiveWeightedFreshnessBoostIf weightedScoresCallback =
new XhiveWeightedFreshnessBoostIf() {
@Override
public WeightedFreshnessParameters
getBoostParameters(final XhiveLibraryChildIf libraryChild) {
@Override
public double getFreshnessWeight() {
return 40;
}
@Override
public double getLibraryChildFreshness() {
return someFreshnessFunction(libraryChild);
}
};
};
query.setLibraryWeightedFreshnessBoost(weightedScoresCallback);
...
Use XhiveScoreBoostFactorIf to set the factor/shift scoring callback for the query. The code to define
and set the shift/factor callback is analogous to the weighted model callback.
Samples
BoostLibraryScore.java
API documentation
com.xhive.query.interfaces.XhiveWeightedFreshnessBoostIf
com.xhive.core.interfaces.XhiveScoreBoostFactorIf
Boolean queries
A Boolean query represents a composite query that can contain subqueries of arbitrary nesting level,
with composition rules such as and, or, not.
For each subquery of a boolean query, two binary qualifiers control how its superquery is matched:
• prohibited - if set, the sperquery is a match only when the subquery does not match. A subquery
can be marked as prohibited using modifier -, !, or NOT.
• required - if set, the superquery is a match only when the subquery does match. This condition
is necessary but not sufficient for the superquery to match. Queries can be marked as required
using modifier +.
The default implicit conjunction is OR. For example, by default, the query "apples oranges bananas" is
equal to "apples OR oranges OR bananas". The implicit conjunction can be changed locally using
the xhive:fts-implicit-conjunction option. For example, the query
//element[xhive:fts(., "apples AND oranges")]
generates the same results as the query
(# xhive:fts-implicit-conjunction ’AND’ #) {
//element[xhive:fts(., "apples oranges")]
}
There is some overlap of functionality with XQuery. For example, the query below also generates
the same results:
Prefix searches
A prefix search searches for all terms starting with a certain prefix.
Phrase searches
A phrase query represents a query that is matched against a consecutive sequence of terms in the field.
A phrase query can have an optional boost factor and an optional slop parameter. The slop parameter
can be used to relax the phrase matching by accepting out of order term sequences. For example, the
phrase query ’winding road’ matches ’winding road’ but not ’road winding’, unless used with more
relaxed slop factors.
xDB allows using the * and ? characters as wildcards in searches. The * wildcard is a substitute for an
arbitrary number of characters, the ? wildcard substitutes a single character.
Only indexes built with the option FTI_LEADING_WILDCARD_SEARCH are suitable to search
for terms with a wildcard as the first character. If this option is not set, the search can become
extremely slow.
The Analyzer
The process of breaking content in terms (words) is called tokenization. In xDB, tokenization is done
by an analyzer. An analyzer breaks up content in tokens, and can also change tokens to improve the
search capacity. For example, an analyzer can change terms to lowercase, or change a term from
plural to singular.
Both the searched text and the query are passed through the same analyzer. If an index is available, the
same analyzer used for building the index is used for analyzing the query. If no index is available, the
value of the fts-analyzer-class option determines which analyzer is used. To use a different analyzer in
the query, the analyzer class name must be included in the options argument of the xhive:fts function.
The default analyzer for the fts function:
• Analyzes the input string as a list of terms, not as a list of characters. (For example, the contains
function in XQuery considers the text as a single monolithic string.)
• Creates terms containing only letters and/or digits. Everything else triggers the start of a new term.
• Converts all characters in a term to lower case.
• Filters out the English stopwords "a", "and", "are", "as", "at", "be", "but", "by", "for", "if", "in",
"into", "is", "it", "no", "not", "of", "on", "or", "s", "such", "t", "that", "the", "their", "then", "there",
"these", "they", "this", "to", "was", "will", "with".
• If something occurs only once, a predicate [1] allows the evaluator to stop searching after the first
occurrence. For example:
(: Not tail recursive because the result is used in the ’+’ operation :)
declare function local:sum($x as xs:integer) as xs:integer
{
if ($x eq ) then 0
else $x + local:sum($x - 1)
};
XQuery Profiler
To help you find out why a query runs slow, how long parts of an XQuery take to execute, or how
much data is read, you can create a profile of an XQuery execution.
The Admin Client provides a simple graphical user interface for profiling XQueries, page 243.
Profiling produces an XML document containing the original query text and a tree of XML nodes that
represent the functions, modules, variables and expressions of the XQuery.
When an XQuery is executed with the xhive:profile option enabled (either as an XQuery option,
or by your application), after at least partial execution the XQuery will contain profiling information.
XQuery implementation
The xDB XQuery implementation is based on the XQuery 3.0 W3C Candidate Recommendation (08
January 2013) specification. xDB implements the Full Axis Feature and provides the ancestor,
ancestor-or-self, following, following-sibling, preceding, and preceding-sibling axes.
Unsupported features:
Unsupported functions:
• fn:format-date
• fn:format-time
• fn:format-dateTime
• fn:format-integer
• Serialization option ’html’ in combination with fn:serialize
• Context dependent functions fn:position(), fn:size() and fn:last() outside of a predicate
As in XPath 1.0, a path step does not set the context size and position, only predicates do so. A
work-around is to use a for iterator with a positional variable and the count() function.
The following xhive functions cannot be used as named function references:
• xhive:fts
• xhive:java
• xhive:metadata
• All functions of the proprietary XQuery Update Syntax
xDB partially implements the W3C XQuery Full-Text Facility Standard available at
http://www.w3.org/TR/xpath-full-text-10/. For more information, refer to the section on XQuery full
text search, page 194.
Collation support
Several XQuery functions take a collation argument. The possible values of this argument are
implementation defined, according to the XQuery specification. In xDB, a collation consists of a locale
and an optional strength, separated by a slash. xDB’s collation support relies on Java’s built in support
for locales and uses collators from IBM’s ICU package (included in the xDB distribution).
Samples
XQueryCompiler.java
API documentation
com.xhive.query.interfaces.XhiveXQueryPolicyIf
com.xhive.query.interfaces.DefaultXhiveXQueryPolicy
XQuery modules
xDB implements the Module Import Feature for creating library functions. Modules can be imported
using the following syntax:
import module namespace prefix = ’http://some/namespace/uri’ at ’location’;
(: ... use functions from the module ... :)
XQuery modules have the following characteristics:
• The XQuery module location depends on the implementation definition. In xDB, the location part
can be any valid Java URI, for example file://... or http://..., as well as a URI within the database.
Using xhive:// or a relative or absolute path without a protocol identifier follows the same syntax
as the doc() function. Import paths are evaluated relative to the XhiveNodeIf interface or the library
in which the query is executed or created.
• Importing a module into a current query makes available all functions and variables that have
been declared within the module namespace.
• Modules can import other modules.
If a module imports another module, functions and variables in the imported module are only
available in the importing module, and are not propagated.
• All variables and functions of a module must have the module namespace. To hide variables and
functions from other modules, use XQuery 3.0 %private annotations.
• XQuery modules can be stored as BLOB nodes or XML documents. BLOB nodes must contain
the module in flat UTF-8 text, XML documents can have any encoding as long as it is correctly
specified during the import. In XML documents, the string value of the document root element is
used as the query. The query is the concatenation of all text nodes below that root node.
• XQuery modules that are stored outside of xDB are always expected to use UTF-8 encoding.
• xDB supports multiple locations per module URI. However, xDB does not support XQuery 3.0
features like forward references to global variables and allowing modules to reference each other
without restriction. As a result, the order of the locations is of importance. For instance, the
module content of the second location can reference items declared in the module content of the
first location, but the module content of the first location cannot reference items declared in the
module content of the second location.
Examples
The following example ignores the name of the <queryModule> root element.
<queryModule><![CDATA[ module namespace mns = ’http://some/namespace/uri’;
declare variable $mns:pi := 3.14159265;
declare function mns:circle-area($r as xs:double) as
xs:double { $r * $r * $mns:pi }; ]]>
</queryModule>
Use a CDATA block for the contents of the module. Otherwise embedded direct element constructors
are interpreted as XML syntax, as described in the following code example:
<queryModule>
module namespace foo = ’bar’;
declare variable $foo:element := <element>"Hello World!"</element>;
</queryModule>
The $foo:element variable contains the string "Hello World!", because <element/> was not escaped.
• Schemas can be imported using the import schema construct. The processor searches the catalog
of the initial context item for a matching schema. The processor first uses any location hints, and
then falls back to the namespace URI. DTDs are not supported.
A schema can be imported into queries and used to validate documents or XML fragments. Types
declared in a schema can be used in XQuery type annotations.
These functions directly move DOM nodes into a new target. By default, they insert $sources as last
into $target. If $anchor is specified and not empty, the $sources are inserted before $anchor.
Moving has a potential performance advantage over removing/inserting nodes: if the $where and
$newContents values belong to the same document, nodes need not be copied or imported.
Nodes covered by indexes with UNIQUE_KEYS flags can be moved. If any of the $node child
nodes use a unique index, moving elements with a delete node $node and an insert
node $node into $target statement generates a DUPLICATE_KEY exception. Using
xhive:move($target, $node) instead works.
Example
for $book in doc(’bib.xml’)/bib/book
where $book/@year < 1990
return
xhive:remove($book)
for $book in doc(’bib.xml’)/bib/book,
$review in doc(’http://example.com/reviews.xml’)//review
where $review/@isbn = $book/@isbn
return
xhive:insert-into($book, $review)
xhive:insert-doc(’/lib/newfile.xml’,
document {
<root>
...
</root>
}
)
Samples
TypedIndex.java
Parallel queries
A particular subset of queries can be evaluated in parallel. For parallel evaluation, an executor instance
must be provided using code like the following:
XhiveXQueryQueryIf query = ... ;
Executor executor = Executors.newCachedThreadPool();
query.setParallelExecutor(new AbstractXhiveXQueryExecutor() {
@Override
public Executor getExecutor() {
return executor;
}
}, XhiveXQueryQueryIf.XHIVE_PATH_EXPR);
XhiveXQueryResultIf result = query.execute();
while (result.hasNext()) {
result.next();
}
result.close()
Parallel evaluation can improve performance: it can reduce the response time of queries, but there
is some overhead involved that can reduce the total throughput.
If the xhive:queryplan-debug option has been turned on for the query, the output contains a message if
the query is being parallelized.
Note: You have to call the close() method on the query result when the result items are no longer
needed. This method terminates all background threads executing jobs in parallel mode.
The parallel executor must be a custom class implementing XhiveXQueryExecutorIf. The class
AbstractXhiveXQueryExecutor provides an abstract implementation of the XhiveXQueryExecutorIf
interface. Extend this class, rather than implementing the interface XhiveXQueryExecutorIf, so your
code does not break if methods are added to the interface in future.
Interface XhiveXQueryParallelJobIf can be used to access sub-query information from within
ThreadPoolExecutor hook functions beforeExecute and afterExecute.
If an FLWOR or path expression is evaluated on a library and no relevant indexes can be found, the
query evaluation descends to the child libraries. The expression on each child library is evaluated
separately. This step can be parallelized. The database creates jobs for the expression evaluation on
each child library and submits them to the executor supplied by the user. Parallel query evaluation
is most useful in cases where the searched child libraries are located on different disks, so the I/O
load can be spread.
Generally, the expressions that can be parallelized are those that can use indexes, regardless of whether
indexes are present or used. For examples of the kind of expressions that can be optimized, see Value
and element name indexes, page 189.
Limitations
Parallel execution is more complex than non-parallel execution. If parallel execution does not improve
performance it is recommended to run queries non-parallel. The following xquery constructs cannot
be handled by parallel xquery parts:
• Variable declarations of function items.
• Variable declarations of other modules.
Samples
ParallelPathExpressionQuery.java
ParallelForEachQuery.java
Samples
XQueryResolver.java
XQueryCompiler.java
API documentation
com.xhive.query.interfaces.AbstractXQueryResolver
com.xhive.query.interfaces.XQueryResolverIf
com.xhive.query.interfaces.XQueryCompilerIf
com.xhive.query.interfaces.XhivePreparedQueryIf
Preparing XQueries
The XQuery Compiler in the XhiveXQueryCompilerIf interface allows creating prepared queries.
XQueries can be parsed once and used many times. The XQuery compiler also sets common options
for all XQueries, such as available namespaces, options, commonly used functions, or modules. Using
the same namespace prefixes or options for several queries reduces the amount of XQuery code.
The XhivePreparedQueryIf interface represents prepared XQueries. Prepared queries are thread safe
and can be used in parallel, either by first creating an XhiveXQueryQueryIf object or by executing
them directly.
For more information, see the XQueryCompiler.java sample code. The code prepares a query using
an XQuery compiler, with an additional namespace prefix set, and runs the same XQuery using
multiple threads.
Samples
XQueryCompiler.java
Examples
/* Java code */
public int foo(String bar) { ... }
(: XQuery code :)
import module namespace eg = ’java:mypackage.Eg’;
let $x := eg:new()
return eg:foo($x, ’param1’)
Instances can be created using a constructor with the eg:new(...) syntax or injected from the
outside as an external parameter.
import module namespace eg = ’java:mypackage.Eg’;
declare variable $x external;
eg:foo($x, ’param1’);
Type checking
XQuery parameters are checked for the correct type and promoted to Java objects according the table.
public static String foo(String bar, int baz, Iterator<XhiveNodeIf> nodes) { ... }
(: legal call :)
eg:foo("bar", 5, <element/>)
(: wrong type :)
eg:foo("bar", "baz", ())
The return value of the function is transformed to XQuery values exactly as in the xhive:java() method.
It is possible to return Iterators, Collections, Arrays, and Sets.
Limitations
Two Java methods with different type parameter types can have the same name. In XQuery, functions
with the same name are only allowed if they have a different number of parameters. In xDB, the query
parser analyzes the input types from the query and tries to select the correct Java method accordingly.
The parser calculates a score for each method based on how good the XQuery parameter types match
the Java parameters. An error is reported if the more than one method has the best score. To direct the
parser on which method to use, users can add treat as or cast as statements to the call, for example:
eg:foo(/some/path treat as element(*, xs:integer))
xDB catalogs
xDB libraries store XML documents, sublibraries, and BLOBs. XML documents can be associated
with a document type definition (DTD) or XML Schema to validate the document. DTDs and XML
schemas are also referred to as models.
A catalog is linked to a library. By default, only the root library has a catalog where all models are
stored. However, it is also possible to place a catalog in a sublibrary and split models over multiple
catalogs. Catalogs in sublibraries are called local catalogs. Local catalogs override information
in the root catalog and during queries the local catalog is searched first. If a root catalog and a
local catalog contain a model with the same identifier, the model in the local catalog is used for all
documents and descendants.
Each model in a catalog has a unique indentifier that depends on the schema type of the model:
• DTD models are identified by their public ID. If the DTD does not contain a public ID, xDB
automatically generates an ID.
• XML schema models are identified by their filename.
Linking DTDs
Generally, a document that contains a <!DOCTYPE> declaration is linked to a DTD. The
<!DOCTYPE> declaration specifies a public ID and a system ID, like the following:
<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.0//EN" "svg10.dtd">
Where -//W3C//DTD SVG 1.0//EN is the public ID and svg10.dtd is the system ID.
When retrieving the active ASModel of a document, the system looks up the ASModel Id in all
catalogs up to the root library. The active ASModel is set using an abstract schema. If a document does
not contain a <!DOCTYPE> declaration, the system automatically adds a <!DOCTYPE> declaration
with the ASModel Id, linking the document to the ASModel. The model can be replaced by adding a
new model to the catalog and changing the Id to the new model.
<personnel xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:noNamespaceSchemaLocation=’personal.xsd’>
• Using the schema-location parameter in the document configuration to normalize the XML
document.
A document with XML schema can have more than one active model attached.
After document validation or validated parsing, the string concatenation of the ASModel IDs is stored
in the xhive-schema-ids parameter of the document. The IDs are used to identify the ASModels
for validation or access to the PSVI interfaces.
Validated parsing
If a document is parsed with validation and the models are not found, the models are automatically
stored in the catalog. If a document is parsed without validation, the models are not stored in the
catalog. xDB includes DOM configuration options that allow validation without storing the model, or
storing the model and internal subsets without validating the document.
ASModel resolutions are handled separately for DTD and XML schema models.
DTDs
When a document contains a <!DOCTYPE> declaration and the document is parsed with validation,
xDB attempts to locate the DTD, as follows:
• If the <!DOCTYPE> declaration specifies a public ID, the public ID is used to identify the ASModel.
• If the <!DOCTYPE> declaration does not specify a public ID or a DTD matching the public ID
does not exist, the system uses the default DTD.
• If no default DTD is specified, the system ID specified in the <!DOCTYPE> declaration is used
to locate a DTD in the file system.
The DTD is stored in the closest local catalog, either under the specified public ID or a public ID
generated by xDB.
Note: If the <!DOCTYPE> declaration does not specify a public ID, xDB stores a DTD for each
document that is parsed with validation, even if they point to a DTD with the same system ID.
Loading the DTD in advance and using the resulting ASModel as the default prevents storing a DTD
for each document.-
XML schema
When parsing with validation, XML schema models are identified, as follows:
• If the document is parsed using the LSParser interface and a schema-location parameter is
specified in the LSParser configurations settings, xDB validates against the defined ASModel. The
schema-location parameter is added to the document.
• If the name space declaration of the document contains a noNamespaceSchemaLocation or
schemaLocation XML schema instance attribute, xDB uses the corresponding ASModel for
validation.
Note: If two models have the same target namespace, using the schema-location configuration
parameter to define a model overrides using the schema location attributes.
Catalog methods
A catalog is a special library for storing schema documents, which are represented by ASModels from
the W3C abstract schema specification. By default, only the root-library has a catalog, but you can set
a local catalog by calling addLocalCatalog() on a library. When that is done, catalog operations on
that library or its descendants will first call on this library.
A document with XML schema can have more than one active model attached.
Linked models can be changed by setting the schema-location parameter in the document configuration
settings or by using the setActiveASModel and addAS abstract schema functions. These functions also
modify the schema-location parameter value.
API documentation
com.xhive.dom.interfaces.XhiveLibraryIf
com.xhive.dom.interfaces.XhiveCatalogIf
org.w3c.dom.as.ASModel
DTDs
The DTD validation process uses the ASModel defined by the public ID of the document. Validation
fails if a DTD with that public does not exist in the catalog. The ASModel, page 126 interfaces contain
the methods for validating documents that use a DTD.
XML schema
The configuration settings of the document determine the XML schema validation process, as
described in Normalizing XML documents, page 102. The validation process attempts to locate the
ASModels associated with the document, then validates the document against the model. The location
of the model is specified either in the schema-location parameter in the configuration settings, or in the
noNamespaceSchemaLocation or schemaLocation attribute of the document.
If two models have the same target namespace, using the schema-location configuration parameter
to define a model overrides using the schema location attributes.
The XML Schema API can be used to traverse XML schema components like type definitions,
element declarations, and schema constraints. PSVI information, such as validity, validation context,
normalized value, type definition, and member type definition can be accessed for individual nodes.
Nodes storing state information need more disk space. Users can set a configuration option to enable
PSVI information storage. If this option is not set, queries do not support data types and the XML
Schema API is only partially accessible. For example, node validity information is not available.
When using the XML Schema API to access schema information, the xhive-schema-ids attribute
value identifies the corresponding ASModels. This value specifies the schema IDs corresponding to
information stored by the configuration schema-location parameter and schema-location attributes.
If possible, the xhive-schema-ids value excludes the schema locations of the attributes that the
schema-location value overrules.
Related topics
XML Schema data types
XML Schema model information
Xerces XML Schema API
Accessing PSVI information
DOM configuration
Samples
PSVI.java
API documentation
org.w3c.dom.TypeInfo
com.xhive.dom.interfaces.XhiveNodeIf
org.apache.xerces.xs
Admin Client
The xDB Admin Client (also known as Administration Client or administration tool) provides
developers, administrators and superusers access to xDB functions through a Swing-based graphical
user interface. The xDB distribution includes the Admin Client java source code, allowing developers
to review its use of the xDB API to perform its tasks.
Admin Client features include:
• a menu-based GUI with menu bar, context-sensitive right-click menus and toolbars
• a data browser that displays database contents using an explorer-like tree view with tabbed content
panels, popup dialogs and messaging
• server, federation, database and segment management
• consistency checking
• backup and restore
• data import and export
• data serialization and deserialization
• user and group management
• library and document management
• index management
• XQuery and XPointer execution and debugging
Figure 5 Admin Client tree view with the context menu of the root library
Most functions are accessible by right-clicking an item in the tree view or in the Contents tab, and
selecting the corresponding option from the corresponding popup-menu. For example, you can
selectively Refresh a part of the tree view, and you can query, import or export the contents of a
specific library, and add or delete users and groups.
Additonal functions are available in the main menu, for example for:
Note: Admin Client preferences, including the last query executed and the last database connection,
are automatically stored using java.util.prefs.
3. Enter optional parameters, if required. The Path option allows you to specify the location of the
default file of the default segment, either as an absolute path, or as a path relative to the default
database location of the federation. The Max size option specifies the maximum size (in bytes) that
the database file is allowed to grow to. A value of 0 means the size is unlimited.
4. If you want to use a Custom configuration, enter the path to your configuration file, page 230.
5. Click OK.
To connect to the new database, select Database > Create database from the Admin Client menu
bar. In the Connect to database dialog, choose the Database name your want to connect to, and
enter a valid username and password.
<xhive-clustering/>
This document element can be used in a database configuration file.
Child elements
The <xhive-clustering/> element can have the following child elements:
• <segment/>, page 231
<segment/>
The <segment/> element creates a database segment.
Attributes
max-size 0 The maximum number of bytes for the default file. By default,
the max-size attribute value is 0, which means the file can have an
unlimited number of bytes. Optional attribute.
temporary false Specifies whether this segment is a temporary data segment. Optional
attribute.
Child elements
The <segment/> element can have the following child element:
• <file/>, page 232
<file/>
The <file/> element creates a database file.
Attributes
path - The path to the location of the default database file. Optional attribute.
If the path is not supplied, the default database file is stored in the same
directory as the default file of the federated database.
max-size 0 The maximum number of bytes for the default file. By default,
the max-size attribute value is 0, which means the file can have an
unlimited number of bytes. Optional attribute.
2. Select the Database name of the database on which you want to reset the administrator password.
3. Enter the Superuser password of the current federation.
4. Enter the new administrator password in the field Administrator password.
5. Retype the new administrator password in the field Retype password.
6. Click OK.
Importing data
The Import option in the right-click menu of library nodes in the Admin Client opens a dialog for
importing data from documents and/or directories into the selected library.
By default, xDB recreates the directory structure of an imported library.
To import data into a library:
1. Right-click on a library node and select Import.
The Import into library dialog appears, with the Select sources tab selected.
2. Use the Add, Delete and Clear buttons of the Import into library dialog to assemble a list of files
and/or directories to import.
The Add button opens a dialog for selecting items to add to Files and directories to import.
3. Select Library settings as required:
4. Click the Filters tab and define import filters for all file types to import.
This tab contains user-definable filters that determine the storage type for each import file type.
Filter definitions are stored in the Admin Client preferences. By default, the importer includes
filters for files with .xml and .xsl extensions. Files with file extensions that have no filter defined
are ignored and not imported into the database.
If you want to import graphics from files of file type .gif, you should add a file filter for that type,
with storage type Blob.
5. Click the Parser configuration tab and specify the parser options.
You can specify whether to use validation, what information items of the original file to preserve in
the parsed Document, and other properties. The parser configuration options are not stored in the
Admin Client preferences.
Exporting data
Data can be exported using the Export option in the right-click menu of library, BLOB, and document
nodes. You can select export location and DOM configuration options.
To export data:
1. Right-click on the library, BLOB, or document node that you want to export.
2. Select option Export from the right-click menu.
The Export dialog appears.
Backing up a federation
You can back up a federation using the Backup option of the Federation menu. The Backup dialog
has options to create either a normal, an incremental or a standalone backup. If the federation contains
detachable libraries, they can be excluded from the backup.
Note: It is good practice to use consistent, self-explanatory file names and file extensions for backups,
and to keep related backup files files together in a single, dedicated and secure location. For example,
keeping a backup and its subsequent increments in one place can make the restore process easier and
more reliable. Creating backups and their increments only to new files in an empty location helps to
prevent accidental overwriting of any preceeding backups or increments. Preferably, a backup location
contains only valid backup files. The Admin Client, page 227 has a Backup management dialog
that shows files name, backup type, creation date/time and other backup metadata, page 266. This
dialog can display a single backup file, or all backup files in an entire directory, optionally with its
subdirectories, provided the selected directories contain only backup files.
To back up a federation:
1. Connect of the Admin Client to the federation that you want to back up.
2. Select option Backup from the Federation. If required, enter the superuser password.
The Backup dialog appears.
3. Specify an output directory and filename.
Note: Choose the filename with care. If a file with the same name already exists, it will be
overwritten without warning.
4. Select a backup option: Default, Incremental or Standalone. With the Standalone option, you
can choose to keep log files.
5. Optionally, if the federation has detachable read-only libraries that you want to exclude from
the backup, expand the tree view of the Select libraries to exclude control, and mark them for
exclusion.
6. Click OK.
1. Select option Restore from the Federation menu of the Admin Client. If required, enter the
superuser password.
The Restore dialog appears:
Serializing data
You can serialize a library, document or BLOB to a file using the Serialize option in the right-click
menu of the object. Subsequently, you can deserialize the object from the file to any library.
You can use serialization and deserialization for backup/restore of content, and for copying content
from one library or database to another.
Note: Serialization files are not human-readable, and should not be edited.
To serialize data:
1. Right-click on the library, BLOB, or document node that you want to serialize.
a. If the object is a library, select menu option Library management from the right-click menu.
2. Select option Serialize.
The Serialize dialog appears.
3. Select a target directory.
4. Enter a file name.
Use a meaningful file name, for example a name that describes the file content, or a name that
relates to its purpose.
Note: If you use the name of an existing file, that file will be overwritten without warning.
5. Click Serialize.
Deserializing data
You can use the Deserialize option in the right-click Library management option of a library to add
content from a file created using the Serialize option.
Note: Serialization files are not human-readable, and should not be edited.
To deserialize data:
1. Right-click on the library where you want to serialize data, and select submenu option Library
management.
2. Select option Deserialize.
The Deserialize dialog appears.
3. Select a source directory and file.
4. Enter a file name.
Use a meaningful file name, for example a name that describes the file content, or a name that
relates to its purpose.
Note: The file that you select must have been created using the Serialize option.
5. Click Deserialize.
Note: The file that you select must have been created using the Serialize option of the Library
management menu of the root-library.
Note: The file that you select must have been created using the Serialize users and groups option.
Serialization and deserialization can be done using right-click menu options of Admin Client.
Editing documents
To edit documents in the xDB Adminclient:
1. Right-click on the document you want to edit.
2. The right-click menu offers a different editing option for versioned documents than for unversioned
documents, because versioned documents are read-only.
• If you want to browse an unversioned document as a tree, select Browse document.
In the document browser, you can right-click a node to open a menu with options to edit or
delete the node.
• Select Edit as text to open the text editor on an unversioned document or document node:
Adding indexes
When you select a library or a document in the left panel of the administration client, you can use the
Indexes tab of the right panel to view and manage its list of indexes. For information on indexing,
refer to indexes, page 150.
Running queries
XQuery is one of several querying mechanisms supported by xDB. XPath is a subset of XQuery - what
you can do in XPath, you can do in XQuery, and more.
For libraries and documents, the Execute XQuery option in the right-click menu opens a new
XQuery panel at the bottom of the window. Queries are executed in the context of the selected
library or document. The Adminclient uses a read-only transaction (which does not take locks) for
normal XQuery execution, but it will use a read-write transaction if it detects XQuery update syntax,
page 211 in your query.
The XQuery panel contains several tabs, as described in table (XQuery tab and options, page 242).
When executing an XPath/XUpdate/XQuery, if the results are idle, the underlying session will timeout.
The user will then be given a choice whether or not to re-execute the action: if not then the results
tree will be disabled. The default timeout is set to 5 minutes and can be set through the Options
dialog of the Settings menu.
Profiling XQueries
Xquery profiling can be useful in finding why a query runs slow. The XQuery panel of the Admin
Client provides an option for profiling XQueries. For information about the XQuery panel, see
Running queries, page 241.
Click the Show Query plan (clock) button of the XQuery panel to open a new window showing
a tree view of the static query plan.
Click the Profile button at the top of the Query plan window to run the XQuery with profiling enabled.
After profiling a query, the query plan dialog shows the values of performance-relevant attributes,
with calculated percentages of the totals in additional columns.
You can copy the resulting information to clipboard as XML text by clicking the Copy selection button.
If you first select a node or a range of lines, only the selected part of the display is copied to clipboard.
Format
Profiling produces an XML document contains the original query text and a tree of XML nodes that
represent the functions, modules, variables and expressions of the XQuery.
Where applicable, these nodes have a location attribute that points to the filename, line, and column
number of the source file where this expression was parsed from. Many expressions also have
additional attributes like the variable name for a for clause, or the function name for a function. In
the document, outer expressions that are XML parent nodes consume the results from their child
nodes, input expressions (for example the input to a for clause) come before output expressions
(like the return clause in a for clause).
Profiled expression nodes will have the attributes accumulatedTime, calls, values, and
pagesRead. These represent the total time spent evaluating this expression, the number of times this
expression was evaluated, the number of values this expression produced, and the number of database
pages that had to be accessed for producing the result.
In addition to timings, the <path/> nodes representing path expressions will have an <indexplans/>
child node that contains the different index plans chosen for different libraries, and within those plans a
description of the lookup steps used to evaluate the path expressions.
Note: The profiling XML document format is a work in progress and subject to change without notice.
Note: The profile will not account for pages read on the server and not transferred to the client in a
client server deployment. In practice, this means pagesRead will not include pages read in concurrent
indexes and multipath indexes. Note: If an expression uses a variable, the time spent and the pages
read to evaluate that variable will be accounted twice: once for the expression that binds the variable,
and once for the first expression that uses the variable. This is due to the lazy evaluation nature of
xDB’s XQuery implementation. However, the time spent on a the parent expression relative to the
variable-binding expression will typically be correct.
Web client
The web client provides administrators and superusers access to some key database functions, including
management of users and indexes and execution of queries. It is based on the xDB REST API.
By default, its server starts automatically as part of the database server and runs on port 1280. During
xDB installation, this can be disabled or set to a different port. These settings are stored in the
xdb.properties configuration file. when running the database server from the command line, you
can choose to disable the web server.
To unlock administrative actions on the federation, the username superuser must be entered, together
with the password provided during setup.
To connect to a database, the username Administrator must be entered, together with the password that
was entered when creating the database.
The command-line client can execute single commands and can also run as an interactive console,
page 247. Commands are always executed in auto-commit mode. All changes are made persistent
after each command, before control is returned to the command line.
To use the command-line client to run a single xdb command, enter that command on your operating
system’s command line in the bin subdirectory of the xDB installation. Note: The Windows installer
automatically adds the xdb command to the PATH variable.
Single commands must be entered using the following syntax:
xdb <command> [arguments]
Enter xdb help to display the available commands with their descriptions.
Enter xdb help <command> to display the available options for a command.
Parameters containing whitespace must be enclosed in single quotes (’) or double quotes (") or escaped
by preceding the whitespace with a backslash (\ ). Quotes within parameters can be escaped using the
backslash character (\’ or \"), the backslash itself is escaped by doubling (\\).
The xdb command accepts GNU-style options, either in a long version such as --federation
or as an abbreviated version such as -f for some options. If an option takes a value, that
value must directly follow the option, separated by whitespace, for example --federation
xhive://localhost:1235. Abbreviated options can be clustered if only the last option takes a
value, for example -vyf xhive://localhost:1235.
Most commands have a number of options in common; these are called global options, page 251.
When passed to the xdb command, these options also serve as default values for any subsequent
commands in interactive mode. The interactive console will cache username and password, server
address, server port, web server properties and database name between commands, so after the first
command that accesses a given database, subsequent commands need not ask for them again.
Default values for most command options are stored in the xdb.properties configuration file in the
home directory of each xDB user. These options need not be specified on the command line every time
a command is invoked. The default values in the configuration file can be modified by specifying the
corresponding parameter on the command line. For more information about the configuration settings
and locations, see Configuration files, page 68.
The interactive console accepts all regular xdb commands, using the same syntax. To start the
interactive console, run the xdb command without options:
xdb
To close the console, type exit.
user@localhost: ~ $ xdb
xDB 10.5.0 command line client (c) 1999-2013 EMC Corporation
Type ’help’ for a list of commands and options, type ’exit’ to leave the shell.
xdb> ls -d united_nations
abc.xml
Commands
The table below lists the commands that are available from the xDB command-line client, page 246.
Table 35 Command line client commands
Command Description
add-binding bind a library to specified xDB server node in addition to all the existing
bindings.
add-log-directory add a secondary transaction log files directory to a node
add-node add a server node
add-segment add a segment to the database
add-file add a file to a database segment
admin start the admin client graphical user interface
attach-library attach a library back into a database from detached state.
backup backup a complete federation
backup-library backup a library or multiple libraries within a database
cat prints a file
cd change the current directory
change-binding bind a library to specified xDB server node.
check-database check database consistency
check-federation check federation consistency
check-librarychild check library child consistency
check-node check node consistency
clean-database clean a database’s MultiPath index segments
clean-library clean a library’s MultiPath index segments
configure-federation change federation settings (superuser password, license key)
create-database create a database
create-federation create a federation
create-replica create a replica
Command Description
delete-database delete a database
detach-library detach a library if it is detachable
federation-set manage a federation set (create, add federations, remove federations)
force-detach forcibly detach a library on which the library is created.
help print a help message
import import documents into the database
info print session information
ll print a list of database objects and their attributes
ls print a list of database objects
mkdir create directories (always creates missing parents)
mv move a file or folder
put create a document in the database
remove-binding remove specified server node as one of the binding nodes of a library.
remove-log-directory remove a secondary transaction log files directory from a node
remove-node remove a server node
remove-segment delete a segment from the database
repair-blacklists repair MultiPath index blacklists
repair-addblacknode add black node to MultiPath index
repair-indexinfo repair MultiPath index information
repair-mappings repair MultiPath index file mappings
repair-merge merge MultiPath index
repair-removeblacknode remove black node from MultiPath index
repair-segments repair MultiPath index segments
repair-set-usable set MultiPath index usable
repair-searchable flip library non-searchable flag (also repairs non-searchable flag issue from
xDB 9.0.0)
repair-input-encoding repair or check input encoding after upgrade to xDB 7.1
restore restore a complete federation from a backup
restore-library restore one library or multiple libraries within a database from a backup
rm delete files/directories
run-server start an xDB server
run-server-repair Starts an xDB server node in repair mode, suitable for index repair.
set-file-maxsize set max size of a segment datafile
set-library-state set states of library
show-backup-metadata show metadata of a specified backup file or multiple backup files in a
specified folder. Metadata fields will be separated by tabs.
show-segment show properties of a segment. Fields will be separated by tabs.
show-unusable-indexes show information of all unusable indexes in specified database
Command Description
show-unusable-libraries show full paths of all unusable libraries in specified database
statistics print statistics about MultiPath index
statistics-bl print MultiPath index blacklists info
statistics-li print subindex specific information
statistics-ls print MultiPath index sub indexes info
stop-server stop a running xDB server
suspend-diskwrites suspend or resume disk writes
sync-entries sync index record and index_info entries
update-node update a server node
xquery execute an XQuery, either given as a direct argument or through the --file
option.
Creating a federation
When xDB is installed, a federation is created automatically. Additional federations can be created
using the xdb create-federation command or the Admin Client.
The xdb create-federation command supports the following options:
Option Description
-f --federation The absolute path and file name of the bootstrap file for the new federation.
ARG
--log ARG Comma separated list of transaction log files directories for the new federation.
The first one in the list represents the primary log directory. Relative paths, if
used, are resolved relative to the bootstrap file directory. If not specified, a single
log directory for the node will be created in default location.
Note: For performance reasons, it is best practice to keep the transaction log files
on a different physical hard disk than the database data pages.
--pagesize ARG The database page size for the new federation.
-p --passwd The initial superuser password.
<value>
Example
The following example uses the xdb create-federation command to create a federation.
xdb create-federation --log /var/dblogs/fed1logs \
--federation /var/databases/Fedration1.bootstrap \
The xdb info command only shows open sessions. Closed sessions are not listed in any internal
administration, and so are available for garbage collection if they are no longer referenced by user code.
A <page not in cache> entry in the description usually means that the locked object is new and its first
page has not yet been entered in the database server cache. Another possibilty is that the first object
page has been removed from the cache to create space for other pages; in this case, the object name is
not retrieved from disk, to avoid affecting the performance of current transactions.
Option Description
-f --federation ARG The federation bootstrap path or URL.
-d --database ARG The name of the database.
-c --cache ARG The cache pages for the database session.
-y --non-interactive Runs the non-interactive mode that does not ask for missing parameters.
--debug Prints stack traces.
--stdout ARG Redirects standard output to a file.
--stdout-append ARG Redirects standard output to a file (append mode).
--stderr ARG Redirects standard error output to a file.
--stderr-append ARG Redirects standard error output to a file (append mode).
-v --verbose Runs in extra verbose mode.
-V --version Prints version information and exits.
-h --help Prints overview of common options and commands.
Server-related commands
add-node
add-log-directory
remove-log-directory
remove-node
Argument Description
run-server
Argument Description
--address ARG Listen address. The address argument can be used on
a multihomed host for a ServerSocket that will only
accept connect requests to one of its addresses. The
value "*" means the server will accept connections
on any/all local addresses.
--port ARG Port number of the server node.
--nodename ARG Name of the server node to start. Optional. If
nodename is not specified, the command will attempt
to start the primary node.
--force If specified, ignore errors during recovery and mark
offending libraries as unusable.
-f ARG | --federation ARG Path to bootstrap file.
run-server-repair
Starts an xDB server node in a special mode suitable for index repair.
Usage: xdb run-server-repair [options]
The arguments are the same as in the case of the run-server command.
stop-server
Argument Description
update-node
Argument Description
--logpath ARG New log directory for the server node to be updated.
Optional.
If not specified, the log path will not be updated. If
specified, the directory must be accessible to both the
primary and the specified non-primary nodes.
--port ARG New port number of the server node to be updated.
Optional.
--host ARG New host name of the server node to be updated.
Optional.
--nodename ARG Name of the server node to be updated. Requried.
Library-related commands
add-binding
Binds a library to a specified xDB server node in addition to all the existing bindings.
Usage: xdb add-binding [options] path
Argument Description
path Path of a library which will be bound to the specified
node in addition to all the existing bindings. Supports
multiple libraries, separated by space.
--nodename ARG The name of the node to which to bind the library
(read-only libraries can bind to multiple nodes).
Required.
add-file
Argument Description
--maxsize ARG The maximum size (in bytes) that the file is allowed
to grow to. Default is 0. A value of 0 means the size
is unlimited.
-d ARG | --database ARG Database name. Required.
add-segment
attach-library
backup-library
change-binding
detach-library
force-detach
Argument Description
segment The ID of the segment on which the library was
created.
-d ARG | --database ARG Database name.
remove-binding
remove-segment
Argument Description
segmentid The ID of the segment to remove. Required. Supports
multiple segments, separated by space.
-d ARG | --database Database name. Required.
restore-library
Argument Description
path Path of one library or multiple libraries to restore,
separated by space. Optional. If not specified, restore
all libraries in the backup file.
Argument Description
--overwrite Overwrite data files if they exist. Optional.
--file Input file. Optional. If not specified, use standard
input.
set-file-maxsize
Argument Description
path The full path of the database file (as produced by the
show-segment command).
--segment ARG The id of the segment which the datafile belongs to.
--maxsize ARG The maximum size (in bytes) that the file is allowed
to grow to. A value of 0 means the size is unlimited.
Required.
-d ARG | --database ARG Database name. Required.
set-library-state
Argument Description
path Path of a library whose state will be changed.
Supports multiple paths, separated by space.
--searchable true|false Set the library to searchable or non-searchable.
Optional. If not specified, the library search-state
remains unchanged.
Argument Description
--readonly true|false [--recursive]
Set the library state to read-only or read-write.
Optional. If not specified, the library read-state
remains unchanged. If the --recursive option is
specified, set specified read-state on the library
including the descendants; if not specified, only set
state on this library.
show-segment
Shows all properties of a segment, including the segment paths. Properties will be separated by space.
Usage: xdb show-segment [options] segmentid
Argument Description
segmentid The ID of the segment whose properties will show.
Required. Supports multiple segments, separated by
space.
--datafile true|false If true, show only information about the database files
in the segment, including full path, maximum file size
and current file size.
-d ARG | --database ARG Database name. Required.
show-unusable-libraries
Argument Description
General commands
ls
Argument Description
path Path to perform the operation on. Supports multiple
paths, separated by space.
--type ARG Only list contents of the given type. Must be either
document, library, or detachable (only list detachable
library children).
--details List detailed properties in columns separated by tabs.
If this option is specified, all library properties are
shown, including segment IDs.
show-backup-metadata
Shows metadata related to a specified backup file, or to all backup files in a specified folder. Metadata
fields will be separated by spaces.
Usage: xdb show-backup-metadata [options] path
Argument Description
path File path. If the path denotes is a normal backup file,
show metadata of the backup file; if the path denotes
a folder, show backup metadata of every file in the
specified folder.
--recursive Show backup metadata of every file in the specified
folder and its subfolders recursively. Ignored if the
path denotes a normal backup file.
show-unusable-indexes
Argument Description
It is best practice to back up your xDB federations on a regular basis, to minimize data loss in case of a
system failure. The xDB administration tools provide backup/restore functionality for:
• creating and restoring full and incremental backups of federations
• backing up and restoring libraries
• (de)serializing libraries and documents
A backup made while any xDB code is running is called an online backup, or hot backup.
A backup made while no xDB code is running is called an offline backup, or cold backup.
• Incremental backup, page 263: xDB backs up only the changes since the most recent online or
incremental backup. Note: xDB provides a standalone option to back up the entire federation
without affecting the current sequence of incremental backups.
• Offline backup, page 266: provided that no xDB code is running on a federation, a conventional file
backup utility can be used for a cold backup of the entire inactive federation. An alternative is to run
the xdb backup command specifying the federation bootstrap file as the federation. If xDB code is
running on the federation, use of external backup tools is not a good choice. Even if no transactions
are open, the server can still flush dirty pages from the cache to the database files during the backup,
which could result in data inconsistency in the backup files.
• Snapshot backup, page 266
To allow for a low-level software or hardware tool to take snapshots of an active federation, xDB
write activity can be temporarily suspended.
By default, xDB restores all files to the backup location, and does not overwrite existing files. The
following restrictions apply to restoring backups:
• Before restoring incremental backups, a full backup must be restored.
• Incremental backups must be restored in the order they were created.
• Any existing database files must be deleted or moved manually before restoring a federation.
Note: You must restore online backups using the same xDB version that they were taken on. If you
want to restore a database/federation from a backup taken on an older xDB version, you must do so
using that xDB version, then cleanly shutdown the server and upgrade.
Note: A federation server must never be started before the last incremental backup has been restored.
It is not possible to restore any further incremental backups after starting the server.
Incremental backups store only the data that has been modified since the most recent full or
incremental backup.
The federation’s primary transaction log is stored in the backup file. By default, any log files that are
no longer needed for transaction rollback or recovery are automatically deleted.
To allow incremental backups, the keep-log-files option of the federation must be enabled before
creating the initial full backup. You can set this manually, either:
• in the Admin Client, by using the Set keep-log-files option dialog, accessible through the Federation
menu’s option Change keep-log-file option.
• in the bootstrap file, by seting the keep-log-files attribute of the log element to true . Note: The
bootstrap file can only be edited if xDB is not running.
When the keep-log-file option is enabled, obsolete log files are only removed when a backup
is created.
You can perform incremental backups with the Admin Client or the Command Line Client:
• in the Admin Client, select the Backup option of the Federation menu, and set the Incremental
option of the Backup dialog,
• in the Command Line Client, use the --incremental option of the xdb backup command,
page 263.
Note: An incremental backup is only valid relative to the latest full backup that precedes it. If you
need to do a full backup without disturbing your current sequence of incremental backups, create a
standalone backup. A standalone backup does not affect the next incremental backup, and can be
created by running the command xdb backup with the --standalone flag. It is not possible to
create an incremental backup relative to a standalone backup.
Note: During incremental backup of a federation that includes a library with MultiPath index,
final merge logging optimization should be disabled. You can enable or disable optimization on
the federation by setting the MultiPath indexing property xdb.lucene.finalMergeNoLogging to true
(default) or false. For more information, refer to the section about MultiPath index merge, page 156.
Argument Description
-o --file ARG Specifies the output file. If no output file is specified, the output is
sent to standard output.
--incremental Creates an incremental backup.
--standalone Creates a standalone backup.
--keeplogfiles Keeps obsolete log files after completing the backup.
--include-segments Comma-separated list of either segments to be included or
segments to be skipped during backup. Segments are specified
--skip-segments
as database:segmentID. The include and skip options are
mutually exclusive.
--include-segments-file Location of a file containing comma-separated list of either segments
to be included or segments to be skipped during backup. Segments
--skip-segments-file
are specified as database:segmentID. The include and skip
options are mutually exclusive.
--overwrite Overwrites an existing output file.
Example
The following example creates an incremental backup and writes the output to the xdb_backup.bak file.
xdb backup --federation xhive://localhost:1235 --incremental --file xdb_backup.bak
If the amount of backup data is large, it is faster to back up from the same JVM as the page
server is using. Backing up from the same JVM avoids sending all the data over a TCP
connection. If such a backup on the server side is not practical, a possible alternative may be to
stop the page server and run a backup directly on the federation by passing the --federation
path/to/FederationFile.bootstrap option to the backup command.
Argument Description
--federation <value> The new bootstrap file location. If no location is specified, the federation
is restored to the same location from which it was backed up, including
configuration values from the xdb.properties file.
Relative paths in the original bootstrap file are interpreted relative to the new
bootstrap file. Database files specified with absolute paths are restored to
their original location.
--file <value> The name of the input file containing the backed up data. If no input file is
specified, the input is read from standard input.
Argument Description
--overwrite Overwrites database files that already exist in the target federation. If not
set, restore fails if any of the database files already exists, to protect data
from accidental overwriting.
--relative-mapper Maps restored paths relative to the restored bootstrap file.
--configurable- A file that maps new restore paths to the old paths. If no path is provided an
mapper <value> error will be raised. For an example, see below.
<xhive-configurable-restore>
<restore-path>
<old-path>
log
</old-path>
<new-path>
C:\foo\log
</new-path>
</restore-path>
<restore-path>
<old-path>
MyDatabase-default-0.XhiveDatabase.DB
</old-path>
<new-path>
C:\foo\MyDatabase-default-0.XhiveDatabase.DB
</new-path>
</restore-path>
</xhive-configurable-restore>
If the data files have been lost, the current log files are intact, and the keep-log-files option was set,
the log files can be used to restore the database.
To restore lost data from log files:
5. Start the server. This should use all the log files to recover the state of the federation to the most
recent one.
Related topics
The xdb restore command
Offline backups
If no xDB code is running, a federation can be backed up and restored using any regular file backup
and restore utility. Another option is to run the xdb backup command using an in-JVM server by
specifying the federation bootstrap file as the federation.
If xDB code is running on the federation, a cold backup is not a good choice. Even if no transactions
are open, the server can still flush dirty pages from the cache to the database files during the backup,
which could result in data inconsistency in the backup files.
Related topics
Using the xdb backup command
Federation snapshots can be created using any appropriate software or hardware method. After
creating the snapshot, any regular file backup utility or the xDB backup command can be used to
back up the federation files.
If the snapshot is atomic, no special measures are required. The disk image of the federation files is
always in a consistent state. If the snapshot is not atomic, all xDB write activity can be temporarily
suspended to take a consistent snapshot of the federation. For example, if several snapshots of different
file systems are required to back up a federation.
From the command line, you can use the xdb suspend-diskwrites command. This command supports
the following options:
Option Description
--flush Flushes all dirty pages in the cache to the disk.
--checkpoint Takes a lightweight checkpoint. If used together with the --flush option,
it takes a heavyweight checkpoint. If creating a backup while disk writes are
suspended, this ensures that redo recovery is not necessary after restoring
the backup.
The checkpoint option is ignored on replicators. Replicators cannot take
independent checkpoints.
--sync Flushes all files to the disk. This option is useful if you use a low level backup
mechanism that bypasses the operating system when copying the federation
files.
--resume Resumes disk writes after suspension.
Example
The following examples use these commands to back up and restore a library.
The backup() method of XhiveFederationIf creates an online ("hot") backup of the federation. It
can only be called if a server is running. The xdb backup command is a simple wrapper for the
API method.
To create an incremental backup, use the backup method with the BACKUP_INCREMENTAL option.
To allow incremental backups, the keep-log-files option of the federation must be enabled before
creating the initial full backup, using the setKeepLogFiles() method of the XhiveFederationIf interface.
Example
The following example code uses the backup() method to create an online federation backup.
Example
The example code below restores a federation from a backup file.
XhiveFederationFactoryIf federationFactory = XhiveDriverFactory.getFederationFactory();
FileInputStream in = new FileInputStream("backupfile");
federationFactory.restoreFederation(in.getChannel(), null, null);
Example
The following example backs up a library.
XhiveSessionIf session = XhiveDriverFactory.getDriver().createSession();
session.connect("administrator", "password", "database");
DomLibrary lib = get a handle to the library for which to create the backup;
if (lib == null) {
// throw exception
}
FileOutputStream out = new FileOutputStream("backupfile");
lib.backup(output.getChannel());
Example
The following example restores a library.
XhiveFederationFactoryIf federationFactory = XhiveDriverFactory.getFederationFactory();
FileInputStream in = new FileInputStream("backupfile");
The following method in XhiveDatabaseIf allows you to create a backup containing multiple
detachable libraries:
void backupLibrary(Collection<XhiveLibraryIf> libraries,
WritableByteChannel out)
This method backs up the specified libraries to the file specified by out. If out is null, the backup goes
to the standard output. The specified libraries must all be detachable and read-only.
You can restore a backup containing multiple libraries using the normal restore process.
To selectively restore a library from a backup containing multiple detachable libaries, use the following
method from XhiveFederationFactoryIf:
void restoreLibrary(ReadableByteChannel in, Collection<String>
librariesToRestore, String bootstrapFilename, PathMapper mapper)
This method restores the libraries specified in librariesToRestore (each represented by a full path
string), from a backup in. If librariesToRestore is null, the method restores all the libraries in the
backup. If any specified libraries are not found in the backup, the method throws an exception.
When you create a backup, metadata about the backup is stored in a backup header. In the Admin
Client, you can use the Backup information option of the Federation menu to view this backup
information.
You can use the XhiveBackupInfoIf interface to read this backup header. To get an
object of this type populated with the header information about a specified backup, use the
XhiveFederationFactoryIf.getBackupInfo method. You can then read the backup metadata using
the following methods of XhiveBackupInfoIf:
Methods Description
BackupType getBackupType() Returns the backup type.
String getDescription() Returns a textual description of the backup, for example: "Library
backup for database db1 created at 9/23/10 4:11 PM, backup LSN =
15609. Included libraries: /library1".
Date getCreationTime() Returns the time of creation.
String getDatabase() Returns the database name of the libraries included in the backup.
This method is applicable only to library backups.
Collection<String> getExcludedLi- Returns a collection of full paths of detachable libraries that are
braries() excluded from the backup. This method is applicable only to
federation backups. In federation backup, the library full path takes
the qualified format, for example: db1:/library1.
Methods Description
Collection<String> getIncludedLi- Returns a collection of full paths of detachable libraries that are
braries() included in the backup. This method is applicable only to library
backups.
long getBackupLSN() Returns the backup LSN.
The workDir argument represents a working directory that will be used for storage of temporary
data resulting from processing the transaction log files in the backup. If workDir is null, the
operating system default temporary directory will be used.
The getBackupDriver() method takes a sequence of backups; in the code example above, a full
federation backup and two incremental backups are provided.
2. Initialize and use the backup driver like a regular driver:
driver.init();
XhiveSessionIf session = driver.createSession();
session.connect(UserName, UserPassword, DatabaseName);
driver.close();
session.commit();
driver.close();
}
API documentation
com.xhive.XhiveDriverFactory
com.xhive.core.interfaces.XhiveDriverIf
com.xhive.core.interfaces.XhiveSessionIf
Methods Description
detach(String) Detaches a library if it is detachable. The library is logically removed
from the database once it is detached.
attach(String) Attaches a detached library to the original database from which it
was detached. When a detached library is attached to the original
database, it can be attached to the original location or moved to a
different location.
forceDetachChild(String) Forcibly detaches a child library. This method is designed to be used
only when the child library (and/or any descendants of this child
library) is believed to be corrupted.
forceAttach(String) Attaches a library which is previously detached through
forceDetachChild(String) or has one or more currently unusable
segments back into the database from detached state.
getState() Identifies the state of a library.
setState(LibraryState) Changes the state of a library. Changing the state of a detachable
library to read-only also changes the state all library children to
read-only. Changing the state of a detachable library to read-write
causes an exception if any of its ancestors is in a read-only state.
getAllSegmentIds() Gets the ids of all the segments occupied by a detachable library.
getBindingNodes(boolean) Gets the names of all the binding nodes of a detachable library.
addBinding(String) Binds a read-only detachable library and all its descendant libraries to
the specified node in addition to all its existing bindings.
remove(String) Removes the specified node as one of the binding nodes of a read-only
detachable library and all its descendant libraries.
changeBinding(String) Binds a detachable library and all its descendant libraries to the
specified node.
backup(WritableByteChannel) Creates a backup of a read-only detachable library.
getNonSearchable() Identifies whether a library is searchable.
setNonSearchable(boolean) Changes a searchable library to a non-searchable library.
Sample
DetachLibrary.java
Related links
Using the library backup() method
Using the restoreLibrary() method
API documentation
com.xhive.dom.interfaces.XhiveLibraryIf
lib.setState(LibraryState.READ_ONLY, true);
lib.detach()
Once the library is successfully detached, use the special XhiveLibraryIf.attach(String, String,
String, XhiveFederationFactoryIf.SegmentIdMapper) call in another session which is connected to
the destination database to actually move the library:
// move the library from the 1st database to the 2nd database
root2.appendChild(root2.attach(databaseName1,
administratorPassword1, segmentId, mapper));
session2.commit();
This special attach API is different from the regular XhiveLibraryIf.attach(String) API as it needs
to access the data of a different database. As a result, the caller at the same time has to be the
administrator of the source database and provides the administrator credential in the API call.
The segments of the library are moved into the destination database together with the library. The
ID of a segment is unique only within the database in which it was originally created, so there
may be an ID collision when moving a segment to another database. Should it be necessary, an
XhiveFederationFactoryIf.SegmentIdMapper object can be specified to rename the IDs of the
segments in the destination database.
The data files of the segments are also moved, logically. Since xDB adopts a naming convention
which combines database name, segment ID and file ID for all its data files, the moved data files will
violate this convention after the move. xDB will attempt to rename them under certain conditions or
the next time server restarts.
Sample
MoveLibrary.java
API documentation
com.xhive.dom.interfaces.XhiveLibraryIf
com.xhive.core.interfaces.XhiveFederationFactoryIf.SegmentIdMapper
Method Description
setLibraryUsableBySegmentId(String Mark the detachable library whose root page is on the specified
segId, boolean usable) segment as usable if usable is true, unusable otherwise.
setLibraryUsableByPageId(long pageId, Mark the detachable library that contains the specified page as
boolean usable) usable if usable is true, unusable otherwise.
setLibraryUsableByPath(String fullPath, Mark a detachable library, which is specified by the full path, as
boolean usable) usable if usable is true, unusable otherwise.
getAllUnusableLibraries() Return a collection of strings representing the full paths of all
unusable libraries in the specified database.
getAllUnusableExternalIndexes() Return all unusable MultiPath indexes in the database.
server node always has one single primary log directory, and one or more secondary log directories can
be added. Secondary log directories can be removed; the primary log directory cannot be removed.
For performance reasons, it is preferable to have each transaction log location on a separate disk.
However, even if all duplicates are on the same disk, increased redundancy can still help protect the
transaction logs against the consequences of I/O errors, file corruption, and so on.
Note: When keeping duplicates of transaction log files, xDB concurrently writes the same information
to multiple identical log files. Duplicating transaction log files therefore increases the amount of
I/O that the page server must perform. Depending on your configuration, this may impact overall
performance.
To recover from a failure on the primary log directory, first shut down the page server, then copy
the duplicate log files from a secondary log directory to the primary log directory, and then restart
the page server.
xDB writes to the available secondary log directories while ignoring any unusable ones. If xDB
cannot write to a secondary log directory, it marks that log directory as "unusable" in the bootstrap
file and places an error message in the message log. Writing to the primary log directory and any
other secundary log directories proceeds normally.
Log directories can be added or removed using command line client commands, page 248.
You can create a federation with multiple log directories, and/or add a log directory to an existing
node, as follows:
XhiveFederationFactoryIf ff = XhiveDriverFactory.getFederationFactory();
XhiveLogConfigurationIf logConfig =
ff.createLogConfiguration(PRIMARY_LOG_DIR, SECONDARY_LOG_DIRS_LIST);
ff.createFederationWithLogConfig(BOOTSTRAP, logConfig,
PAGE_SIZE, SUPERPWD);
Monitoring statistics
xDB implements extensive monitoring capabilities, which can be used to analyze various aspects
of xDB performance.
xDB monitors statistics separately for each server, which allows for better understanding of each
server’s load.
xDB supports monitoring of various statistics categories, page 279, including number of cache hits,
page access, query response time, number of connected transactions and transaction rollback time.
Monitoring can be enabled and disabled for various categories, either on a system or on transaction
level.
By default, monitoring is disabled. It can be enabled or disabled through the property
XHIVE_STATISTICS_MONITORING_ENABLED in the xdb.properties file, which is set to false
by default.
If you set this property to true and then run an xDB server, then monitoring will be enabled for that
server.
With statisics monitoring enabled, you can use the Command Line Client command monitor-statistics
with the following arguments:
• path: The connection URL to the local JMX server (required).
• password: The password to the local JMX server (required).
• username: The username to the local JMX server (required).
• duration: The amount of time (in seconds) we monitor, infinite if empty.
• interval: Statistic data polling intervals (in seconds). Defaults to
XHIVE_STATS_MONITOR_INTERVAL.
The example code below shows how to subscribe to and receive monitored buffer pool cache statistics:
XhiveStatisticsSubscriptionIf subscriber =
getDriver().getStatisticsSubscription("xdb.bufferpool");
XhiveStatisticsSnapshotIf snapshot = subscriber.getSnapshot();
Map<String, XhiveStatisticValueIf> data = snapshot.getData();
for(String category : data.keySet()) {
System.out.println(snapshot.getTimeStamp() + ": " + category + "="+
data.get(category).toString());
}
♦ concurrent.internal
♦ concurrent.leaf
♦ namebase.root
♦ namebase.internal
♦ namebase.leaf
They and their combined prefixes are accepted.
RAM segments
A RAM segment is a special type of database segment that is kept in the database cache but never
written to a file.
Note: The size of the database cache limits the amount of data that can be kept in a RAM segment.
Furthermore, if too large a part of the cache is used to store temporary data, database performance can
suffer due to lack of cache pages for other usages
In the Admin Client, a RAM segment for temporary data can be enabled in the Database properties
dialog, by selecting the option ram_segment from the Temporary data segment dropdown list.
Read-only federations
If your xDB application requires that data remains unchanged, you can put that data in a read-only
federation. For example, this can be useful for distribution on readonly media like CDROM. To
facilitate such usage, xDB has some special features for readonly federations.
A federation can be changed into a read-only federation by changing the bootstrap file to a read-only
file.
A read-only federation has the following characteristics:
• It requires no log files. For example, copying a federation to a CD-ROM only requires copying the
bootstrap file and database files. The log files are not required. A federation can only be copied
when the xDB server is not running and has been shut down cleanly. A clean shutdown allows the
server to write all modified pages back to disk and the log files are not needed on startup.
• Its bootstrap file is not locked. Multiple xDB applications can be run using the same read-only
federation.
• Its data cannot be modified.
Note: A read-only federation, though the data itself cannot be modified, can still allow for creation of
temporary data by using a RAM segment.
Federation sets
A federation set provides a convenient way to run multiple federations in a single server with a single
TCP port and a single page cache.
A federation set is defined by a federation set description file, which is simple XML file that contains
a list of references to federations. Federation sets can be nested: a federation set description can
reference other federation set description files. The example below contains absolute references to two
federations and one relative reference to a (nested) federation set description file. Any relative paths to
federations or nested federation sets are interpreted relative to the federation set description file.
A federation referenced in a federation set description file does not have to exist. Adding a federation
to a federation set and creating that federation are separate and unrelated actions.
• Using the administration client and clicking Federation sets > Create federation set and entering
the file name for the new federation set description file. After creating the federation set description
file, the Modify federation set option can be used to add federations.
Configuring SSL on the client includes specifying a URL as the xhive.bootstrap property. The URL
has the format xhives://host:port. The system uses the default SSLSocketFactory to connect to the
xDB server. It can require setting the JSSE system properties.
The Admin Client and the command line client offer options to check the consistency of a database,
federation, library, document or BLOB.
Select one or more check options, then click the Check button.
When the check is complete, you can select text in the Consistency checker report and copy the
selection to the clipboard.
3. Create a PrintWriter object that contains the consistency checker report, similar to the following:
ByteArrayOutputStream baos = new ByteArrayOutputStream();
PrintWriter pw = new PrintWriter(baos);
checker.setPrintWriter(pw);
Given an active session, the code sample below shows how to check a federation.
XhiveFederationIf fed = session.getFederation();
XhiveFederationConsistencyCheckerIf checker = fed.getConsistencyChecker();
ByteArrayOutputStream baos = new ByteArrayOutputStream();
PrintWriter pw = new PrintWriter(baos);
checker.setPrintWriter(pw);
if (checker.checkFederationConsistency().isConsistent()) {
System.out.println("Federation at driver: " +
session.getDriver().getFederationBootFileName() + " is consistent");
} else {
pw.flush();
System.out.println(baos.toString());
}
Message logging
Message logging provides run-time status information in a human-readable form, and is unrelated to
write ahead logging, page 43. This section applies to the java.util.logging logging back-end, which
is the default xDB message logging implementation. For more information about Java’s logging
system, refer to the Javadoc documentation for the class java.util.logging.LogManager, or the
documentation of your JVM. Note: To facilitate adaptation to environments that depend on specific
logging frameworks or configurations, xDB uses SLF4J as message logging framework. For more
information about SLF4J, refer to Message logging framework, page 287. If you want to use xDB with
a different logging implementation, refer to that implementation’s documentation.
The java.util.logging functionality relies on the global logging configuration for the JVM in which
it runs. By default, log messages with a priority of INFO or higher are written to the console.
Java’s logging system is typically configured through configuration files, either a default one
per JVM in JAVA_HOME/lib/logging.properties, or a file specified with the system property
java.util.logging.config.file, for example on the Java command line. For the xDB
standalone server, you can configure this in the additional Java VM option XHIVE_OPTS in
xdb.properties, or in the lax configuration file (see Configuration files for Windows, page 68).
For example, you can enable logging for the xdb command, by adding
-Djava.util.logging.config.file=[logging.properties] to JAVA_CMD.
Under Windows in xdb.bat:
"!JAVA_CMD!" -Xms!XHIVE_MIN_MEMORY! -Xmx!XHIVE_MAX_MEMORY! !XHIVE_OPTS! \
-Djava.util.logging.properties.file=mylogging.properties \
-cp "!XHIVE_CLASSPATH!" com.xhive.tools.Cmd %*
Under UNIX in xdb.sh:
"${JAVA_CMD}" -Xms${XHIVE_MIN_MEMORY:-128M} -Xmx${XHIVE_MAX_MEMORY:-256M} \
${XHIVE_OPTS} \
-Djava.util.logging.properties.file=mylogging.properties \
-cp "${CLASSPATH}" com.xhive.tools.Cmd "${@}"
You can configure multiple handlers to receive logging information (for example, to write log
messages to a file and to send them by email). You can set log levels for individual logging areas,
identified by a hierarchical package name. Logging areas are hierarchical package names, separated by
’.’. Areas cascade: setting a log level on com.xhive will apply to all sub-packages, unless they have
their own log level specified.
The example below configures a console handler, and specifies different logging levels for several
logging areas.
# specify a set of handlers, in this case just a console handler
handlers = java.util.logging.ConsoleHandler
# log level for a particular handler
java.util.logging.ConsoleHandler.level = INFO
# log level for xDB core messages
com.xhive.core.level = FINE
# less information from multi path indexes
com.xhive.index.multipath.level = SEVERE
xDB uses SLF4J (see www.slf4j.org) as message logging framework (with the exception of the xDB
Ant tasks, which use Ant’s built-in logging system).
SLF4J is a facade for Java logging frameworks such as java.util.logging, Log4J, logback and
commons-logging. SLF4J decouples the logging API from the actual logging implementation, making
it is possible to plug in a different logging implementation at deployment time, without having to
modify the application code. This makes it easy to adapt xDB to environments that depend on specific
logging frameworks or configurations.
The logging in SLF4J is configured through the underlying logging implementation. How this is
done depends on the specific logging framework.
The xDB distribution uses java.util.logging as the default logging implementation. To use a different
logging framework, remove the lib/xhivedb/core/slf4j-jdk14.jar SLF4J binding library from the Java
classpath and substitute it with the desired SLF4J binding library.
Replication
Load sharing by spreading database reads over multiple page servers can provide a large performance
advantage. xDB supports this by means of lazy primary copy replication.
With xDB, replication involves maintaining one or more full copies of a federation, with each copy
having a separate page server. One page server maintains a “master” copy of the data, while one or
more additional page servers each maintain a read-only “slave” copy of the data.
Primary copy replication means that data updates occur on a master federation, known as the primary
or the primary copy. Applications write data only to the primary. The updates propagate from
the master to one or more read-only copies of the primary, known as replicas, secondary copies, or
slaves. One primary can have multiple replicas. For each replica, there is a separate, dedicated page
server called a replicator or replication server. A replicator can act as master for another replica.
Applications can perform read-only transactions and online backups on the replicas instead of on
the primary, to distribute query load and improve performance. Also, applications can provide for
using a replica as a failover.
The term lazy indicates that the master does not wait for a transaction to propagate to the replicas
before confirming the transaction. The replicas are updated asynchronously in the background.
The replication process sends the master’s primary transaction log files to the replicators. The
replicators read the log files and apply all updates to their own copy of the data. The master preserves
any required log files until the replica has confirmed that it received the files.
The example above assumes that the master server is running a page server that is accessible from
other hosts, and that the replica host has xDB installed.
• Use Replicate federation option of the Admin Client.
• Use the xdb backup command or the Admin Client to create a full online backup, and restore
that backup on the desired machine.
• If no server is running for the federation, you can copy all the federation files to another machine
using a system file copy command.
Note: Do not make a system file copy while a server is running for the original federation, because
that will corrupt the data in the replica.
First registering a replicator name with the master and then creating the copy of the federation implies
this replicator name is also registered with the copy. The replicator server preserves its log records for
another replicator with this name, unless the name is unregistered from the copy.
To unregister a replicator name with a copy, use the Admin Client or the xdb create-replica command
with –remove option.
• --replicator replicatorId
This option passes the replicator name. The replicator name must be registered with the master.
Clients can connect to the replicator server for read-only queries and online backups.
Example
The example below uses the xdb stop-server command to shut down a server, after updating
its two replicas.
Removing a replica
A replica is removed by stopping the replication server and removing the files.
Subsequently, the replicator must be unregistered from the master, so the master does not preserve the
log records for this replica. Otherwise the master preserves obsolete log files forever. A replicator
can be unregistered manually using the Admin Client or the xdb create-replica command with
–remove option.
Example
The following example starts an internal replicator server.
XhiveDriverIf driver = XhiveDriverFactory.getDriver("/xhive/replica/
XhiveDatabase.bootstrap");
driver.configureReplicator("xhive://masterhost:1234", "myReplicator");
driver.init(1024);
A temporary segment can also be created with the Admin Client: just right-click Segments, enter
an ID and check the temporary flag.
2. Create the replica by running the following command on the replica host:
The example assumes that the master server is running a page server that is accessible from other
hosts, and that the replica host has xDB installed. The create-replica command is a wrapper
around the XhiveFederationIf.registerReplicator(...) and XhiveFederationIf.replicateFully(...)
API calls. The calls register the replica in the master federation and copy the complete federation
over to the replica host.
return waitForUpdates(replicaDriver.createSession());
} else {
return masterDriver.createSession();
}
}
The code that pools the sessions replaces the createSession call. Since there are two drivers,
also set up two session collections.
The read-only transactions are faster because they use the replica. However, xDB replication is lazy.
When an update transaction on the master federation finishes, it is not guaranteed that a transaction
started directly after on the replica detects the changes. This restriction can be problematic, , if the
application adds a document to a library and then immediately presents the contents of the library to the
user in a read-only transaction. The library contents could not be updated on the replica yet. Therefore
it is better for a session on the replica to wait for updates made in a session on the master server. In the
session pool code, you could register a timestamp each time a read-write session completes:
XhiveSessionIf.TimeStamp currentWaitTimeStamp = null;
If data is evenly distributed among the various libraries, you can achieve parallel ingestion of the data.
One of the node servers is designated as primary server, and the other node servers are called
non-primary servers. Usually, each node server will run on a separate host machine, but multiple
node servers can run on a single host if necessary.
Note: The primary server must always be active to enable access to the database.
Although data to node server binding is implemented at library level, the actual binding information
is stored at segment level. Changing the binding of a library causes updates to all the segments of
the library.
An example configuration with two xDB nodes is shown in xDB multi-node configuration, page 298.
This example shows two front-end application servers, each hosting a web application with an xDB
client within it. Behind the scenes, each xDB client can connect to both of the server-side nodes. Each
node is bound to one or more libraries. One of the node servers is designated as the primary server, the
other node server is called a non-primary server.
Node server 1 serves the detachable libraries lib1, lib2, and lib3. As primary, it also serves the root
library, which cannot be detachable. Node server 2 serves detachable libraries lib2, lib3, and lib4.
Transaction recovery
Each node server maintains its own transaction log and recovers the libraries that are bound to it. A
transaction in a multi-node installation can only make updates on a single node. All transactions are
handled the same way as in a single-node installation. If a node server fails, restarting the node server
recovers the binding libraries to a consistent state.
Note: The primary node server must always be started first.
The sample bootstrap file below defines the example configuration shown in xDB multi-node
configuration, page 298.
The node and segment-to-server bindings are highlighted. There are two xDB nodes. The
name="primary" attribute for the first node indicates that this is the primary server. The federation has
six segments. The default segment and Seg1 are bound to the primary. Segments Seg4 and Seg5 are
bound to Node2. The two read-only segments Seg2 and Seg3 are bound to both nodes.
<?xml version="1.0" encoding="UTF-8"?>
<server version="xDB 10.1" pagesize="8192" license=<license> passwd=<password>>
<node name="primary">
<log path="log" id="1210583277531" keep-log-files="false"/>
</node>
<node name="Node2" host="host2", port="1236",>
<log path="Node2-log" id="1210583888574" keep-log-files="false"/>
</node>
<database name="Shanghai">
<segment id="default" temp="false" version="1" state="read-write"
usage="non-detachable" usable="true">
<file path="Shanghai-default-0.XhiveDatabase.DB" id="0"/>
<binding_server name="primary"/>
</segment>
<segment id="Seg1" temp="false" version="1" state="read-write"
usage="detachable_root" usable="true"
library-path="/Lib1" library-id="0">
<file path="Shanghai-Seg1-0.XhiveDatabase.DB" id="1"/>
<binding_server name="primary"/>
</segment>
<segment id="Seg2" temp="false" version="1" state="read-only"
usage="detachable_root" usable="true"
library-path="/Lib1/Lib2" library-id="0">
<file path="Shanghai-Seg2-0.XhiveDatabase.DB" id="2"/>
<binding_server name="primary"/>
<binding_server name="Node2"/>
</segment>
<segment id="Seg3" temp="false" version="1" state="read-only"
usage="detachable" usable="true"
library-path="/Lib1/Lib2" library-id="0">
<file path="Shanghai-Seg3-0.XhiveDatabase.DB" id="3"/>
<binding_server name="primary"/>
<binding_server name="Node2"/>
</segment>
<segment id="Seg4" temp="false" version="1" state="read-write"
usage="detachable_root" usable="true"
library-path="/Lib1/Lib4" library-id="1">
<file path="Shanghai-Seg4-0.XhiveDatabase.DB" id="4"/>
<binding_server name="Node2"/>
</segment>
<segment id="Seg5" temp="false" version="1" state="detach_point"
usage="detachable_root" usable="true">
<file path="Shanghai-Seg5-0.XhiveDatabase.DB" id="5/>
<binding_server name="Node2"/>
</segment>
</database>
</server>
To the Administrator, the above bootstrap example would look like this in the Admin Client:
<node/>
The <node/> element describes a node server.
Attributes
host name The host name of the node server, if the server is not the primary node.
port number The port number of the node server, if the server is not the primary
node.
directory The default log directory of primary server is log. The default log
directory for a non-primary node server is host name-log. With
multi-node support, the "log" element is a child of the "node" (instead
of the "server") element.
When a node server is added, the administrator can specify a different
log directory for that node server.
Child elements
<binding_server/>
The <binding_server/> element maps a segment (library) to multiple node servers.
Segment to server bindings are stored as child elements of <segment/>, page 231 element in the
bootstrap file.
Attributes
name The name of the node server to which the segment is bound.
Multi-node considerations
For multi-node, the recommended indexing method is multipath indexing.
Multi-node constraints
Example
The table below compares two examples of valid transactions with one invalid transaction, occurring
on the libraries in the multi-node architecture, page 297 example.
Transaction 1 Transaction 2 Transaction 3
begin begin begin
.. . .. . .. .
update Root-Lib read Root-Lib update Lib1
update Lib1 read Lib2 update Lib4
.. . read Lib3 .. .
commit .. . commit
commit
Transaction 1 is valid, because the Transaction 2 is also valid, because Transaction 3 is not allowed. It
Root library and Lib1 are bound to it performs no updates. violates the locking rules, because
the same server. it updates Lib1 and Lib4, which
are bound to different nodes.
The following run-time restrictions apply to the primary and non-primary node servers:
• The primary server must always be started first. Non-primary servers can be started in any order.
• The primary server must be up and running to access the federation.
• When the primary server is down, the entire federation becomes unavailable.
• When a non-primary server is down, the libraries that are bound to the server become unavailable.
• A node server can be started only after the node server has been added.
• A node server can be updated or removed only if the node server is not running.
Managing nodes
xDB automatically creates a primary node as part of a federation. The primary node cannot be removed.
The command-line client provides xdb commands to manage and configure multi-node architecture
properly, including add-node, remove-node, and update-node, add-binding, change-binding and
remove-binding. The commands run-server and stop-server have multi-node options.
Note: If you need to modify the bootstrap file for multi-node, use xDB functions. Editing the bootstrap
file manually is NOT recommended.
Command examples
The following examples show use of the xdb add-node, xdb remove-node, and xdb update-node
commands to add, remove, and update node servers.
xdb add-node --passwd secret --nodename "Node1" \
–host "Node1Host" --port 1236
xdb remove-node --passwd secret --nodename "Node2"
xdb update-node --passwd secret --nodename "Node1" \
–host "Node1Host" --port 1238
Examples
The following code fragment adds, removes, and updates a node server.
XhiveDriverIf driver =
XhiveDriverFactory.getDriver("xhive://primaryHost:1235");
if (!driver.isInitialized()) driver.init(1024);
XhiveSessionIf session = driver.createSession();
Session.connect(superUserName, superUserPassword, null);
XhiveFederationIf federation = session.getFederation();
// Add node Node1 with host Node1Host and port 1235
federation.addNode("Node1", "Node1Host", 1236);
// Remove node Node2
federation.removeNode("Node2");
// Change listening port of Node1 to 1238
federation.updateNode("Node1", "node1Host", 1238);
The following code fragment uses the getAllNodeInfo() method to retrieve all non-primary node
server information.
XhiveFederationIf federation = session.getFederation();
List<XhiveNodeServerInfoIf> nodeInfoSet = federation.getAllNodeServerInfo();
for (XhiveNodeServerInfoIf nodeInfo : nodeInfoSet) {
System.out.println("Node name = " + nodeInfo.getNodeName());
System.out.println("Host name = " + nodeInfo.getHost());
System.out.println("Port number = " + nodeInfo.getPort());
System.out.println("Log directory = " + nodeInfo.getLogPath());
}
Locking rules
In an xDB multi-node configuration, a transaction can update data pages bound to a single node only.
Therefore a transaction cannot acquire write locks on more than one node. Despite this restriction,
distributed deadlock is still possible. For example, a transaction can read any data pages in the database
and thus acquire read locks on segments bound to any node server. Furthermore, when a transaction
updates a library, it has to acquire read locks on other segments/libraries on other nodes.
Distributed deadlock can be avoided by enforcing the following locking rules:
• A transaction that has only read locks can get a read lock on any objects bound to any node server.
• A transaction can only acquire a write lock on a node server if the transaction has no write locks
on the other node servers. All read locks on the other nodes must be on ancestor segments of the
segment on which the transaction is requesting write lock.
• Once a transaction has a write lock on a node server, it cannot request write locks on any node
servers. The transaction can request read locks only in segments that are descendants of the segment
on which it has write locks, or on ancestor segments.
Examples
The following code example creates a segment using the createSegment() method.
session.connect(dbaUserName, dbaUserPassword, "MyDatabase");
session.getDatabase().createSegment("node1", "segment1", null, 0);
The following code example changes the binding of a library to the node Node1.
XhiveLibraryIf library = session.getDatabase.getByPath("/library1");
library.changeBinding("Node1");
As an example of a possible multi-node application, consider how a fictional U.S. cell phone company
might use a multiple-node configuration to store cell phone call information. Each cell phone call is
logged as an XML document, and all call transactions are stored in a multi-node deployment, because
of the high volume of incoming data (3,000 call transactions per minute), as well as for the sake of
high availability and disaster recovery.
The application logic is implemented as a distributed web application, hosted within a Java application
server. The web application serves as the interface point for the company back-office systems. A
back-office system will generate the call transaction and send a transaction write request to the web
application. Each new call transaction will be saved as an XML document in a library for the region
where the customer was when placing or receiving the call. For example, if someone in New York
City receives a call, the resulting call transaction is logged to the USNortheast library. The web
application will also be the interface point for a web portal, where customers can search their personal
call transaction logs. When a user enters search criteria from the portal, the portal will generate a
query request to the web application.
Call volumes are spread equally across the various US regions, and the company models the data as
one XML library per US region, with all regional XML libraries sharing the same parent library:
MobileCallLog
• USNortheast
• USSoutheast
• USMidwest
• USSouthwest
• USWest
Deployment Topology
Due to its requirements and data model, the company chooses a multi-node deployment with 5 nodes,
where each node has read/write access to a single child library:
In this deployment topology, one server node has been mapped to each regional XML library to
handle read/write operations for that particular library. When the web application needs access
to call transaction data in a regional library, an App Server transparently obtains a connection to
the appropriate server node.
A spare server node is kept on hot standby, ready to be incorporated into the system if one of the
current server nodes should fail. As a result, 6 server nodes are deployed in all. Furthermore, in this
sample deployment topology the libraries reside on a SAN, therefore high availability and disaster
recovery of the actual data is abstracted away from the IT owner managing the deployment.
Three identical application server instances allow requests to be load-balanced across the application
servers. Each xDB client connects to appropriate server nodes as and when access to the library
bound to a particular server node is required.
Data access
To explore the relationship between calling application, client APIs, and server nodes, we will walk
through the system interaction for the following use cases:
• Calling application sends a request to store a call transaction to the database
• Calling application issues a query to find call transactions
• Calling application requests a specific call transaction
Issue a query
1. A customer uses a self-service web application to view his call log. In particular, the customer
would like to search for call transactions within the past week.
2. Customer-facing portal dispatches the query request to one of the 3 application servers. In addition
to handling call transaction write requests, the web application hosted on the 3 application servers
is also designed to handle query requests.
3. The application server receives the query request and turns the query request in to a formal XQuery.
The web application within the application server invokes the XQuery client API.
4. The client API transparently dispatches page requests to all 5 server nodes. In this use case, the
customer’s query spans across all 5 libraries because the customer did not enter a qualifier in his
query such as “calls within the past week placed in NYC”.
5. The client API transparently gathers pages from the 5 server nodes, continues to process the
XQuery request as needed, then consolidates the results into a single XQuery result set.
6. The results ultimately get propagated from the web application up to the customer-facing portal
1. Customer uses a portal to view his call log, and would like more information about a particular call
transaction. Therefore, the self-service portal issues a request to fetch a specific call transaction
to one of the 3 application servers.
2. The web application within an application server invokes a client API by providing information
about the call transaction, including the XML library where the underlying call transaction is called.
3. Based on the XML library specified, the client API routes the fetch request to the correct server
node.
4. The server node receives the request and returns the call transaction.
5. The call transaction is ultimately propagated back up to the customer-facing portal.
One main benefit of multi-node lies in the area of high availability and disaster recovery.
The most elaborate approach is that the multi-node deployment includes a server node whose
sole purpose is to act as a hot standby: this standby is running, but not bound to any of the XML
libraries. If one of the server nodes fails, the standby server node can be bound to the XML library
to which the failed server was previously bound. In the figure below, Node 4 has failed, and
the Standby Node has been bound to the US Southwest library to provide continuity of service.
High availability can also be achieved without a hot standby node. In the event of a
node failure, another bound node can take over read access for the affected library. In the
figure below, Node 2 has failed, and the US Southeast library has been bound to Node 1,
giving Node 1 access to 2 XML libraries, so continuity of service has been maintained.
Note: These high availability techniques can only be used for read-only libraries. If a node fails that
serves a read/write library, recourse is to either restart the failed node, or to replace the failed node with
another preconfigured one.
An additional, worst-case disaster recovery policy should be to back up the XML libraries periodically.
The XML library backup can occur in parallel. In case of a catastrophic failure, for example if the
XML libraries are lost along with all the server nodes, a new environment can then be created by
restoring the backup XML libraries in parallel, and bringing new server nodes online.
For developing and testing its new system, the phone company chooses a slightly simplified setup:
The partitioning of the documents is based on US regions, with a separate library for each region plus
one hot standby node, but with only three active node servers, and some libraries sharing a node server.
The fragment of java code below creates and starts the server nodes for the above multi-node
architecture.
class MobileCallLogBackend {
/*
* This method creates the Mobile application deployment topology.
* It creates federation and database and then
* creates and runs xDB nodes to serve incoming query/update queries.
*/
private void setUp() {
//сreate federation
XhiveFederationFactoryIf ff = XhiveDriverFactory.getFederationFactory();
ff.createFederation(BOOTSTRAP_FILE_NAME, PRIMARY_LOG_FOLDER, PAGE_SIZE, SUPERPWD);
//connect as a superuser
superSession.connect(SUPERUSER, SUPERPWD, null);
//start transaction
superSession.begin();
XhiveFederationIf federation = s.getFederation();
//create database
federation.createDatabase(DBNAME, DBPWD);
/**
* This method starts a node.
* @nodeName name of the node to run.
* @port specifies a port to listen to icoming requests.
* @return driver object of the running xDB node
*/
XhiveDriverIf startNode(String nodeName, int port) throws IOException {
//create and initialize node server
XhiveDriverIf nodeDriver =
XhiveDriverFactory.getDriver(BOOTSTRAP_FILE_NAME, nodeName);
if (!nodeDriver.isInitialized()) {
nodeDriver.init(1024);
}
//start listening thread on the server node to listen to incoming client requests
ServerSocket socket = getServerSocketFactory().createServerSocket(port);
nodeDriver.startListenerThread(socket);
return nodeDriver;
}}
The comments below appy only to multi-node API related features. See the xDB API javadocs for
information on API methods used in the java code above. Each federation has a primary node,
which is created at the federation creation time. An Administrator can specify primary node
parameters like primary node log files directory in the XhiveFederationIf.createFederation(...)
API method. An Administrator can also add an arbitrary number of additional nodes using the
XhiveFederationIf.addNode(...) method. The method adds the node specification to the bootstrap file
and should be run within a transaction initiated by a superuser session.
It’s important to notice that each node has a separate log directory, but it is not possible to run a
distributed transaction over 2 nodes and update these 2 nodes. It is possible to read libraries bound to
different nodes within a transaction, but you can update only one node. Otherwise, an XhiveException
will be thrown.
To start a node, get and initialize a driver for the node, and then start a listening thread to listen to
requests from clients. If deployment includes a multi-node architecture, then use of xDB client/server
mode is assumed. Embedded xDB mode is not applicable for the multi-node feature.
The setup() method starts the server nodes. Each server node listens for client requests on a separate
port number, but a client should always connect to a primary node. This means that, to start to
work with multi-node servers, a client should first create a remote driver for the primary node using
XhiveDriverFactory.getDriver(primary-node-URL), and then create sessions using this driver. If a
client accesses a library bound to a non-primary node, then xDB automatically redirects requests to
the correct node server.
The following jave code fragment demonstrates how to create application libraries and bind them to
the server nodes according to the deployment topology.
/**
* Create a detachable concurrent library and bind it to the list of specified nodes.
* @rootLib root library of the database
* @segmentId id of the segment
* @libName name of the new library
* @nodeName binding node
*/
private void createLibrary(XhiveLibraryIf rootLib,
String segmentId, String libName, String nodeName) {
// create segment to store new library
rootLib.getDatabase().createSegment(segmentId, null, 0);
The createLibrary(...) method creates a detachable concurrent library in a separate segment and
binds it to a server node. The XhiveLibraryIf interface contains more methods like addBinding(...),
removeBinding(...) and getBindingNodes(...) for handling library bindings properly.
Note: Only detachable libraries can be bound to non-primary server nodes.
In the event of a node failure, the Administrator can bind the standby server node to the library served
by the failed node. However, this technique can only be used for read-only libraries. For a read/write
library, if the node serving it fails, the only options are to either restart the same node, or for another
preconfigured node to replace the failed one.
Samples
CreateMultinodeDatabase.java
Replacing a server
Various circumstances can require replacement of a host machine, for example: a need for a faster
server, or a hardware failure or server crash.
In multi-node environments, replacing a host for a non-primary server will be different than for a
primary server. Furthermore, replacement can involve replacing the entire node, or keeping the node
and replacing only the underlying host machine. The latter case is called a node identity change.
For guidelines on host replacement, see
• Changing node identity, page 314.
• Replacing a non-primary server, page 314.
• Replacing a primary server, page 315.
Example
The following code snippet launches a backend server as the primary server, by specifying the node
name primary as the second argument in the getDriver() method.
XhiveDriverIf driver = XhiveDriverFactory.getDriver(bsFile, "primary");
driver.init(1024);
ServerSocket socket = new ServerSocket(1235);
driver.startListenerThread(socket);
The primary server has a new xhive://host:port URL. Applications must reconnect to the primary
server using the new URL.
The loaderref attribute is required here, because the loaders for the xDB tasks and types must be the
same, whereas Ant uses different loaders for every task and type definition.
Unlike the command-line client of xDB, the xDB Ant tasks do not read the default values from the
xdb.properties configuration file. You can read the properties manually and pass them to the
xDB Ant tasks in your Ant build file.
The Ant tasks use Ant’s built-in message logging.
Related references
<database/>
<document/>
<federation/>
<group/>
<library/>
<user/>
<subpath/>
<database/>
This Ant type represents a database in a federation.
Example
<database id="database1"
name="MyDatabase"
bootstrap="c:/xhive/data/XhiveDatabase.bootstrap"
user="username"
password="Password"/>
<document/>
This Ant type represents a document path in the database.
Example
<federation/>
This Ant type represents a federation.
Example
<federation id="federation1"
bootstrap="c:/xhive/data/XhiveDatabase.bootstrap"
password="TheSuperUserPassword"/>
<group/>
This Ant type represents a user group in a database.
<library/>
This Ant type represents a library path in a database.
Example
<library id="myLibrary5" path="/existingLib"/>
<subpath/>
This Ant type represents an XhiveSubPathIf instance for a multipath index. For information about
multipath indexing, refer to <multipathindex/>, page 351 and Multipath indexes, page 153.
Attributes
Example
<subpath xpath="line"
fulltextsearch="true"
valuecomparison="false"
compressed="true"
returningcontents="true"
includedescendants="true"
enumerateelements="true"
startendmarkers="true"
leadingwildcard="true"/>
API documentation
com.xhive.index.interfaces.XhiveSubPathIf
<user/>
This Ant type represents a user in the database.
<!-- Define a federation type. You can reference it using its "id"
attribute -->
<federation id="MyFederation"
bootstrap="c:/xhive/data/XhiveDatabase.bootstrap"
password="MySuperUserPassword"/>
<!-- Define a database type. You can reference it using its "id"
attribute -->
<database id="MyDatabase"
bootstrap="c:/xhive/data/XhiveDatabase.bootstrap"
name="MyDatabase"
user="MyUser"
password="MyPassword">
<library path="/MyOtherLibrary"/>
</database>
</target>
allows you to change the examples to:
<createdatabase name="MyDatabase"
dbapassword="MyPassword">
<federation refid="MyFederation"/>
</createdatabase>
or using the databaseref attribute:
<createdatabase name="MyDatabase"
dbapassword="MyPassword">
databaseref="MyFederation"/>
and to:
<createlibrary name="MyLibrary">
<database refid="MyDatabase">
<library path="/MyOtherLibrary"/>
</database>
</createlibrary>
or using the databaseref attribute:
<createlibrary name="MyLibrary" databaseref="MyDatabase"/>
Note: Using the databaseref attribute allows referencing only one federation or database at a time.
Example
<target name="jeroen" depends="init">
<antcall target="test-createdatabase" inheritrefs="true"/>
<antcall target="test-deletedatabase" inheritrefs="true"/>
</target>
<addgroup/>
This Ant task adds a group to a database. There are two mutually exclusive ways to use this task:
• with the name attribute, to add a single group
• with one or more nested <group/> elements, to add a number of groups
failonerror Fail the task if the group already exists. No - default is true
<database/>, page 318 The database that contains the new group(s). No
Examples
Use of the name attribute:
<target name="addgroup-using-name">
<addgroup databaseref="MyDatabase.ref" name="group1" />
</target>
Use of nested <group/> elements:
<target name="add-moregroups">
<addgroup databaseref="MyDatabase.ref">
<group name="group5" />
<group name="group6" />
</addgroup>
</target>
Use of nested <group/> elements with nested <user/> elements. The users are created as members
of the group.
<target name="addgroupuser">
<addgroup databaseref="test.database">
<group name="group1" />
<group name="group2">
<user name="alice" password="secret">
<group name="another_group" />
</user>
<user name="bob" password="secret" />
</group>
</addgroup>
</target>
<addsegmentfile/>
segmentid The unique ID of the segment to which the file will be Yes
added.
path The path to server side directory where the data file No
should be created. If not specified, then the path is the
same as that for the federation default database.
maxsize The maximum size (in bytes) that the file is allowed to No
grow to. Default is 0. A value of 0 means the size is
unlimited.
Examples
The example below creates a segment "newsegment" and then adds to it a datafile (in the path and
with the maxsize specified):
<target name="addMySegmentFiles">
<addsegment segmentid="newsegment" databaseref="test.database" />
<addsegmentfile segmentid="newsegment" databaseref="test.database"
path="seg2Path" maxsize="122880" />
</target>
<adduser/>
This Ant task adds one or more users to a database.
There are two mutually exclusive ways to use this task:
• with the name attribute, to add a single user.
• with one or more nested <user/> elements, to add a number of users.
<database/>, page 318 The database that contains the new user(s). No.
Examples
<backup/>
This Ant task creates an online (hot) backup of the federation. This requires that the database server
is running.
Example
The following example passes a reference to a federation as a database reference.
<target name="make-a-backup">
<backup databaseref="MyFederation.ref" file="new_backup.db" />
</target>
<batchindexadder/>
This Ant task adds multiple indexes in one batch operation, using the XhiveIndexAdderIf interface.
For general information about indexing, refer to Indexes, page 150.
Index Description
<database/> Specifies a database and library where the index(es) must be added. For the
library, use a nested <library/> element (see example below).
<pathvalueindex/> Add a path value index.
<multipathindex/> Add a multi path index.
Index Description
<elementindex/> Add an element index.
<fulltextindex/> Add a full-text index.
<idattributeindex/> Add an ID attribute index.
<libraryidindex/> Add a library index.
<metadatafulltextindex/> Add a metadata full-text index.
<metadatavalueindex/> Add a metadata value index.
<valueindex/> Add a value index.
Example
<target name="add-indexes">
<batchindexadder>
<database refid="MyDatabase">
<library path="/MyLibrary"/>
</database>
<metadatavalueindex name="MyMVIndex"
key="bla"/>
<valueindex name="MyValueIndex" elementURI="http://www.x-hive.com/ns"
elementName="title"/>
...
</batchindexadder>
</target>
API documentation
com.xhive.index.interfaces.XhiveIndexAdderIf.html
<checkdatabase/>
This task uses the checkDatabaseConsistency API call to check the consistency of a database.
Parameters
Example
The example below checks the consistency of a database.
<target name="checkdatabase-target">
<checkdatabase destination="NewName">
<database bootstrap="MyBootstrapFile"
name="MyDatabase"
user="Administrator"
password="MyPassword"
Checkdomnodes="false"
CheckIndexes="false"
BasicCheckIndexes="true"
CheckAdministrationPages="false"
CheckSegmentPages="true"
CheckPageOwner="false"/>
</checkdatabase>
</target>
API documentation
com.xhive.index.interfaces.XhiveConsistencyCheckerIf.html#checkDatabaseConsistency()
<checkfederation/>
This task uses the checkFederationConsistency API call to check the consistency of a federation.
Parameters
Example
</checkfederation>
</target>
API documentation
com.xhive.index.interfaces.XhiveFederationConsistencyCheckerIf.html#checkFederationConsistency()
<checklibrarychild/>
Parameters
Example
<database bootstrap="MyBootstrapFile"
name="MyDatabase"
user="Administrator"
password="MyPassword" >
<library path="/existingLib" />
</database>
</checklibrarychild>
</target>
<checknode/>
This Ant task checks the consistency of a node.
Parameters
This task uses the API call XhiveFederationConsistencyCheckerIf.checkNodeConsistency.
The following optional parameters can be specified as nested elements:
Example
<closedriver/>
This Ant task closes a federation driver. Note: The driver is also closed when the JVM of the Ant
process exits.
Other xDB Ant tasks do not close the XhiveDriver after they run, to avoid the performance overhead
associated with closing and opening a driver between subsequent xDB tasks. This can become a
problem if you wish to use xDB Ant tasks, and spawn a new process that will use the database inside
the same Ant process. For example, deploying a servlet using the federation driver on Tomcat.
Note:
Example
<copydatabase/>
This Ant task copies a federation database.
If the destination for the copy of the database exists, the task raises an
XhiveException.DATABASE_EXISTS.
Note: The copy process does not copy empty pages, so the copy of the database may be smaller that
the original. Content conditioned indexes are not copied; only their definition remains.
Example
<createdatabase/>
This Ant task creates a federation database with a default configuration. If a database with the same
name already exists, the Ant execution script will fail unless failonerror is set to false.
failonerror Fail the task if the database already exists. No - default is true
Example
<!-- Declare a federation with an ID, which is used in the database creation task. -->
<federation id="federationId" bootstrap="${xhive.bootFilePath}"
password="${xhive.superpwd}" />
<target name="create-database">
<createdatabase name="${database}" dbapassword="${password}">
<federation refid="federationId" />
</createdatabase>
</target>
<createfederation/>
This Ant task creates a federation. The bootstrap file, superuser password and license key can be set
explicitly by inserting a <federation/> element, page 319.
logdir A comma separated list of paths to the log file directories No - default path is log
of the new federation. If specified, the first path in the list
represents the primary log directory.
Relative paths, if used, are resolved relative to the
directory of the bootstrap file.
Example
<createlibrary/>
This Ant task creates a library in a database.
documentslock Documents in the new library lock with the parent. No - default is true
lockwithparent The new library locks with its parent. No - default is false
<database/>, page 318 The database where the new library is created. No
Example
<deletedatabase/>
This Ant task deletes a database in a federation. If a database with the given name does not exist, Ant
stops execution, unless the quiet attribute is set to true, or the failonerror attribute is set to false.
failonerror Stop execution if database name does not exist. No - default is true
Example
The example below deletes a database.
<target name="DeleteMyDatabase" depends="init">
<deletedatabase name="MyDatabase">
<federation refid="MyFederation"/>
</deletedatabase>
</target>
<deletegroup/>
This Ant task deletes one or more groups from a database.
There are two mutually exclusive ways to use this task:
• Using a name attribute to add a single group.
• Using one or more nested <group/> elements to delete a number of groups.
Attributes
name The name of group to delete. The name attribute cannot Yes.
be used in conjunction with nested <group/> elements.
quiet Specifies whether output about the task progress should No.
be displayed. The default value is false.
Parameters
The following optional parameter can be specified as nested elements:
<database/> The database that contains the group that is deleted. No.
Example
The following example uses the name attribute to delete a group:
<target name="delete-one-group">
<deleteindex/>
This Ant task deletes an index.
failonerror Stop execution if the specified name does not exist. No - default is true
<database/>, page 318 The database containing the library where the index is No.
deleted.
Example
The example below uses a nested <database/> with a nested <library/> element.
<target name="DeleteMyIndex" depends="init">
<deleteindex name="MyIndex">
<database refid="MyDatabase">
<library path="/MyLibrary"/>
</database>
</deleteindex>
</target>
<deletelibrary/>
This Ant task deletes a library. If the specified library does not exist, execution stops, unless the quiet
attribute is set to true or the failonerror attribute is set to false.
failonerror Stop the Ant task if the library does not exist. No - default is true
<database/>, page 318 The database from which to delete the library. No
<deleteuser/>
This Ant task deletes one or more users from a database. There are two mutually exclusive ways
to use this task:
• Using a name attribute to delete a single user.
• Using one or more nested <user/> elements to delete a number of users.
Example
<deserialize/>
This Ant task deserializes a library child from a specified file. The library child in the source file
becomes the last child of the target library.
Note: If no target library is specified, the deserialized library replaces the current root library of
the database.
Attributes
Example
<deserialize-users/>
This Ant task deserializes all users and groups of a database, replacing the current users and groups.
Example
<elementindex/>
This Ant task adds an element index to a library.
exists What to do if an index with the same name already No - default is skip
exists. Accepted values are: skip - do not create a
new index, overwrite - delete the existing index
with the same name and create the newly specified
index, fail - fail the Ant task.
Example
The example below contains a nested <database/> element with a nested <library/> element.
<target name="AddMyElementIndex" depends="init">
<elementindex name="MyElementIndex">
<database refid="MyDatabase">
<library path="/MyLibrary"/>
</database>
</elementindex>
</target>
<exportlibrary/>
This Ant task exports a library into a database directory.
Attributes
Parameters
The following parameters can be specified as nested elements:
Example
The following example contains a nested <database/> and a nested <library/> element.
<target name="ExportMyLibrary" depends="init">
<exportlibrary destdir="c:/exportdata">
<database refid="MyDatabase">
<library path="/MyLibrary"/>
</database>
</exportlibrary>
</target>
<fulltextindex/>
This Ant task adds a value full-text index to a library.
Attributes
supportscoring Boolean attribute that specifies whether the use No - default is true
of scoring is supported.
Parameters
The following parameters can be specified as nested elements:
Example
The following example contains a nested <database/> and a nested <library/> element.
<target name="AddMyFulltextIndex" depends="init">
<fulltextindex name="MyFullTextIndex"
elementName="title"
alltext="true">
<database refid="databaseRef">
<library path="/MyLibrary/SubLib"/>
</database>
</fulltextindex>
</target>
<idattributeindex/>
This Ant task adds an ID attribute index on a library.
Attributes
unique Boolean attribute that specifies whether to use unique keys. No - default is false
exists What to do if an index with the same name already exists. No - default is skip
Accepted values are: skip - do not create a new index,
overwrite - delete the existing index with the same name
and create the newly specified index, fail - fail the Ant task.
Parameters
Example
The following example contains a nested <database/> and a nested <library/> element.
<target name="AddMyIDattrbuteIndex" depends="init">
<idattributeindex name="MyIDattrbuteIndex">
<database refid="MyDatabase">
<library path="/MyLibrary"/>
</database>
</idattributeindex>
</target>
<libraryidindex/>
This Ant task adds a library ID index to a library.
Attributes
exists What to do if an index with the same name already exists. No - default is skip
Accepted values are: skip - do not create a new index,
overwrite - delete the existing index with the same name
and create the newly specified index, fail - fail the Ant task.
Parameters
Example
The following example contains a nested <database/> and a nested <library/> element.
<target name="AddMyLibaryIDindex" depends="init">
<libraryidindex name="MyLibraryIDindex">
<database refid="MyDatabase">
<library path="/MyLibrary"/>
</database>
</libraryidindex>
</target>
<listindexes/>
This Ant task lists all indexes for a database library path.
Attributes
Parameters
The following parameters can be specified as nested elements:
Example
The following example contains a nested <database/> and a nested <library/> element.
<target name="list-indexes">
<listindexes info="true">
<database refid="MyDatabase.ref">
<library path="path/to/SomeLibrary"/>
<library path="anotherLibrary"/>
</database>
</listindexes>
</target>
<metadatafulltextindex/>
This Ant task adds a metadata full-text index to a library.
Attributes
Parameters
Example
The following example contains a nested <database/> and a nested <library/> element.
<target name="AddMyMFTIndex">
<metadatafulltextindex name="MyMFTIndex"
key="bla">
<database refid="MyDatabase">
<library path="/MyLibrary"/>
</database>
</metadatafulltextindex>
</target>
<metadatavalueindex/>
This Ant task adds a metadata value index to a library.
Attributes
unique Boolean attribute that specifies whether to use unique keys. No - default is false
valuetype Specifies the indexed key value type, page 161. The default is string.
exists What to do if an index with the same name already exists. No - default is skip
Accepted values are: skip - do not create a new index,
overwrite - delete the existing index with the same name
and create the newly specified index, fail - fail the Ant task.
versioninfo Boolean attribute that specifies whether to store version No - default is false
information in the index.
Parameters
Example
The following example contains a nested <database/> and a nested <library/> element.
<target name="AddMyMVIndex">
<metadatavalueindex name="MyMVIndex"
key="bla">
<database refid="MyDatabase">
<library path="/MyLibrary"/>
</database>
</metadatavalueindex>
</target>
<metadata/>
This Ant task can set or unset metadata (XhiveMetadataIf) on a document, library or BLOB.
Attributes
path The path of the library child whose metadata is going to Yes.
be modified.
delete Whether to delete the metadata key or not. No. This attribute can
only be set to true or
false
Parameters
The following parameters can be specified as nested elements:
<database/> The database containing the library from which the index No.
is deleted.
Example
The following example change the metadata of the library child at path
"/myLib/MyOtherLib/document.xml" by setting the metadata field "sauce" to the value
"pommodoro".
<metadata key="sauce" value="pommodoro" path="/myLib/MyOtherLib/document.xml">
<database refid="test.database" />
</metadata>
<multipathindex/>
This Ant task adds a multipath index to a library child. For information about multipath indexes,
refer to Multipath indexes, page 153
Example
The example below contains two nested <subpath/> elements, a nested <database/> and a nested
<library/> element.
<multipathindex name="my-multipath-index"
path="/mainXPath"
analyzer="com.emc.textanalysis"
scorecustomizer="com.emc.scorecustomize"
lowercase="false"
stopwords="false">
<database refid="databaseRef">
<library path="/existingLib" />
</database>
<subpath xpath="line"
compressed="true"
returningcontents="true"
getalltext="true"
enumerateelements="true"
startendmarkers="true"
leadingwildcard="true"
fulltextsearch="true"
valuecomparison="true" />
<subpath xpath="bar" type="int" scoreboost=".5" valuecomparison="true"" />
</multipathindex>
<parse/>
This Ant task parses files into a library. Include the <fileset/> element to indicate which files to parse.
The parse task copies the directory structure as a library structure into the target library, unless the
flatten attribute is set to true.
Attributes
Parameters
The following optional parameter can be specified as nested elements:
<database/> The database to which the library with the parsed files is No
added.
Example
<target name="ParseInMyLibrary" depends="init">
<parse>
<fileset dir="data">
<include name="**/*.xml"/>
</fileset>
<database refid="MyDatabase">
<library path="/MyLibrary"/>
</database>
</parse>
</target>
<pathvalueindex/>
This Ant task adds a path value index to a library.
Attributes
unique Boolean attribute that specifies whether to use unique keys. No - default is false
exists What to do if an index with the same name already exists. No - default is skip
Accepted values are: skip - do not create a new index,
overwrite - delete the existing index with the same name
and create the newly specified index, fail - fail the Ant task.
versioninfo Boolean attribute that specifies whether to store version No - default is false
information in the index.
Parameters
The following parameters can be specified as nested elements:
Example
The following example contains a nested <database/> and a nested <library/> element. The index is
created with path "/foo/bar[@x<INT>]". To insert literal < and > characters into an Ant build file,
use the < > notation.
<target name="create-path-index" depends="init">
<pathvalueindex name="MyPathValueIndex" path="/foo/bar[@x<INT>]">
<database refid="MyDatabase">
<library path="/MyLibrary"/>
</database>
</pathvalueindex>
</target>
<registerreplicator/>
This Ant task registers a replicator in a federation.
Attributes
Parameters
The following optional parameter can be specified as nested elements:
Example
The following example registers a replicator.
<target name="register-MyReplicator">
<registerreplicator name="MyReplicator">
<federation refid="test.federation"/>
</registerreplicator>
</target>
<renamedatabase/>
This Ant task renames a database in a federation.
This task uses the XhiveDatabaseIf.renameDatabase API call and renames the database but not the
database files. A database can be renamed safely by first using the <copydatabase/> task and later
deleting the original database.
Attributes
Parameters
The following optional parameters can be specified as nested elements:
Example
<target name="renamedatabase-target">
<renamedatabase destination="NewName">
<database bootstrap="MyBootstrapFile"
name="MyDatabase"
user="Administrator"
password="MyPassword" />
</renamedatabase>
</target>
<replicatefederation/>
This Ant task replicates the whole federation. This task only performs an initial duplication. It is a
simultaneous standalone backup and a restore process.
In order to move the federation to a new location, change the location of the bootstrap file. Set the
relativepath attribute set to true, because all paths in the original federation are first made relative and
then set to the directory of the new bootstrap file.
Attributes
Parameters
The following optional parameters can be specified as nested elements:
<restore/>
This Ant task restores a federation from a backup. The <restore/> task does not overwrite existing
files. To restore incremental backups, this task must be used to restore the last full backup first, then
for each incremental backup in the order they have been created. Do not restart the server during
the restore procedure.
Attributes
Example
<target name="restore-mybackup">
<restore file="lastBackup.db" bootstrap="MyBootstrapFile" />
</target>
<serialize/>
This Ant task serializes a library child into an output file.
Example
The following example serializes data from the MyDatabase database into the file MyLibrary.xhd.
<target name="SerializeMyLibrary" depends="init">
<serialize file="c:/MyLibrary.xhd">
<database refid="MyDatabase">
<library path="/MyLibrary"/>
</database>
</serialize>
</target>
<serialize-users/>
This Ant task serializes all users and groups of a database.
file The destination file for the serialized users and groups. Yes
<database/>, page The database containing the users and groups to serialize. No
318
Example
<session/>
The session Ant task is a container task that can contain other Ant tasks. The nested tasks are executed
inside a single XhiveSession.
Attributes
Examples
The <session/> task can contain any other xDB Ant tasks specified as nested elements. The nested
Ant tasks cannot be a combination of Ant tasks that require superuser permissions and Ant tasks that
do not require superuser permission. Ant tasks operating at the federation level require a nested
<federation/> element.
<target name="use-session" depends="init">
<session databaseref="database.RefId">
<createlibrary name="testLib"/>
<createlibrary name="testLib2"/>
</session>
</target>
<target name="use-session" depends="init">
<session>
<database refid="test.database"/>
<createlibrary name="testLib3" />
<createlibrary name="testLib4" />
</session>
</target>
<target name="test-session-fed">
<session databaseref="federation.RefId">
<createdatabase name="MyDB1" dbapassword="${password}"/>
<createdatabase name="MyDB2" dbapassword="${password}"/>
<createdatabase name="MyDB3" dbapassword="${password}"/>
</session>
</target>
<setmaxfilesize/>
The setmaxfilesize Ant task sets the maximum size of a segment data file.
path The full path (case sensitive) of the data file (as produced Yes
by the show-segment command).
maxsize The maximum size (in bytes) that the file is allowed to Yes
grow to. A value of 0 means the size is unlimited.
Example
The example below sets the max file size of the newly created data file to 200000 bytes:
<target name="setFileMyMaxSize">
<addsegment segmentid="newsegment" databaseref="test.database" />
<addsegmentfile segmentid="newsegment" databaseref="test.database"
maxsize="0" /> <property name="segment.fullname"
location="..${file.separator}data${file.separator}MyDatabase2-newsegm
<setmaxfilesize segmentid="newsegment" databaseref="test.database" path="${segment.fu
</target>
<unregisterreplicator/>
This Ant task cancels a replicator registration in a federation. Log files are no longer preserved for this
replicator.
Attributes
name The name of the replicator for which the registration is Yes
canceled.
Parameters
Example
<updatefederation/>
This Ant task updates the xDB license key of a federation. The updatefederation task closes the
xDB driver.
Attributes
Example
The following example updates the license key in the bootstrap file.
<target name="update-federation">
<updatefederation bootstrap="MyBootstrapFile"
password="secret"
licensekey="MyLicenseKey"/>
</target>
<upload/>
This Ant task uploads files into a library. To indicate which files to upload, an Ant fileset element must
be included. This task can upload DOM Documents as well as BLOBs into the database. The task
copies the directory structure as a library structure in the target library unless the flatten attribute is
set to true.
Attributes
Parameters
Example
<upload xmlextensions="xml,xhtml">
<fileset dir="${data.dir}">
<include name="fbooks/*" />
</fileset>
<database refid="MyDatabaseRef">
<library path="/testLib" />
</database>
</upload>
<valueindex/>
This Ant task adds a value index to a library.
Attributes
unique Boolean attribute that specifies whether to use unique keys. No - default is false
exists What to do if an index with the same name already exists. No - default is skip
Accepted values are: skip - do not create a new index,
overwrite - delete the existing index with the same name
and create the newly specified index, fail - fail the Ant task.
versioninfo Boolean attribute that specifies whether to store version No - default is false
information in the index.
Parameters
Example
The following example contains a nested <database/> and a nested <library/> element.
<target name="AddMyValueIndex" depends="init">
<valueindex name="MyValueIndex"
elementURI="http://www.x-hive.com/ns"
elementName="title">
<database refid="MyDatabase">
<library path="/MyLibrary"/>
</database>
</valueindex>
</target>
<xquery/>
This Ant task executes an XQuery in the context of a library. The query results can be stored in an Ant
property, written to a file, or logged in the Ant build. The query must be given using a nested <query/>
element, XQuery external variables can be set using nested <param/>s.
Attributes
Parameters
Example
The following example runs an xquery wrapped in a CDATA section, and stores the result in a
property. The xquery takes an external variable ($addressee), which is supplied with a <param/>
element, which in turn uses an Ant property. The library is specified by a nested <database/> element
with a nested <library/> element.
<target name="run-my-xquery" depends="init">
<propert name="greeting.name" value="World"/>
<xquery outputproperty="xquery.result">
<query><![CDATA[
declare variable $addressee external;
’Hello, ’, $addressee]]></query>
<param name="addressee" value="${greeting.name}"/>
<database refid="MyDatabase">
<library path="/MyLibrary"/>
</database>
</xquery>
</target>
Example
This example loads the query from a file and stores the result in a file.
<target name="run-my-xquery-file" depends="init">
<propert name="greeting.name" value="World"/>
<xquery outputfile="/tmp/xquery-out.txt">
<query file="query.xq"/>
<param name="addressee" value="${greeting.name}"/>
<database refid="MyDatabase">
<library path="/MyLibrary"/>
</database>
</xquery>
</target>
BLOBs, 45 troubleshooting, 72
catalogs, 44
checking consistency, 284
configuration file, 46 E
connecting, 85, 271
creating, 42, 62 element name index, 167
documents, 44 element name index example, 167
files, 46 exporting
groups, 44 documents, 239, 272
indexes, 44 libraries, 239, 272
libraries, 44 external editors, 71
referencing objects, 144
segments, 45
superuser, 42 F
users, 44 failover, 294
database configuration, 86 federation, 42
dedicated page server, 70 creating, 250
dedicated page server program, 92 creating replica, 290, 293
detach, 274 log, 43
detach point, 47 read-only, 281
detachable replicating metadata, 291
library, 274 sets, 281
detachable library, 272 creating, 282
unusable, 275 using, 92, 282
disconnect() method, 143 file system performance, 77
distributed deadlock, 305 files, 46
document fips, 67
branching, 41 FTP, 71
creating, 108 full-text index, 163
exporting, 120, 239, 272 create, 164
linking, 40 full-text queries, 195
normalizing, 102 anyall options, 197
parsing, 99 Boolean queries, 202
parsing with context, 100 cardinality option, 198
publishing limitations, 195
PDF, 121 positional filters, 197
using XSLT, 121 score calculation, 199
retrieving, 109 score variables, 198
by ID, 111 thesaurus, 196
by library path, 112 thesaurus handler, 196
by name, 111 wildcards, 195
previous versions, 124 xhive:fts function, 201
using indexes, 112 functions
using XQuery, 112 external, 177
storing, 103
traversing, 116
using DOM, 116 G
using function objects, 119
validating, 101, 224 group
XQuery access, 178 managing, 47, 90
documents
editing, 240
DOM H
configuration, 103 high-availability, 314
retrieving documents, 110 host replacement, 314
support, 39 hot backup, 262, 268
DTD
managing, 71
I L
ID attribute index, 166 lazy replication, 289
id collision, 274 leave() method, 143
incremental backup, 262–263, 268 library
index, 153–154, 160–161 backing up, 267
adding, 241 using API, 269
concurrent, 167–168 creating, 88
context conditioned, 168 detach point, 46
element name, 167 detachable, 46, 272
full-text, 163–164 exporting, 239, 272
ID attribute, 166 ID index, 165
ignoring, 170 index, 165
library, 165 metadata, 126
library ID, 165 name index, 165
library name, 165 restoring
live, 150 using API, 269
metadata full text, 164 root, 44
metadata value, 164 unusable, 275
optimizing performance, 170 XQuery access, 178
path, 151 locking
query performance, 150 context, 133
scope, 170 namebase, 134
selectivity, 170 locking context, 133
types, 150 locking rules
using with XQuery, 188 multi-node configuration, 305
value, 161–162 log, 42
log files, 43
lucene, 67
J lucene blobs, 153
JAAS, 97
Java, 49
Java command line
classpath, 50
JDK, 49
join() method, 143
K
keep-log-files, 43, 263
M P
master, 289 page cache, 57
memory, 66 page server, 38, 297
message logging, 287 configuring, 70
java.util.logging, 286 page server port, 56
message logging areas, 287 Page server settings, 67
metadata parallel queries, 215
backup, 266, 270 parsing, 223
indexing, 191 path index, 151
replicating, 291 specification, 152
metadata full text index, 164 performance
metadata value index, 164 cachepages, 75
model configuring JVM and
adding, 222 cache pages, 75
linking, 222 disabling disk-write caches, 78
models, 44 file system, 77
monitoring internal server, 75
statistics, 277–278 multiple disks, 77
move, 274 page size, 77
multi-node architecture, 297 parallel queries, 215
multi-node configuration using indexes, 192
API examples, 310 XQuery tuning, 204
applications, 306 primary copy replication, 289
bootstrap file, 299 primary server, replacing, 315
node server, 297 property
primary server, 297 xhive.bootstrap, 69
upgrade, 301 PSVI, 194, 224
multi-node locking rules, 305 PSVI information, 225
multipath index, 153–154, 160–161
Lucene segment, 153
merge, 156 Q
specification, 155
sub-index, 153 queries
Multipath Index Limitations, 160 running, 241
multipath index merge query
performance, 156 preparing, 217
queryable, 125
quick start, 35
N
namebase, 134
naming convention, 274
node identity
changing, 314
node identity change, 314
node versioning, 125
non-XML data, 106
O
offline backup, 262, 268
onine backup, 262, 268
online backup method, 268
OSGi, 96
R S
RAM segment, 281 sample
range queries, 191 running, 84
read-only federations, 281 scope, 86
read-only transactions, 148 score customization, 161
rename, 274 search
replica versions, 125
creating, 290, 293 segment
replication temporary, 46
lazy primary copy replication, 289 segments, 45
moving master, 292 serialization, 239, 272
removing replica, 292 server, 38, 56
running replicator, 291 Server.lax file, 68
using as failover, 294 session, 86
REST API, 246 sessions, 133–134
restore() method, 269 connect() method, 142
restoring createSession() method, 141
from log file, 265 disconnect() method, 143
restoring backups, 262 join() method, 143
rollback() method, 142 joined, 136
RPC tracing, 78 leave() method, 143
console, 80 lifecycle, 134
file, 80 locking conflicts, 146
session level, 81 pools, 136
system level, 80 terminate() method, 143
transaction isolation, 146
slave, 289
snapshot backup, 262, 266, 268
SSL, 283
standalone backup, 262, 268
statistics, 277–278
superuser, 42
T
temporary data, 46
terminate() method, 143
terms, 201
trace file properties, 80
tracing
RPC, 78
transaction log, 289
transaction recovery
multi-node, 299
transactions, 86, 134
begin() method, 142
checkpoint() method, 142
commit() method, 142
distributed deadlock, 305
locking, 133
namebase and locking, 134
read-only, 148
rollback() method, 142
U xhive-ant, 318
xhive:fts function extends XQuery, 201
Unix XHIVE_HOME, 67
background server, 70 XhiveGroupIf, 91
unusable detachable library, 275 XhiveGroupIf interface, 91
upgrade XhiveGroupListIf, 91
multi-node configuration, 301 XhiveGroupListIf interface, 91
user XhiveUserIf interface, 90
managing, 47, 90 XhiveUserListIf interface, 90
users XLink, 40, 122
deserializing , 239 XQuery, 39
serializing, 239 accessing documents and libraries, 178
collation support, 208
collection(), 178
V data model, 213
validated parsing, 223 doc(), 178
value index, 161–162 error reporting, 178
type, 161 extending using Java, 218
versioned document, 123 extension expressions, 179
versioning, 123 extension function xhive:highlight, 186, 188
node, 125 extension functions, 182
versions external variables, 175
search, 125 full-text queries, 194
full-text support, 208
implementation, 199
W instance methods, 218
Java objects, 218
web client, 246 limitations, 219
Windows service, 56 methods, 173
modules, 209
multiple indexes, 191
X name element index, 189
namespace declarations, 213
xDB options, 179
commands, 61 parallel queries, 215
features, 37 preparing queries, 217
installing on UNIX, 59 proprietary extensions, 193
installing on Windows, 50 range queries, 191
uninstalling, 61 security, 209
xdb admin command, 62, 228 security policy, 209
xdb backup command, 62, 263 supported, 207
xdb backup-library command, 267–268 type checking, 219
xdb command unsupported, 207
syntax, 247 updates, 211
xdb configure-federation command, 62 using, 173
xdb create-database command, 62, 251 using indexes, 188
xdb create-federation command, 62, 250 using type information, 194
xdb delete-database command, 62 using type information sample, 215
xdb info command, 62, 148, 251 value index, 189
xdb restore command, 62, 264 XML Schema, 210
xdb restore-library command, 267 XQuery collation support
xdb run-server command, 62 java, 206
xdb run-server command on Unix, 70
xdb stop-server command, 62
xdb suspend-diskwrites command, 62
xdb.properties file, 66
xhive-ant
run java samples, 84
xhive.bootstrap property, 69