2005 10 MarkLogic Server Government
2005 10 MarkLogic Server Government
Defense, intelligence, and law enforcement agencies collect a tremendous volume of raw data. For these agencies, the fast, accurate analysis of this raw data to produce actionable intelligence is vital to their national security missions. With the increasing ow of textual content emails, eld reports, immigration records, open source content streams, Web content, and the like government agencies are forced to reevaluate their analysis strategies. What may have worked before is no longer viable. Solutions based on open, industry-standard XML technologies are providing more powerful ways to integrate, discover, analyze, and share actionable intelligence. Load content as is without predened DTDs or XML schemas Automatic conversion to XML makes shredding or chunking documents a thing of the past by eliminating the time-consuming and costly rst step most organizations experience when attempting to format, load, and process content. MarkLogic Server loads XML content as is and converts popular document formats including Microsoft Ofce, HTML and Adobe PDF into structured XML all without requiring adherence to predened DTDs or schemas. In addition, MarkLogic Server is compatible with content archiving and interchange initiatives such as the Intelligence Community Metadata Working Group (ICMWG) and the Department of Defense Discovery Metadata Standard (DDMS), enabling full compliance with cross-agency standards for content integration and sharing. Create powerful, custom content processing pipelines Using the built-in Content Processing Framework, MarkLogic Server lets organizations dene sequences of content processing steps and seamlessly incorporate functions such as document categorization, entity extraction, or linguistic analysis. MarkLogic Server can execute sequences of native XQuery statements plus call out to Web services-enabled external applications within the content processing ow. Support mission-critical, multi-terabyte contentbases Robust, enterprise-class capabilities allow government organizations to condently deploy MarkLogic Server for their most mission-critical content assets. MarkLogic Server features high availability, error recovery, and cluster monitoring as well as a comprehensive administrator interface. analysis solution. Use XQuery to perform detailed, highly precise queries and content processing tasks that leverage all XML structural elements, with or without a formal DTD or XML schema. MarkLogic Server delivers millisecond response times against terabyte-scale contentbases.
Rapidly query and analyze large contentbases MarkLogic Server combines XML element query, XML proximity search, and full-text search to create a scaleable, fast, and complete content retrieval and
XML and XQuery XML is revolutionizing the way organizations integrate, store, access, and analyze content, enabling them to more easily repurpose content and create custom documents and views that speed analysis and provide deeper insights. With cross-agency initiatives such as the Intelligence Community Metadata Working Group, XML is rapidly becoming the standard for information sharing within the defense and intelligence communities, enabling diverse organizations with unique mission requirements to more easily access and mine the vast amount of contentbased information that must be sifted to discern vital, actionable intelligence. XQuery is a query language being developed by the World Wide Web Consortium (W3C) for querying and manipulating XML data. It is an open standard endorsed and supported by many leading technology companies. XQuery is a powerful step forward in content processing technology and is on its way to widespread adoption. MarkLogic Server features the industrys most powerful XQuery implementation that will not only search, but also transform, query and manipulate content. Users can ask for and get exactly the results they want, while XQuery searches both the content and the structure of that content. XQuery can isolate and extract specific portions of XML content from multiple sources, analyze and manipulate the extracted content, and dynamically generate new content.
Standards XQuery 1.0 (May 2003) XML 1.0 XPath 2.0 (May 2003) XML Schema 1.0 XML Namespaces 1.0 Ingestion and Conversion Schema-independent, automatic indexing of content and structure Automatic conversion from: - Microsoft Office 97 or later (Word, PowerPoint, Excel) - Adobe PDF - HTML Other format conversions via third-party applications Content Processing Framework Programmable, event-driven automation Content processing pipelines Web-services integration Search Full-text, relevance-ranked XML search, including ordered/unordered Boolean, wildcards, proximity, stemming, thesaurus, spell check Programmable highlighting Full-text and XML search via XQuery XQuery Support Complete XQuery specification High-performance, optimizing XQuery evaluator Dynamic XPath optimization includes multi-step paths, complex predicates, // and unions Extensive, built-in libraries support update, search and other functions Storage Multi-terabyte XML scalability Transactional element-level update Automatic directory creation and management XML, text and binary documents Metadata sheet for every document Failover Performance and Scalability Designed for modern processor architectures Multi-threaded, high-performance C++ implementation Single host or clustered configuration APIs and Integration Java and .Net APIs Embedded HTTP and WebDAV server XQuery-level HTTP, SOAP and SMTP access Administration Web-based administrator interface Hot, cluster-wide administration, backup and restore Role-based security Two-minute installation Operating Systems and Platforms Red Hat Linux ES3 on AMD Opteron and x86 architectures Windows Server 2000 and Windows Server 2003 on x86 architectures Sun Solaris 8 and 9 on SPARC architectures
Copyright 2005 Mark Logic Corporation. Mark Logic is a registered trademark and MarkLogic Server is a trademark of Mark Logic Corporation, all rights reserved. All other product names mentioned herein are the property of their respective owners. 08/05