Biovia-Pipeline Pilot-Documents-And-Text
Biovia-Pipeline Pilot-Documents-And-Text
TEXT COLLECTION
The Documents and Text Collection for Pipeline Pilot brings Chemical Text Mining
together the utility of document search, analysis and • For biology, identify Amino Acids, Proteins, DNA, RNA, Cell
manipulation with the power of process automation and data Lines, and Cell Types.
integration in Pipeline Pilot. The Doc and Text collection uses • For chemistry, identify chemical names and embedded
natural language processing, advanced statistical modeling, structures in text and convert these to structures with the
and subject-specific ontologies and thesauri to uncover patterns included OpenEye name-to-structure converter.
and extract information from text content. Additionally, it ◦◦ Names: systematic names (IUPAC, InChI, and SMILES),
provides extensive capabilities for managing and manipulating formulae, family names, abbreviations, identifiers, CAS
documents. The Documents and Text Collection can help numbers, and non-systematic (trivial) names
you achieve your most challenging document analysis and
◦◦ Embedded Structures: BIOVIA Draw, ISIS/Draw,
manipulation objectives by linking together search, analysis and
ChemDraw and Accord molecules
reporting steps into automated routines. Integrate literature
mining and text analytics with your existing scientific protocols, • Create chemically-aware search applications by mapping
and run them interactively or automatically every night. structures to documents in a structure-searchable
text database.
THE DOCUMENTS AND TEXT COLLECTION,
PROVIDES A SUITE OF CAPABILITIES FOR:
Documents and Search
• Search PubMed, US and European Patents, Twitter, Bing,
websites, SharePoint documents, local files and enterprise
databases and integrate with third-party search engines, to
find the documents of highest value and importance
• Search multiple data sources easily using a single query
language with phrase, wildcard, fielded and synonym
matching, allowing searches to be created and maintained
efficiently
• Create searchable databases of key documents for ongoing
review during the course of a project
• Create rich, interactive reports to explore and mine
your documents and text analyses, including specialist
visualizations such as Tag Clouds, highlighted terms in
reports and document clustering
• Generate Word reports from templates by automatically
adding content to Microsoft Word documents to streamline
the creation of internal reports and external regulatory
documents, reducing error-prone manual document editing
• Do web crawling to extract content and tabular data from
the Web
Text Analytics
• Analyze documents / text to extract key concepts to find
correlations in documents and online literature for competitive
intelligence and to support primary research findings
• Automate information extraction and summarization
• Use relationships between documents for greater
understanding
SEARCH: LOCAL, ONLINE & ENTERPRISE ANNOTATE SCIENTIFIC RESULTS
DOCUMENTS When reporting the results of pipelined data analyses, it is often
The Documents and Text Collection gives you the power to useful to include additional information about the output data
extract knowledge from important online document resources points. With the Documents and Text Collection, you can easily
such as PubMed, US and European Patents, Bing, TOXNET, add a few steps at the end of any Pipeline Pilot protocol and
Twitter and Wikipedia (user extendable to other remote have each data point serve as a query to search a database of
text data sources). Search these databases with interactive literature. For example, after clustering a set of genes with the
queries, or mine them with large-scale document retrieval Biology Collections, you can annotate each gene with summary
and characterization routines. You can also search and mine information from its top reference in PubMed (and a link to
internal documents in exactly the same way. The Documents further search results). This kind of enhancement makes for
and Text Collection indexes and searches folders that contain more easily interpretable results. Also, you can create your
PDF, Microsoft Office, HTML, or text files (extendable to other own custom document templates in Microsoft Word, and use
file formats). You can even store the results of online searches the Documents and Text Collection to surgically fill in scientific
in local repositories for speedy retrieval and post processing. data and charts to generate beautifully styled final documents.
Local databases of documents stay current automatically by IDENTIFY EMERGING TRENDS
monitoring the folder contents for the introduction of any new The Documents and Text Collection can monitor the scientific
or edited documents. literature for topics of interest, and it can even alert you
CHEMICALLY-AWARE SEARCH when new concepts are emerging for those topics. The latter
The Documents and Text Collection uses a chemically aware is achieved by searching for new articles about your topic
search algorithm to identify and convert chemical names to of interest and detecting the concept words they contain.
structures and enable researchers to query for information in a The association of each concept with the topic of interest is
more intuitive and effective way with substructure, similarity, calculated over time to detect emerging new relationships. This
SMILES, and IUPAC name searches. Also included is a complete allows you to stay on top of a broader class of topics, and learn
end-user application, “Chemically Aware Search”, for indexing about breakthroughs before they become widely known.
and searching collections of documents containing structures. MINE PATENT DATABASES
It allows you to identify a set of interesting documents, The Documents and Text Collection provides you with the tools
download the content to a local database and index the necessary to characterize research and intellectual property
documents for text or structure search, all without the need to trends in a field of interest. You can search and process the
author protocols or additional configuration. U.S. patent databases (extendable to other patent databases)
for trends reflecting the quantity of patents, application areas,
companies engaged, and more. For example, by building a
protocol to process patents in the field of fuel cells, you can
discover how rapidly this emerging field is growing. You
can also see that applications for automobiles have come to
dominate the area and that Honda and General Motors are
leading innovators.
or registered trademarks of Dassault Systèmes or its subsidiaries in the U.S. and/or other countries. All other trademarks are owned by their respective owners. Use of any Dassault Systèmes or its subsidiaries trademarks is subject to their express written approval.
©2016 Dassault Systèmes. All rights reserved. 3DEXPERIENCE®, the Compass icon and the 3DS logo, CATIA, SOLIDWORKS, ENOVIA, DELMIA, SIMULIA, GEOVIA, EXALEAD, 3D VIA, 3DSWYM, BIOVIA, NETVIBES, and 3DEXCITE are commercial trademarks
Find important documents in the scientific literature (or your local files), detect and extract key concepts, and derive correlations
and trends that may provide new insights.
Our 3DEXPERIENCE® platform powers our brand applications, serving 12 industries, and provides a
rich portfolio of industry solution experiences.
Dassault Systèmes, the 3DEXPERIENCE® Company, provides business and people with virtual universes to imagine sustainable innovations. Its
world-leading solutions transform the way products are designed, produced, and supported. Dassault Systèmes’ collaborative solutions foster social innovation,
expanding possibilities for the virtual world to improve the real world. The group brings value to over 190,000 customers of all sizes in all industries in more than
140 countries. For more information, visit www.3ds.com.
DS-9008-1016