User Guide For Powercenter: Informatica Powerexchange For Mongodb (Version 10.0)
User Guide For Powercenter: Informatica Powerexchange For Mongodb (Version 10.0)
(Version 10.0)
Version 10.0
November 2015
This software and documentation contain proprietary information of Informatica LLC and are provided under a license agreement containing restrictions on use and
disclosure and are also protected by copyright law. Reverse engineering of the software is prohibited. No part of this document may be reproduced or transmitted in any
form, by any means (electronic, photocopying, recording or otherwise) without prior consent of Informatica LLC. This Software may be protected by U.S. and/or
international Patents and other Patents Pending.
Use, duplication, or disclosure of the Software by the U.S. Government is subject to the restrictions set forth in the applicable software license agreement and as
provided in DFARS 227.7202-1(a) and 227.7702-3(a) (1995), DFARS 252.227-7013©(1)(ii) (OCT 1988), FAR 12.212(a) (1995), FAR 52.227-19, or FAR 52.227-14
(ALT III), as applicable.
The information in this product or documentation is subject to change without notice. If you find any problems in this product or documentation, please report them to us
in writing.
Informatica, Informatica Platform, Informatica Data Services, PowerCenter, PowerCenterRT, PowerCenter Connect, PowerCenter Data Analyzer, PowerExchange,
PowerMart, Metadata Manager, Informatica Data Quality, Informatica Data Explorer, Informatica B2B Data Transformation, Informatica B2B Data Exchange Informatica
On Demand, Informatica Identity Resolution, Informatica Application Information Lifecycle Management, Informatica Complex Event Processing, Ultra Messaging and
Informatica Master Data Management are trademarks or registered trademarks of Informatica LLC in the United States and in jurisdictions throughout the world. All
other company and product names may be trade names or trademarks of their respective owners.
Portions of this software and/or documentation are subject to copyright held by third parties, including without limitation: Copyright DataDirect Technologies. All rights
reserved. Copyright © Sun Microsystems. All rights reserved. Copyright © RSA Security Inc. All Rights Reserved. Copyright © Ordinal Technology Corp. All rights
reserved.Copyright © Aandacht c.v. All rights reserved. Copyright Genivia, Inc. All rights reserved. Copyright Isomorphic Software. All rights reserved. Copyright © Meta
Integration Technology, Inc. All rights reserved. Copyright © Intalio. All rights reserved. Copyright © Oracle. All rights reserved. Copyright © Adobe Systems
Incorporated. All rights reserved. Copyright © DataArt, Inc. All rights reserved. Copyright © ComponentSource. All rights reserved. Copyright © Microsoft Corporation. All
rights reserved. Copyright © Rogue Wave Software, Inc. All rights reserved. Copyright © Teradata Corporation. All rights reserved. Copyright © Yahoo! Inc. All rights
reserved. Copyright © Glyph & Cog, LLC. All rights reserved. Copyright © Thinkmap, Inc. All rights reserved. Copyright © Clearpace Software Limited. All rights
reserved. Copyright © Information Builders, Inc. All rights reserved. Copyright © OSS Nokalva, Inc. All rights reserved. Copyright Edifecs, Inc. All rights reserved.
Copyright Cleo Communications, Inc. All rights reserved. Copyright © International Organization for Standardization 1986. All rights reserved. Copyright © ej-
technologies GmbH. All rights reserved. Copyright © Jaspersoft Corporation. All rights reserved. Copyright © International Business Machines Corporation. All rights
reserved. Copyright © yWorks GmbH. All rights reserved. Copyright © Lucent Technologies. All rights reserved. Copyright (c) University of Toronto. All rights reserved.
Copyright © Daniel Veillard. All rights reserved. Copyright © Unicode, Inc. Copyright IBM Corp. All rights reserved. Copyright © MicroQuill Software Publishing, Inc. All
rights reserved. Copyright © PassMark Software Pty Ltd. All rights reserved. Copyright © LogiXML, Inc. All rights reserved. Copyright © 2003-2010 Lorenzi Davide, All
rights reserved. Copyright © Red Hat, Inc. All rights reserved. Copyright © The Board of Trustees of the Leland Stanford Junior University. All rights reserved. Copyright
© EMC Corporation. All rights reserved. Copyright © Flexera Software. All rights reserved. Copyright © Jinfonet Software. All rights reserved. Copyright © Apple Inc. All
rights reserved. Copyright © Telerik Inc. All rights reserved. Copyright © BEA Systems. All rights reserved. Copyright © PDFlib GmbH. All rights reserved. Copyright ©
Orientation in Objects GmbH. All rights reserved. Copyright © Tanuki Software, Ltd. All rights reserved. Copyright © Ricebridge. All rights reserved. Copyright © Sencha,
Inc. All rights reserved. Copyright © Scalable Systems, Inc. All rights reserved. Copyright © jQWidgets. All rights reserved. Copyright © Tableau Software, Inc. All rights
reserved. Copyright© MaxMind, Inc. All Rights Reserved. Copyright © TMate Software s.r.o. All rights reserved. Copyright © MapR Technologies Inc. All rights reserved.
Copyright © Amazon Corporate LLC. All rights reserved. Copyright © Highsoft. All rights reserved. Copyright © Python Software Foundation. All rights reserved.
Copyright © BeOpen.com. All rights reserved. Copyright © CNRI. All rights reserved.
This product includes software developed by the Apache Software Foundation (http://www.apache.org/), and/or other software which is licensed under various versions
of the Apache License (the "License"). You may obtain a copy of these Licenses at http://www.apache.org/licenses/. Unless required by applicable law or agreed to in
writing, software distributed under these Licenses is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
implied. See the Licenses for the specific language governing permissions and limitations under the Licenses.
This product includes software which was developed by Mozilla (http://www.mozilla.org/), software copyright The JBoss Group, LLC, all rights reserved; software
copyright © 1999-2006 by Bruno Lowagie and Paulo Soares and other software which is licensed under various versions of the GNU Lesser General Public License
Agreement, which may be found at http:// www.gnu.org/licenses/lgpl.html. The materials are provided free of charge by Informatica, "as-is", without warranty of any
kind, either express or implied, including but not limited to the implied warranties of merchantability and fitness for a particular purpose.
The product includes ACE(TM) and TAO(TM) software copyrighted by Douglas C. Schmidt and his research group at Washington University, University of California,
Irvine, and Vanderbilt University, Copyright (©) 1993-2006, all rights reserved.
This product includes software developed by the OpenSSL Project for use in the OpenSSL Toolkit (copyright The OpenSSL Project. All Rights Reserved) and
redistribution of this software is subject to terms available at http://www.openssl.org and http://www.openssl.org/source/license.html.
This product includes Curl software which is Copyright 1996-2013, Daniel Stenberg, <[email protected]>. All Rights Reserved. Permissions and limitations regarding this
software are subject to terms available at http://curl.haxx.se/docs/copyright.html. Permission to use, copy, modify, and distribute this software for any purpose with or
without fee is hereby granted, provided that the above copyright notice and this permission notice appear in all copies.
The product includes software copyright 2001-2005 (©) MetaStuff, Ltd. All Rights Reserved. Permissions and limitations regarding this software are subject to terms
available at http://www.dom4j.org/ license.html.
The product includes software copyright © 2004-2007, The Dojo Foundation. All Rights Reserved. Permissions and limitations regarding this software are subject to
terms available at http://dojotoolkit.org/license.
This product includes ICU software which is copyright International Business Machines Corporation and others. All rights reserved. Permissions and limitations
regarding this software are subject to terms available at http://source.icu-project.org/repos/icu/icu/trunk/license.html.
This product includes software copyright © 1996-2006 Per Bothner. All rights reserved. Your right to use such materials is set forth in the license which may be found at
http:// www.gnu.org/software/ kawa/Software-License.html.
This product includes OSSP UUID software which is Copyright © 2002 Ralf S. Engelschall, Copyright © 2002 The OSSP Project Copyright © 2002 Cable & Wireless
Deutschland. Permissions and limitations regarding this software are subject to terms available at http://www.opensource.org/licenses/mit-license.php.
This product includes software developed by Boost (http://www.boost.org/) or under the Boost software license. Permissions and limitations regarding this software are
subject to terms available at http:/ /www.boost.org/LICENSE_1_0.txt.
This product includes software copyright © 1997-2007 University of Cambridge. Permissions and limitations regarding this software are subject to terms available at
http:// www.pcre.org/license.txt.
This product includes software copyright © 2007 The Eclipse Foundation. All Rights Reserved. Permissions and limitations regarding this software are subject to terms
available at http:// www.eclipse.org/org/documents/epl-v10.php and at http://www.eclipse.org/org/documents/edl-v10.php.
This product includes software licensed under the terms at http://www.tcl.tk/software/tcltk/license.html, http://www.bosrup.com/web/overlib/?License, http://
www.stlport.org/doc/ license.html, http://asm.ow2.org/license.html, http://www.cryptix.org/LICENSE.TXT, http://hsqldb.org/web/hsqlLicense.html, http://
httpunit.sourceforge.net/doc/ license.html, http://jung.sourceforge.net/license.txt , http://www.gzip.org/zlib/zlib_license.html, http://www.openldap.org/software/release/
license.html, http://www.libssh2.org, http://slf4j.org/license.html, http://www.sente.ch/software/OpenSourceLicense.html, http://fusesource.com/downloads/license-
agreements/fuse-message-broker-v-5-3- license-agreement; http://antlr.org/license.html; http://aopalliance.sourceforge.net/; http://www.bouncycastle.org/licence.html;
http://www.jgraph.com/jgraphdownload.html; http://www.jcraft.com/jsch/LICENSE.txt; http://jotm.objectweb.org/bsd_license.html; . http://www.w3.org/Consortium/Legal/
2002/copyright-software-20021231; http://www.slf4j.org/license.html; http://nanoxml.sourceforge.net/orig/copyright.html; http://www.json.org/license.html; http://
forge.ow2.org/projects/javaservice/, http://www.postgresql.org/about/licence.html, http://www.sqlite.org/copyright.html, http://www.tcl.tk/software/tcltk/license.html, http://
www.jaxen.org/faq.html, http://www.jdom.org/docs/faq.html, http://www.slf4j.org/license.html; http://www.iodbc.org/dataspace/iodbc/wiki/iODBC/License; http://
www.keplerproject.org/md5/license.html; http://www.toedter.com/en/jcalendar/license.html; http://www.edankert.com/bounce/index.html; http://www.net-snmp.org/about/
license.html; http://www.openmdx.org/#FAQ; http://www.php.net/license/3_01.txt; http://srp.stanford.edu/license.txt; http://www.schneier.com/blowfish.html; http://
www.jmock.org/license.html; http://xsom.java.net; http://benalman.com/about/license/; https://github.com/CreateJS/EaselJS/blob/master/src/easeljs/display/Bitmap.js;
http://www.h2database.com/html/license.html#summary; http://jsoncpp.sourceforge.net/LICENSE; http://jdbc.postgresql.org/license.html; http://
protobuf.googlecode.com/svn/trunk/src/google/protobuf/descriptor.proto; https://github.com/rantav/hector/blob/master/LICENSE; http://web.mit.edu/Kerberos/krb5-
current/doc/mitK5license.html; http://jibx.sourceforge.net/jibx-license.html; https://github.com/lyokato/libgeohash/blob/master/LICENSE; https://github.com/hjiang/jsonxx/
blob/master/LICENSE; https://code.google.com/p/lz4/; https://github.com/jedisct1/libsodium/blob/master/LICENSE; http://one-jar.sourceforge.net/index.php?
page=documents&file=license; https://github.com/EsotericSoftware/kryo/blob/master/license.txt; http://www.scala-lang.org/license.html; https://github.com/tinkerpop/
blueprints/blob/master/LICENSE.txt; http://gee.cs.oswego.edu/dl/classes/EDU/oswego/cs/dl/util/concurrent/intro.html; https://aws.amazon.com/asl/; https://github.com/
twbs/bootstrap/blob/master/LICENSE; https://sourceforge.net/p/xmlunit/code/HEAD/tree/trunk/LICENSE.txt; https://github.com/documentcloud/underscore-contrib/blob/
master/LICENSE, and https://github.com/apache/hbase/blob/master/LICENSE.txt.
This product includes software licensed under the Academic Free License (http://www.opensource.org/licenses/afl-3.0.php), the Common Development and Distribution
License (http://www.opensource.org/licenses/cddl1.php) the Common Public License (http://www.opensource.org/licenses/cpl1.0.php), the Sun Binary Code License
Agreement Supplemental License Terms, the BSD License (http:// www.opensource.org/licenses/bsd-license.php), the new BSD License (http://opensource.org/
licenses/BSD-3-Clause), the MIT License (http://www.opensource.org/licenses/mit-license.php), the Artistic License (http://www.opensource.org/licenses/artistic-
license-1.0) and the Initial Developer’s Public License Version 1.0 (http://www.firebirdsql.org/en/initial-developer-s-public-license-version-1-0/).
This product includes software copyright © 2003-2006 Joe WaInes, 2006-2007 XStream Committers. All rights reserved. Permissions and limitations regarding this
software are subject to terms available at http://xstream.codehaus.org/license.html. This product includes software developed by the Indiana University Extreme! Lab.
For further information please visit http://www.extreme.indiana.edu/.
This product includes software Copyright (c) 2013 Frank Balluffi and Markus Moeller. All rights reserved. Permissions and limitations regarding this software are subject
to terms of the MIT license.
DISCLAIMER: Informatica LLC provides this documentation "as is" without warranty of any kind, either express or implied, including, but not limited to, the implied
warranties of noninfringement, merchantability, or use for a particular purpose. Informatica LLC does not warrant that this software or documentation is error free. The
information provided in this software or documentation may include technical inaccuracies or typographical errors. The information in this software and documentation is
subject to change at any time without notice.
NOTICES
This Informatica product (the "Software") includes certain drivers (the "DataDirect Drivers") from DataDirect Technologies, an operating company of Progress Software
Corporation ("DataDirect") which are subject to the following terms and conditions:
1. THE DATADIRECT DRIVERS ARE PROVIDED "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING BUT NOT
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT.
2. IN NO EVENT WILL DATADIRECT OR ITS THIRD PARTY SUPPLIERS BE LIABLE TO THE END-USER CUSTOMER FOR ANY DIRECT, INDIRECT,
INCIDENTAL, SPECIAL, CONSEQUENTIAL OR OTHER DAMAGES ARISING OUT OF THE USE OF THE ODBC DRIVERS, WHETHER OR NOT
INFORMED OF THE POSSIBILITIES OF DAMAGES IN ADVANCE. THESE LIMITATIONS APPLY TO ALL CAUSES OF ACTION, INCLUDING, WITHOUT
LIMITATION, BREACH OF CONTRACT, BREACH OF WARRANTY, NEGLIGENCE, STRICT LIABILITY, MISREPRESENTATION AND OTHER TORTS.
4 Table of Contents
Chapter 4: MongoDB Sources. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
MongoDB Sources Overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
Importing a MongoDB Source Definition. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
MongoDB Reader Sessions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
Example: MongoDB Reader Mapping. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
Index. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
Table of Contents 5
Preface
The Informatica PowerExchange for MongoDB User Guide describes how to use PowerExchange for
MongoDB with PowerCenter to extract data from and load data to MongoDB. The guide is written for
database administrators and developers who are responsible for developing mappings and workflows. This
guide assumes that you have knowledge of MongoDB and PowerCenter.
Informatica Resources
Informatica Documentation
The Informatica Documentation team makes every effort to create accurate, usable documentation. If you
have questions, comments, or ideas about this documentation, contact the Informatica Documentation team
through email at [email protected]. We will use your feedback to improve our
documentation. Let us know if we can contact you regarding your comments.
The Documentation team updates documentation as needed. To get the latest documentation for your
product, navigate to Product Documentation from https://mysupport.informatica.com.
6
Informatica Web Site
You can access the Informatica corporate web site at https://www.informatica.com. The site contains
information about Informatica, its background, upcoming events, and sales offices. You will also find product
and partner information. The services area of the site includes important information about technical support,
training and education, and implementation services.
Informatica Marketplace
The Informatica Marketplace is a forum where developers and partners can share solutions that augment,
extend, or enhance data integration implementations. By leveraging any of the hundreds of solutions
available on the Marketplace, you can improve your productivity and speed up time to implementation on
your projects. You can access Informatica Marketplace at http://www.informaticamarketplace.com.
Informatica Velocity
You can access Informatica Velocity at https://mysupport.informatica.com. Developed from the real-world
experience of hundreds of data management projects, Informatica Velocity represents the collective
knowledge of our consultants who have worked with organizations from around the world to plan, develop,
deploy, and maintain successful data management solutions. If you have questions, comments, or ideas
about Informatica Velocity, contact Informatica Professional Services at [email protected].
Online Support requires a user name and password. You can request a user name and password at
http://mysupport.informatica.com.
Preface 7
The telephone numbers for Informatica Global Customer Support are available from the Informatica web site
at http://www.informatica.com/us/services-and-training/support-services/global-support-centers/.
8 Preface
CHAPTER 1
Introduction to PowerExchange
for MongoDB
This chapter includes the following topics:
You can use PowerExchange for MongoDB to integrate and migrate data from diverse data sources that are
incompatible with MongoDB architecture.
You can use PowerExchange for MongoDB for the following data integration scenarios:
• Create a MongoDB data warehouse. You can aggregate data from MongoDB and other source systems,
transform the data, and write the data to MongoDB.
• Migrate data from a relational database or other data sources to MongoDB. For example, you want to
migrate data from a relational database to MongoDB. You can write data from multiple relational database
tables with different schemas to the same MongoDB collection. A MongoDB collection contains the data in
a MongoDB database.
• Move data between operational data stores to synchronize data. For example, an online marketplace uses
a relational database as the operational data store. You want to use MongoDB instead of the relational
database. However, you want to maintain the relational database along with MongoDB for a period of
time. You can use PowerExchange for MongoDB to synchronize data between the relational data store
and the MongoDB data store.
• Migrate data from MongoDB to a data warehouse for reporting. For example, your organization uses a
business intelligence tool that does not support MongoDB. You must migrate the data from MongoDB to a
data warehouse so that the business intelligence tool can use the data to generate reports.
9
Introduction to MongoDB
MongoDB is an open source, document based, NoSQL database that maintains dynamic schema. You can
maintain more than one database on a MongoDB server.
A MongoDB database contains a set of collections. A collection is a set of documents and is similar to a table
in a relational database. MongoDB stores records as documents that are similar to rows in a relational
database. A document contains fields that are similar to columns in a relational database. A document can
have a dynamic schema. A document in a collection does not need to have the same set of fields or structure
as another document in the same collection. A document can also contain nested documents.
The following schema provides a sample MongoDB document from the collection called Product:
{
sku: "111445GB3",
title: "CM Phone",
description: "The best in the world.",
manufacture_details: {
model_number: "CMP",
release_date: new ISODate("2011-07-17T22:14:15.656Z")
},
shipping_details: {
weight: 350,
width: 10,
height: 10,
depth: 1
},
quantity: 99,
pricing: [
{region: "North America",
cost_price: 1000,
sale_price: 1200},
{region: "Europe",
cost_price: 1200,
sale_price: 1500}
]
In the example, sku, title, description, quantity, manufacture_details, shipping_details, and pricing are fields.
The fields manufacture_details and shipping_details are nested document type fields and pricing is an array
type field.
PowerExchange for MongoDB includes the Informatica MongoDB ODBC driver that connects to the
MongoDB server. PowerExchange for MongoDB supports the MMAPv1 storage engine in MongoDB. You can
create an ODBC connection to extract data from or load data to a MongoDB database. You can also
configure the replica sets for the MongoDB server so that the PowerCenter Integration Service can access
the secondary servers if the primary server is not available.
The Designer imports a document based on the schema that you set for the collection. If a document
contains hierarchical elements like arrays or nested documents, the Designer imports them as columns at the
same level as other columns.
For example, you need to import the collection product_details with the following schema:
{
sku: "sku_name",
title: "product_name",
description: "description",
manufacture_details: {
model_number: "model_number",
release_date: new ISODate("date")
},
shipping_details: {
weight: <value>,
width: <value>,
height: <value>,
depth: <value>
},
quantity: <value>,
pricing: [
{region: "North America",
cost_price: 1000,
sale_price: 1200},
{region: "Europe",
cost_price: 1200,
sale_price: 1500}
]
}
The Designer imports the collection schema into a tabular format. You can identify arrays and nested
documents with the naming convention of the column. The naming convention of a nested document is <top
level element name>.<nested document name>.<nested document element name>. The naming convention
of an array is <array name>.<element number>.
When you run a session, the PowerCenter Integration Service uses the MongoDB ODBC data source name
in the machine that runs the PowerCenter Integration Service to extract data from or load data to a MongoDB
database.
Prerequisites
You must complete the prerequisites before you can use PowerExchange for MongoDB.
For more information about product requirements and supported platforms, see the Product Availability Matrix
on the Informatica My Support Portal:
https://mysupport.informatica.com/community/my-support/product-availability-matrices
13
PowerExchange for MongoDB Upgrade
Before you upgrade to Informatica 10.0, back up the odbc.ini file.
After you upgrade to Informatica 10.0, replace the odbc.ini file with the backup copy of the odbc.ini file,
and verify if the MongoDB driver name in the odbc.ini file is libinformaticamongodbodbc64.so.
The Designer uses the Informatica MongoDB ODBC driver to import MongoDB collections as source or target
definitions. The PowerCenter Integration Service uses the driver to extract data from or load data to the
MongoDB database. Create ODBC data source names to connect to the MongoDB database.
Edit the odbc.ini file to configure the driver in the following location: <INFA_HOME>/tools/mongodb/Setup
1. Enter the correct ODBCInstLib for the ODBC Driver Manager in all the .ini files.
2. Replace <INSTALL_DIR> with the path to the Informatica services installation directory in all the .ini
files.
3. Add the following information to the LD_LIBRARY_PATH environment variable:
• <INFA_HOME>/tools/mongodb/lib
• 32-bit library directory of the ODBC Driver Manager
4. Add the path of the odbc.ini file to the ODBCINI environment variable.
5. Add entries for all the MongoDB data sources in the odbc.ini file.
The following section shows a sample entry in the odbc.ini file:
[Sample Informatica MongoDB DSN]
Description=Informatica MongoDB ODBC Driver DSN
Driver=]<INFA_HOME>/tools/mongodbodbc/lib/libinformaticamongodbodbc64.so
Host=[Host]
Port=[Port]
Database=[Database]
ReadPreference=primary
ReplicaSetName=""
SecondaryServers=""
UseReplicaSet=0
VirtualTableDetection=0
VTAnyMatchColumnsDetection=0
VTAnyMatchString=ANY
VTAnyMatchTableNameSuffix=any
VTArrayCountPrefix=Number of
VTHideRealTables=0
VTIndexColSuffix=index
VTInsertUpdateSafeMode=0
VTKeyColumnSeparator=.
VTMainTableNameSeparator=main
VTMainTableShowArrayCounts=0
VTTableNameSeparator=_vt_
You must create a data source name in the ODBC datasource administrator to extract data from and load
data to a MongoDB database. The connection properties provide information for the MongoDB server and the
database. The advanced properties are read and write operations. You can also define a schema after you
create a database.
You can find the ODBC datasource administrator in the Control Panel on Windows. Configure the ODBC data
source name in the 32-bit ODBC datasource administrator in the client and the machines where you install
the Informatica services. You can access the 32-bit ODBC datasource administrator, odbcad32.exe, in 64-bit
Windows from the following location: C:\Windows\SysWOW64
Property Description
Replica Set Name Optional. Name of the replica set of the database.
Advanced Properties
Configure the advanced properties when you create a data source name.
The following table describes the advanced properties in the Informatica MongoDB ODBC driver:
Property Description
Documents fetched per The maximum number of documents that the PowerCenter Integration Service
block reads for every call to the MongoDB database.
Default is 4096.
Nested column separator Separator character for arrays and nested documents. The nested column
separator must be consistent across connections used in a mapping. For example,
if one connection uses the period (.) as the nested column separator and another
connection in the same mapping uses the underscore (_) as the separator, then
the mapping fails.
You can use either the underscore (_) or the period (.) as the nested column
separator. Default is period (.).
Maximum number of The maximum number of array elements that the ODBC driver flattens into multiple
columns to flatten nested columns.
Default is 5.
Read preference Server that you prefer to read data from if you configure replica sets. You can
select one of the following server options:
- Primary. The PowerCenter Integration Service reads data from the primary server. If
the primary server is offline, the session fails.
- Primary Preferred. The PowerCenter Integration Service reads data from the primary
server if the primary server is available. If the primary server is offline, the
PowerCenter Integration Service reads data from the secondary server.
- Secondary. The PowerCenter Integration Service reads data from the secondary
server. If the secondary server is offline, the session fails.
- Secondary Preferred. The PowerCenter Integration Service reads data from the
secondary server if the secondary server is available. If the secondary server is
offline, the PowerCenter Integration Service reads data from the primary server.
- Nearest. The PowerCenter Integration Service reads data from the nearest available
server.
Default is primary.
Sampling strategy Number of rows to scan in the schema definition. You can select one of the
following sampling strategies:
- Start. Scans the specified number of rows from the start.
- End. Scans the specified number of rows from the end.
- Random. Scans the specified number of rows in random order.
Default is End.
String Columns Lengths The string column length to use for the fields. You can select one of the following
string column lengths:
- Standard. The string column length to use for the standard fields. Default is 255.
- Container. The string column length to use for the container fields. Default is 511.
- DocumentAsJSON. The string column length to use for the documentAsJSON fields.
Default is 1023.
Use SQL_WVVARCHAR for The PowerCenter Integration Service maps the String datatype to
String datatype SQL_WVARCHAR ODBC instead of SQL_VARCHAR.
Default is disabled.
Enable reading/writing as Read or write data as a JSON document. If enabled, the driver reports a special
JSON document. column named documentAsJSON that retrieves or stores whole documents as
JSON formatted strings.
Default is disabled.
Note: For a MongoDB connection, if you toggle between enabling and disabling
this option, the metadata cache might lose its integrity. Instead of changing the
Enable reading/writing as JSON document property for a MongoDB connection,
create separate connections with this property.
Show container columns Show the container columns when the Integration Service generates the metadata.
when generating metadata Default is disabled.
Enable SSL Establish secure communication to the MongoDB server over SSL. If enabled, you
must specify the path where the SSL certificates for the MongoDB server are
stored.
Default is disabled.
Check GetLastError on Calls the MongoDB CheckGetLastError() function to check for failures after a write
writes operation.
Default is enabled.
Enable Updating Multiple The Informatica MongoDB ODBC driver updates multiple rows for each
Rows PowerCenter Integration Service write call.
If enabled, the driver updates all rows that match the filter condition. If disabled,
the driver updates only the first row that matches the filter condition.
Default is disabled.
Omit default NULL column The PowerCenter Integration Service does not write columns with NULL value to a
on insert MongoDB target.
Default is enabled.
Truncate documents larger Truncate the document size to 16 MB when you load data to MongoDB.
than 16 MB Default is disabled.
Active Metadata Location Read metadata changes from the MongoDB database or from a local file. Required
if you choose to store the metadata in a local file.
Default is database.
Schema Definition
This chapter includes the following topics:
A collection in MongoDB might contain several fields that you do not want to import. When you define the
schema you can limit metadata that you import. The driver dynamically detects the collection schema of a
MongoDB database. It flattens the MongoDB schema and displays the keys in the a tabular format with each
key as a column in the Schema Editor.
You can export the collection to an external schema definition file and edit the schema definition in the
Schema Editor. After you modify the collection properties and column metadata, you can save the
modifications in the schema definition file. The driver does not modify the schema of the actual MongoDB
collection. You can choose to store the modifications in the MongoDB database or as a file.
If you enable virtual table detection in the Informatica MongoDB ODBC driver, the driver creates virtual tables
in the schema if the collection contains arrays. You can import the virtual table as a source or target definition
in the Designer.
Schema Editor
Use the Schema Editor to view or edit the MongoDB collection schema that you want to import.
You can access the Schema Editor from the ODBC Data Source Administrator when you configure the
Informatica MongoDB ODBC Driver DSN. You can also find the Schema Editor in the following location:
$INFA_HOME/clients/tools/mongodb/Tools
When you define a schema in the Informatica MongoDB ODBC driver DSN, you must specify a schema
definition file. You can use an existing schema definition file or create a new one. After you specify a schema
19
definition file, you can import the collections in the database to the schema definition file. You can import all
the collections in the database or a particular collection. You can use a JSON filter to filter records on a
collection. You can also export those collections that are missing in the schema definition file.
When you open the Schema Editor, all the databases and collections in the schema definition file appear.
When you select a collection, the collection properties and document properties appear on the right pane.
You can modify the properties and save the schema. You can also save the schema changes to a new
schema definition file.
Collection Properties
Before you import a collection, you can view or edit the properties associated with the collection in the
Schema Editor.
You can view or edit the following collection properties in the Schema Editor:
Virtual Type
Indicates whether the collection is a virtual collection or not. Reserved for future use.
Permissions
The permissions assigned to you. Reserved for future use.
Column Metadata
When you select a collection in the Schema Editor, you can view or modify the column metadata of the
collection.
SQL Type
The ODBC data type of the column. The PowerCenter Integration Service uses the SQL type when you
run the session that uses the ODBC data source. You can modify the datatype based on your
requirement.
Source Type
The data type of the column in the source database. The PowerCenter Integration Service uses the SQL
type when you run the session that uses the ODBC data source. You can modify the datatype based on
your requirement.
Hide Column
You can choose to hide the column so that it does not appear in the schema.
Behavior
The behavior field shows whether the column is scalar or a container. Scalar columns contain a single
value like an integer or a string. Container columns have multiple values. Arrays and documents are
examples of container columns.
Key Type
The key type field shows whether the column is a key column.
• Primary key
• Foriegn key
• Not a key
Virtual Tables
You can configure the Informatica MongoDB ODBC driver to create virtual tables in the schema if the
collection contains arrays.
Virtual tables depict the normalized view of a MongoDB collection. You can import virtual tables as an ODBC
data object and create mappings.
Note: You cannot use the Designer to preview virtual tables.
To configure virtual table creation, open the Informatica MongoDB ODBC Driver DSN. In the Schema
Definition dialog box, click Virtual Table Options.
If you enable virtual table creation, the driver creates the following virtual tables:
Virtual Tables 21
Main virtual table
The main virtual table contains all the data from the original MongoDB collection except the data in the
arrays. The driver replaces the cells that contain arrays with the number of arrays in the cell.
The main virtual table use the following naming convention by default: <original collection
name>_vt_main
The columns that contain arrays use the following naming convention by default: Number of <original
column name>
The virtual table for an array column uses the following naming convention by default:<original
collection name>_vt_<original column name>
Each virtual table has a key column that references back to the primary key column in the original
collection. The key column uses the following naming convention by default: <original collection
name>.<primary key column name>.
The virtual table has an index column that shows the position of the data within the original array. The
index column uses the following naming convention by default: <original column name>.index
Other columns in the virtual table represent the elements in the array and are named after the array
element. If the array is of scalar type, the data column uses the following naming convention by
default:<original column name>.value
Note: You cannot use a DD_DELETE strategy in an Update Strategy transformation to delete rows from a
virtual table. You also cannot use the MongoDB ODBC driver to add an array element to an existing array
index because of a limitation from the C API used by the MongoDB driver.
The following table describes the virtual table options in the Informatica MongoDB ODBC driver:
Property Description
Enable Virtual Table The driver creates virtual tables if the collection contains arrays.
Detection Default is disabled.
Virtual Main Table Suffix The suffix for the main virtual table.
Default is main.
Virtual Key Column The separator for the key columns in a virtual table. The virtual key column separator
Separator must be consistent across connections used in a mapping. For example, if one
connection uses the period (.) as the virtual key column separator and another
connection in the same mapping uses the underscore (_) as the separator, then the
mapping fails.
You can use either the underscore (_) or the period (.) as the virtual key column
separator. Default is period (.).
Virtual Table Index The suffix for the virtual table index column.
Column Suffix Default is index.
Hide Real Table if Virtual Hide the real tables if the corresponding virtual tables are created.
Tables Created Default is disabled.
Show Array Counts In The virtual tables contain columns that show the array count.
Virtual Main Table Default is disabled.
Virtual Table Array The prefix for the virtual table array count column.
Count Prefix Default is Number of.
Enable Any Match The driver filters the data and selects rows where a value in a top-level array matches
Columns Detection a specified expression and then returns the results as columns in a virtual table.
Any Match Table Name The prefix for naming the array column in an any match virtual table.
Prefix
Any Match Column The separator for naming the columns in an any match virtual table.
Separator
If you enable virtual table detection, the driver creates the following virtual tables:
Virtual Tables 23
CustomerTable_vt_main
The following table shows the schema of CustomerTable_vt_main virtual table:
CustomerTable_vt_Invoices
The following table shows the schema of CustomerTable_vt_Invoices virtual table:
CustomerTable_vt_Contacts
The following table shows the schema of CustomerTable_vt_Contacts virtual table:
CustomerTable_vt_Ratings
The following table shows the schema of CustomerTable_vt_Ratings virtual table:
1111 1 7
1111 2 8
2222 1 5
2222 2 6
You must modify the schema definition if there are updates to the documents that require a change in the
definitions that you created in the Designer.
If you store the schema modification in a file, ensure that the file is available in the location that you configure
in the ODBC data source name when you import a source or target definition. If you store the schema
modification in the MongoDB database, PowerExchange for MongoDB stores the schema modification in a
collection called Mersenne_Collection_Metadata. If you edit Mersenne_Collection_Metadata, you may lose
the schema modifications.
Note: If you clear the metadata cache, you must re-create or re-import the source and target objects with the
same metadata that the existing mapping objects use.
Metadata Caching 25
Updating the Schema File
You can update the schema file to reflect metadata changes in the MongoDB database or make changes in
the imported metadata.
1. Open the schema definition by using the Informatica MongoDB ODBC Driver DSN.
2. Click Browse and select a schema definition file.
You can also enter a file name in the file selection dialog box to create and use a new schema definition
file.
3. Export the metadata to the SSD file.
a. To export the metadata imported by using the MongoDB ODBC driver, click Export Existing.
b. To export metadata sampled from the MongoDB database, click Generate All.
c. To export any missing tables and add metadata, click Generate Missing.
4. From the Database source table list, select the table to be updated.
5. Click Generate Table to update the schema of the table from the database.
6. Click Edit Schema File to open the schema file that you exported.
7. In the Schema Editor, make the required modifications in the schema file to reflect the metadata
changes.
Note: When you update metadata, press Enter and then click Save to ensure that the changes to the
metadata are saved.
8. Save the schema file and close the Schema Editor dialog box.
9. In the Schema Definition dialog box, click Update Metadata to replace the metadata with the metadata
from the SSD file.
MongoDB Sources
This chapter includes the following topics:
When you run a MongoDB reader session, the PowerCenter Integration Service uses the Informatica
MongoDB ODBC data source to extract data from MongoDB. The MongoDB reader sessions may fail or
produce incorrect results if you enable pushdown optimization in the session properties. Set pushdown
optimization as none if the session fails.
You can configure advanced reader properties for the Informatica MongoDB ODBC driver in the ODBC driver
properties.
You can configure the following read options in the ODBC driver properties:
27
Read Preference
MongoDB server that you prefer to read data from if you configure replica sets.
• Primary. The PowerCenter Integration Service reads data from the primary MongoDB server. If the
primary MongoDB server is offline, the session fails.
• Primary Preferred. The PowerCenter Integration Service reads data from the primary MongoDB
server if the primary MongoDB server is available. If the primary MongoDB server is offline, the
PowerCenter Integration Service reads data from the secondary MongoDB server.
• Secondary. The PowerCenter Integration Service reads data from the secondary MongoDB server. If
the secondary MongoDB server is offline, the session fails.
• Secondary Preferred. The PowerCenter Integration Service reads data from the secondary MongoDB
server if the secondary MongoDB server is available. If the secondary MongoDB server is offline, the
PowerCenter Integration Service reads data from the primary MongoDB server.
• Nearest. The PowerCenter Integration Service reads data from the nearest available MongoDB
server.
The business analysts uses a business intelligence tool that cannot read data from MongoDB. The tool
requires the input data to be in a relational database or a flat file.
The data warehouse includes a collection called Music_Contents. The collection Music_Contents contains a
catalog of all of the songs in the store. You must move the data in the collection to a flat file to use the data
for business analysis. You must also remove those records with zero units to ensure that the data is current.
Field Dataype
Name String
Units Int
The following table describes the structure of the nested document, Price:
Field Datatype
Cost_Price Int
Sale_Price Int
Create a mapping with a MongoDB source definition to read the records from the collection. Include a flat file
target definition in the mapping so that the business intelligence tool can consume the data. Use a Filter
transformation to remove the documents that have zero units.
Filter transformation
The filter transformation applies a filter on the Units field and writes those records that have one or more
units in the Units field.
MongoDB Targets
This chapter includes the following topics:
When you run a MongoDB writer session, the PowerCenter Integration Service uses the Informatica
MongoDB ODBC data source to load data to the MongoDB database. The MongoDB writer sessions may fail
or produce incorrect results if you enable pushdown optimization in the session properties. Set pushdown
optimization as none if the session fails.
You can configure advanced write options for the Informatica MongoDB ODBC Driver in the ODBC driver
properties.
31
You can configure the following write options in the ODBC driver properties:
Omit default null columns on insert
You want to use a MongoDB database to store all inventory details. Create a mapping with two flat file source
definitions to read the records from the flat files. Include the MongoDB target definition to write data from the
flat files. Use a Joiner transformation with full outer join on the common fields to combine data in the flat file
sources before writing the data to MongoDB.
Field Datatype
Name String
Artist String
Units Integer
Field Datatype
Name String
Director String
Artist1 String
Artist2 String
Type String
Units Integer
The collection MDB_Inventory stores audio CD information and movie disks information.
The following sample document shows a movie disk document in the collection:
{
"Name" : "City Lights",
"Type" : "Blu-ray",
"Director" : "Charlie Chaplin"
The following figure shows the target definition that you import in the Designer:
Datatype Reference
This appendix includes the following topic:
The Informatica MongoDB ODBC driver reads MongoDB data and converts the MongoDB datatypes to ODBC
datatypes. The PowerCenter Integration Service converts the ODBC datatypes to transformation datatypes.
The following table lists the MongoDB datatypes and the corresponding ODBC and transformation datatypes:
35
Index
O
overview
targets 31
W
write options
Truncate documents larger than 16 MB 31
36