Talend Open Studio For Data Integration: Installation and Upgrade Guide
Talend Open Studio For Data Integration: Installation and Upgrade Guide
5.4.0
Talend Open Studio for Data Integration
Adapted for v5.4.0. Supersedes any previous Installation and Upgrade Guide.
Copyleft
This documentation is provided under the terms of the Creative Commons Public License (CCPL).
For more information about what you can and cannot do with this documentation in accordance with the CCPL,
please read: http://creativecommons.org/licenses/by-nc-sa/2.0/
Notices
All brands, product names, company names, trademarks and service marks are the properties of their respective
owners.
Table of Contents
Preface ................................................. v
1. General information . . . . . . . . . . . . . . . . . . . . . . . . . . v
1.1. Purpose . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v
1.2. Audience . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v
1.3. Typographical conventions . . . . . . . . . . . v
Chapter 1. Prior to installing the
Talend products .................................... 1
1.1. Installation requirements . . . . . . . . . . . . . . . . . . . 2
1.2. Studio specific prerequisites . . . . . . . . . . . . . . . . 2
1.2.1. Installing database client
software (for bulk mode) . . . . . . . . . . . . . . . . . . 2
1.2.2. Installing the XULRunner
package (for Linux users) . . . . . . . . . . . . . . . . . 3
1.3. Compatible Platforms . . . . . . . . . . . . . . . . . . . . . . . 3
Chapter 2. Installing Talend Studio for
the first time ......................................... 5
2.1. Downloading and installing Talend
Studio . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.2. Launching Talend Studio . . . . . . . . . . . . . . . . . . . 6
2.2.1. Launching the Studio . . . . . . . . . . . . . . . 6
2.3. Configuring Talend Studio . . . . . . . . . . . . . . . . . 8
2.3.1. Identify required external
modules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.3.2. Install external modules . . . . . . . . . . . 10
Chapter 3. Upgrading your Talend
products ............................................. 13
3.1. Backing up the environment . . . . . . . . . . . . . . 14
3.1.1. Saving the local projects . . . . . . . . . . 14
3.2. Upgrading the Talend projects in the
Studio . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
Appendix A. Supported Third-Party
System/Database Versions ..................... 15
A.1. Supported systems and databases . . . . . . . . . . . 16
1. General information
1.1. Purpose
This Installation Guide explains how to install, configure and upgrade the Talend modules and related
applications. For detailed explanation on how to use and fine-tune the Talend applications, please refer
to the appropriate Administrator or User Guides of the Talend solutions.
1.2. Audience
This guide is devoted for administrators of the Talend products.
The layout of GUI screens provided in this document may vary slightly from your actual GUI.
• text in bold: window and dialog box buttons and fields, keyboard keys, menus, and menu and
options,
•
The icon indicates an item that provides additional information about an important point. It is
also used to add comments related to a table or a figure,
•
The icon indicates a message that gives information about the execution requirements or
recommendation type. It is also used to refer to situations or information the end-user needs to be
aware of or pay special attention to.
• recommended: designates an environment already set up by Talend which has undergone QA tests prior to the release
of the software;
• supported: designates an environment that can be put in place by Talend for problem reproduction and testing within
24 hours;
• supported with limitations: designates an environment that is supported by Talend under certain conditions explained in
notes.
Memory usage heavily depends on the size and nature of your Talend projects. However, in summary, if your Jobs
include many transformation components, you should consider upgrading the total amount of memory allocated
to your servers, based on the following recommendations.
The same requirements also apply for disk usage. It also depends on your projects but can be summarized as:
• Define your JAVA_HOME environment variable so that it points to the JDK directory.
For example, if the JDK path is C:\Java\JDKx.x.x\bin, you must set the JAVA_HOME environment variable to
point to: C:\Java\JDKx.x.x.
It is highly recommended that the full path to the server installation directory is as short as possible and does not
contain any space character. If you already have a suitable JDK installed in a path with a space, you simply need to
put quotes around the path when setting the values for the environment variable.
For more information on how to set the JAVA_HOME variable on Unix and Windows systems, see the online Oracle
documentation.
On Windows XP and Windows Server 2003, the GDI is already installed. However, on Windows 2000, this installation is
required. The GDI can be downloaded from Microsoft’s Website. For further information, visit Eclipse’s FAQ.
• OracleBulkExec uses the sqlldr external utility. This utility is available in Oracle clients that must be installed
on the computer.
• Sybase uses the bcp.exe external utility. This utility is asked for in the Sybase bulk components’ Basic Settings
view. For more information, see tSybaseBulkExec, tSybaseOutputBulk and tSybaseOutputBulkExec components
on the appropriate Talend Components Reference Guide.
The XULRunner packages versions that are supported are v1.8.x - 1.9.x and v3.6.x.
2. Add the following line at the end of the Studio .ini file that corresponds to your Linux architecture:
-Dorg.eclipse.swt.browser.XULRunnerPath=</usr/lib/xulrunner-1.9.2.17>
Please refer to the following grid for a summary of supported OS and Java Runtime environments.
1. Note that Java v.6 is no longer supported by Oracle and that it is recommended to use a recent update of JDK 1.6 (Update 11 or higher).
Note that the .zip file contains binaries for ALL platforms (Linux/Unix, Windows and MacOS).
2. Once the download is complete, extract the archive file on your hard drive.
It is recommended to avoid spaces and long names in the target installation directory path.
If you only have 512Mo of memory on your computer, you can specify the memory allocation as following,
for example:
On Unix-like systems, add execution rights on the desired TOS_DI-* binary before launching it.
$ chmod +x TOS_DI-linux-gtk-x86.sh
$ ./TOS_DI-linux-gtk-x86.sh
TOS_DI-macosx-cocoa.app/Contents/MacOS/TOS_DI-macosx-cocoa
Public license
• First screen is a license screen. In the [License] window that appears, read and accept the terms of the license
agreement to proceed to the next step.
1. As first time user, you need to set up a new project or you can also import a Demo project which gathers
numerous job samples.
To create a new project, enter the name of your project in the corresponding field and click Create... to
complete the description of your project.
Click Finish when complete, and the newly created project is displayed in the Login window.
3. In the Login window, open the project you just created. A registration window opens.
If required, follow the instructions provided to join the Talend community or click Skip to open a welcome
window and launch the Studio.
When you open the Basic settings or Advanced settings view of a component for which one or more required
external modules are missing, you will see a piece of highlighted information about missing external modules,
followed by an Install button. Clicking the Install button opens a wizard that will show you the external modules
to be installed.
The Modules view lists all the modules required to use the components embedded in the Studio, including those
missing Java libraries and drivers that you must install to get the relevant components or Metadata connection
working.
If the Modules view is not shown under your design workspace, go to Window > Show View… > Talend and then select
Modules from the list.
The table below describes the information presented in the Modules view.
Column Description
Status points out if a module is installed or not installed on your system.
The icon indicates that the module is not necessarily required for the corresponding component
or Metadata connection listed in the Context column.
Column Description
The icon indicates that the module is absolutely required for the corresponding componentor
Metadata connection.
Context lists the name of Talend componentor Metadata connection using the module. If this column is
empty, the module is then required for the general use of Talend Studio.
This column lists any external libraries added to the routines you create and save in the
Studio library folder. For more information, see the Talend Studio User Guide.
Module lists the module exact name.
Description explains why the module/library is required.
Required the selected check box indicates that the module is required.
In addition to the Modules view, the Studio provides a mechanism that enables you to easily identify, download
and install most of the required third-party modules from the Talend website and directs you to valid websites
for the rest.
A Jar installation wizard appears whenever any required external module is found missing for any feature in the
Studio, including when you:
• drop a component from the Palette if one or more external modules required for that component to work are
missing in the Studio, or
• click the Check button in a Metadata connection setup wizard in Talend Studio if one or more external modules
required for the connection are missing in the Studio, or
• click the Guess schema button in the Component view of a component if one or more external modules required
for that component to work are missing in the Studio,
• click Install on the top of the Basic settings or Advanced settings view of a component for which one or more
required external modules are missing,
• run a Job that involves components or Metadata connections for which one or more required external modules
are missing, or
•
click the button in the Modules view.
When you click this button, the wizard that appears will list all the required external modules that are not integrated in
the Studio.
Item Description
Jar The file name of the external module.
Module A short description about the nature of the module.
Required by component Lists the components that require the external module.
Required The selected check box indicates that the module is required.
License The license under which the module is provided.
More information Provides the URL of the valid website where you can find more information about this module
and download the module manually.
Action : Click to open the [Download external modules] dialog box to download
and install the module, which is available on the Talend website;
: Click the link to open the valid website to download the module, which
is not available on the Talend website, and then click the jar button to import the downloaded
module into the your studio. For a list of these external websites, see the article How to install
external modules in the Talend products;
: You need to find and download the module yourself and click the jar
button to import it into the your studio.
Click to open the [Download external modules] dialog box to download and install all the
required modules that are available on the Talend website.
Do not show again
Select to prevent the wizard from appearing again unless you click the button in the
Modules tab view.
This check box shows only when you drop a component, set up a connection, or guess the
schema of a database, that requires a missing external module, or click the Install button on
the Component tab of a component that requires a missing external module.
Click here to obtain more Click to go to Talend online documentation on installing third-party modules.
information about external
modules
This wizard lists the external modules to be installed, the licenses under which they are provided, and the URLs
of the valid websites where they are downloadable, and allows you to download and install automatically all the
modules available on the Talend website and download those not available on the Talend website by following
the links provided in the Action column and then install them into your Studio manually.
When you drop a component, set up a connection, or guess the schema of a database, that requires an external
module for which neither the Jar file nor its download URL information is available on the Talend website, the
Jar installation wizard does not appear, but the Error Log view will present an error message informing you that
the download URL for that module is not available. You can try to find and download it by yourself, and then
install it manually into the Studio.
To show the Error Log view on the tab system, go to Window > Show views, then expand the General node and select
Error Log.
1. In the Jar installation wizard, click the Download and Install button to install a particular module, or click
the Download and install all modules available button to install all the available missing modules. The
[Download external modules] dialog box opens.
2. To download and install the external module(s) provided under a particular license, select that license from
the Licenses pane, review the license terms, select the I accept the terms of the license agreement option,
and click Finish to start the download and installation process.
To download and install all external modules provided under all the listed licenses, click the Accept all button
to start the download and installation process.
Upon installation of the chosen external module or modules, a dialog box appears to notify you about the
number of modules successfully installed and/or about the modules failed to install, if any.
To install manually an external module you already have in your local file system, do the following:
1.
Click the button in the upper right corner of the Modules view or in Jar installation wizard to
browse your local file system.
2. In the [Open] dialog box of your file system, browse to the module you want to install, double-click
the .jar file, or select it and then click Open to install it.
The dialog box closes and the selected module is installed in the library folder of the current Studio.
You can now use the component or Metadata connection dependent on this module in any of your Job
designs.
1. Make sure CommandLine is not started, then download the missing modules from the Modules view as
explained in the previous procedure.
2. Copy the downloaded .jar files from <StudioPath>/lib/java and paste them into <CommandLinePath>/
lib/java, where <StudioPath> and <CommandLinePath> are the installation directories of the Studio and
CommandLine respectively.
Note that the <CommandLinePath>/lib/java folder is not created by default, it is created the first time you
start the CommandLine application.
3. Restart CommandLine.
You can now use the component or Metadata connection dependent on these modules.
• For the studio, the downloaded modules must be placed in the following folder:
<StudioPath>/lib/java
We assume that you have installed and configured these solutions as described in the chapter Installing Talend
Studio for the first time.
The migration and upgrade process includes the following mandatory steps:
2. Upgrading the Talend projects in the Studio, see the section Upgrading the Talend projects in the Studio.
2.
Click the icon and export your local projects to an archive file.
2. In the login window, select Import, then import the archive file containing your local projects.
The local projects are displayed in the Project list and appear on the Studio Repository view.
For more information on how to export local projects to an archive file, see the section Saving the local projects.
Systems/Databases Versions OS
Amazon Redshift Initial release of Amazon Redshift N/A1
AS400 V5R2 to V5R4 N/A1
AS400 V5R3 to V6R1 N/A1
Access 2003 Windows
Access 2007 Windows
DB Generic ODBC Windows
DB2 9.5/9.7 Windows + Linux
EXASolution 4 Windows
FireBird 2.1 Windows + Linux
Greenplum 4.2.1.0 Windows (client
uniquement) + Linux
HSQLDb 1.8.0 N/A1
Kerberos
Hive Hive 1 (HiveServer) HortonWorks Data Platform V1.0.0 (0.9.0) Windows + Linux
(kinit and keytab)
The security
information is Hortonworks Data Platform V1.2.0 (Bimota)Kerberos
not available (kinit and keytab)
to standalone
servers. Hortonworks Data Platform V1.3.0 (Condor)Kerberos
(kinit and keytab)
Custom2
Informix 11.50 Windows + Linux
Ingres 9.2 Windows + Linux
Systems/Databases Versions OS
Interbase 7 and above N/A1
JavaDB 6 Windows + Linux
LDAP No version limitation Windows + Linux
MS SQL Server 2000/2003/2005/2008/2012 Windows + Linux
MaxDB 7.6 N/A1
MySQL Mysql4 Windows + Linux
Mysql5 Windows + Linux
Netezza Version 6 and earlier have been tested. Windows + Linux
MapR 2.1.2
MapR 2.1.3
MapR 3.0.1
Custom2
Oracle Oracle 8i/9i/10g/11g/11g (11.6) Windows + Linux
ParAccel 3.1/3.5 N/A1
PostgreSQL 8.3 Windows + Linux
PostgresPlus 8.3 Windows + Linux
Salesforce until V26 Windows + Linux
SAP 4.6 Windows
SQLite 3.6.7 Windows + Linux
Sybase 12.5/12.7/15.2/15.5/15.7 Windows + Linux
SybaseIQ 12.5/12.7/15.2 Windows + Linux
Teradata 12/13/14 Windows + Linux
VectorWise 2 Windows + Linux
Vertica 3/3.5/4/4.1/5.0/5.1/6.0 Windows + Linux
eXist 1.4 Windows 32bit + Linux
32bit
Kerberos (kinit and keytab): The Kerberos authentication with a specific keytab is supported.
Kerberos (kinit only): The Kerberos authentication without a specific keytab is supported.