0% found this document useful (0 votes)
42 views14 pages

Talend Metadata

Uploaded by

danukrishnan003
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
42 views14 pages

Talend Metadata

Uploaded by

danukrishnan003
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 14

INTRODUCTION TO

TALEND
Talend Data Integration features
Talend Data Integration Components
Talend Data Integration features

• Talend open studio is divided into different parts, and each one of
them having its functionalities.
• It have four various components of Talend open Studio, which is as
follows,
1. Repository
2. Design workspace
3. Component palette
4. Configuration Tabs
Repository:

• The Repository is where Talend


open studio related to the
technical items are used
to design jobs, create
jobs or manage metadata
Repository:
• Metadata is an essential part of the Repository
because it has complete information about the
data, which is available in the Talend studio.
• If we want to develop any project, we can use
the metadata in our jobs by dragging the object
from the Repository, and drop it to the design
workspace window.
• Metadata contains many sources, for
example, DB connections, different kinds of files
like Azure, LDAP, Marketo, Salesforce, web
services, Hadoop cluster, FTP, so on options are
available under Talend Metadata Repository
Design Workspace:
• The Design workspace window, where we can layout and design the
jobs because we have access to the designer tab, which displays the
job graphically, and the code tab, which is used to generate and
identify the possible errors.
• The Design workspace contains two fields, which are as follows,
Designer tab
• By default, the designer tab is open when a job is created, which
displays the job in a graphical mode.
Code tab
• The code tab helps us in the visualization of the code and highlight
the possible language error.
Components Palette:

• It is used to contain the different technical


components for building the jobs, and grouped
in the families.
• The component palette is used to perform the
specific data integration operation because it is
a preconfigured connector.
• And, it can also minimize the amount of hand-
coding, which is required to work on data from
multiple and heterogeneous sources.
Configuration Tabs:

• The configuration tab is used to


display and edit the properties of
specific components within the
design workspace.
• These properties can be edit to
change or set the parameters that
are related to a particular
component or the job as a whole,
and the run tab is used to execute
our jobs.
Configuration Tabs:
Configuration tab contains four-part, which are as follows,
• Job Tab
The job tab is used to give information about the current job in the design workspace
window, which has name, version, creation, purpose, description, status, and so on.
• Context variables
The context variables to set the different values in the various platforms, and we can
also create a context group where we can hold multiple context variables.
It is also used to move the code into the development.
• Component Tab
The component tab displays all the parameters that are requested to configure the
components.
• Run Tab
The run tab shows the progress of the execution of a job, and the log pane displays the
starting, and ending of the error messages
Talend Data Integration Components

• Components perform all the


operations in Talend, and it provides
800+ connectors, and components to
perform multiple actions.
• The components are available in
the palette panel, and there are 21
main categories, which belong to the
components.
• By doing drag and drop in the
designer panel, we can choose the
connectors, and it automatically
creates the java code.
Components for Data Integration Description

tMysqlConnection It is used to connect the MySQL database, which is


defined in the component.

tMysqlInput It is used to run the database query to read a database


and extract fields (tables, views, etc.) depending on
the query.

tMysqlOutput It is used to write, update, and modify data in the


MySQL database.

tFileInputDelimited It reads a delimited file row by row and divides them


into separate fields, and passes it to the next
component.

tFileOutputDelimited It is used to get the output from the input data in a


delimited file based on the defined schema.
tFileInputExcel It reads an excel file row by row and divides them into separate fields, and passes it to
the next component.
tFileOutputExcel It is used to write an MS Excel file with different data values based on a defined schema.

tFileList It is used to get all the files and directories from a given file mask pattern.

tFileArchive It is used to compress a set of files or folders into a zip, gzip, or tar.gz archive file.

tRowGenerator It provides an editor where we can write functions or choose expressions to generate
our sample data.
tMsgBox It returns a dialog box with the message specified and an OK button.
tLogRow It is used to monitor the data which is getting processed. And it always displays
data/output in the run console.
tPreJob It defines the sub-jobs that will run before our actual job started.
tMap tMap is used to transform and route the data from single or multiple sources to single
and various destinations.
tJoin It is used to join two tables by performing inner and outer joins between the main flow
and the lookup flow.
tJava It enables you to use personalized java code in the Talend program.

tRunJob It is used to manage the complex job systems by running one Talend job after another.

tCloudStart It is used to start instances on AmazonEC2(Amazon Elastic Compute Cloud)

tCloudStop It is used to change the status of a launched instance on Amazon EC2(Amazon Elastic
Compute Cloud)
tDotNETInstan It is used to invoke the constructor of a .NET object, which is intended for later reuse.
tiate
tDotNETRow It helps us to transform the data by utilizing the custom or built-in.NET classes.

tDB2Connectio It is used to open a connection in a specified database, which can be reused in the
n subsequent subjob or subjobs.

tFileFetch It is used to retrieve a file through the given protocol (HTTP, HTTPS, FTP, or SMB).

tFTPClose It helps us to close an active FTP connection to release the taken resources.

tFTPConnectio It is used to open the FTP connection to transfer the file in a single transaction.
n
tFTPDelete It is used to delete the files or folders in a specified directory on the FTP server.

tFileInputJSON It is used to extract JSON data from a file and transfer the data to a file, database table, etc.
tFileOutputJSON It helps us to receive the data and rewrites it in a JSON
structured data block in an output file.

tFileInputXML It reads the XML structure related file row by row and breaks
them up into fields and sends those fields, which is defined in
the schema for the next component.

tFileOutputXML It writes an XML file with separated data values based on a


defined schema.

tReplicate It is used to duplicate the incoming schema into two identical


output flows.
Connectors:

1. Row
2. Iterate
3. Triggers
4. Link

You might also like