WP EN IP Talend Integrate Data Securely RB
WP EN IP Talend Integrate Data Securely RB
Talend architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Hybrid infrastructure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
Computation resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
Network perimeter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
Data flows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
Physical security . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
User access . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
Administrative access . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
Audit trails . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
Vulnerability management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
Recovery strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
Monitoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
Talend Cloud Data Fabric is a multitenant integration environment that allows businesses to collect,
transform, govern, and share data.
Talend Cloud Data Fabric architecture distinguishes 3 layers that ensure by design best
practices around security and privacy:
• The Control Plane, known as Talend Management Console (TMC), manages all the administrative
and operational aspects of the platform.
• The Data Plane performs all the data processing and is composed of execution engines managed
either by Talend or by the customer.
• The Applications layer implements business domain logic (Job design, Data Quality ruleset, etc).
Additionally, Talend Studio, which runs on a local workstation, allows users to design data
integration flows (or Talend Jobs) and publish them to Talend Cloud Data Fabric.
Figure 6: T
alend Data Stewardship functional architecture in hybrid deployment
Computation resources
Talend Cloud Data Fabric is a multitenant platform and customers can set up isolated execution
environments for computation resources.
• Remote Engines are deployed by customers on their own systems and therefore serve as
computation resources that they manage and control.
• Cloud Engines are deployed within a Talend Cloud Data Fabric tenant-specific AWS or Azure
Kubernetes cluster. Each tenant gets its own Cloud Engine instance.
The live preview feature of Talend Pipeline Designer, which allows users to preview the output of
processors while designing a pipeline, is executed in a dedicated Remote Engine or Cloud Engine.
Talend Management Console, Talend Data Inventory, and Talend Pipeline Designer give separate
computation resources to each tenant.
Network perimeter
To function properly and deliver its services, Talend Cloud Data Fabric may need to communicate
with external third-party solutions. All communications between Talend Cloud Data Fabric and such
external solutions need to be authorized and initiated by Talend Cloud Data Fabric. No external
solution can communicate with Talend Cloud Data Fabric unless the communication was initiated by
Talend Cloud Data Fabric.
Talend Cloud Data Fabric supports both AWS and Azure PrivateLink™ private connectivity, offering
an extra layer of security by ensuring traffic is not exposed to the public internet. Talend private
endpoints are futher documented in https://help.talend.com/r/en-US/Cloud/aws-private-link and
https://help.talend.com/r/en-US/Cloud/azure-private-link/activating-azure-private-link-with-talend
Talend networks and systems are protected via network and application firewalling, visibility
mechanisms, and micro segmentation strategies.
The types of data that can be exchanged between Talend Studio and Talend Cloud Data
Fabric include:
• Task artifact binaries
• Task artifact metadata (such as context variables and parameters)
• Talend API Designer definitions
Talend Cloud Data Fabric never initiates connections to Remote Engines. Remote Engines always
initiate outbound connections to Talend. Once a connection is established, all data is sent encrypted
over HTTPS.
Here are the types of data that can be exchanged between Remote Engines and Talend:
• Status information and metrics
• Lifecycle commands
• Task artifact metadata
• Job logs (optional)
• Task artifact binaries
Figure 11: Talend data flows when using Remote Engine Gen2
Figure 12: Talend data flows with Data Preparation and Stewardship (hybrid deployment)
Guiding principle — Talend applications and components always initiate outbound HTTPS
connections. Talend Cloud Data Fabric never initiates any inbound connection to these applications.
Here are the types of data that can be exchanged between hybrid applications and Talend Cloud
Data Fabric:
a) During user login: Client ID and client secret (as defined by the OIDC specification) of the installed
application is used to authorize its communication with Talend Cloud Data Fabric.
b) After user login: A JSON Web Token (JWT) that represents the user’s identity, metadata, and claims
is transferred back to the application.
Physical security
Talend maintains security controls to prevent unauthorized physical access to buildings and data
centers and to protect its systems and software, and by extension the Talend environment, from
damage, interruption, misuse, or theft.
Authorizations are reviewed regularly and access is monitored continuously.
Administrative access
The Cloud environment is totally separated from corporate IT resources and assets – only designated
members of the SRE team can access the Cloud environment governed by the Principle of Least
Privilege. Privileged access for the Cloud environment must be requested, is time-constrained, and
only performed via a bastion host.
Audit trails
Talend Cloud Data Fabric provides always-on audit trail capabilities to help monitoring user
activities. The audit logs are made available via a REST API. The logging service tracks all users and
their actions in the system with the timestamps and outcome of those actions.
With this API, you can manage regulatory compliance risks by collecting and storing those logs
on your own system. More details at https://help.talend.com/r/en-US/Cloud/api-user-guide/
audit logging
Vulnerability management
Talend partners with an external vendor for Static Application Security Testing (SAST) and Software
Composition Analysis (SCA). Their product is used to scan our software for security vulnerabilities in
third party or community software and in our own code. Scans are automated and integrated in the
development process of every Talend product.
Monitoring
Talend monitors all data backend backups and data replication to the failover region and follows up
the backup status using an internal dashboard to ensure the RPO target is respected.
Latest uptime per region is available on https://trust.talend.com.
Testing
Talend performs regular (at least, annual) tests of the below plans:
• Paper test: involved stakeholders review and update recovery plans
• Structured walkthrough: step by step review of disaster recovery plans and configurations
• “War Game Day” simulation: conduct scenario-based practice execution of plans
• Automatic biweekly backup data restoration and integrity tests at failover region
24