Content Modifier
Content Modifier
In a content enricher, "combine" simply appends the additional data retrieved from a lookup source to
the original message, while "enrich" merges the additional data with the original message based on a
common field, allowing for more controlled integration of the new information.
There are two options for aggregation algorithm: Combine: On using this option, the Lookup Message is
appended to the Original message. Enrich: On using this option, the Lookup Message is merged with the
Original message using a common field.
Aggregator : you can collect and store individual messages until a complete set of related messages has
been received The aggregated message is then sent to the actual receiver.
----------------------------------------------------------------------------------------------------------------------------------------
The scope of the element declared in the Header is beyond the scope of the IFlow,
The scope of the element decalred in the Property is only within the IFlow i.e. the property parameters are
not handed over to the receiver.
In the Message Body, you can do the following:
${header.elementname}
${property.elementname}
3. Call the body of the previous Content Modifier, like:
${in.body}
As shown in the screenshot below, I have called the Header, and body, and since, the received expects
the output in xml, the tags have been maintained like wise.
OUTPUT from this Content Modifier is:
Now, I have used one more content modifier to explain you the functionalities even better, so, follow
the series of images to see how it works.
Here, I have inserted a Groovy Script after every Content Modifier because I wanted to see the payload
after the IFlow is deployed. I shall share the script's content in the Part 7 of this series.
In the Content Modifier 2, below is the Header, Property, and Message Body:
As you can see in the screenshot above, I have used the code ${in.body} to call the content of the
previous i.e. Content Modifier 1.
Now, the output to this, after you have "Saved as version", and "Deployed" is below.
Passing the header parameters into the iFlow
This post addresses a common challenge encountered when working with SAP BTP Integration Suite —
specifically, passing parameters to iFlows using the 'Header' section of the message. Let's walk through an
example to illustrate the issue.
Define a new exchange property. This property will be passed within the Header when triggering the API.
Step #4. Modify the message body
To access the added exchange parameter value within the iFlow, consider printing it out in the message
body.
Deploy and test the newly created iFlow, ensuring to pass the parameter to the Header.
Upon testing, you may encounter an inability to read the value of the parameter passed to the iFlow.
Navigate to the Runtime Configuration tab of your iFlow to determine which headers are allowed.
See Specify the Runtime Configuration
Step 7. Testing
Re-test the iFlow to ensure the parameter is now successfully passed and readable.
As simple as that.
As shown in the above section there are 4 parts to Content Modifier on high level i.e.
General,Headers ,Properties and Body.Lets deep dive into each sections to see what are the different
possibilities or brief use cases we can leverage this step.
1. Constant: Any constant value that will be carried across integration process.
2. Header: Pass value from other header to newly created header.
3. Local Variable:We can use Write Variables step type to create a variable at a certain point within
the message processing sequence within same integration flow.To consume the Local variable
( with the scope of integration flow ) we can set type as Local Variable and set it to header or
property in Content Modifier.
1. Global Variable: We can use Write Variables step type to create a variable at a certain point
within the message processing sequence to be used across Integration Flows of cloud
integration tenant.To consume the Global variable ( with the scope of tenant ) we can set type
as Global Variable and set it to header or property in Content Modifier.
2. Property:source value depending on the type either header name,a property name, a variable
name or any other type.
3. Number Range:It helps to insert unique sequence numbers as part of the inbound or outbound
messages.To consume Number Ranges we can set type as Number Ranges and set it to header
or property in Content Modifier.
Value: Place holder for source value.Its value depends on the type ( like Header,Property,Local
Variable...)
Data Type: The Data Type column is used only for the types XPath and Expression
Default:If you have selected Local Variable or Global Variable as Type, the value specified as Default will
be assigned to the header value if the variable is not found at runtime.
Exchange Property:
Properties are important aspect of Message which will not get propagated to reciever system or target
system.Boundary condition for properties is integration flow.Rest of the configuration remains same as
header.
4.Message Body
Body contains actual message or content.There are 2 ways you can set the body value either a constant
or using camel simple expressions.If body is kept empty that the body is not changed.
Content Enricher:
Purpose: The Content Enricher is used to enrich the message by adding additional data fetched from an
external source (e.g., a REST service, OData service, etc.). It makes a call to an external system, retrieves
additional data, and merges this data with the original message.
Functionality:
Call External Service: The Content Enricher sends a request to an external service (e.g., REST or SOAP) to
fetch additional data.
Merge Data: The data retrieved from the external service is then merged with the original message. The
merging can be done in different ways, depending on the configuration.
Type of Operation: The operation is dynamic; it depends on external data sources, which can vary based
on the content of the original message.
Use Cases: Enrich a message with additional details from a backend system (e.g., fetching customer
details using a customer ID from a REST service).
Add supplementary data that wasn’t originally in the message, which is necessary for further processing.
Example: Fetching additional customer information based on a customer ID in the original message and
adding this information to the message payload.
Key Differences: Modification vs. Enrichment: The Content Modifier is used for direct modification of the
message content based on what’s already available, while the Content Enricher dynamically fetches and
adds new data from external systems.
Static vs. Dynamic: The Content Modifier operates with static values or expressions within the message,
whereas the Content Enricher involves dynamic data retrieval from external sources.
Complexity: The Content Modifier is simpler and does not involve external communication, while the
Content Enricher requires configuring connections to external systems and handling the integration of
the returned data.
When using the SFTP Sender Adapter in SAP Cloud Integration, you can filter files based on patterns.
Regex filtering provides a powerful way to match specific file names when polling an SFTP server. Here’s
how you can configure it:
1. Directory Path: Provide the relative directory path to read files, e.g., parentdir/childdir.
2. Enable Regex Filtering Under Source Tab, select the Regex Filtering checkbox.This ensures the
file name field is treated as a regular expression rather than a simple pattern.
3. Define File Name: Input your regex expression in the File Name field.
Regex Filtering Examples:
Important Notes:
Ensure the regex syntax is correct; invalid patterns may cause errors.
A 5-second timeout is set for regex evaluation.
Avoid overly complex patterns to ensure performance efficiency.
Example Configuration:
Directory: data/input
Regex Filtering: Checked
File Name: ^report_\d{4}-\d{2}-\d{2}\.csv$Matches files like report_2024-06-17.csv.
Regex filtering in the SFTP sender adapter simplifies file processing by enabling precise control over
which files to pick during polling. This is especially useful for dynamic integration scenarios where file
naming follows a specific pattern.
Read Lock Strategy Prevents files that are in the process of being written from being read from
the SFTP server. The endpoint waits until it has an exclusive read lock on a
file before reading it. Select one of the following options based on the
capabilities of the SFTP server:
None (default): Does not use a read lock, which means that the
endpoint can immediately read the file. None is the simplest option
if the SFTP server guarantees that a file only becomes visible on the
server once the process of writing it to the server has been
finished.
Done File Expected : Uses a specific file to signal that the file to be
processed is ready for consumption.
If you have selected this option, enter the name of the .done-file.
The .done-file signals that the file to be processed is ready for
consumption. This file must be in the same folder as the file to be
processed.
Post-Processing Allows you to specify how files are to be handled after processing.
Delete File (default): The file is deleted after it has been processed
successfully.
If you have also selected Done File Expected as Read Lock Strategy, the file
to be processed as well as the done file is deleted.
Select this option for SFTP servers that do not allow deletion or moving of
files, but the files are to be read only once.
Note that when you choose this option, the system only takes into account
the file name to decide whether it is the same file or not. Attributes such
like file size, timestamp, hash value, for example, are ignored.
If you have also selected Done File Expected as Read Lock Strategy, an entry
will be created in the idempotent repository; the done file will not be
deleted.
Keep File and Process Again: The file is kept on the SFTP server and
file processing is repeated. You can use this option for testing
purposes, for example.
If you choose this option, the file is processed with every message
processing run, even in case it has not be changed.
If you select this option, you need to specify the target directory.
If you have also selected Done File Expected as Read Lock Strategy, only the
file to be processed is moved and the done file will be deleted.
Idempotent Repository You can select one of the following idempotent repository options:
(only if Keep File and
Database(default): Stores the file names in a database to
Mark as Processed in
synchronize between multiple worker nodes and to prevent the
Idempotent
files from being read again when the runtime node is restarted. File
Repository is selected
for Post-Processing) name entries are deleted by default after 90 days.
Processing
Parameters Description
Read Lock Strategy Prevents files that are in the process of being written from being read from
the SFTP server. The endpoint waits until it has an exclusive read lock on a
file before reading it. Select one of the following options based on the
capabilities of the SFTP server:
None (default): Does not use a read lock, which means that the
endpoint can immediately read the file. None is the simplest option
if the SFTP server guarantees that a file only becomes visible on the
server once the process of writing it to the server has been
finished.
Content Change: Monitors changes in the file length/modification
timestamp to determine if the write operation on the file is
complete and the file is ready to be read. If you have selected this
option, the system waits for at least one second until there are no
more file changes. Therefore, if you select this option, files cannot
be read as quickly as with the other two options.
Done File Expected : Uses a specific file to signal that the file to be
processed is ready for consumption.
If you have selected this option, enter the name of the .done-file.
The .done-file signals that the file to be processed is ready for
consumption. This file must be in the same folder as the file to be
processed.
Caution
Example:
Rename: Renames the file on the SFTP server before reading it.
Poll on One Worker Only In case the integration flow is deployed on multiple worker nodes, each
worker node is connected as separate consumer to the SFTP server.
Note
The SFTP server provides the files in an order that is not controlled
by Cloud Integration.
The order of messages is only maintained per worker.
The technical communication between workers prevents
processing of the same file on multiple workers.
Therefore, the setting of this parameter has the following impact on how
the two other parameters behave at runtime, as explained
under Integration Flow Deployed on Multiple Worker Nodes.
Stop on Exception Select to stop the processing of the current file batch if any exception or
error is encountered. The polling will be resumed in the next polling cycle.
Only if Poll on One
Worker Only is enabled If you use an Exception Subprocess with Stop on Exception enabled, ensure
to end it with an Error End event. This is to ensure that the exceptions are
not suppressed, in-turn disabling Stop on Exception.
Sorting Select the type of sorting to use to poll files from the SFTP server:
This field is enabled only None (default): The sorting is specified by the STFP server.
if Poll on One Worker
File Name: Files are polled sorted by file name.
Only is checked.
File Size: Files are polled sorted by file size.
Note
Time Stamp: Files are polled sorted by the modification time stamp
From adpater version of the file.
1.16 onwards,
the Sorting field is
accessible only when Poll
on One Worker Only is
enabled.
(only if
for Sorting another
option than None is
selected)
Max. Messages per Poll Maximum number of messages to gather in each poll. Enter any value
between 1 and 500. The default is set to 20.
Consider how long it will take to process this number of messages, and
make sure that you set a higher value for Lock Timeout (in min). The
messages are picked up sequentially.
The system uses locks to ensure that each file from the SFTP server is only
processed on one runtime node (see: Message Locks).
Note
If you are using the sender SFTP adapter in combination with an Aggregator
step and you expect a high message load, consider the following
recommendation:
Set the value for Max. Messages per Poll to a small number larger
than 0 (for example, 20). This ensures proper logging of the message
processing status at runtime.
Note
Lock Timeout (in min) Specify how long to wait before trying to process the file again in the event
of a Cloud Integration outage. If it takes a very long time to process the
scenario, you may need to increase the timeout to avoid parallel processing
of the same file. This value should be higher than the processing time
required for the number of messages specified by Max. Messages per Poll.
Default: 15
Change Directories Select this option to change directory levels one at a time.
Stepwise
Include Subdirectories Selecting this option allows you to look for files in all subdirectories of the
directory.
Flatten File Names Flatten the file path by removing the directory levels so that only the file
names are considered.
(only if Include Sub-
Directories is selected)
Use Fast Exists Check If selected, file exists check is performed on the SFTP server. If your server
doesn't support this operation, switch back to client side check. This option
is enabled by default.
Post-Processing Allows you to specify how files are to be handled after processing.
Delete File (default): The file is deleted after it has been processed
successfully.
If you have also selected Done File Expected as Read Lock Strategy,
the file to be processed as well as the done file is deleted.
Select this option for SFTP servers that do not allow deletion or
moving of files, but the files are to be read only once.
Note that when you choose this option, the system only takes into
account the file name to decide whether it is the same file or not.
Attributes such like file size, timestamp, hash value, for example,
are ignored.
If you have also selected Done File Expected as Read Lock Strategy,
an entry will be created in the idempotent repository; the done file
will not be deleted.
Keep File and Process Again: The file is kept on the SFTP server and
file processing is repeated. You can use this option for testing
purposes, for example.
If you choose this option, the file is processed with every message
processing run, even in case it has not be changed.
If you select this option, you need to specify the target directory.
If you have also selected Done File Expected as Read Lock Strategy,
only the file to be processed is moved and the done file will be
deleted.
Idempotent Repository You can select one of the following idempotent repository options:
(only if Keep File and Database(default): Stores the file names in a database to
Mark as Processed in synchronize between multiple worker nodes and to prevent the
Idempotent files from being read again when the runtime node is restarted. File
The idempotent repository uses the username, host name, and file
name as key values to identify files uniquely across integration
flows of a tenant.
In Memory: Keeps the file names in the memory. Files are read
again from the SFTP server when the runtime node is restarted. It is
not recommended to use the In Memory option if multiple runtime
nodes are used. In this case the other nodes would pick the file and
process it because the memory is specific to the runtime node.
Archive Directory Specifies the target directory where to move the file.
(only if Move File is Make sure that you specify a relative file path for the target directory. Note
selected for Post- that the specified file path is defined relative to the directory specified with
Processing) the Directory parameter. If you specify an absolute file path, it may occur
that the file cannot be stored correctly at runtime. You can also specify the
target directory dynamically, for example, using the timestamp of the
message. The following example uses backup folders with timestamps and
replaces the file extension with bak: backup/${date:now:yyyyMMdd}/$
{file:name.noext}.bak
Integration Flow Deployed on Multiple Worker Nodes
If the integration flow is deployed on multiple worker nodes, each worker node is connected as separate
consumer to the SFTP server.
The setting of parameter Poll on One Worker Only has the following impact on how the Sorting and
the Max. Messages per Poll parameters behave at runtime.
If Poll on One Worker Only is not selected, before evaluating the Sorting setting, the system
determines the maximum number of messages to be read from the SFTP server per poll (as
configured by the Max. Messages per Poll parameter).
For example, if there are 1000 files on the SFTP server and for Max. Messages per Poll you've
specified 500, the SFTP adapter reads the first 500 files from the SFTP server and, after this step,
sorts these files according to the Sorting settings.
If you don’t restrict the polling to one worker, files are processed in parallel. As consequence,
you can run into a situation in which messages that are later in the sorting order can overtake
other messages which are currently being processed on a different worker. This disturbs the
sequence of messages.
If Poll on One Worker Only is selected, all files will first be sorted and then the messages
according to the setting of parameter Max. Messages per Poll are selected.
Max. Maximum number of messages to gather in each poll. Enter any value
Messages between 1 and 500. The default is set to 20.
per Poll
Consider how long it will take to process this number of messages, and make sure that
you set a higher value for Lock Timeout (in min). The messages are picked up
sequentially.
In case the integration flow is deployed on multiple runtime nodes, each runtime node
is connected as separate consumer to the SFTP server. Likewise, each runtime node
polls files from the SFTP server independently. In such a case, the maximum number of
polled files is the value specified for parameter Max. Messages per Poll multiplied with
the number of runtime nodes. For example, if the integration flow is deployed on two
runtime nodes and the parameter Max. Messages per Poll is set to 10, the overall
maximum number of polled files per scheduled poll time is 20.
The system uses locks to ensure that each file from the SFTP server is only processed on
one runtime node (see: Message Locks).
🌟 Recently, I faced a issue in one of our SAP CPI production flows, where an HTTP client error was
causing integration failure with ODATA Adpater. Sharing the details to help others in the community:
🔴 Error:
HTTP Request failed with error: while trying to invoke the method java.lang.Object.hashCode() of a null
object loaded from local variable 'httpClient'
The issue occurred due to proxy-related system properties (https.proxyHost, https.proxyPort, etc.) being
overridden in Groovy scripts with System.setProperty() call.
These properties are global and shared across all HTTP clients, leading to conflicts and
NullPointerExceptions during initialization. 🌐
🛠️The Fix:
To resolve this, implemented a simple Groovy script to clear proxy settings before any HTTP call:
✅ Results:
🌟 Key Takeaways:
Splitter :
XPath Expression – XPath of the node that marks each split message.
Grouping – How many nodes are clubbed together under 1 split message. If this is blank, each node will
form 1 split message.
Streaming - Enable this if you want to start splitting before the entire big message is loaded into
memory. The system will first divide the message into chunks and starts splitting the chunks into split
messages. This is only relevant if the splitter is the first step after the step that fetches data – otherwise,
the content would have already loaded in memory.
Odata Adaptor :either get EDMX file or Address for Odata service
1st Message
2nd Message
Batch processing : as its name suggests implies that multiple requests are clubbed into one HTTP call.
What this means is that if you are updating 100 records, instead of making 100 PUT or update request
the adaptor would just make one call to the server with all essential data .this would greatly improve
time taken to achieve such requirements ,also improving the performance.
1. Delta Load:
Definition: Delta load refers to the process where only the data that has changed (i.e., the new
or modified records) since the last synchronization are extracted and loaded into the target
system.
Use Case: This method is efficient for scenarios where the volume of data is large, and only
updates or new entries need to be transferred. It reduces the load on both the source and
target systems by minimizing the amount of data being processed.
Key Feature: Typically, a timestamp, version number, or unique identifier is used to identify the
changes (delta) since the last load.
2. Period Delta:
Definition: Period delta refers to a variation of delta load where data is extracted for a specific
period (e.g., last 24 hours, last week, etc.). It’s based on a defined time window rather than a
general "last change" timestamp.
Use Case: It’s useful when you want to extract data that has changed within a specific time
frame or period, rather than just identifying individual changes. This is often used in cases where
you want to fetch data in periodic batches or during a regular sync.
Key Feature: It is period-based, meaning data within a defined time window (e.g., last 30 days) is
extracted and processed.
3. Full Load:
Definition: Full load refers to the process of extracting and loading all data from the source
system into the target system, regardless of whether it has changed or not. This is in contrast to
delta loads, which only process changed data.
Use Case: Full loads are typically used when it is necessary to refresh the entire dataset or when
setting up initial integrations. It can also be used in cases where the delta mechanism is not
available or reliable.
Key Feature: Full load can be resource-intensive because it involves transferring the entire
dataset.
Key Differences:
Delta Load only brings in new or changed data (efficient for regular, incremental updates).
Period Delta extracts data within a specific timeframe, even if the data hasn't changed but is
within the set period (useful for periodic data extraction).
Full Load involves extracting and loading all data (typically used during initial loads or full
refreshes).