0% found this document useful (0 votes)
642 views152 pages

Data Hub

The SAP Commerce Data Hub is a powerful data integration and staging platform that facilitates loading large amounts of data from one or more sources. It then processes the data and prepares it for delivery to any number of target systems. Data Hub removes obstacles related to loading and manipulating large datasets, acting as a staging area for essential business data while also protecting investments in backend master data management systems.

Uploaded by

威王
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
642 views152 pages

Data Hub

The SAP Commerce Data Hub is a powerful data integration and staging platform that facilitates loading large amounts of data from one or more sources. It then processes the data and prepares it for delivery to any number of target systems. Data Hub removes obstacles related to loading and manipulating large datasets, acting as a staging area for essential business data while also protecting investments in backend master data management systems.

Uploaded by

威王
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 152

Data Hub

SAP Commerce Data Hub is a powerful data integration and staging


platform. It facilitates the loading of large amounts of data from one or
more sources. Then it processes the data and prepares it for delivery
to any number of target systems.

Data Hub Fundamentals
The SAP Commerce Data Hub is designed to remove obstacles
related to the loading and manipulation of large amounts of data. It
acts as the staging platform for the core data that is essential to
operations in your business. It also protects your current investments
in back end master data management systems.

SAP Commerce Data


HubBusiness Advantages
Data Hub is a comprehensive business solution that takes care of
almost any data integration issue. Data Hub also serves as a robust
and flexible master data management tool.

Addressing Data Integration Challenges

Introducing new, front-end commerce applications into an environment with existing business data
management systems creates challenges for data integration. Traditionally, the introduction of these
new applications means lengthy, and costly, custom integration projects. Data Hub is designed to
overcome barriers related to the import and manipulation of large amounts of data. It acts as the
staging platform for data from multiple sources, and protects existing investments in back-end master
data management systems such as SAP ERP.
Data Hub is designed with SAP Commerce-SAP integration in mind, but is customizable for any data
integration scenario. Its vendor-independent workflow is fully extensible. Customizations can easily
be introduced that alter the data transformation workflow for specific integration needs. The ease of
customization greatly reduces project implementation costs and time-to-market wherever such data
integration is required.

Improving the Customer Experience

Data Hub works primarily asynchronously, which means the data transformation process is
independent of external systems. Removing this reliance on the availability or responsiveness of
back-end systems allows for faster response times to the customer. Customer data is transmitted to
back-end systems outside the customer transaction workflow, and the customer experience is
improved.

Supporting Master Data Management

With the right extensions, Data Hub brings together fragmented data from diverse systems into a
single, authoritative master view. It provides a full history of the data transformation workflow, and
performs error correction. It even retries data publication in unstable environments without any data
loss.

Data Hub offers all of the principle features of good Master Data Management (MDM) - collection,
aggregation, correlation, consolidation, and quality assurance. It allows for the synchronization of all
critical business data to a single reference system - an authoritative source of master data. These
features make Data Hub an ideal data staging platform for a unified MDM strategy.

SAP Commerce Data


Hub Basic Data Flow
The SAP Commerce Data Hub loads large amounts of data from one
or more sources. It then processes the data and prepares it for
delivery to any number of target systems.

The SAP Commerce Data Hub provides a vendor independent, asynchronous workflow that is fully
extensible to support the loading, processing, and publication of any data structure.

1. Raw items come directly from the source system. The data they contain may undergo some
preprocessing before entering the Data Hub.

2. Canonical items have been processed by Data Hub and no longer resemble their raw origins.
They are organized, consistent, and ready for use by a target system.

3. Target items are transition items. The process of creating target items reconciles the
canonical items to target system requirements.

Context
Data Hub is a Java web application, and it utilizes a relational database. It requires certain
prerequisites for a minimal installation. Unless otherwise stated, the following instructions
are not valid for third party software versions other than those stated here.

Solution Book
The Solution Book files are useful as examples or samples of what
can be done with Data Hub.

You can find the Solution Book files in the Data Hub Suite ZIP file. After you install Data Hub, you
can find the files in hybris/bin/ext-integration/datahub/solution-book. You can

also access the files from 2764052  .

Install SAP Commerce Data


Hub
By default, Data Hub enables basic authentication on its REST API. To configure users for basic
authentication, create the text file /opt/datahub/config/local.propertiesand add the
following content:

1. HSQLDB is the default database of SAP Commerce and, unless otherwise specified, is the
default database used by Data Hub. No configuration is necessary, and you do not need to
start it. It just works.

Configure Data Hub for a Test


Environment
Set up a Data Hub test environment that you can use for the tutorials.

Context
By default, Data Hub enables basic authentication on its REST API. Configure your installation with
access credentials for the two basic security roles, then test your installation. After you complete
these steps, you can use this environment for the Tutorial: Setting Up and Running Hello World.

Creating Your Data Models


Whenever you process data through Data Hub, create a data model
for each of the three stages.

Creating Your Data Models


Whenever you process data through Data Hub, create a data model
for each of the three stages.
Data Hub data models are XML files that describe and sometimes transform the data. The
relationship between the data models and the respective raw, canonical, and target items is
represented in the following figure. A data model is required for each of the three stages of the Data
Hub workflow.

Summary

You have just created and loaded three custom Data Hub extensions. The extensions consist of a data
model for each of the three data types, Raw, Canonical, and Target. Data Hub uses the raw and
canonical models to structure and transform data internally during the load and composition phases.
It then uses the target model to transform data for specific target systems, and to connect to those
systems. Such data models are the minimum requirement for any custom Data Hub extension. Your
new extension is used for the tutorials that follow.
Composing Data
Once the raw data is loaded, compose it into canonical items. The
composition phase has two processes: grouping and composition.

Context

Results
The POST request you sent in step one triggered the composition of all items in the GLOBAL data
pool. The request caused the raw item to be transformed into a canonical item, according to your
canonical data model. This canonical item is now ready for transformation into a target item for one
or more target systems. In the next tutorial you publish this item to a file using the file adapter.

Publishing Target Data


Once the items are composed into canonical items, it is time to publish
them.

Results
Congratulations! You've just successfully published your first set of target data. When sent in a
POST request to Data Hub, the contents of datafile2 triggered a publication of all data in the
specified pool. The resulting target item was picked up by the output adapter which then wrote the
data to the output file. There were no complex data transformations in your data model, only one-to-
one mappings. Therefore the output was the same as the input. The only thing that has changed is the
column headers

Overview and Concepts


SAP Commerce Data Hub moves data items from one system to
another. During the process of moving the data, you have the
opportunity to modify the data. While your data is in the Data Hub,
each item has a status assigned to it. The status changes whenever
the data begins or completes a process.
The SAP Commerce Data
HubWorkflow
At a high level, the SAP Commerce Data Hub workflow moves data
from a source system to a target system. This workflow comprises
three major phases that are explained in the following detail. The
workflow includes the load phase, the composition phase, and the
publication phase.

Source System

The SAP Commerce Data Hub takes data from any source system. However, the SAP Commerce
Data Hub treats each data fragment as its own raw item during the load process. So, it is good to
understand the nature and complexity of your data before loading data into the SAP Commerce Data
Hub.
Load

During the load phase, data is converted into raw items ready for processing. Loading means
resolving the data into raw item fragments as key-value pairs ready for the next step of the process,
composition.

Data Feed and Data Pool

When the data is loaded, it goes through a data feed. Raw items enter the data feed as fragments, and
not as a single, monolithic, data block. The data feed routes the raw items to a data pool. Pools are
containers for isolating items during the processing lifecycle.

Once data is loaded, it is processed in the composition phase

Compose

The composition phase converts raw items to canonical items using two subprocesses: grouping and
composition. Both of these subprocesses are influenced by the use of default data handlers. Handlers
are custom extensions to the SAP Commerce Data Hub and are used to enhance its functionality.
Grouping handlers pull the raw imported items into logically ordered groups, while composition
handlers apply the composition rules by which the canonical items are composed.

Canonical items represent a master data type view that is independent of the structure of both the
source and target systems. It is during this phase that the power of the SAP Commerce Data Hub as a
data staging platform is seen. The canonical view provides a reference model or template that may be
reused regardless of source or target system.

It is also during the composition phase that data can be consolidated from multiple sources, the
integrity of data checked, and any quality issues remedied. Imported data is open to inspection at any
phase in the SAP Commerce Data Hub workflow. The inspections allow for complete transparency
of the data processing and error remediation before publishing to target systems. Once data is fully
composed, you can publish it to one or more target systems.
Publish

The publication phase transforms canonical items into target items ready for export to the target
system. Once the data has been processed into target items, outbound extensions or target adapters
then provide the means for delivering the data to target systems.

Target System

SAP Commerce Data Hub includes a target adapter that integrates with SAP Commerce. This target
adapter is called the SAP Commerce Data Hub Adapter, and it provides connectivity between SAP
Commerce Data Hub and SAP Commerce. If you are using a different target system, you create your
own custom target adapter.

The Backoffice Data Hub


Cockpit
Backoffice Data Hub Cockpit exposes the primary processes of
the Data Hub workflow. Here you can access the basic functionality
of Data Hub, embedded as an extension in SAP
Commerce Backoffice.

The typical use case for Data Hub is to deploy it together with SAP Commerce. SAP
Commerceprovides the Backoffice Administration Cockpit, a framework for building web-based
administration tools. The Backoffice Data Hub Cockpit is one of the provided Backofficecomponents
shipped with SAP Commerce.

Backoffice Data Hub Cockpit is an easy-to-use graphical user interface for performing key Data
Hub tasks in a simple and intuitive way. With Backoffice Data Hub Cockpit, you can manage the
entire Data Hub workflow, including the following:

 View item and status counts


 Create named pools

 Create new feeds and assign them to pools

 Initiate load, compose, and publication actions for small data sets

 View and analyze any error messages and failures

Accessing Backoffice Data Hub Cockpit

When the Backoffice Data Hub Cockpit extension is installed, you can access it by simply logging
into the SAP Commerce Backoffice Administration Cockpit. Once logged in, choose the Data
Hub perspective from the perspectives menu to access the Backoffice Data Hub Cockpit.

Layout of Backoffice Data Hub Cockpit

The Backoffice Data Hub Cockpit consists of several main areas, as shown in the following figure.
These main areas are as follows:

1. The perspective menu - Using the options provided here, you can switch between the Data
Hub perspective and other perspectives as required.

2. The current Data Hub instance - Here you can choose which Data Hub instance you want to
use for this session. Just click the down arrow and select.

3. The function menu - Here you choose one of the available functions of Backoffice Data Hub
Cockpit:

o Dashboard - See an overview of status and item counts for raw, canonical, and target
items. You can filter this count by pool, to view only those items in a selected pool.

o Quick Upload - This tool is designed to accommodate the quick upload of data into
a Data Hub data feed. You can also initiate the composition of the data, and the
publishing of the data to a target system.

o Errors & Failures - This tool makes it simple to review the errors that may occur
during each step of the Quick Upload process. The Errors & Failures menu option
can be useful even for large production runs due to its advanced, field level searching
capability.
o Manage Pools - Here you can create named pools for data import and composition.
The GLOBAL pool is provided by default.

o Manage Feeds - Here you can create new feeds and assign them to pools. The
DEFAULT_FEED is provided by default, and is mapped to the GLOBAL pool.

4. The control pane - This pane contains the screens associated with the different menu options;
allowing you to perform tasks associated with these options. When you select each item in
the function menu, the contents of this pane change to reflect your choice.

Data Handling
Data Hub is all about data. Data is loaded from a source system,
which can be any kind of source. Once the data is in the Data Hub, it
is organized and converted into a canonical format. While the data is
in the Data Hub, there are opportunities to modify filter it. When
preparing the data for a target system, further manipulation can occur.
Then the target adapter ensures that the data is compatible with the
target system.

All adapters and handlers are really just extensions. They have other names to designate their
function. There are several different types of extensions for processing data including the following:

 source adapters

 target adapters

 grouping handlers

 composition handlers

 custom data handlers - for grouping, composition, and publication grouping

Source and Target Adapters


Source adapters prepare data from a source system for easy load into the Data Hub. Target adapters
are responsible for sending data to a target system and reporting back the results.
Source Adapters

The source system adapter receives data from a source system and feeds it to the Data Hub in a
standardized way.

Some kinds of data, such as IDocs, require additional preparation to be efficiently processed. The
SAP Data Hub Extensions help handle IDocs.

Target Adapters

After data is processed by the Data Hub, it is sent to a target system through the target system adapter
as part of the publication phase.

Data Handling During Composition: An


Example
When raw item data enters the composition process, it is grouped by specific attributes and then
composed into canonical items. This work is done by handlers (both out-of-the-box handlers and
custom Data Hub extensions). There are two kinds of handlers in the composition process: grouping
handlers and composition handlers.

Grouping Handlers

Grouping handlers are responsible for determining what canonical items are going to be created
given an input set of raw items. The Data Hub ships with two default grouping handlers. One handler
groups by type and the other by primary key. Custom handlers can be added by the user to perform
more sophisticated data handling operations such as filtering.
Composition Handlers

The Data Hub ships with three default composition handlers that create localized values, collections,
and simple strings. Custom handlers can be added by the user to perform more sophisticated data
handling operations. Composition handlers put the grouped raw item data into canonical form.
Shaping the raw items includes the following:

 Populating the canonical data fields

 Handling empty data fields

 Merging data from several fields into one


Composition handlers are executed after grouping handlers in the composition phase, and the
resulting canonical items persist in the database.

Data Hub Extensions
Data Hub includes an engine for data management and
transformation, and a framework for building data integrations. With
aData Hub extension you can construct these data integrations and
influence the functionality of the Data Hub engine.

Extensions are compiled as .jar files and placed on the Data Hub class path. They are loaded when
the Data Hub starts before it performs any data handling functions. Extensions are not loaded
haphazardly. Some extensions, like grouping or composition handlers, have explicit Spring
properties defined to control when the handler runs in relation to other handlers. Other extensions
have dependencies upon each other, so they only load after their defined predecessors. You control
loading order by defining the dependencies in the extension XML. Data Hub automatically resolves
these relationships during the extension loading process.

Data Hub Ships with Four Default Extensions

 Test Adapter

 File Adapter (an outbound adapter)


 CSV Inbound Adapter

 SAP Commerce Outbound Adapter

How an Extension is Built

1. The only required element in an extension is the XML file. It defines the data structure.

2. Optionally, you can write custom java code for your extension when you want it to do more
than an XML file can accomplish. Custom composition handlers are a good example. You
may want to do more to the data during composition than the default handlers do. The simple
solution is to create your own composition handler and place it
in /opt/datahub/extensions.

3. Optionally, you can use a Spring XML file. If you create a custom composition or grouping
handler, you must create a Spring XML file, because that is where you set the processing
order property. The name template for a Data Hub Spring file is your-extension-
name/datahub-extension-spring.xml.

Summary: Concepts
In this topic area, you reviewed a more detailed data flow diagram for
the Data Hub. You were introduced to some of the basic data handling
structures, and learned some fundamental information about
extensions.

The following areas are the expected Learning Outcomes for this topic area:

 You have a much better understanding of the overall Data Hub data flow

 You have some familiarity with the Backoffice Data Hub Cockpit, and can relate the features
there to the data flow concepts you have learned

 You have been exposed to some of the Data Hub data manipulation tools and you understand
their basic purpose
 You understand the concept of an extension. You have seen what is used to create them, and
how Data Hub incorporates them.

The core of Data Hub functionality is its ability to manipulate data. One


of the ways to manipulate data is by applying transformation
expressions.

In this topic area, you use the extension.xml file from the Hello World tutorial to manipulate the
data during the publication phase. Data manipulation is achieved using various transformation
expressions that you learn about here.

Working with the Extension


XML File
The extension.xml file is the most important ingredient of your Data
Hub extension. This file defines the data model and relationships of
each of the three phases of the Data Hub workflow (load, compose,
and publish). It describes your data to Data Hub.

The extension.xml file uses an easy-to-understand XML dialect that allows you to define a data
model for your project. This data model describes the internal structure of raw items, canonical
items, and target items within Data Hub. It also allows you to specify the necessary transformations
the data must undergo on its journey from raw to canonical, and then to target items suitable for
publishing to your target systems.

It is considered a best practice to maintain distinct extension.xml files for raw, canonical, and


target items. This separation of data models for the different item types makes your data models
easier to manage. Data Hub loads the extension files together during start-up and initialization.
An extension.xml file is the minimum requirement for a custom Data Hubextension. You can
easily create custom extension.xml files by modifying existing examples, or creating new files
using your favorite XML editor or IDE.
The following sections briefly outline the XML dialect for each of the item types described by
your extension.xml files, with examples. The examples are taken directly from the Hello
World tutorial you completed earlier. In the tutorials later in this chapter, you then modify some of
the attributes of these files to customize your data transformations.

A raw item represents the structure of the unprocessed data entering the Data Hub. Here is a 

The canonical data model is independent of both source and target systems. You therefore should not
need to modify it, even when changing the source or target.

The target item XML file differs only slightly in structure to the raw and canonical extension XML
files. Data Hub can publish to more than one target, and each of target can have a unique schema.
You can therefore have multiple target data models. For the purposes of this chapter, there is only
one target system defined in our file.

The following are specific to target items:

 The <transformationExpression> allows you to transform your data in some way.


It can be done using simple mapping, similar to the way you map raw to canonical items. In
that case, you simply provide the canonical item attribute name from which the item is to be
sourced. Alternatively, you can use the Spring Expression Language (SpEL), and the
custom SAP Commerce resolve() function for more complex transformations.

 The <exportCode> provides the name of the target item attribute into which the canonical
item data is to be placed. Here, once again, you can simply provide a name,
but exportCode can also include an ImpEx expression.

Data Hub supports the Spring Expression Language. Using SpEL, you


can apply more complex transformations to your target data items

Tutorial: Overriding Target Item


Definitions
Data Hub allows you to override an existing target item definition in a
custom extension.
Summary: Basics of
Manipulating Data
In this topic area, you learned the important role of the extension.xml
file, and were able to use it to manipulate data in a simple way.

The following areas are the expected Learning Outcomes for this topic area:

 You understand that the extension.xml file is at the heart of Data Hub data structures,


and you are familiar with its layout.

 You know how to manipulate data using basic transformation expressions.

 You understand how to override an existing item definition.

Build Essential Data
HubKnowledge
Once you've completed Start Your Journey, you're ready to dive in a
little deeper. Here is where you build a solid foundation of the
essential Data Hub concepts and processes.

Build Essential Data Hub Knowledge is aimed at anyone needing to understand at a deeper level


how Data Hub works. Target audience includes project managers and solution architects, for whom
this level of knowledge may be sufficient. But it is also essential learning for the experienced
extension developer who wants to prepare Data Hub for a production environment. Build
Essential Data Hub Knowledge gives you command of all the foundational concepts for you to start
building custom data integrations. You must completely understand the concepts presented here
before moving onto the final section, Master Your Data HubProject.

A Generic Use Case


A typical Data Hub integration consists of several components,
including adapters, data models, data handlers, and configuration
files. Take all of these components into account when considering
your needs for a successful Data Hub deployment.

Careful planning is required to undertake a successful data integration project using SAP Commerce
Data Hub. Understanding what components are required for the needs of your project is the first step
in this planning. Certain components are mandatory, and for some SAP Commerce provides an out-
of-the-box solution. These out-of-the-box components are shipped together with the SAP
Commerce software package. Other components are purely optional, their inclusion being dictated
solely by the needs of your project.

The following figure provides a complete end-to-end overview of a Data Hub integration. Use this
overview and subsequent explanations to gain some insight into which components may be relevant
to you. If a component is optional, it is indicated in the description. You can find links to further
information after each description.

Source System Adapter - The source system adapter provides an integration between the source
system and Data Hub. The source system adapter is an essential part of your data integration solution
because it is required to import raw, unprocessed data from the source system to Data Hub. The
REST endpoint used to import data corresponds to the Data Hub integration channel and data feed. A
more complex adapter may include event responses and data preprocessing capabilities. If your
source system is SAP Commerce or SAP ERP, then you can use one of the provided, out-of-the-box
adapters for this purpose

Inbound Channel - Data Hub provides a single inbound channel that is a Spring Integration channel.
Spring Integration is the preferred method for importing data into Data Hub. The CSV inbound
adapter provides a convenient method for data already in CSV format. It sends the data along to the
Spring Integration channel.. For more information, see:

Data Feeds and Pooling Strategy - Do you need more than one feed or pool for either concurrent
processing or data segregation? Data Hub provides a single feed and a global pool by default. If you
need more than one feed or pool, you can use either a named pool or new pool per input pooling
strategy. Using a strategy other than the default feed and global pool is optional. For more
information, see:

Data Model - The data model defines how raw, canonical, and target data items are structured, and
includes metadata that assists with composition and publication. Typically, you create separate XML
files for RawItem, CanonicalItem, and TargetItem definitions. The target item XML also includes
target system definitions. The data model is an essential component of your Data Hub integration
project

Data Handlers - Data handlers are classes that are used to execute data transformations, including
grouping, especially during the composition phase of theData Hub workflow. Data Hub is shipped
with a set of default grouping and composition handlers. You may develop your own dedicated
handlers in addition to the provided defaults. Handlers can run in any order and you can insert as
many handlers as you need. Publication grouping handlers are optional, but provide powerful data
transformation possibilities during the publication phase.

Target System Adapter - The target system adapter is a means to deliver data to target systems and
gather results. The target system adapter is the normal endpoint for the Data Hub publication phase.
It does the final packaging of target items for delivery, then forwards them to the target system. If
your target system is SAP Commerce, or either SAP ERP or SAP C4C, then you can use one of the
provided out-of-the-box adapters. For more information, 

Configuration - Data Hub provides a number of configuration possibilities, including parameters


related to performance, the database, security, and integration with external systems through the
adapters. Although not all configuration options are mandatory, you need to provide some essential
basic configuration for your integration scenario. Essential configuration includes credentials for
connecting to the Data Hub database, and for use with basic authentication. In addition to providing
basic configuration for your project in the local.properties file, you may choose to create a
custom Spring Security profile. This profile is used for the protection of Data Hub REST endpoints
using HTTP Basic Authentication.

In addition to the out-of-the-box solution described above, you may also find examples of solutions
for specific test and real-world scenarios in the Solution Book.
Data Hub – SAP ERP
Integration Use Case
Data Hub plays a vital role in simplifying the data integration between
the SAP Commerce platform and SAP ERP systems.

As a part of the SAP ecosystem, SAP Commerce is the natural choice for companies already heavily
invested in SAP back-end systems. This includes companies who now also wish to expand their
footprint into digital commerce channels. Extending SAP ERP to interchange data with SAP
Commerce ordinarily creates challenges for data management. Leveraging the Data Hub and the out-
of-the-box SAP Commerce-SAP integration, companies have the solution to simplify, extend, and
develop these vital data integrations.

A Simple Use Case Overview

Companies have invested considerable resources and time into implementing their SAP ERPsystems
and customizing them to suit their particular business needs. It is natural, then, to want to build upon
this system and its data that forms the basis of critical business processes. Using the Data Hub,
transactional, customer, pricing, and product data can be bi-directionally integrated between SAP
ERP and SAP Commerce.

SAP Commerce-SAP ERP Integration Overview


When a customer places an order in SAP Commerce, the customer and order details are stored in
the SAP Commerce database. The data is then propagated to Data Hub by Data HubAdapter core
extension. Data Hub processes the data, mapping it to SAP customer, order, product, and other data
fields, then exchanges the data with the SAP ERP system using SAP-native IDOC format. With
asynchronous integration, the processing of data has no impact on the responsiveness of the SAP
Commerce front end. Data Hub ensures data integrity between front and back end systems, while the
customer experience is preserved.

The SAP ERP system then handles this data according to its pre-defined business processes. It sends
an order confirmation back to SAP Commerce through Data Hub in a similar process. Later it may
update the customer view in SAP Commerce periodically with the order status or delivery
notifications.

The SAP ERP system also serves as an authoritative master data source for pricing information. The
pricing information includes campaigns and discounts, product information, and inventory, as well as
anticipated future stock. Data Hub can propagate this product and pricing data from SAP
ERP to SAP Commerce as it evolves, ensuring customers always have the most up-to-date view. In
addition, the event-driven architecture of Data Hub ensures that the propagation happens in near-real
time.

The SAP Integration Wor

The SAP Integration Workflow

Data Hub is shipped with out-of-the-box SAP integration extensions. These extensions provide a
path to move data to and from SAP ERP and Data Hub using the native SAP IDOC format.

Support for SAP integration includes the extensions to receive inbound data and send outbound data,
and also a complete set of data model extensions. These extensions provide for the mapping of raw
data fragments to canonical items, and then to target data for SAP systems. Data models are also
provided for data flowing from SAP ERP to SAP Commerce. Components marked in color in the
following graphic represent current out-of-the-box supplied integration extensions.
The Integration Workflow for Data Flowing from SAP Commerce to SAP ERP

With data flow in both directions, it is possible to deploy the SAP Commerce solution as the
responsive, transactional front-end. SAP ERP remains the authoritative master data source. Some
customization may be necessary to adapt this workflow to specific environments. However, the SAP
Commerce software package includes out-of-the-box adapters and handlers for this scenario. The
complete package greatly simplifies the effort involved by reducing both time-to-market and total
project costs.

Data Integration in a Commerce Environment

Data Hub is able to integrate and consolidate various types of data between SAP ERP and commerce
front-end systems. Types of data that may be integrated include the following:

Customer Data
With asynchronous integration, customer data is replicated between the SAP Commercesolution
and SAP ERP. When a customer submits an order, there is the option to create a profile in SAP
Commerce. The unique customer data, along with preferences and transaction history, is stored
in SAP Commerce. It is also replicated to the SAP ERP with an asynchronous call along with the
order data. The replication enables a responsive customer experience while ensuring all master data
in the SAP ERP is up-to-date. Existing customer data in the SAP ERPcan similarly be replicated
to SAP Commerce using either a bulk data transfer or a change event trigger.

Order Data

Submission of a customer order from SAP Commerce to SAP back-end systems is a critical part of
the SAP Commerce-SAP integration solution. Asynchronous order updates may be triggered by a
periodic cron job or an event. The ability to pause between the submission of the order and update to
SAP has the added benefit of allowing a window of time for updates to the order. This window could
be used as a "buyer's remorse" period, for example.

Product Data

In most cases, considerable time has been invested in specifying, organizing, updating, and
classifying product data in the SAP ERP. Such product data may be imported into SAP Commerce.
Once there, it is available in the SAP Commerce Product Information Management (PIM) solution
through the SAP Commerce Product Cockpit. It is also available to customers during the search and
transaction process. This import can be done in bulk, either manually or triggered by an automated
process. It can also be updated at the time of a customer transaction as described previously. PIM
plays a vital role in the overall MDM strategy.

Pricing Data

SAP Commerce provides extensive support for B2B pricing scenarios. In these scenarios, a single
price for a product applies to all customers and discounts or campaign strategies applied across the
board. In such a scenario, basic pricing data, including discounts and campaigns, can be stored
in SAP Commerce. The information is then updated to the SAP ERPon a per-transaction basis with
the customer order.
In a B2C scenario, pricing is determined on a per-customer basis depending on various, complex,
factors. It therefore may be the SAP ERP that is best equipped to be the source of the master pricing
data. In such a scenario, you may opt for synchronous integration between SAP ERP and SAP
Commerce. Synchronous integration also depends on how often these prices are updated. An
asynchronous integration solution always means faster response times, decreased load on the SAP
ERP back end, and a better customer experience.

Inventory Data

Your SAP ERP system is ordinarily the master data store for inventory, including availability,
shipping locations, back orders, and number of items in stock. The business processes in place here
already are not replicated in SAP Commerce. How and where to calculate inventory data is a
decision based on factors such as performance and customer satisfaction. The solution may be
implemented either in the SAP ERP or the SAP Commerce solution. It is then presented to the
customer through the SAP Commerce product details page, at checkout, or through the order history.

Order Status

After submitting an order, both B2C and B2B customers want to track the status of their order. The
order status data stored and updated in SAP ERP is potentially vast as notifications are issued at
every stage of the order fulfillment process. How and if each of statuses are mapped to order status
updates in SAP Commerce is a matter of choice in your implementation. Out-of-the-box, Data
Hub provides support for data mapping using one of several pre-built extensions. As with other
aspects of the SAP ERP integration scenario, these extensions may be customized to suit your
business processes and customers' needs

A More Detailed Data Flow Model

Getting data from SAP Commerce to SAP ERP and back again involves more than simply inbound
and outbound transport adapters. Data Hub ships complete with a dedicated suite of extensions for a
typical SAP Commerce-SAP integration use case. Each of the extensions has a specific role to play
in the preparation and manipulation of data on its path towards and through Data Hub.
The data flow models in the following diagram outline this path, and the role of the various
extensions in more detail. Extensions indicated with blue titles are standard SAP Commerce or Data
Hub extensions, while the yellow titled ones are for SAP integration use cases. All the extensions
included in these data flow models can be shipped together with Data Hub. They can be used out-of-
the-box as the foundation for a SAP Commerce-SAP data integration. Having out-of-the-box
solutions greatly simplifies and reduces the effort involved in your integration project. Click each of
the image thumbnails for a larger view.

Publishing a customer order from SAP Commerce to ERP


Data Hub – SAP C4C
Integration Use Case
Data Hub plays a vital role in the data integration between SAP
Commerce and the SAP Cloud for Customer solution. It transforms
essential customer data, so the data can be used by back end
customer support, sales, and service platform.

SAP Cloud for Customer (C4C) provides a complete solution for targeted customer support, sales,
and marketing that goes well beyond traditional Customer Relationship Management
(CRM). C4C allows sales and customer service teams to provide a relevant customer experience at
every stage of the customer journey. However, in a commerce environment, it is the front-end web
shop that is the first point of contact for customers. It is also the primary repository for customer
data, which is vital to enabling this targeted customer experience. With Data Hub, customer data
created in the SAP Commerce solution can easily be transferred to SAP C4C for use during sales,
marketing, and support activities.

A Simple Use Case Overview

With SAP Commerce and Data Hub, customer data integration with SAP C4C is unidirectional.

Customers create or update their information in the SAP Commerce solution, either within or outside
the context of a transaction. The data change is detected by the SAP Commerce y2ysync extension.
The relevant customer data, including billing and shipping addresses and contact details, are then
written to an ImpEx file to be collected by Data Hub. Data Hub processes this data through its
standard workflow and delivers target data items suitable for C4C over a SOAP interface.

In a telecommunications environment, for example, the SOAP interface allows the C4C system to


utilize the SAP Customer Telephony Integration system. The SAP Customer Telephony Integration
system identifies the incoming customer telephone numbers and reroutes the customer to the most
suitable call-center and CS agent. In other commerce environments, it may be information such as
shipping or billing addresses that provide key information. The information helps CS agents provide
appropriate, targeted customer service.

The C4C Integration Workflow


Data Hub comes shipped with a complete set of out-of-the-box C4C integration extensions. Support
for C4C integration includes raw, canonical, and target data extension files. These handlers provide
the mapping from one stage to the next. There are also dedicated adapters for receiving inbound and
delivering outbound data. Components marked in color in the following graphic represent current
out-of-the-box supplied integration extensions.
Data Integration in a C4C Environment

The y2ysync Commerce extension is able to detect the delta of any customer data changes in SAP
Commerce. It is possible to send only that delta to Data Hub. However, SAP C4C requires the
complete set of data for a single customer to be able to update the customer entry. Therefore
the C4C integration model includes all customer and address data related to any change. The
following types of data are integrated:

Customer Data

Customer Data includes names, addresses, IDs, payment addresses, and shipping addresses. During
the data transformation process, Data Hub adds new fields required by C4C that are not present in
the SAP Commerce platform. Fields might include items such as gender or form of address, which is
not necessary for an online commerce platform but is useful in customer service. It additionally splits
or joins fields, such as address or name fields, where a single field maps to multiple fields in the
target system. It also works in the opposite way where there is only one field in the target system for
data in multiple fields in the source.

Address Data
Address data includes customer contact details, and also such items as shipping and billing addresses.
In SAP Commerce, billing and shipping addresses are closely tied to the customer persona. In C4C,
such details are distinct. The data mapping for C4C integration splits SAP Commerce address data,
transferring it from the CustomerItem to the AddressItem and splitting or combining fields as
necessary.

A More Detailed Data Flow Model

Getting customer data from SAP Commerce to SAP C4C involves various extensions and


adapters. Data Hub is shipped with a dedicated suite of extensions for a typical SAP Commerce-
C4C integration use case. Each of the extensions has a specific role to play in the preparation and
manipulation of data on its path towards and through Data Hub.

The following data flow model outlines this path, and the role of the various extensions in more
detail. Extensions indicated with blue titles are standard SAP Commerce or Data Hubextensions.
Dedicated C4C use case customer data integration extensions are shown with yellow titles. All the
extensions included in this data flow model are shipped together with Data Hub. They can be used
out-of-the-box as the foundation for a SAP Commerce-C4C integration. Being out-of-the-box greatly
simplifies and reduces the effort involved in your C4C integration project. Click the image thumbnail
for a larger view.
Basic Aspects of Load,
Compose, and Publish
What happens to data as it passes from the source system
through Data Hub to the target system? This more detailed overview
of the Data Hub workflow shows how Data Hub handles data end-to-
end.
This topic area discusses the three main phases the Data Hub passes through when handling data—
the load phase, the composition phase, and the publication phase. These phases are tightly
intertwined, and there are well defined pathways that the data takes from one phase to the next. With
a sound grasp of how these phases handle data, you have a deeper undertanding of how Data
Hub actually works.

When you are finished with this topic area, you should understand how Data Hub loads raw data
fragments. You have a more detailed view of the composition of data into canonical items using both
grouping and composition handlers. Finally, you should be aware some of the key aspects of
publishing data to target systems.

Data Flow Overview


What happens to an item as it moves from a source system through
the Data Hub to a target system?

1. Data is loaded into the Data Hub through an inbound extension (for example, the CSV
inbound extension). The extension converts the data into a standardized format and passes it
to the exposed Spring Integration inbound channel. At this point, the Data Hub converts the
data from the inbound channel into raw items and makes them available for further
processing.

The raw item begins the Composition phase by being copied. The copy of the raw item now goes
through Grouping. During Grouping, the raw items are arranged by canonicalItemType or
by primaryKey. If custom Grouping handlers exist, the raw items may also be arranged in other
ways

 Multiple values combine into a set. Sets occur when a given attribute on a type is a collection
of values as opposed to a single value. Sets often occur in one-to-many relationships.
If an attribute has a single value that cannot be localized and cannot be a member of a set. Please see
the following transformation rules for these types of attributes:

 Populating the attribute item based on matching attribute names between RawItem type


andCanonicalItem type.

 Transformation is used when the raw attribute name does not match the canonical attribute
name. To implement this logic, populate the <expression></expression> tag in the
XML file with the name of the raw attribute that supplies the desired value.

 Use a SpEL transformation if a basic reference attribute does not suffice. The SpEL
expression is specified in the <expression spel="true"></expression> tag. A
SpEL expression can be used to construct a value made up of 0 or more reference attributes,
as well as any other valid SpEL expression.

 You can add custom Composition handlers that further manipulate the Grouped raw
fragments. After all of the handlers are done, the Grouped raw item has been composed into a
canonical item.

 You can add custom Composition handlers that further manipulate the Grouped raw
fragments. After all of the handlers are done, the Grouped raw item has been composed into a
canonical item.

 The canonical item is published to a target system. During publication, the publication phase
uses the target.xml file to transform the canonical data into target compatible output.
Then the target system adapter accepts the target item from the publication process and
passes it to the target system.
Load
Data Hub provides several methods for loading data. The primary and
recommended method is through the Spring Integration Channel, but
there is also a service for loading source data in CSV format.

What Happens to Data as It Loads

Data is loaded into Data Hub as key-value pairs. With CSV data, which is already a key-value
mapping, this is straightforward. With other data input formats, some processing occurs to extract the
key-value pairs.

Each new batch, file, or stream is considered a new data loading action. All data in the provided
batch, file, or stream, must be completely processed before the data loading action completes. As
each key-value pair is loaded, it is recognized as a valid raw fragment and assigned a PENDING
status. Only raw items with the PENDING status, and a completed data loading action, are eligible
for composition.

CSV Web Service Extension

The goal of the CSV Web Service extension is to simplify the load process of raw items into the Data
Hub. It is possible to make REST calls to the CSV web service to load the data. The REST call
resource contains the name of the raw item where you want to load data. The request body contains
the attribute names and data that you want to load.

CSV Header

The first line in the CSV body is called the header. The header of the CSV corresponds to the
attribute names of the raw item where you are trying to load data. The attribute names should be
comma separated with no spaces. For example, if you want to load data into a raw item that has the
raw attributes: identifier, name, description, and unit, you would place the following header in the
request body:

itemnum,name,description,unittype

The order of the attributes must match the order of the data.

CSV Body

The following rows in the CSV body include the data you are loading. Each row corresponds to
exactly one raw item. Following the previous example, if you have the raw attributes: identifier,
name, description, and unit, and you want to load three raw items, the corresponding CSV body
includes the header plus the values for each attribute:

itemnum,name,description,unittype

0,pants,wear it to cover your legs,pieces

1,shorts,kinda like pants but shorter,pieces

2,t-shirt,wear it to cover your torso,pieces

In this example, you create three raw items. Each value in a column corresponds to the value of the
attribute in the same column. So, for item 0, the value for identifier is 0, the value for name is pants,
and so on.

A blank value for an attribute implies that this attribute should be ignored. In this case, the value of
the attribute does not change. If this attribute value has never been set, it remains the default, and, if
it has been set before, the value is not updated. This is in contrast to an empty string, which causes
the existing attribute value to be replaced. In the following example, the value for description of item
0 is not set, and will therefore be ignored.

itemnum,name,description,unittype

0,pants,,pieces
1,shorts,kinda like pants but shorter,pieces

The CSV Web Service


Extension
The CSV Web Service extension provides one of several paths to load
data into Data Hub. With the CSV Web Service, you can provide CSV-
formatted data in the body of REST requests.

The goal of the CSV Web Service extension is to simplify the load process of raw items into Data
Hub. It is possible to make REST calls to the CSV web service to load data into Data Hub. The
REST call resource contains the name of the raw item you want to load data into. The request body
contains the attribute names and data that you want to load.

REST Call Resource

The REST call resource contains the feed name, and the name of the raw item where you want to
load data. For example, if you want to load data into a raw item called RawProduct, you call the
following resource:

POST -
http://{host:port}/datahub-webapp/v1/data-feeds/{feed_name}/items/RawProduct

Localized Attributes

If you want to add values for attributes that are localized, the item should span multiple lines - one
line per locale. You must include the isoCode attribute when loading localized attributes. You must
also include the attributes that are defined as a primary key in each line. Let's assume, for example,
that identifier is the only attribute that is defined as a primary key in the metadata. You then must
include it in every line. Including attribute values for non-primary key attributes is optional. For
example, if you want to load item 0 with the name and description in English, German, and French,
you load the following CSV content:

identifier,isoCode,name,description,unit

0,en,pants,wear it to cover your legs,pieces

0,de,Hose,"tragen Sie es, um Ihre Beine zu bedecken",

0,fr,pantalon,porter pour couvrir vos jambes,

NoteIn the snippet above, the German version uses quotation marks to escape the comma
in tragen Sie es, um Ihre Beine zu bedecken. Also, note that the pieces parameter is implicit in
the German and French CSV content.

Collection Attributes

Loading values for attributes that are defined as collections is similar to loading localized attribute
values. However, there is no need for the isoCode attribute to be present. For the following example,
you add the attribute category, which is defined as a collection in the Canonical model. You also
assume that identifier is the only primary key attribute. If you want to load several categories for
your item 0, you must include 1 category per line.

identifier,category,name,description,unit

0,clothes,pants,wear it to cover your legs,pieces

0,wearables,,,

0,accessories,,,

Emptying Attribute Values After They are Set

You can "empty" attribute values in SAP Commerce after they have been populated with values.
 Basic Attributes - In the case where you want to reset a value of an attribute back to its
default, you must use a special string reserved for the command to clear a field. By default,
the string <empty> instructs Data Hub to empty an attribute value. For example, if you want
to clear out the attribute unit for item 0, you use the following CSV:

identifier,unit

0,<empty>

 Localized Attributes - If you want to empty an attribute value for a localized attribute, you
must specify the isoCode of the language you are trying to clear. For example, if you would
like to clear the English name for item 0:

identifier,isoCode,name

0,en,<empty>

 Collection Attributes - There is no way to clear specific elements of a collection, however, it


is possible to clear out the entire collection. To empty the entire collection, just include the
string <empty>. For emptying out the category collection for item 0, include the following
CSV content:

identifier,category

0,<empty>

Configuring the <empty> Property

You can configure the empty property csv.empty.value to be a different value. The default
value is <empty>, but you can easily configure it to be a different string in
the local.properties file.
Return Codes

 If all attributes in the CSV header are defined for the raw item where you are trying to load
data, a 200 OK code is returned

 At least one of the attributes in the CSV header has to be a valid attribute of the raw item.
The data for the attributes that are defined correctly for that raw item are loaded and
persisted. However, the data that corresponds to the attributes that are not defined for that
raw item are ignored, and a 200 OK code is returned.

 If none of the attributes in the CSV header exist for that raw item, a 400 Bad Request code
is returned from Data Hub. The message is No valid attribute names exist in csv header
for type 'MyRawType'.

 If the raw item type you are trying to load data into does not exist, or is not a subtype of raw
item, a 400 Bad Request code is returned. The message is InvalidRawType type is not a
subtype of RawItem.

 If the feed name you are trying to use does not exist, a 400 Bad Request code is returned.
The message is Invalid feed name specified - InvalidFeedName.

The SAP Commerce Test
Adapter Extension
The SAP Commerce Test Adapter enables you to quickly set up Data
Hub instances for testing. It intercepts Data Hub publications that
would ordinarily be sent to a HybrisCore target. You can then
complete the full load, composition, and publication cycle without the
need to install and configure SAP Commerce.

The SAP Commerce Test Adapter is available as an optional extension for Data Hub. To install it,
simply place the extension in the /opt/datahub/extensions/. No further configuration is
necessary. The Test Adapter automatically intercepts any publication being sent to
a HybrisCore target.
Data Hub Test Adapter provides the following benefits:

 It enables you to test all Data Hub processes in isolation.

 When used in performance testing, it removes the extra latency and processing time of
publishing to SAP Commerce, thus providing more accurate results.

 It can optionally write the generated ImpEx to a file on the file system.

Tutorial: Using the SAP


Commerce Test Adapter
Here you learn how to use the SAP Commerce Test Adapter to
intercept publications to a SAP Commerce Core target system. You
test the export of data to a file in ImpEx format.

The Data Hub Adapter


Data Hub Adapter is an SAP Commerce extension that
links Commerce Platform to Data Hub. This extension is necessary to
integrate SAP Commerce with the high performance, flexible, data
loading capabilities provided by Data Hub.

Data Hub provides a stand-alone, platform-independent environment in which data from multiple


data sources, such as SAP ERP system, databases, online catalogs, or other business systems, is
consolidated and prepared for loading into one or more instances of SAP Commerce. To import data
from this platform-independent Data Hub to a SAP Commerce installation, SAP Commerce requires
the Data Hub Adapter. Data Hub Adapter is supplied out of the box, along with the SAP
Commerce software bundle. It provides a single point of communication between Commerce
Platform and Data Hub.
Install this extension, if:

1. You are going to utilize Data Hub for extraction and preparation of data for your SAP
Commerce installation.

2. Your target system is based on Commerce Platform.

To install Data Hub Adapter, you first include it as an extension in your SAP Commerce build. Then
configure the endpoints and authorization credentials on both Commerce Platform and Data Hub.

Summary: Default Data
Hub Adapters
The following areas are the expected Learning Outcomes for this topic area:

 You have an understanding of the difference between an input channel and an adapter.

 You can use the CSV Web Service for basic data load using a REST call.

 You are aware of the available out-of-the-box adapters shipped with Data Hub.

 You can set up and use the Data Hub Test Adapter.

 Optionally, you can set up and use the Data Hub Adapter with a local SAP
Commerce installation.
In the Next Topic Area...

You receive:

 A more advanced overview of the various methods used to transform data in Data Hub

 An opportunity to try more advanced transformation rules such as SpEL expressions

 An detailed overview of grouping and composition handlers

Transforming Data
The Data Hub provides many default tools to simplify data
transformation. In addition, you can create custom tools that make
your options endless.

In this topic area, you build upon what you have already learned about transforming data using XML
and handlers. By the end of this topic area you should be able to create your own Grouping and
Composition handlers. You are also able to transform data using the extension XML.

Using Extension XML to


Transform Data
There are several techniques within Data Hub to change data. The
techniques used within an extension XML file are here.

You can use simple expressions, SpEL expressions, and/or custom SpEL expressions provided
by SAP Commerce to modify, alter, or disable XML attribute definitions.
Using a Simple Expression

Using SpEL Expressions in Extension XML

Using the resolve Function

Overriding or Disabling Canonical Attribute


Definitions
Sometimes your canonical extension setup just needs a minor adjustment to work fine for a different
job. Rather than recreate the entire extension, you can create a custom extension that modifies your
original extension definition. In the custom extension, you use the XML override attribute to modify
the transformation expression of the canonical attribute or the XML disable attribute to ignore the
target attribute altogether. In the following code snippet, you see <attribute
override="true">. You can now change the values of the transformationExpression,
and those changes are adopted by the canonical extension. In a similar way, if your XML
said <attribute disable="false">, the transformationExpression of this
canonical attribute is completely ignored.

It is essential the custom extension containing the override or disable XML attribute loads after the
original extension. To ensure the proper load order, define a dependency in the custom extension for
the original extension.

Using Handlers to Transform


Data
A handler is a component of a Data Hub extension that provides more
powerful data transformations than the extension XML. It implements
the methods used to group and transform data items as they move
from one stage of the Data Hub workflow to the next.
Pre-requisites: For additional background or context regarding the following material, see Data Hub
Extensions.

While you can describe some transformations with the extension XML, a handler provides
opportunities for implementing advanced logic using Java. Handlers are most important during the
composition phase, but may also be used during publication. A handler is typically a class file
included in your extension project that implements the relevant handler interface. It extends the
default abstract handler classes, or adds new process logic to the data chain.

There are three different types of handler, each executing a vital part of the data integration process:
grouping handlers, composition handlers, and publication grouping handlers. Data Hub is shipped
with a set of default grouping and composition handlers that provide logic essential to its workflow.
You can also add your own handlers to perform custom transformations.

Grouping Handlers

Grouping handlers group raw item fragments according to specified attributes. These may be
attributes such as canonicalItemType or primaryKey. For example, a grouping handler that
groups by canonicalItemType may place all raw item fragments related
to canonicalProduct in one group, and all item fragments related to canonicalVariant in
another. A grouping handler that groups by primaryKey would look for all raw item fragments
with the same primary key and group these such that the result represents a single row in the
database. Grouping handlers work on copies of the original raw items, which remain in the database
with their status unchanged at this stage. This grouping behavior, the first stage of the composition
process, delivers unified raw data items from the data pool ready for transformation into canonical
items.

By redefining the order of execution of the default grouping handlers, you may influence the final
result. You can also introduce new, custom grouping handlers of your own. The execution of
grouping handlers always precedes the execution of composition handlers.
Composition Handlers
Composition handlers put the grouped raw item data into canonical form. This may include
populating the canonical data fields, handling empty data fields, merging data from several fields into
one, and creating new canonical item primary keys. Composition handlers are executed after
grouping handlers in the composition phase, and the resulting canonical items are persisted in the
database. Upon successful execution of the composition handlers, the raw items, which remain in the
database, are marked with the status PROCESSED, while the resulting canonical items are given the
status SUCCESS. As with grouping handlers, you may introduce your own custom composition
handlers.
Default Handlers

Data Hub is shipped with default grouping and composition handlers. These are defined as abstract
classes, which may be extended by your own custom handlers by way of the available handler
interfaces in the SDK. These default handlers represent a simple use case involving the set of test
data shipped with Data Hub, and provide essential grouping and composition processes that form a
good foundation for many data integration scenarios. You may extend the default handlers, but it is
not recommended you override or exclude them unless you are certain this is necessary for your
project.

Custom Handlers

Custom handlers introduce new grouping or composition logic into the process chain alongside the
default handlers. They are implemented as part of your custom extension and loaded with that
extension during Data Hub startup. Because the default handlers are implemented as abstract classes,
you may easily extend or modify them with your own custom handler logic. Custom handlers need to
be registered in your Spring application context. The order property in the Spring context determines
the order in which the custom handler executes in the process chain.

Summary: Transforming Data


In this topic area, you were provided an in-depth review of how to use
extension XML and handlers to manipulate data.

The following areas are the expected Learning Outcomes for this topic area:

 You learned the different techniques for using extension XML to manipulate data

 You learned about Grouping handlers and how they work

 You learned about Composition handlers and how they work


In the Next Topic Area...

You receive:

 You get a thorough overview of the Backoffice Data Hub Cockpit

 You learn how you can use the Backoffice Data Hub Cockpit to perform many of the
fundamental functions of Data Hub

 You learn how to view and analyze errors using the Backoffice Data Hub Cockpit

Using the Backoffice Data Hub


Cockpit
The Backoffice Data Hub Cockpit is an SAP Commerce
Platform Backoffice extension built upon the Backoffice Framework. It
gives you easy and intuitive access to the major functions of Data
Hub.

The Backoffice Data Hub Cockpit is presented as a perspective alongside


other Backoffice perspectives such asAdministration or Commerce Search.

With the Backoffice Data Hub Cockpit installed and configured, you can do the following:

 Create new Data Hub instances

 View item counts and statuses for each of the three phases of the Data Hub workflow

 Set up new feeds and pools

 Perform load, compose, and publish actions

 Investigate any errors that may occur


Tutorial: Installing and
Initializing the Backoffice Data
Hub Cockpit
Data Hub comes with a user interface that provides the basic
functionality of Data Hub, embedded as an extension in SAP
Commerce Backoffice.

Context
Backofficeapplications are created as part of the Backoffice Framework. This enables application
designers to create or modify widgets in order to customize Backoffice without writing code.
Described below is the process for installing and initializing the Backoffice Data Hub Cockpit.

Procedure

Results
You can now access Backoffice Data Hub Cockpit by going
to http://<hybris_host>:9001/backoffice/, and logging in with the Data Hub Admin
Group role. Select the Data Hub perspective icon from the drop down perspectives menu.
Tutorial: Create a New Data
Hub Instance
You can define new Data Hub instances directly in the Backoffice
Data Hub Cockpit.

Prerequisites
For this tutorial you must have both Data Hub and SAP Commerce installed and running. Backoffice
Administration Cockpit must include the datahubbackoffice extensio

Procedure
1. Connect to Backoffice Administration Cockpit in your browser at the appropriate URL.

http://localhost:9001/backoffice/

After launching Backoffice, you should find yourself in the Administration perspective. If


not, select it from the drop down perspectives menu. You then see a comprehensive menu
and panel layout as follows.

In the navigation menu on the left, perform the following steps.


a. Open the System menu option
b. Within the System menu, select the Types menu option

In the right pane, enter datahubinstance (one word) in the search box, as shown in the
following example.

a.
b. Click the Search button
c. Click the DataHubInstanceModel called DataHub Instance

Click the Search by type button, as shown.


a. Clear the search field to see all current Data Hub instances.
b. Click the plus sign.
c. A pop-up appears on your screen like the following. Fill out the Instance URL and Instance
Name, you do not have to fill out the remaining fields. Click the Done button.
d. Log out of Backoffice and log back in again, then select the Data Hub perspective from
the perspectives menu.

Results
Your new Data Hub instance should now appear in the drop-down instances menu. If there is an
issue resolving the instance you just created, it appears as grayed out with a red x in the instance list.

There are two possible reasons for this situation:

 An incorrect URL.

 The Data Hub server instance is not currently running.

Tutorial: Perform a Quick Load


The Data Hub Backoffice Quick Upload area gives you the ability to
upload data in CSV format, then compose and publish the result. This
process is suitable for quickly uploading smaller data sets.

Procedure
Tutorial: Perform a Quick
Compose
You can trigger a composition of previously loaded data using
the Data Hub Backoffice Quick Upload page.
Tutorial: Perform a Quick
Publish
It is easy to trigger a publication of previously composed data using
the Data HubBackoffice Quick Upload page. Quick publish allows you
to define one or more target systems for publication

Prerequisites
Ensure that you have canonical items in Data Hub that have a SUCCESS state, awaiting publication.
You do this by issuing the following curl command

Tutorial: Review Errors Found


in Any Quick Step
Use the Errors and Failures section of the Data Hub Backoffice to
examine the possible causes of issues during quick upload. This
tutorial shows you how.

Context
Occasionally you will encounter errors when using the quick upload, compose, and publish options
in the Data HubBackoffice. Data Hub returns these errors to Backoffice where the details are stored
for review.

Procedure
Tutorial: Create a Feed and a
Pool
You can use the Backoffice Data Hub Cockpit to create new feeds and
pools, and to define pooling strategies. This tutorial walks you through
the process.

Context
The DEFAULT_FEED and GLOBAL pool, along with the global pooling strategy, are present by
default in Data Hub. Create additional feeds and pooling strategies for custom data management
requirements.

Procedure
Master Your Data Hub Project
Master Your Data Hub Project gives you the advanced knowledge to
prepare Data Hubfor a production environment. Complete the earlier
modules before proceeding with this one.

Master Your Data Hub Project is aimed at the experienced extension developer. It provides


essential knowledge for advanced control of Data Hub workflow and processes. You not only learn
how to write custom input and output adapters, but to have total control over data transformations. In
addition, you gain the tools to ensure that your Data Hub installation is secure, robust, and high-
performance. Master Your Data Hub Project allows you to put Data Hubinto production confidently

Upon completion of this module you have all of the tools necessary for any Data Hub integration
project. To proceed with the first topic of this topic area, click the following related link

Installing Data Hub
There are several things you must do to complete your installation
of Data Hub and prepare it for a production environment.
This topic area discusses the final steps you take for your Data Hub installation to progress to a
production ready installation. There is also information about version upgrade and running under
Oracle WebLogic.

At the end of this topic area, your Tomcat installation is gong to be properly tuned. A relational
database is going to be installed and integrated with Data Hub. Encryption is going to be enabled.
You will have determined what, if any, database cleanup strategy to employ.

Data Hub Pre-Requisites
You must have the following knowledge to take full advantage of Data Hub capabilities.

 Program in Java

 Understand and be able to apply the Spring Framework

 Know and understand Tomcat or Oracle WebLogic

 Understand and be able to apply Maven

 Understand and be able to apply XML and JSON

 Have a good understanding of relational database fundamentals

You must have the following properties set in your local.propertiesfile.

Tuning Tomcat
Tomcat can be tuned to use memory more efficiently, which improves
performance and helps Data Hub move more data.

Create your Tomcat startup file. There is a Linux example and a Windows example listed below. Just
copy and paste to your system. Note that the <CATALINA_OPTS> described below is just an
example and should not be used as isin the production environment. You need to determine your own
best configuration
Optional: If you need a more secure environment for your Data Hub installation, you can add SSL
authentication 

Choosing a Database
Data Hub requires a dedicated database for staging data, and saving
item metadata and statuses related to load, composition, and
publication actions. Data Hub can be configured to use several
common relational databases. Flexibility of this type provides you the
opportunity to select the ideal database solution for your data
integration project.

Overview
Data Hub is a data integration and staging platform that works primarily asynchronously. Raw item
data is first loaded from source systems. It is then composed into canonical items. Finally, it is
published to one or more target systems in a form suitable for those systems. Each of these stages
occurs independently. Because load, composition, and publication events may be triggered at any
time, Data Hub must store the following:

 Raw items during the load phase, to be ready for composition.

 Canonical items during the composition phase, to be ready for publication

In addition, each item includes metadata that defines both its structure, and its relationship to the next
state in the data transformation workflow. The workflow being raw to canonical, or canonical to
target. Additional metadata is also required to describe each target system

Thirdly, each item is marked with a status that indicates its progress in the Data Hub workflow.
Which would be the outcome of any load, composition, or publication event. Statuses are also
recorded for target system publication events.

Persist all of these data types - data items, metadata, and statuses. In sum, they not only enable the
function of Data Hub, but also constitute a complete history of all data transformations. A sort
of golden record history of Data Hubevents for auditing purposes.
Persisting this data requires a dedicated database.

Performance

Data Hub employs highly concurrent processing for maximum efficiency and throughput. It uses
hibernate for non performance-critical transactions and has its own implementation of a jdbc
repository for performance-critical transactions. These Data Hub features provide a level of
persistence abstraction that is compatible with a range of common relational databases. Some of the
databases may have their own performance limitations, depending on configuration. The choice of
database does not affect Data Hub performance in any significant way.

Please refer to the related links section about the individual database topics. Consult with a DBA for
the right choice of database. The DBA can help you create a performance-related configuration
tailored for the needs of your data integration project.

Data Retention

Any data from completed publications remains in the database, forming a complete auditing record
of your data transformation history, as described previously. You may wish to keep the auditing
record indefinitely, but over time it can affect Data Hub performance. Previous auditing records can
be cleaned up, either manually or automatically, using the provided Data Hub clean up extension.
You may also wish to develop your own extension to perform the clean-up according to your
requirements. See Activating Data Hub Database Cleanup.

Database Schema

During initialization, Data Hub creates its own schema and initializes this schema with the metadata
loaded from its extensions. By default, the kernel.autoInitMode is set to update to prevent
data loss. To refresh the database at any time, drop and create the database manually. Then
restart Data Hub to regenerate the schema.
Supported Databases
By default, Data Hub is configured to use the HSQL database. However, HSQL is not a supported
database for production deployments.NoteData Hub is case sensitive, so your chosen database must
also be case sensitive. Of the databases supported by Data Hub, only MySQL is not natively case
sensitive. Instructions for configuring it to be case sensitive are included.For production, Data
Hub supports several relational databases. They include:

 MySQL

 Oracle

 SAP HANA DB

 MSSQL

To avoid potential issues in certain cases, ensure that your database supports case-sensitive queries.
More information is provided in the individual database topics.

Using MySQL
MySQL is a popular, open-source relational database system. Data
Hub can be easily configured to use MySQL

Note

By default, MySQL performs case-insensitive queries, which may be an issue in some cases. Case
sensitivity is set using the collate parameter. To enable case-sensitive queries in Data Hub, create the
schema so it is configured as follows

Using Oracle
Oracle is an enterprise database management system (DBMS)
produced by the Oracle Corporation. Data Hub can be easily
configured to use an Oracle database.

Context
Complete the following steps to configure your Data Hub installation to use an Oracle database.

RestrictionWhen using Oracle SE, Data Hub and SAP Commerce cannot share the same Oracle
SE Instance.

NoteOracle is case sensitive by default.

Procedure

Using SAP HANA


SAP HANA DB is a high-performance, in-memory database that is
part of the SAP HANA platform. Data Hub can be easily configured to
use HANA DB

Context
Complete the following steps to configure your Data Hub installation to use an SAP HANA
database.

NoteSAP HANA is case sensitive by default.

Procedure
Using MSSQL
MSSQL is a relational database management system (DBMS)
produced by Microsoft. Data Hub can be easily configured to use
MSSQL.

Context
Complete the following steps to configure your Data Hub installation to use a MSSQL
database.NoteMSSQL is case sensitive by default.

Procedure
Create your database.
The default Data Hub installation relies on a database instance with the name of integration, with an
administrative user named hybris and the password hybris. You can change the database instance
name as well as the username and password. Reflect the changes in your database connection
information located in the local.properties file. A user with sufficient privileges to grant the
rights for the Data Hub database creates the database. Including full schema privileges to the
database instance.

Auto Init Mode


Auto Init Mode
Data Hub provides a configuration property that allows you to control
what happens to the database schema during start-up.

Auto-initialization of the Data Hub database schema is possible during the start up cycle of Data
Hub. To specify how and if this auto initialization occurs, add the
property datahub.autoInitMode to your local.properties file, as follows:

The Version Table


The Data Hub database includes a version table, DataHubVersion, which holds the current Data
Hub version number. This table is used during initialization to check the existing schema against the
schema of the Data Hubversion starting up. If datahub.autoInitMode is set to
either ignore or create, and an incompatible version of the schema is found, Data Hub fails to
start. In this case a warning appears in the logs.

Securing Your Data
Hub Application
There are several ways in which you can secure your Data
Hub application using simple configuration. These steps ensure basic
end-to-end security of REST endpoints and data attributes.

Configure the Default Security Profile

Context

Data Hub provides a default Spring security profile. You must provide authentication credentials for
the roles defined in this profile.

Procedure

Provide Credentials for Data Hub Adapter

Context

If you are using Data Hub with SAP Commerce, provide connection credentials for the Data Hub
Adapter so it can connect to Data Hub.
Procedure

Create an OAuth Client for Data Hub


Adapter

Context

If you are using Data Hub with SAP Commerce and the Data Hub Adapter, configure a dedicated
OAuth client for Data Hub. This configuration is done in the Backoffice Administration Cockpit.

Procedure

Set Up Encryption

Context

Data Hub comes with some built-in encryption capabilities for attributes that you wish to keep secure
in the data store. Use this service for such items as passwords and other sensitive data. This is a
mandatory step, as target system passwords are encrypted by default.

Procedure

Define Encrypted Attributes

Context

Once you have configured encryption and stored your key, you can specify which attributes you wish
to secure.
Procedure

Activating Data Hub Database
Cleanup
Over time, a Data Hub database accumulates database records that
are used for auditing, but as these records accumulate, they can also
affect performance. The following document describes these records
and which database tables are affected

When a Data Hub instance has been running for a long time, too many audit records accumulate in
the Data Hubdatabase. These records do affect performance. If not needed in the active Data
Hub database, the historical auditing database records can be migrated to an archive database before
elimination. They can also be eliminated without affecting the current state of Data Hub. These
operations can be performed even when there are processes currently being performed by Data
Hub on the database. The only consequence is that removing the records also removes the audit
history for the records. The audit records show how and when they have been imported, composed,
and published.

Using Data Hub's Built-In Extension

The datahub-cleanup extension is deployed with the Data Hub WAR file.

Out of the box, the datahub-cleanup extension executes a set of default deletion behaviors. The
default behavior deletes the following:

 All raw items after they are processed in a composition

 All archived canonical items and their associated publication errors and status after a
composition

 All target items after they are finished being used during a publication

To enable the default behavior, the corresponding properties must be set to true in the
deployed local.propertiesfile as follows:
Data Hub is shipped with these property values set to false.
Defining the Cleanup Batch Size

The datahub-cleanup extension processes all audit item deletes. The extension is triggered by


an event and does its work in batches. The split of one large transaction into multiple smaller batches
comes at the cost of total time for deletion. However, it gives the benefit of a more responsive and
robust system. It also avoids potential issues with limitations certain databases may impose on the
number of unique record IDs included in the IN clause of a query. For example, with Oracle, this
number is limited to 1000.

If not explicitly specified, the default batch size is 1000 - the Oracle maximum. You can set the
default batch size to any positive integer, but it cannot be set to a negative value.

Defining Canonical Item Cleanup Delay Times

When Data Hub publishes, it gathers data from two groups:

 all the new canonical items

 all canonical items that have previously failed to publish but have not reached their max retry
limit

All of these canonical items in the pool are processed with the publication. The publication is
comprised of one or more publication actions. If there is a maximum publication action size, then the
publication action is limited to that number of items. Whatever the number of publication actions
needed to publish all items, Data Hub creates them and queues them. One set of input data has the
possibility of being split across several publication actions, because items are not pre-assigned to
specific publication actions.

The cleanup extension is required for any Data Hub installation, because it has a powerful, positive
impact on performance. However, it can have negative impacts if it is misconfigured. You use the
following two timeout properties to configure it. The two timeout properties described below are
critical for a proper configuration. The properties must be in your local.properties file
at Data Hub startup. If you activate the cleanup extension without specifying a value for these
properties, they each default to 12 hours.
Excluding Certain Types from Cleanup

It may be useful to exclude certain canonical item types from deletion by the cleanup extension

Data Hub Installation Using


Recipes
Out-of-the-box, Data Hub has powerful functionality. However, the
functionality does not present itself until you add custom extensions.
The extensions tell Data Hub what to do with your data. Data
Hub does not come with any extensions.

Data Hub CLASSPATH
Configuration Files and
Recipes
Data Hub relies on configuration files and property files. Your recipe
can create and properly place these files.

Configuring CLASSPATH Resources

Configuring CLASSPATH Resources

A typical Data Hub deployment requires some resources that are placed on the application classpath.
All resources can be configured inside a resources element within the Data Hub configuration clause.
Here is how you can do that
Configuring Data Hub Binaries
with a Recipe
Data Hub is useless unless some extensions are deployed with
it. SAP Commercealready contains some Data Hub extensions, which
can be used in the recipes. The extensions can be found
in <PLATFORM_HOME>/../ext-
integration/datahub/extensions directory. If you develop a
new Data Hubextension, and it exists somewhere on your local file
system, it may be included in the recipe also.

Customizing Data
Hub Deployment
You can use a recipe to customize a Data Hub deployment.

Deploying to a Docker Image


You can choose to convert and deploy the Data Hub image as a
Docker image

Installing the Basic


Prerequisites
In the following steps, you set up the basic prerequisites for
installing Data Hub.

Context
Data Hub is a Java web application, and it utilizes a relational database. It requires certain
prerequisites for a minimal installation. Unless otherwise stated, the following instructions
are not valid for third party software versions other than those stated here.

Procedure
1. Install the Java Development Kit 1.8
Using your browser, go

to http://docs.oracle.com/javase/8/docs/technotes/guides/install/install_overview.html  an
d install the Java JDK for your Operating System. 64 bit Java is supported.
2. Install Apache Tomcat 7.x.

In your browser, go to https://tomcat.apache.org/download-70.cgi . Download Tomcat


version 7. Install it as described in the Apache Tomcat

documentation, https://tomcat.apache.org/tomcat-7.0-doc/setup.html .

TipStep 3 through Step 5 are critical to a successful installation of Data Hub. You are going
to use these folders often as you work through the documentation and use Data Hub.
3. Create an /opt/datahub folder in the root directory.
4. In the datahub folder, create the following new folders:
a. config - for eample, to be used for the
files local.properties and datahub.encryption.key.txt
b. extensions
5. Create a Tomcat webapp context XML file called datahub-webapp.xml. datahub-
webapp is going to be the name of the web service.
RememberAll of the Data Hub documentation is based on the following datahub-
webapp.xml file.
a. Open <TOMCAT_HOME>/conf/Catalina/context.xml in your favorite
text editor. If it does not exist, go to
the <TOMCAT_HOME>/conf/Catalina/localhost folder, open a new file,
and move on to the next step.
b. If it is already customized, edit it as follows. If it is not customized, replace the
contents of the file with:

6.
a. NoteEdit the <docBase> parameter to reflect the path to the SAP
Commerce installation and the current Data Hub version. You can find this version
by following the path mentioned for the <docBase> to the WAR file. Also, the
version number on the end of the docBase attribute is just for identification. It serves
no other purpose and has no other impact.
b. Save the file with the new name datahub-webapp.xml into
the<TOMCAT_HOME>/conf/Catalina/localhost folder.
7. Install the cURL command-line tool.

Using your browser, go to https://curl.haxx.se/ . Download and install the cURL


software that is appropriate for your operating system.

The Solution Book files are useful as examples or samples of what


can be done withData Hub.

ou can find the Solution Book files in the Data Hub Suite ZIP file. After you install Data Hub, you
can find the files in hybris/bin/ext-integration/datahub/solution-book. You can

also access the files from 2764052  .

Note

The Solution Book files and recipes are optimized for versions 6.7 and above, and are provided as
examples only. It is assumed and recommended that you regularly patch the Data Hub application.
Deploying to a Docker Image
You can choose to convert and deploy the Data Hub image as a
Docker image

Customizing Data Hub
Out of the box, Data Hub allows you to perform the tutorials in this
documentation. Customizing Data Hub functionality requires custom
tools and extensions. With customizations, the Data Hub can do a
great variety of data handling operations

Tutorial: Setting Up Your


Environment for Custom
Extension Development
Generating an empty extension is the first step in any custom data
integration project using Data Hub. The provided Maven Archetype
makes it a relatively quick and simple task.

Anatomy of an Extension
A Sample Extension XML File

The extension XML file is used to define the data structure. As mentioned previously, Data Hub uses
three main data structures: raw, canonical, and target. Each one has its own XML file. Following is
an abbreviated canonical extension XML file. The key tags in the canonical item XML file are
 The extension tag is the root element for the extension definition. The attribute name is
mandatory, and must contain your extension name. Other extensions refer to your extension
by this name, so be thoughtful about the name you choose. The name should be descriptive
enough that other developers, teams, or companies know what business domain your
extension addresses. It should also be unique in a way that other teams or companies do not
pick an identical name for their extension. If the version attribute is used, it should help
you identify the version of the extension you are deploying.

 The dependencies tag declares any possible dependencies your extension may have on


other extensions. If your extension is dependent on another extension, load the other
extension first. If your extension has no dependencies on other extensions, omit this element
in the extension.xml file.

 The canonicalItems tag declares canonical item types used by your extension. Each


type is represented by the item element and, besides the type name, contains definitions of
the item attributes. It also defines how those attributes can be populated from the declared
raw items and their attributes.

Java Source Code in an Extension

While Java source is optional in an extension, it is common. You can do some data manipulation
within the XML file, but for intricate work or faster bulk work, java source code is more powerful.

The XML that defines your data structures is the minimum requirement for a Data Hub extension.
Package your extension into a JAR file, and place it in the Data Hub class path. Follow the steps here
to create and load an extension jar file for your canonical data structure.

Procedure
If you develop multiple Data Hub extensions, you can define multiple build profiles with additional
dependencies declared. With these, you can easily deploy each of the extensions separately or certain
combinations of the extensions simply by adding the profile name to the build command.
Illustrating a Dependent
Extension
Your extension may declare dependencies to other Data
Hub extensions. Ensure that all dependencies can be successfully
resolved

To discover and resolve dependencies, Data Hub loads all of the extension XML files, reads all of the
declared dependencies, and checks whether they have already been loaded or not. If the declared,
required extension is found in Data Hub but is not loaded yet, Data Hub loads it before loading your
extension. If the required extension is not found, your extension is not loaded, and a corresponding
warning or error message is reported in the log files. The failure of your extension to load means any
other extension dependent on your extension is not loaded either.

Besides missing a required extension, it is possible that two or more extensions create a circular
dependency. For example, your extension depends on another extension, the other extension depends
on yet another extension, and that one depends back on your extension. Such circular dependencies
are identified by Data Hub during extension loading, and Data Hub excludes any extension in the
circular dependency chain from loading.

Dynamically Loading an
Extension
Data Hub primarily relies on extensions that are loaded at startup.
However, it is possible to dynamically load an extension while Data
Hub is running. There are some limitations with this process but using
dynamic extensions is better than shutting Data Hub down.
You can either load an extension statically or dynamically. Extensions loaded statically are loaded
during Data Hubstartup and initialization. Extensions loaded dynamically are loaded during runtime
using a REST call, without the need to shut down and restart Data Hub.

RestrictionDynamic extensions can only be used on standalone Data Hub systems. Attempts to


use dynamic extensions in a multi node Data Hub cluster return the following response: "This
operation is not allowed when there is more than one node running in the Data Hub cluster."

In most circumstances, if you want to load a complete new extension,


you shut down Data Hub. But what if all you want to do is change
some parts of the metadata? In that case, you can load a dynamic
extension, and keep Data Hub running while you do

Security
Various aspects must be considered when planning a secure Data
Hub deployment. Alongside application-specific features, you should
also pay attention to the network, database, web container, and
monitoring.

In this topic area you get an overview of all aspects of security related to your Data Hub installation
and operation. You learn the importance of setting up a secure network and container,
configuring Data Hub to use basic authentication for all REST endpoints, and using SSL for all
communication to and from Data Hub. In addition, you learn how to encrypt data stored in the Data
Hub database.

Infrastructure Security
Security is an important topic, and Data Hub provides several security
features and is compatible with others
Network Security

To prevent denial-of-service and similar attacks, install Data Hub within a DMZ. Then the network
ports used by Data Hub are not exposed to the internet. The DMZ ensures that only known clients
can connect to Data Hub resources.

Container Security
Configure your application server to use SSL, thus encrypting all requests and responses and further
securing your data. Apache Tomcat, for example, can be configured to force any Data
Hub connections over SSL/TLS (HTTPS). To configure the SSL, have an experienced system
administrator with knowledge of generating self-signed certificates do the work. More information
and a tutorial is available at the Apache Tomcat website in the topic SSL/TLS 

Monitoring Security
Secure monitoring communications between a network administration center and Data Hub to
prevent man-in-the-middle attacks. Data Hub and its JVM can be monitored using the out-of-box
monitoring solution named Jconsole. There are also other monitoring solutions that are compatible
with the Java Management Extensions (JMX).
Application Security
Data Hub implements basic authentication configured with Spring
Security to minimally secure its RESTful API endpoints. Authentication
is enabled by default, and Data Hubreturns a HTTP status code 401
Unauthorized when secured REST endpoints are accessed without a
valid authentication header.

Basic authentication requires credentials (username and password) to be included in all requests
to Data Hub REST endpoints. Only the status and version endpoints are not secured. These can
be used to validate your Data Hubinstallation without the need for authentication.

Overriding the Default Configuration


To extend the provided security concept, override the default configuration in your extension by
carrying out the following steps:

1. Define a Spring profile containing your custom security definition

2. Start Data Hub using that profile

By creating a separate profile, you ensure that no conflicting filters are defined, and that you properly
override the default security configuration.

A custom Spring Security profile enables you to expand and customize the available user roles, or to
use other Spring Security features such as Spring LDAP, thus providing a single-sign-on experience
using an external LDAP server and circumventing the need to define user credentials in a properties
file.

Using SSL

When using basic authentication, the username and password is encoded using base64 and
transmitted in the Authentication header of REST requests. This request can be intercepted and
decoded to reveal the username and password, and it is therefore strongly recommended that you
configure your Tomcat installation to use SSL for all such requests, and to refuse non-encrypted
connections. To read about enabling SSL with Tomcat, see SSL/TLS 
Configuring Data Hub Adapter for OAuth

Tutorial: Creating a Custom


Spring Security Profile
Create a custom Spring Security profile to define a new user role with
specific access restrictions to Data Hub REST endpoints.

Context
Data Hub ships with a default Spring security profile that includes definitions for an admin user, and
a read-only user. 

If you want to define an additional role to gain more granular control over Data Hub access, you can
do so in a custom security profile. With a custom security profile, you can, for example:

 allow a user to load and compose data, but not publish

 restrict a user to only using the GET and POST methods, but not PUT or DELETE

Both of these examples are included in this tutorial. Many more possibilities exist for your own
custom security profile.

Procedure
The username must be unique. Do not define a user name for the load user that is already defined for
another role.
You now have a new security profile that you can use in place of the existing default profile. It can
include any number of new user roles, allowing fine control over access to the various Data
Hub REST endpoints, and available methods

Data Security
Data Hub data is stored in your chosen RDBMS. When configuring it
for use with Data Hub, create a dedicated user for Data Hub with only
the required permissions required.
The two aspects of database security you should consider at a minimum are the user and permissions,
and attribute encryption.

Database User and Permissions


The default Data Hub installation relies on a database instance with the name of integration, with an
administrative user named hybris and the password hybris. For security purposes, change these
values in a production environment. Create a dedicated runtime user to ensure that only Data
Hub can access the database, and only with those privileges that are necessary.

Attribute Encryption
Data stored in Data Hub database can include target system passwords, and other sensitive data from
source systems. Data Hub provides support for the encryption of sensitive data in the database. It also
masks this type of data when returned in the body of RESTful responses or recorded in log files.
Always enable encryption and attribute masking to protect sensitive data. By default, Data
Hub application password is always masked.

Data Hub comes with some built-in encryption capabilities for


attributes that you wish to keep secure in the data store. The
encryption allows you to secure specific data attributes, decrypt an
encrypted password, and mask attribute values in RESTful responses
and log files.
Data stored in the Data Hub database can include target system passwords, and other sensitive data
from source systems, such as personal data. Data Hub provides support for the encryption of
sensitive data in the database. It also masks this data when returned in the body of RESTful
responses or recorded in log files. By enabling encryption and attribute masking, you can protect
sensitive data. By default, the Data Hub application password is always masked.

utorial: Using Attribute


Encryption
Some basic setup is needed before you can use Data Hub encryption
features. Perform the following steps to configure Data Hub to use
encryption for secure attributes.

Set Up Encryption

Context

Data Hub comes with some built-in encryption capabilities for attributes that you wish to keep secure
in the data store. Use this service for such items as passwords and other sensitive data. This is a
mandatory step, as target system passwords are encrypted by default.

Procedure
Adapter Security
Basic Authentication is implemented in both directions between Data
Hub and SAP Commerce. The credentials configured in Data
Hub must also be configured in Data Hub Adapter in order for them to
communicate
Data Hub Adapter is an SAP Commerce extension that links SAP Commerce to Data Hub. Data
Hub and SAP Commerce interact in a client/server architecture where Data Hub is the client,
and SAP Commerce is the server. Authentication is implemented on both client and server side.

Inbound Connections
For inbound connections (client to server), set up a user in SAP CommerceData Hub with a
username and password. Then provide these credentials in the target-item.xml file of
your Data Hub extension in the target system definition,
Outbound Connections
For outbound connections (server to client), Data Hub employs HTTP Basic Authentication. This is
defined in the specifically for Data Hub Spring Security configuration. Provide these credentials
to SAP Commerce in the SAP Commerce local.properties file 

Data Hub Adapter and OAuth


OAuth authorization between Data Hub and Data Hub Adapter requires that you configure an OAuth
client. A default configuration exists that uses a client called eic. However, SAP recommends you
define your own OAuth client. To override the default configuration and use a custom client, add the
following properties to your Data Hublocal.properties file, updating the values to match your
own client.
As Backoffice specifically for is a standard extension of SAP Commerce, the properties definition
must follow the procedure documented for the platform extensions. If not defined, these properties
are supplied with blank values.
Backoffice does not support a separate security configuration (authentication method and user
credentials) for each Data Hub server defined in the system. Therefore all servers must have the same
basic authentication configuration.

To use a custom OAuth client configuration with Data Hub Adapter,


first define a new OAuth client. You do this in the SAP
Commerce Backoffice

Context
A default OAuth configuration exists that uses a client called eic. However, you should define your
own OAuth client in any production system.

Advanced Aspects of Load,


Compose, and Publish
This topic area describes the advanced parts of load, compose, and
publish that were not covered in earlier topics.

Examples of things covered include the concept of actions and a series of topics addressing items
with specific issues in each of these three areas. The three primary topics, load, compose, and
publish, comprise the action points for the Data Hub. There is a linear relationship between them.

Once you complete this topic area, you should have a full understanding of these three main
processes.

Load
Load is the most straightforward of the three main Data
Hub processes. As such, there are not many modifications you can
make to the load process. However, there are a few things you can do
to make it more efficient with specific kinds of data and with specific
load actions.
The Data Hub Spring Integration mechanism enables data loading of raw (fragmented) data
as key/value pairs. Each fragment is represented as a Map<String, String> where the key
represents the name of the raw attribute. The second element is the value corresponding to the key. In
the following snippet from a rawExtension.xml file, the raw
attribute <name>city</name> has city as the key. When the source data is prepared for loading,
the key city is matched with the correct source data; a city name.
If there is no value (for example, the value is null), the attribute is ignored. Any string, including the
empty string, is supported as an attribute value. When an attribute value is set to an empty string, this
value means that the attribute has been cleared. There is a difference between empty strings and null
values in raw items. Null is treated like an ‘ignore’. An empty string, however, is treated as an
explicit value, so the empty string overrides any previous value.

Limiting the Size of Loading Datasets


Send data to Data Hub in smaller increments and allow it to be processed before sending the next set
of items. Sending volumes of items that are too large can result in Data Hub slowing down
drastically and appearing to 'freeze' and not complete actions. For example, break a 10 million item
dataset into smaller chunks. The size of these chunks depends on your specific Data Hub installation.
Data Loading Actions
Data Hub uses data loading actions to control the loading of raw
fragments.
Data loading actions are started by the DefaultDataLoadingModificationService. The
service is part of Data Hub core. Load data into Data Hub using either the CSV extension or one of
the two Spring integration channels.

CSV
Before the CSV data begins to load, Data Hub creates a data loading action, assigns an ID to it, and
gives it a status of IN_PROGRESS. Because it is CSV data, all of it must load before the data
loading action is complete. Once it is complete, the data loading status changes to COMPLETE. All
of the raw item statuses are set to PENDING. The raw items are ready to be composed.

Spring
Since the custom code loads data into the Spring channel, each piece of data is given its own data
loading action. Each data loading action is assigned an ID and given a status of IN_PROGRESS.
After the piece of data is loaded, the data loading status changes to COMPLETE. The raw item status
is set to PENDING, and the raw item is ready to be composed. If the data fails to load, the action is
set to a FAILURE status.
In both cases, when the composition is complete, the data loading action status is set to
PROCESSED.

Unique Aspects of Loading


IDocs
IDocs are composed of segments. Data Hub treats each segment as a
raw item. So, it is very important for the IDoc data to only contain
segments pertinent to the Data Hubprocessing role.
Every IDoc is composed of many, many segments. It is very important that the IDoc data be carefully
screened. The uploaded IDocs must only contain segments requiring Data Hub processing.
Extraneous IDocs or segments can have a serious effect on Data Hub performance.
Data Hub converts fragments into RawItems. More specifically, each segment of an IDoc becomes
one RawItem. Data Hub performance degrades with every additional, unneeded RawItem. This
degradation is non-linear, so operating on large sets of RawItems can have a negative impact
on performance. To maximize the performance of Data Hub, carefully analyze the specific ECC
instance and what data is contained in each IDoc sent. Sending the minimum number of IDocs, and
the relevant (minimum possible) segments inside of each IDoc has been shown to give massive
performance gains. Consider investigating the following three transactions to improve the
performance:

1. BD53 - Reduction of Message Types

2. BD56 - Maintain IDoc Segment Filters

3. BD64 - Maintenance of Distribution Model


The sapidocintegration extension filters out (removes) duplicates based on their unmapped
attributes. As a result, it creates unique rows and reduces the total amount of raw items input to Data
Hub. It uses a conservative approach to only remove consecutive duplicates. It reads in each of the
relevant Data Hub raw model types in advance to validate the required set. Then it only creates those
raw items that have been defined in the raw model.

Compose
There are several things mentioned here that help you understand the
fundamentals of composition. Additionally, there is information about
composing IDocs, which may improve the performance of Data Hub.

Canonical Metadata

1. What item types can be created from the imported item type.

2. What attributes the composed types have.

3. How the composed item attributes are constructed from the imported raw items.
Don't use reserved identifiers such as "id", "status", or "type" when naming attributes of Raw or
Canonical models. If a Composition or Publication expression uses one of these attributes, the actual
value read is not the one provided by the user on the Raw or Canonical model, but the underlying
technical attribute.
At the end of Data Composition, the canonical data set is complete and the data pool is populated
with the new canonical items. The metadata defining this process consists of:

 CanonicalItemMetadata – defines metadata for a canonical item type and holds a


collection of canonical attribute definitions. One instance exists for a canonical type.

 CanonicalAttributeDefinitions – holds all metadata about a canonical attribute


including raw item source and composition metadata. One instance exists for each raw
source.

 CanonicalAttributeModelDefinition – holds the basic model information about


a canonical attribute (including name and primaryKey). Only one instance exists per
canonical attribute

IntegrationKey
The IntegrationKey is a critical element in the Data Hub data structure. It is created
automatically by Data Huband is composed by concatenating the primaryKeys you have identified
in your canonical extension XML. The method used to create it
is IntegrationKeyGenerationStrategy, which can be found in the SDK. It looks like:

There are some instances where you might want to override the default creation method. The
simplest way to do this is to override this bean with your own implementation in an extension.

Canonical Model Design Constraints


Metadata is an important ingredient in defining the design constraints of the canonical model. See the
list below

 Every attribute existing in a RawItem model must be referred to by


a CanonicalAttributeDefinitionattribute in order to be used in the canonical
model.
 Additional attributes defined as CanonicalAttributeDefinitions that are not in
the RawItem model must be in the CanonicalItem model.

 Each attribute can be defined as single string attributes, sets, or as localized values.

 Every CanonicalItem model type must derive from the CanonicalItem model.

 The CanonicalItem type contains the


attribute's integrationKey and dataPool only. Neither of these can have
a CanonicalAttributeDefinition created.

Ensuring IDocs Process Properly During


Composition
It is critical that each set of IDocs finish Composition before the next batch begins. When using the
Auto-Compose feature, it is important that the IDoc load interval ensures that the queue does not
build. Larger queues create performance issues. Considering the effect of larger queues, the IDoc
load interval must be longer than the time it takes to Compose the prior IDoc load.
The best way to set up the process to ensure that the processing occurs with the correct intervals
is deterministically. Explicitly load the next set of IDocs only after the first set completes
Composition. You can do it in an automated way using events to create the trigger, which is the best
option. It ensures that the process occurs in the correct sequence without any risk of queuing and has
the most efficient processing time.

Composition Actions
Composition actions are the construct within which raw items are
moved through the composition phase.
The composition phase starts when you initiate it using an event
(InitiateCompositionEvent) or a POST request. The Data Hub then creates a composition
action and opens the composition queue. It then composes raw items from the specified pool. The
composition phase runs asynchronously, and the POST request returns immediately indicating the
composition action is IN_PROGRESS. If you are using events, Data Hub fires several events that
help track composition activity.
Data Hub deals with composition actions in one of two ways.

1. All raw items are composed. Data Hub assigns the appropriate status to the composition
action, and it is done.
2. Composition continues until the datahub.max.composition.action.size is met.
(Default size is 100,000.) At that point, Data Hub assigns the appropriate status to the
composition action, and it is done. If further composition is needed, a new composition
action is triggered

Changing a Canonical Item


During Composition
Existing canonical items can be updated by merging them with
incoming items when they share the same integration key. Canonical
items can also be deleted with a delete request.

Updating and Merging Existing Canonical Items


During Composition
If you have an existing canonical item you wish to update, you can load a raw item to the same pool
with the same integration key. Data Hub creates a new, merged canonical item during composition.
The new, merged canonical item automatically updates or includes any new attributes passed to it.
There is a difference between empty strings and null values in raw items. Null is treated like an
‘ignore’. An empty string, however, is treated as an explicit value, so the empty string overrides any
previous value.
Your starting point for the update is a single data pool that holds both the existing canonical items,
and the new raw items. Data Hub goes through the following steps to complete the merge and
update:

1. Current canonical items with a SUCCESS status are updated when new raw items matching
their data pool and integration key are composed.

2. For each item to be updated, a new merged canonical item is created. Its values are populated
by merging the existing canonical item and the just composed raw item. The new, merged
canonical item has a status of SUCCESS.

3. The status of the old matching canonical item and the just composed raw item are set to
ARCHIVED.
Transforming Data (Advanced)
Transforming data is the primary task of Data Hub. Data Hub provides
several tools for complex data transformations, in addition to simple
data item mapping in your data model XML files.
All common SpEL expressions are supported by Data Hub. You can use SpEL to add more powerful
transformations to your data model XML. Data Hub also provides the custom SpEL resolve()
function, which allows the use of a lookup table for resolving data values. For even more complex
transformations in the publication phase, consider using a Publication Grouping Handler.

ImpEx
SAP Commerce is shipped with a text-based import and export
functionality called ImpEx. The ImpEx engine allows creating,
updating, removing, and exporting Platformitems such as customer,
product, or order data to and from comma-separated value (CSV) data
files, both during runtime and during the initialization or update
process.
The ImpEx key features are:

 Import and export of SAP Commerce data from and into CSV files.

 Support for import using database access.

With ImpEx, you can:

 Update data at run time.

 Create initial data for a project, or migrate existing data from one SAP Commerce instance
into another (during an upgrade, for example).
 Facilitate data transfer tasks, for example, for CronJobs, or for synchronization with third-
party systems (such as an LDAP system, or SAP R/3).
An SAP Commerce extension may provide functionality that is licensed through different SAP
Commercemodules. Make sure to limit your implementation to the features defined in your contract
license. In case of doubt, please contact your sales representative.

Function Overview
The ImpEx engine matches data to be imported to the SAP Commerce type system. ImpEx allows
separate data to be imported into two individual CSV files, one for the matching to the type system
and the other file for the actual data. That way, swapping input files is easy. Importing items via
XML-based files is not supported. For details, refer to the ImpEx documentation. There are three
main fields of use for ImpEx:
During Development:

 Importing sample data.

 Testing of business functionalities.

 Creating sample data during SAP Commerce initialization.

 Providing an easy way to create a project initial data.


Migration:

 Migrating existing data from one SAP Commerce installation to another (during a version


upgrade, for example).
In an Operational SAP Commerce:

 Synchronizing data in SAP Commerce with other systems.


 Creating a backup of SAP Commerce data.

 Providing run-time data for CronJobs.

TipImpEx import is based on the ServiceLayer. This means that actions like INSERT, UPDATE,
and REMOVE are performed using ModelService, thus the ServiceLayer infrastructure like
interceptors and validators is triggered. 

Extract, Transform, and Load


(ETL) Tools
SAP Commerce is usually part of an entire software environment:
ERP software, database systems, application servers, technology
stacks. In such cases, SAP Commerce needs to be able to exchange
data with other sources.
In data warehousing contexts, the process of retrieving pieces of data from an external source,
modifying that data's format and later on importing the data is referred to as Extract, Transform,
Load (ETL).
ETL tools are used to route data to and from the SAP Commerce system. They help to integrate
various systems with each other. They can transform different data formats into each other. They also
can be used for cleaning the data by running some checks, for example by checking if a name value
is set. A main benefit can be that ETL tools can ensure to keep the rules on how to extract and
transform data outside of an application
While SAP Commerce contains the ImpEx module as a means of importing and exporting data,
creating an interface to actually connect to other pieces of software can be complex. SAP has created
an MDM data integration tool called the Data Hub. For more information

The Data Hub is a service-oriented, standards-based data integration solution, designed to help lower
implementation time and costs, and allows you to maintain control over ongoing data integration and
maintenance. The Data Hub is designed to easily import and export data between SAP
Commerce and external data storage systems (including ERP, PLM, CRM, etc.) taking advantage of
features like event publication and subscription data fragments can easily be consolidated and
grouped, and users can create categories and assign data accordingly. The Data Hub reduces the
effort for systems integration and initial data migration through a fully integrated toolset and pre-
built data mappings, including SAP pre-built extension integrations for key types.
In December 2008, SAP conducted an evaluation of existing ETL tools available at that time as
shown below:

 Talend Open Studio 3.0

 Pentaho Data Integrator (Kettle)

 Oracle Data Integrator

During the evaluation, SAP found that Talend Open Studio was quite powerful and easy to use and
might be a good choice for smaller fields of use. For large-scale applications, the Oracle Data
Integrator might be a good choice, especially if an Oracle support contract is already available.

Impex stands for Import and export. As the name suggests, Impex in hybris is used for
importing data from CSV file / impex file to hybris system and exporting data
from hybris system to CSV file. ... Impex import means always data flows intoHybris system
Hot Folder is a pre-designated common folder on the server. Any CSV data placed in
the folder invokes the import process called ImpEx and the result of the import is instantly
loaded into SAP Hybris system using a pre-defined data translation mode

ImpEx API
The ImpEx API allows you to interact with the ImpEx extension. You
can use the API to import and export data, and extend or adapt it to
your needs
Import API

You can trigger an import using the import API in a number of ways. These include using the
back end management interfaces, as well as triggering it programmatically

Export API

You can trigger an export using the export API in a number of ways. These include using the
back end management interfaces, as well as triggering it programmatically.

Validation Modes

The validation mode controls validation checks on ImpEx. By default, strict mode is enabled
meaning all checks are run.
Customization

You can extend your import or export process with custom logic. Customization allows you
to addresses requirements that cannot be achieved completely with the ImpEx extension.

Scripting

You can use Beanshell, Groovy, or Javascript as scripting languages within ImpEx. In
addition, ImpEx has special control markers that determine simple actions such
as beforeEach, afterEach, if.

User Rights

The ImpEx extension also allows modifying access rights for users and user groups.

Translator

A translator class is a converter between ImpEx-related CSV files and values of attributes
of SAP Commerceitems
There are four basic ways of triggering an import of data for the ImpEx extension:

1. Hybris Management Console using the ImpEx Import Wizard. For more details, see Using
ImpEx with Hybris Management Console or SAP Commerce Administration Console.

2. Hybris Management Console creating an Import CronJob. For more details, see Using ImpEx
with Hybris Management Console or SAP Commerce Administration Console.

3. Using ImpEx extension page. For more details, see Using ImpEx with Hybris Management
Console or SAP Commerce Administration Console.

4. Using the Import API, which is described here.


To support the systems with large numbers of products, SAP Commerce has a capability of
multithreaded import operations
You have several possibilities to perform an import programmatically. The decision depends mainly
on the specialized configuration needs. The basic kind of processing is the instantiation and
configuration of the Importerclass. For detailed information, Using an Importer Instance. Here you
have the full range of configuration possibilities. The instantiation and configuration of
an Importer class triggers an import cronjob too, but it additionally provides the features of a
cronjob, that is, all settings, results, and logs are stored as persistent, which is strongly preferred. The
third convenient alternative is the usage of the API methods of the ImpExManager method. For
detailed information, see Using a Method of ImpExManager. They also use an import cronjob, but
you do not have to create and configure it on your own.

Using an Importer Instance


The Importer class is the central class for processing an import. The
process for importing by directly using this class has 3 steps.

Procedure
Instantiate the class.
While instantiating the Importer class, this CSV-stream is given using a CSVReader or
an ImpExImportReader. If you only want to specify the input stream, use an CSVReader class,
the Importerinstantiates a corresponding ImpExImportReader. The usage of
an ImpExImportReader is only needed, if special settings while instantiation are needed (settings
after instantiation can be done using the getReader()method of the Importer instance).

Configure the import process.


You have several possibilities for configuring the import process.
Trigger the import.

Using ImpExImportCronJob
When using ImpExImportCronjob, you have the advantage of
persistent logging, as well as persistent result and settings holding.

Using a Method of ImpExManager


The ImpExManager class provides methods with the
qualifier importData. These methods all use a cron job for performing
an import from a given source and help you to simplify the import call.

Important parameters for the import methods are as follows:

 synchronous - sets the created cronjob to be performed synchronous or asynchronous.


 removeOnSuccess - sets the cronJob and created medias to be removed if finished
successfully.
 codeExecution - sets the execution of BeanShell code to be enabled.

Export API
You can trigger an export using the export API in a number of ways.
These include using the back end management interfaces, as well as
triggering it programmatically.

You can trigger an export of data for the ImpEx extension in the following ways:

1. In Hybris Management Console using the ImpEx Export Wizard. See Export Wizard.

2. In Hybris Management Console, using an ImpExExportCronjob. See Export Using an


ImpexExportCronJob.

3. In SAP Commerce Administration Console, using the ImpEx extension page. See Export via


ImpEx Web.

4. Using the export API which is described here.


While at an import script, the data to be imported is specified via value lines, an export script has a
different structure to define the set of items to be exported, as well as the export file format. 
You have several possibilities to perform an export programmatically. The decision depends mainly
on the specialized configuration needs.
The basic kind of processing is the instantiation and configuration of the Exporter class. Here you
have the full range of configuration possibilities. The instantiation and configuration of
an Exporter does an export cronjob too, but it additionally provides the features of a cronjob, that
is: all settings, results, and logs are stored as persistent, which is strongly preferred. The third
convenient alternative is the usage of the API methods of the ImpExManager. They also use an
export cron job, but you do not have to create and configure it on your own.

Export Using an Exporter Instance


The Exporter class is the central class for processing an export. You
can use this class directly for exporting data in three steps.
Procedure
Define your export configuration.
Create an instance of the class.
Start the export.

Export Using an ImpexExportCronJob


You can generate an export using the ImpEx export cron job.

When using the ImpExExportCronjob, you have the advantage of persistent logging, as well as
persistent result and settings holding. You can create a cron job using
the createDefaultImpExExportCronJob method of theImpExManager class. Possible
settings are provided in the API of the cron job class. 

Using an Export Method of the ImpEx


Manager
You can export data using dedicated methods provided by
the ImpExManager class.

The ImpExManager class has different methods with the qualifier exportData. These methods


all use a cronjob for performing an export with given ExportConfiguration and help you to
simplify the import call. The only important additional parameter for this methods
is synchronous. This parameter defines if the created cronjob is performed synchronous or
asynchronous

Structure of an Export Script


An export script needs to specify the target file to export items to, a
header line for defining how to export the items, and a statement
specifying which items to export.
Exporter API by default uses pagination of search results, therefore, to have accurate results, your
FlexibleSearch queries must contain the ORDER BY clause, for example ORDER BY {pk}.
Validation Modes
The validation mode controls validation checks on ImpEx. By default,
strict mode is enabled meaning all checks are run.
There are five different modes available, where two are only applicable for import and three for
export

 Import Strict - Mode for import where all checks relevant for import are enabled. This is the
preferred one for an import.

 Import Relaxed - Mode for import where several checks are disabled. Use this mode for
modifying data not allowed by the data model like writing non-writable attributes. Please be
aware of the fact that this mode only disables the checks by ImpEx, if there is any busines
logic that prevents the modification, the import fails anyway.

 Export Strict (Re)Import - Mode for export where all checks relevant for a re-import of the
exported data are enabled. This is the preferred mode for export if you want to re-import the
data as in migration case.

 Export Relaxed (Re)Import - Mode for export where several checks relevant for a re-
import of the exported data are disabled.

 Export Only - Mode for export where the exported data are not designated for a re-import.
There are no checks enabled, so you can write for example a column twice, which cannot be
re-imported in that way. Preferred export mode for an export without re-import capabilities.

Customization
You can extend your import or export process with custom logic.
Customization allows you to addresses requirements that cannot be
achieved completely with the ImpEx extension.
Writing A Custom Cell Decorator
Using a cell decorator you can intercept the interpreting of a specific cell of a value line between
parsing and translating of it. It means the cell value is parsed, then the cell decorator is called, which
can manipulate the parsed string and then the translation of the string starts

Writing A Custom Translator


If you have to change the translation logic for a value, then a decorator is not enough for your needs.
In such cases you can write your own translator and configure it for the translation of specific
attributes. You can do the configuration by simply adding the translator modifier to the desired
header attribute like:

Writing A Custom Special Translator


A special translator has to be written and configured in case you want to change the translation logic
for special attributes

Writing A Custom Script Modifier for Script


Generator

Scripting
You can use Beanshell, Groovy, or Javascript as scripting languages
within ImpEx. In addition, ImpEx has special control markers that
determine simple actions such asbeforeEach, afterEach, if.
With the scripting engine support in SAP Commerce, you can set the value of the
flag impex.legacy.scriptingto false to benefit from new scripting features in ImpEx. You
can then use not only Beanshell, but also Groovy and Javascript.
Standard Imports

By default, a number of standard imports are always provided to you by default in ImpEx scripting,
so you do not need to call them yourself. Here is a list of those imports:

User Rights
The ImpEx extension also allows modifying access rights for users
and user groups.

Translator
A translator class is a converter between ImpEx-related CSV files and
values of attributes of SAP Commerce items
A translator is one of the two ways SAP Commerce offers for using business logic when importing
or exporting items.
On import, a translator converts an entry of a value line into a value of an SAP Commerce item. It
writes the value from the CSV file into SAP Commerce.
On export, a translator converts a value of an attribute of a SAP Commerce item into a value line. It
writes the value from SAP Commerce into a CSV file.

Standard Value Translators


Standard value translators are used for values that are mapped to standard type attributes, in contrast
to special header attributes. If you do not specify a translator for a standard attribute, a fixed
translator is chosen by default depending on the specified attribute type. Be aware that with the
exception of the default translators, Standard Value Translators are not all part of the ImpEx
extension itself. They can also be part of other modules as well, such as theeurope1 extension. As a
consequence, a translator is not available if it is part of an extension that is not included in your
installation.
Special Value Translators
Special Value Translators are used for values that have no exact attribute match, or for values that
require complex business logic to resolve. Unlike Standard Value Translators, you have to explicitly
enable Special Value Translators via the translator modifier

ImpEx Import for Essential and


Project Data
The Convention over Configuration principle is adopted in SAP
Commerce to simplify and reduce the need for writing configuration
files. As a result, ImpEx files for essential data and project data can
be prepared without the need for additional data configuration.

ImpEx Syntax
SAP Commerce ships with an integrated text-based import/export
extension called ImpEx, which allows creating, updating, removing,
and exporting platform items such as customer, product, or order data
to and from Comma-Separated Values (CSV) data files - both during
run-time and during the SAP Commerce initialization or update
process.

CSV Files

The ImpEx extension uses Comma-Separated Values (CSV) files as the data format for
import and export. As there is no formal specification, there is a wide variety of ways in
which you can interpret CSV files.

ImpEx Syntax in CSV Files


An ImpEx-compliant CSV file contains several different kinds of data. The following
screenshot is a representation of a sample CSV file with different colors for different kinds of
data.

ImpEx Syntax Highlighting with UltraEdit

You can use syntax highlighting rules in UltraEdit to have a more colored view upon ImpEx
files.

Using ImpEx with Hybris


Management Console or SAP
Commerce Administration
Console
You can use the impex extension via the SAP Commerce API,
the Hybris Management Console (HMC) or the SAP Commerce
Administration Console.

Import
For importing data to the platform via the Hybris Management
Console (HMC), you have to create and configure a CronJob of type
ImpExImportCronJob.

The configuration of such a CronJob and the import result attributes are described in the next
paragraph. To make the configuration of such a CronJob easier, the HMC provides a wizard of type
ImpExImportWizard. For more information, seeImport Wizard.

Another possibility for importing data that has nothing to do with the HMC but is also a more
graphical alternative, is the usage of the ImpEx Web. This alternative is intended for development
only and can only be used by administrators. For more information
Export
For exporting data from Platform to CSV-files via the Hybris
Management Console(HMC) you have to create and configure a
CronJob of type ImpExExportCronJob.

You first need an export script that you can generate using the Script Generator. The configuration of
such an export CronJob and the export result attributes are described in the next section. For making
the configuration of such a CronJob more easy, the HMC provides a Wizard of
type ImpExExportWizard. This Wizard is described in the Export Wizard section below.

Another possibility for exporting data, which has nothing to do with the HMC but is also a more
graphical alternative, is the usage of the ImpEx Web. This alternative is intended for development
only and can only used by administrators.

ImpEx Media
An ImpEx media represents in general a CSV/impex file or a ZIP
archive containing CSV/impex files. They are used only by the ImpEx
extension for import and export processes.

ImpEx Import - Best Practices


Importing data is a common project task. If data being imported
surpasses a certain amount, the import time becomes a factor to
consider. For example, the performance of some business methods is
not suitable to import mass data. The reason is that some checks or
implemented service code—although very useful for normal
requirements—do not have a performance improved for importing
mass data. A typical case is importing Customers into a system, thus
this article refers to this example.
ImpEx Distributed Mode
SAP Commerce Platform offers a completely new ImpEx engine that
enables you to import large volumes of data using the power of the
whole cluster.
The Distributed ImpEx engine enables you to import Platform items from huge and complex
external files (for example, files that contain many dependencies between items), and at the same
time it delivers exceptional performance.
Distributed ImpEx leverages the existing ImpEx framework to parse and analyze input, and dump
unresolved value lines. It also leverages ServiceLayer for persistence, as well as TaskEngine to
process single batches of data.
Importing data using Distributed ImpEx (the distributed mode for short) consists of three phases:

1. Prepare and split phase

2. Single task execution phase

3. Finish phase

Executing Import
from Administration Console
To import data in the distributed mode using Administration Console,
use the same Administration Console page that the classical ImpEx
uses.

Context
Choose a file that includes data you want to import and start importing.

Procedure
1. Log into Administration Console.
2. Hover the cursor over Console to roll down a menu bar.
3. In the menu bar, click ImpEx Import.
ImpEx Import page displays.
4. Switch to the Import script tab.
5. Click Choose file and load your file.
6. Tick the Distributed mode option.

Click Import.
You should get a message about import status.

Executing Import
from Backoffice
To import data in the distributed mode, use the standard ImpEx import
wizard.

Context
Choose a file that includes data you want to import and start importing.

Procedure
1. Log into Backoffice.
2. Open ImpEx import wizard.
a. Click System.
b. Click Tools.
c. Click Import.
ImpEx import wizard opens.
3. Choose the data file to upload.

4.
a. Click upload and choose your file.
b. Click create.
5. Click Next to switch to the Advanced Configuration tab.
6. Select Distributed Mode.
7. Click Start.
The view switches to ImpEx Import Results.
8. Click Done.
Executing Import on Selected
Node Groups
Distributed ImpEx uses TaskEngine internally, which was designed to
work well in a cluster environment. This enables you to choose which
node group to execute import on.

Execution Results and Logs


Backoffice enables you to search for logs from a given CronJob.

Context
For backward compatibility, an instance of ImpExImportCronJobModel is available as a result
of data import execution. This CronJob contains all the logs from a given execution, as well as its
status.Caution

An instance of this CronJob must not be executed or scheduled for further execution.

To look up logs from a particular Distributed ImpEx import execution, follow the procedure.
Procedure
1. Log into Backoffice.
2. Look up a CronJob from a given import execution:
a. Click System.
b. Click Background Processes.
c. Click CronJobs.
You can see a list of CronJob instances from particular import executions.
3. Click the CronJob instance you're interested in.
The CronJob's editor area opens. Here you can find the status of the CronJob execution, as
well as a list of items containing the logs:

Using ServiceLayer Direct


Distributed ImpEx allows you to use ServiceLayer Direct

Import Cockpit Module


SAP Import Cockpit Module reduces the complexity of importing data by allowing you to define
import mappings using intuitive graphical user interface tool. It helps to improve the overall quality of
your data assets through the consolidation and validation of heterogeneous data.

Using the Import Cockpit Module, you can:

 Reduce import complexity and empower data managers to define import mappings using an
intuitive graphical user interface.

 Support the long-tail approach by enabling easy supplier on-boarding.

 Keep your content accurate by importing the most current product information from your
suppliers and business partners.

 Aggregate all product information scattered across various systems and departments.

Importing Product Data Using ImpEx


Files
Importing data from various systems into a central application is a complex task that requires
detailed knowledge in creating and maintaining import files. Usually these imports have to be done
by an external service or by internal IT department. This typically means that product managers and
other business users are dependent on the support of a third party before new data can be imported
into the central system. This results in additional effort spent and delays the availability of the new
data. Due to the import complexity, even an import of smaller amounts of data is time consuming,
because due to the efficiency reasons smaller imports are often accumulated until there is a larger
amount of new data available

The Import Cockpit enables the user to import data into the SAP Commerce platform using a CSV
source file without the need of specifying an ImpEx import script. Using Import Cockpit you can avoid
extensive integration efforts consumed by the creation of import interfaces

A flexible and high-performing Import Cockpit integrates with multiple source systems and supports
the rapid and seamless migration of vast product data. All data imports are managed from a central
location. Via import mappings you can load high volume data from various external sources as a flat
CSV file into the SAP Commerce application. The Import Cockpit allows you to manage, configure,
run, cancel and monitor import jobs. Also you can attach files and define import parameters.

The Import Cockpit also facilitates a selective data import. You can define a number of data lines to
be skipped. It is also possible to define constant values for attributes in all imported objects instead
of using field values of the CSV source data file. In addition, it is not necessary to map all source
data columns to attributes, so that certain columns can be omitted during import.

Moreover the functionality of the Import Cockpit provides the user with additional information about
the current job status, results or the history of imports. Due to these generated imports, the basis is
given to analyze and enhance the support service.

Using the Import Cockpit


The Import Cockpit enables the user to import data into the SAP Commerce platform using a CSV
source file without the need of specifying an ImpEx import script. Using Import Cockpit you can avoid
extensive integration efforts consumed by the creation of import interfaces.

A flexible and high-performing Import Cockpit integrates with multiple source systems and supports
the rapid and seamless migration of vast product data. All data imports are managed from a central
location. Via import mappings you can load high volume data from various external sources as a flat
CSV file into the SAP Commerce application. The Import Cockpit allows you to manage, configure,
run, cancel and monitor import jobs. Also you can attach files and define import parameters.

The Import Cockpit also facilitates a selective data import. You can define a number of data lines to
be skipped. It is also possible to define constant values for attributes in all imported objects instead
of using field values of the CSV source data file. In addition, it is not necessary to map all source
data columns to attributes, so that certain columns can be omitted during import.

Moreover the functionality of the Import Cockpit provides the user with additional information about
the current job status, results or the history of imports. Due to these generated imports, the basis is
given to analyze and enhance the support service.
Mapping Tab
n the mapping tab of a particular import job you can define or edit mapping data columns. The data
columns can be dragged to the mapping area in order to assign a column to the internal SAP
Commerce data structures. The mapping area shows a list of mapping entries that provide a
correlation between the SAP Commerce type attribute and the source column.

Quick and Reliable Performance


The Import Cockpit includes intelligent wizards that you can use for initial job creation. This enables
you to create frequently used jobs quickly. Stored job templates also accelerate the creation of
import jobs. Import status functionality empowers tracking the import jobs regarding the whole
process, from the creation, execution and finally the monitoring. Therewith the user receives a great
overview and is always up-to-date to see if supervision is necessary.

Benefits
 Reduce import complexity and empower data managers with an intuitive, graphical UI.

 Support the long tail approach by enabling easy supplier on-boarding.

 Keep your content accurate by importing the most current product information from your
suppliers and business partners.

 Aggregate all product information scattered across systems and departments.

Graphical user interface for importing a vast data amounts. 4.5.0

Data mapping via drag&drop - Allows cost-efficient supplier self-service models without IT4.5.0
involvement.

Mapping validation - Ensures data consistency and reduces follow up cost of bad data. 4.5.0

Progress tracking - Monitors the progress of an import job including last started job in the 4.5.0
current session.

Reusable saved mapping - enhancing productivity on recurring imports. 4.5.0

Browser-based, no client installation needed - easy roll-out to a large and external user
base.
Import Cockpit Interface
SAP Commerce Import Cockpit enables you to import data into the SAP Commerce using a CSV
source file without the need of specifying an ImpEx import script. You can perform the import
operations in the user-friendly interface of the SAP Commerce Import Cockpit.

The Import Cockpit consists of the following main areas:

 Navigation area on the left side for previewing of the import jobs history.

 Browser area in the center for browsing import jobs and mappings.

 Editor area on the right side for editing the details of the import jobs.

Navigation Area
The navigation area consists of the following UI(user interface) components:

 Menu

 Execution history box

 History box

 Info box

You can expand or collapse all boxes using the triangle button on the upper right side of a box.

You can rearrange the most boxes in the navigation area using drag-and-drop operations.
The Info box cannot be moved.
Menu
Use the Menu for the following:

 Choosing the language in which screen texts, catalog names, product descriptions, and the
like appear

 Choosing the user group

 Logging out

Execution History Box


The Execution History box displays the history of recently executed import jobs.

History Box
The History box displays a list of up to 20 modification steps. Every entry represents a modification
you have done in the course of the current Product Cockpit session. The list displays the earliest
modification that has been done at the top of the list, while the bottom entry indicates the latest
modification.

Click an entry to undo the represented modification plus any others that were done chronologically
after the modification step.

Undone modification steps are displayed with gray text. Click on an undone entry in the list to redo
the modification done in that entry and all modifications prior to that entry. In this
context redo means undo for undo, that is, a change you have undone is redone

You also can click the Undo  and Redo  buttons or use keyboard shortcuts Ctrl+Z for undoing
and Ctrl+Y for redoing.

As the modifications you make are written to the database outright and the undo/redo history is kept
with the Import Cockpit session, you are not able to directly review modifications other Import
Cockpit users have done. For a modification history of SAP Commerceitems, see SavedValues -
Keeping Track of Attribute Value Modification.

Info Box
The Infobox displays how many workflow-based tasks a user currently has assigned. It also
displays the number of comments.

 If you have any comments, clicking on the number of comments brings up the comments
screen. You can review, edit, add attachments, delete, and reply to existing comments here.

 If you have tasks assigned, clicking on the number brings up the task screen. You can
review your tasks and select an outcome for the tasks here.

The Product Cockpit has an interface to the Workflow Module to enable using workflows in


the Product Cockpit.
Browser Area
The Browser Area in the Import Cockpit consists of these main UIcomponents:

 Tabs:

o Welcome tab

o Job tab

o Mapping tab

 Caption

Caption
The caption bar consists of the following UIcomponents:

 Import Cockpit Interface

Content Browser
se the content browser to enter a search string narrowing the number of displayed

products.  Click the Search button to perform search.

Results are displayed on the active tab of the browser area.

Advanced Search
To access advanced search dialog, click Advanced Search button  in the search input field.

Use the Clear button  to delete all you entered in the input fields.

Use the Edit button  to add search criteria. They appear as additional input fields.

Select Sticky check box to make advanced search options visible all the time. If you want to hide it,
clear the Sticky check box.
Click the Search button to perform search.

Use the Close Browser button to close the advanced search

Welcome Tab
The Welcome tab is the default tab of the Import Cockpit. Here you can find the information on

 Your user role

 Tasks assigned to you

 Last edited import jobs

You can also create a new import job or go to the jobs, you created. It is also possible to go to Wiki
documentation on Import Cockpit.

Job Tab
The Job tab is composed of two parts

 Main area, displaying the list view of defined import jobs with the information on their status
and with action buttons.

 Context area, with tabs showing detailed information on the selected job.

Mapping Tab
The Mapping tab is composed of two parts

 Main area, displaying three sections and a toolbar, used to create and edit mappings.

 Context area, with tabs showing detailed information on the mapping.

Editor Area
By double-clicking on a job in the browser area, you open the job in the editor area. Here, you can
edit job data
The editor area consists of the header at the top plus a number of sections.

Header
The header displays the job's name. You can also browse through the products displayed in the
browser area using arrows.

Figure: Header containing the job's name.

Sections
The sections of the editor area display a number of job attributes you can maintain. You can
configure the list of attributes to be displayed. Use the TAB key to jump to the next attribute field,
and SHIFT + TAB to jump to the previous attribute field. Via the drag-and-drop operation, you can
move sections and the attribute fields within the editor area. You can re-order sections and the
attribute fields, even across sections. In addition, you can show hidden attribute fields and hide
displayed attribute fields.

The editor area of the Import Cockpit consists of the following sections:

 Basic

 Source File

 Timetable

 Logging

Basic

In the Properties section you can change the job name and upload the new source file. Here you
can also change the default Ignore error mode to Fail or Pause.
Source File

In the Source File section you can set the properties of the source file. Pay special attention to the
field Separator Character and to the radio button Has Header Line.

Timetable

It is possible to define an import job as a cron job. You can configure the trigger and other cron job
details in the Timetable section.
Logging

In the Logging section you can define the logging settings. By default the Log level database for
import jobs is set to ERROR, and Log level file is set to INFO.

Welcome Tab
The Welcome tab of the SAP Commerce Import Cockpit displays the essential information for
starting the work with the SAP Commerce Import Cockpit.

The Welcome tab is the default tab of the SAP Commerce Import Cockpit. Here you can find the
information on

 Your user role

 Tasks assigned to you

 Last edited import jobs


You can also create a new import job or go to the jobs, you created. It is also possible to consult the
documentation on SAP Commerce Import Cockpit.

ob Tab
In the Job tab of the SAP Import Cockpit you can browse the import jobs.

The Job tab is composed of two parts

 Main area, displaying the list view of defined import jobs with the information on their status
and with action buttons.

 Context area, with tabs showing detailed information on the selected job.

Main Area
In the main area of the Job tab you can see the import jobs in a list view.

To start the job click the Start Import Job button  . Note, that if this button is inactive  that means
you should first create a mapping for this job.

To create or edit the mapping, click the Edit Mapping button  .

To delete a job, click the Delete Job button 

Figure: The main area of the Job tab displaying the status of the Test Job

The context area of the Job tab consists of the following tabs:

 Trace Log tab, showing errors that occurred during the job run time.

 Source Data tab, displaying the source file in its original format.

 Source Data (Table) tab, displaying the source file in the table format.

 Output Impex tab providing output ImpEx script, generated from the source file.
 Log History tab containing the logs from the job run time.

Figure: The context area of the Job tab with the displayed Output Impex tab.

Figure: The context area of the Job tab with the displayed Output Impex tab.

Mapping Tab
In the Mapping tab of the SAP Import Cockpit you can create and edit the mapping for your import
job.

The Mapping tab is composed of two parts

 Main area, displaying three sections and a toolbar, used to create and edit mappings.

 Context area, with tabs showing detailed information on the mapping.

Main Area
In the main area of the Job tab you can see three sections:

 Source section, containing the names of the imported attributes from the source file.

 Mapping section, used for matching the source attributes with the SAP Commerceattributes.
Drag the attributes from the Source column and drop them to the Mapping column. Next select the
equivalent from the SAP Commerce attributes column and drop it to the same row.

 Target section, containing the list of attributes of a previously selected type of the SAP


Commerce, for example Product or User.

In the upper part of the main area you can find the toolbar, containing buttons for uploading ZIP files,
validating, and saving the mapping.
Figure: The main area of the Job tab displaying the mapping of the Test Job with the Product type.

Context Area
The context area of the Job tab consists of the following tabs:

 Console tab, showing errors in mapping.

 Preview tab, displaying the source file in the table format.

Figure: The context area of the Job tab with the displayed mapping errors.

Working with the Import


Cockpit
The Import Cockpit enables you to easily import data into your SAP Commerce.
In the Import Cockpit you can easily do the following:

 Create and edit import jobs.

 Create and edit mappings for the imported files.

 Create recurrent cron jobs.

Getting Started with the


Import Cockpit
The SAP Import Cockpit allows you to easily import data into your SAP Commerce.

ou can also use the Hybris Management Console HMC) to perform the tasks that you can perform in
the SAP Import Cockpit. For more information, see Using ImpEx with Hybris Management Console
or SAP Commerce Administration Console.

Procedure
1. Open an Internet browser.

2. Enter the URL of the SAP Commerce Cockpit in the browser's address bar. The default URL
is http://localhost:9001/mcc. If you are not already logged in, the SAP Commerce Cockpit
Login appears. 
3.

The Import Cockpit Menu


The SAP Commerce Import Cockpit menu allows you to select your screen language, define your
user settings, and log out.

 Data Language: Choose the language in which the screen text should appear.

 User Group: View to which user group your user account is assigned.

 User Settings: Select which user settings you want to reset.

 Logout: Log out of the current cockpit.

The Import Cockpit Perspective


The Import Cockpit perspective consists of the following areas:

 A navigation area on the left side for previewing of the import jobs history.

 A browser area in the center for browsing import jobs and mappings.
 An editor area on the right side for editing the details of the import jobs.

Creating Import Jobs


The SAP Import Cockpit allows you to create import x`x`jobs.

On the Welcome tab, click the Create a new import job button.

The Mandatory Fields page of the Create/Add Item wizard is displayed

Select a source file: click the Create an Item button  .


The Please Select an Option page is displayed. 

1. n select an already uploaded file (Selet an Existing Reference) or create a new upload
(Create a New Item). Select Create a New Item.

2. Click the Next button. The Mandatory Fields page is displayed. 


3. Enter the file identifier and click the Next button.
4. The Upload File page is displayed. 

5. Click the Upload Dialog button.

6. Browse your file folders to locate the file that you want to upload.

7. Select the file that you want to upload and click the Done button.

The Mandatory Fields page is displayed.

8. Select the job you want to use. If no jobs are defined, you must create one. Enter a name in
the Job field. 

Click the Done button.

Editing Import Jobs


You can edit existing import jobs

Context

To edit an existing import job:


Procedure
1. On the Welcome tab, click the View your jobs button.

2. Double-click on the name of the job that you want to edit.

The job attributes are displayed in the editor area. All changes are saved automatically;
you do not need to save them manually. For more information on the editor area, see

the Import Cockpit Interface. 

3. Modify the attributes and settings are required.


Creating Mappings for
Imported Files
The SAP Import Cockpit enables you to easily create mappings. You just drag and drop the
attributes from your imported file and map them to SAP Commerce attributes.

Context

You cannot start an import job without using a mapping. If you do not have a mapping specified,
the Start import job button   is inactive and you cannot import the desired file. The Start Import
Job button   is only activated once you have defined a mapping.

To create a mapping:

Procedure
1. On the Welcome tab, click the View Your Jobs button.
In the Overview Import Jobs page, all your defined jobs are displayed.

2. In the Actions column of the job for which you want to create a new mapping, click the Edit
Mapping button   .

The Load or Create New Mapping page of the wizard is displayed.


3. In this example, select Create New Mapping.

The Create New Mapping page is displayed.


4. Select the Target object and the Catalog version from the drop-down lists.

5. Click the Done button.

The job opens in a new mapping tab. sp


Because the mandatory attributes are not yet mapped, the console in the context area displays error
messages.
6. Drag the attributes from the Source column and drop them to the Mapping column on its left
side. 

7. Drag the corresponding attributes from the Product column and drop them to


the Mapping column on its right side, to create a mapping.

8. Click the Validate button to check if the mapping is correct.

If the mapping is correct, there are no errors indicated in the console.

9. Click the Save button to save the mapping. You can later reuse this mapping for other jobs.
In the example below you can see the created Test Job with the active Start import job button 

You successfully created a new mapping.

Editing Mappings for Import Jobs


Steps walking you through the edition of your import job mappings.

Context

You need to have your import job mappings created as described in the Creating Mappings
documentation.

Procedure
1. On the Welcome tab, click the View your jobs button.
2. Click the Edit mapping button  . Load the mapping that you want to edit.

3. Edit the mapping as required.

4. Click the Save button to save your edits.

Running Import Jobs


Use the SAP Import Cockpit to run your jobs.

Context

You can start your job directly after creating a mapping. To do so:
Procedure

1. Click the Run job now button in the top right


corner. 

You can also start your job from the Jobs tab by clicking the Start import job button 

The job is started.

2. To stop it, click the Stop import job button  .


You can see the result of the executed job in
the Jobs tab: 

3. You can also check the history of all executed jobs since you have logged in in
the Execution history box in the navigation area of the Import Cockpit:

importcockpit Extension
The importcockpit extension provides the SAP Commerce Import Cockpit Module, which
enables you to import data into SAP Commerce using a CSV source file without specifying an ImpEx
import script. You simply define the type of data to be imported and the target attributes each source
data column corresponds to.

To ensure proper mapping and avoid omitting mandatory attributes, a set of mapping validation rules
are applied. T
he importcockpit extension offers a preconfigured user interface, also known as a perspective.
This perspective is widely customizable. Below, you can find more information about this topic.

For information about how to add importcockpit extension to your SAP Commerce, please go


to Configuring Available Extensions.

An SAP Commerce extension may provide functionality that is licensed through different SAP


Commerce modules. Make sure to limit your implementation to the features defined in your contract
license. In case of doubt, please contact your sales representative.

Customization Options
The import cockpit consist of a single perspective that can be configured to offer multiple role-based
sets of user interface elements. For example, a principal group can be restricted from creating jobs
or mappings and only work with existing ones. The principal group can be further restricted to the
attributes the group is allowed to see, modify or map. This level of customization is a result of the
Cockpit framework on which the importcockpitextension is built. The Cockpit framework
offers several customization options described in cockpit Extension. You can distinguish different
levels of customization, easy, medium, and expert.

Easy Customization
You can also use the easy customization options described in cockpit Extension, section Easy
Customization, to configure most elements of the importcockpit extension. Further options not
fully covered in cockpit Extension are discussed below.

Customization Options Documentation

Configure the job tab. Configuring the Job Tab of the SAP Import
Cockpit

Learn about the customization options of the Configuring the Mapping Tab of the SAP
mapping tab. Import Cockpit

Change the console message descriptions. Configuring the Console Message Descriptions
Customization Options Documentation

Configure the mapping wizard. Configuring the Mapping Wizard

Learn how to display and hide attributes in the Configuring Target Section of the Mapping Tab
target section.

Learn how to configure the main area. Configuring the Main Area of the Mapping Tab

Learn about the customization options of the editor


area.

Medium Customization
The medium customization needs very little implementation, because it can be done by using an
existing importcockpit extension as a template to be modified. For details, see the cockpit
Extension, section Medium Customization.

You need a valid ZK Studio Enterprise Edition license for the medium customization.

Expert Customization
You can also construct a new cockpit extension, based on the importcockpit extension or using
the yCockpit template. For details see the cockpit Extension, section Expert Customization.

The expert customization needs experienced implementation, because it uses the SAPCockpit


framework independently from the existing cockpit. You need a valid ZK Studio Enterprise Edition
license for the expert customization.
Validation in the Import
Cockpit
This document describes how the validation works in the importcockpit extension.

Validation Types
There are five types of validation:

Validator Validation Type Description


Number

1 ValidateMappedSourceColumns Checks whether every dragged source


column is mapped to the target.

2 validateSourceColumns Checks if the mapped source columns


is still in the source file.

3 validateCollectionMappingLine4Attributes Checks if Translator- and-


ComposedMappingLine is mapped.
Note, that only the

Product type has this line.

4 validateMappingInsert Checks if a mandatory attribute is


missing.

5 validateMappingUpdate Checks if a unique attribute is missing.


Validators in Different Adding Data Modes
There are three different ways of adding data: INSERT, UPDATE, and INSERT_UPDATE. Below
you can find a table presenting which validator covers which adding mode:

Adding Data Mode Responsible Validator

INSERT Mode Validators 1 to 4

UPDATE Mode Validators 1 to 3 and 5

INSERT_UPDATE Mode Validators 1 to 5 (all possible validators)

Please note, that there is no way to configure the rules or strategies for validation.

Import Cockpit - Data Model


Overview
This document looks at the data model of the importcockpit extension. It provides enough
information to enable a developer to better understand the extension to the point where it can assist
in designing an importcockpit based customer project or simply customizing parts of the extension
itself.

importcockpit Extension Basics


The importcockpit extension provides a means for importing data into the SAP Commerce system
through a user friendly interface and avoids the need to learn the ImpEx syntax in details. The output
of this process produces a combination of import ImpEx scripts and import CSV files, all of which is
stored in an ImportCockpitJob object and saved in the database.
ImportCockpitCronJob
The ImportCockpitCronJob type extends ImpExImportCronJob. See ImpEx for more information
about this latter type. Here you can find the information about the extended attributes relevant to
the importcockpit extension.

An ImportCockpitCronJob contains the following attributes:

Extended Used For


Attribute Name

job An abort-able ImportCockpitJob, which is run to do the actual import of the data


(the jobMedia) held by the parent ImportCockpitCronJob.

inputMedia This is an ImportCockpitInputMedia typed attribute. It is the CSV file that


contains the data to be mapped. It needs to be structured in such a way that all
fields for a particular record to be imported into the SAP Commerce system are
contained on one line
The header line in the example
above: Number;category;approval;MinOrderQuantity;Currency;Unit;Price is used as the source
field names in the Source section of the Mapping tab. The other lines represent a record that is
transformed into data files (CSV) that matches the related SAP Commerce type. These data files
make up the jobMedia.

Below is a list of the relational rules of the ImpExImportCronJob model depicted in the figure above

An ImportCockpitCronJob can have:

 Only one job object and has to be of the type ImportCockpitJob. This is a mandatory


attribute and has to be added when the job is created and before saving.

 Only one inputMedia object and has to be of the type ImportCockpitInputMedia. This is a


mandatory attribute and has to be added when the job is created and before saving.

 Only one mappingValid flag set. Although this is a mandatory attribute, if not set during
creation time the default value of false is assigned.
 Only one nextExecutionTime object and has to be of the type java.util.Date. This is an
optional attribute and can be added after creation.

 Only one jobMedia object and has to be of the type ImpExImportCockpitMedia. This is an


optional attribute and can be added after creation.

 Only one mapping object and has to be of the type ImportCockpitMapping. This is an


optional attribute and can be added after creation.

ImportCockpitJob
The ImportCockpitJob type extends ImpExImportJob

mpExImportCockpitMedia
The ImpExImportCockpitMedia type extends ImpExMedia

ImportCockpitMapping
The ImportCockpitMapping type extends Media.

ImportCockpitInputMedia
The ImportCockpitInputMedia type extends ImpExMedia.

An ImportCockpitInputMedia contains the following extended attributes:

Attribute Name Used For

hasHeaderLine A boolean flag that indicates whether the source input media CSV file contains any
header column names.
Configuring the Job Tab of
the SAP Import Cockpit
This document covers the default configuration shipped out of the box as well as other custom
configurations available to the importcockpit extension.

The following documents outlines the general configuration options for the cockpit browser area,
including the List View, Advanced Search and the base views:

 Configuration of the Cockpit UI

 How to Configure the Advanced Search Dialog Box

 How to Configure the List View

 Job Tab end-user documentation of the SAP Import Cockpit


 Automatic Storing of UI Configuration
 The SAP Cockpit framework gives you an opportunity to automatically store your configured
user interface (UI) without usage of an XML.

The Job Tab


The job tab is already discussed in Job Tab end-user documentation of the SAP Import Cockpit.

Run Time Configuration


The list view is the default view mode for the Job tab. It is constructed as a table where each row is
a separate ImportCockpitCronJob instance. This view mode can be configured without having to
modify any XML configuration files
Advance Configuration
The list view can also be configured based on user roles where each user group can have a different
UI setup. This type of configuration can be done by editing the XML UI configuration files.

Simple Search and Sort Order


Here you can edit and change the search properties including sort order, which are necessary for
the simple search.

Advanced Search and Sort Order

Figure: Advanced Search and Sort Order options for the Job tab of the SAP Import Cockpit.

Figure: Runtime Advanced Search configuration for the Job tab of the SAP Import Cockpit.


Here you can edit and change the search properties including sort order, which are necessary for
the advanced search

List View Page Size


Figure: Default paging options for the list view in the SAP Import Cockpit.

On the toolbar, a user has the option to choose the number of ImportCockpitCronJob to display in
the list view. The current values in the drop down box shows the default values. 

Figure: The list view in the main area of the Job tab of the SAP Import Cockpit.

The default layout of the list view shows only a subset of the total properties that can be shown in
the list

Configuring the Mapping Tab


of the SAP Import Cockpit
The Mapping Tab Overview
The Mapping tab is composed of two parts

 Main area, displaying three sections and a toolbar, used to create and edit mappings.

 Context area, with tabs showing detailed information on the mapping.

Customization Options Documentation

The target section Configuring the Target Section of the Mapping Tab

The mapping section Configuring the Main Area of the Mapping Tab
Customization Options Documentation

The context area Configuring the Console Message Descriptions

Mapping Wizard Configuration


The mapping wizard used to create/select existing mapping for a ImportCockpitCronJob is a
custom wizard with minimal configuration options. It does not follow the normal configuration options
as discussed in the cockpit Extension - Technical Guide. You have also the possibility to manually
set the catalog version.

Customization Options Documentation

Mapping wizard Configuring the Mapping Wizard

Configuring the Mapping


Wizard
The following documents outlines the configuration options for the mapping wizard.

Configuring the Target Types of the Mapping


Wizard
It is possible to configure the available target object types of the mapping wizard

Configuring the Editor Area


of the Import Cockpit
The How to Configure the Editor Area document outlines the general configuration options for the
editor area. This is expanded here to cover the default configuration shipped with the SAP
Commerce, as well as other custom configurations available to the importcockpit.

Tip
Automatic Storing of UI Configuration

The Cockpit framework allows you to automatically store your configured user interface (UI)
without using XML. For more information, see Storing UI Configuration 

Impex Scripts and Media


Generation in the Import
Cockpit
The ImpEx scripts and related import medias are generated in the importcockpitextension.

Creating Mappings for


Imported Files
The SAP Import Cockpit enables you to easily create mappings. You just drag and drop the
attributes from your imported file and map them to SAP Commerce attributes

Context

You cannot start an import job without using a mapping. If you do not have a mapping specified,
the Start import job button   is inactive and you cannot import the desired file. The Start Import
Job button   is only activated once you have defined a mapping.

To create a mapping:
Running Import Jobs
Use the SAP Import Cockpit to run your jobs.

Context

You can start your job directly after creating a mapping. To do so:

Procedure
1. Click the Run job now button in the top right corner. 

To stop it, click the Stop import job button 

You can see the result of the executed job in the Jobs tab: 

You can also check the history of all executed jobs since you have logged in in the Execution
history box in the navigation area of the Import Cockpit:

importcockpit Extension
The importcockpit extension provides the SAP Commerce Import Cockpit Module, which
enables you to import data into SAP Commerce using a CSV source file without specifying an ImpEx
import script. You simply define the type of data to be imported and the target attributes each source
data column corresponds to.

To ensure proper mapping and avoid omitting mandatory attributes, a set of mapping validation rules
are applied

The importcockpit extension offers a preconfigured user interface, also known as a


perspective. This perspective is widely customizable

Impex Scripts and Media


Generation in the Import
Cockpit
The ImpEx scripts and related import medias are generated in the importcockpitextension
Import Cockpit Job Media Generation
Overview
The figure above shows a basic overview of the activities involved in setting up an import job in
the importcockpit extension. Below is a breakdown of the areas depicted in the image above. The
two main services required for the ImpEx script generation and import CSV files are:
the ImpExMediaGenerationService and the ImpExTransformationService.

The section below outlines creating the mapping and is used for the background information:
Activity Number Explanation

1 It starts with creating a new ImportCockpitCronJob.

2 Attaching a source CSV file, that is the data that will be imported.

3 Creating a mapping via the mapping wizard and the Mapping perspective.

4 Validating and saving the mapping to the job.

This section looks at the call to start the media generation processes:

Activity Explanation
Number

5 Run the ImportCockpitCronJob either via the Run job now button in the Mapping

perspective or the Start import job action in the list view on the Job Overview tab.

6 The perform cron job method is then called on the ImportCockpitJob active object.

7 A call to the ImportCockpitCronJobService is made to start creating the job media, that is


the ImpEx file and related CSV files.

8 The ImportCockpitCronJobService makes a call to the ImpExTransformationService to


generate the ImpEx file and related CSV files.

This section looks at a call to the ImpExTransformationService, which is responsible for


interpreting the mapped lines and generating data that represents the SAP Commercetarget item
type:
Activity Explanation
Number

9 A call is made to generate the import ImpEx script and involves several sequential calls that
result in the ImpEx file used later in the process to import the generated CSV data. See Import
Impex Header Generation in the SAP Commerce Import Cockpit for more details.

10 A call is made to generate the import data files and involves several sequential calls that result
in, depending on the related SAP Commerce item type that the data represents, one or more
CSV files. That is, if you are importing Product data with prices, then two CSV files are
generated. One for product and the other for the price information. See Import Data File
Generation in the Import Cockpit for more details.

The last two activities are for background information only.

Activity Explanation
Number

11 An Importer, making use of our cron job with related media and
an ImpExImportReader is created.

12 The doImport method is called, which extracts the media from out job and imports the
generated CSV files using the generated ImpEx file.

Required Services and Related ImpEx


Generator Operation Strategies
The main services and ImpEx generator operation strategies required to produce the ImpEx script
generated above include:

 ImpExMediaGenerationService
o ImportCockpitMappingService

o HeaderGeneratorOperation

o FileGeneratorOperation

Required Services and Related Data


Generator Operation Strategies
The main services and data generator operation strategies required to produce the CSV formatted
data above include:

 ImpExTransformationService

o ImportCockpitMediaService

o DataGeneratorOperation

o FileGeneratorOperation

Importing the Classification


Attribute Value
You can use Import Cockpit to import classification attribute values.

Context

Once the import job and a mapping object for that job are created, classification attributes mapping
becomes similar to the normal attributes mapping. There is, however, an exception: the classification
attributes are not listed in the target attribute list on the right side. Instead, they need to be selected
from within a dialog box that pops up. For more information, see the following:

 Creating Import Jobs

 Creating Mappings for Imported Files


To create a classification attributes mapping:

Synchronization
Between SAP
Commerce Installations
The SAP Commerce-to-SAP Commerce synchronization consists in transferring data from a source
system to a target system, for example from one SAP Commerce installation to another, using
the Data Hub in between. SAP Commerce-to-SAP Commercesynchronization is possible thanks to
the y2ysync framework.

The y2ysync framework ensures high performance of SAP Commerce while you synchronize your


data. Synchronization has been divided into stages that are independent of each other. They include
transferring data from the source system to the Data Hub, and from the Data Hub to the target
system. The y2ysync framework guarantees that the process of transferring data from the source
system to the Data Hub doesn't affect the target system's performance. Similarly, transferring data
from the Data Hub to the target system doesn't affect the source system's or other target systems'
performance.

The y2ysync data flow architecture ensures high performance and supports scalability of SAP
Commerce.

y2ysync Framework
The y2ysync framework shipped with Commerce Platform allows you to develop your own data
synchronization solutions.

With the y2ysync framework, you can create sophisticated system architectures consisting of


multiple SAP Commerce clusters. Imagine, for example, one SAP Commerce cluster serving as a
Product Information Management server, and multiple region-specific storefront clusters. The PIM
server would only manage and process the product and catalog master data. It would push out
required data to a SAP Commerce installation that serves the storefront and collects orders.
You can extend the y2ysync framework to create solutions that integrate data between SAP
Commerce and other systems.

Configuration and
Synchronization
y2ysync synchronization heavily relies on the y2ysync and the Data Hub extensions. For that
reason, it requires proper configuration, both on the Commerce Platform side and the Data Hub side.

Item Description

Y2YColumnDefinition The most basic configuration item. It represents a single


attribute (a column) that you can synchronize.
The product's code is an example of such an attribute.

Y2YStreamConfiguration Represents a single type that you can synchronize. It


contains multiple Y2YColumnDefinition items that define
Item Description

attributes you want to synchronize. Theproduct, as defined


in Platform (contains multiple attributes), is an example of
such a type.

Y2YStreamConfigurationContainer Represents a single unit of synchronization that contains


multiple Y2YStreamConfiguration items. When you initiate
synchronization, you search for changes in all the items that
belong to the types specified
in Y2YStreamConfigurationContainer. These changes are
then saved as medias for export

In addition to preparing configuration on the source Platform side, you have to configure Data


Hub with installed y2ysync-datahub-ext extension. Such a configuration describes:

 The data model that you feed into Data Hub from Commerce Platform

 Transformations the data model should undergo during the composition phase

 The target systems to which you want to push the data.

You can generate your Data Hub configuration using the Data Hub Configuration Generator. It
generates an xml based on Y2YStreamConfigurationContainer. Use the
same Data Hub Configuration Generator to upload your configuration to the running Data Hub.

Synchronization

The synchronization process usually involves three parties:

 The source system detects items that you created, changed, or deleted since the last
synchronization. It creates files that contain information about the changes - medias. It finally sends
a request to Data Hub to start executing its part of the synchronization process.

 Data Hub acts as a middleman between the source and the target systems. Data Hubimports
data from the source system, composes it and publishes to the target system. Depending on
configuration, you can trigger these steps automatically or manually. For more information,
see Target System Definition and Auto Passthrough.

 The target system receives the new or updated data via impex

You can initiate synchronization programmatically using the startSync method


fromSyncExecutionService, or by clicking the Perform Y2YSyncJob Action button
available in Backoffice. To initiate synchronization, an instance of Y2YSyncJobwith a unique
code is required. It must be also assigned to a
specific Y2YStreamConfigurationContainer. As a result of
initiation, Y2YSyncCronJob is created. Each Y2YSyncCronJob represents a single
execution of a synchronization action at a given time. It also holds SyncImpExMedia objects
that contain the data that have changed since the last synchronization.

You can make Data Hub import the y2ysync media files again if the previous attempt failed. To
do it, click the Resend to datahub action button available in Backoffice.

The relation between Y2YSyncJob and Y2YStreamConfigurationContainer is


somehow similar to Factory and objects that it produces. The second execution of
synchronization would create another Y2YSyncCronJob object. The result doesn't overwrite
the previous Y2YSyncCronJob. Thanks to such a design approach, you can keep the history of
synchronization attempts.

Data Hub Processing Details

When all y2ysync medias have been created, the source Platform calls the Data Hubendpoint. It is


exposed by the y2ysync-datahub-ext extension at the address/y2ysync/v60. The request
from Platform to Data Hub to start executing its part of the synchronization process contains the
following information:

 the synchronization execution id

 the source Platform url

 links to all medias created in the change detection process

 a pool and a feed

 the auto-publish target systems


 y2ysync uses a feed and a pool just as they are specified
in Y2YStreamConfigurationContainer. If they don't exist in Data Hub, they get
created.
 The next step of the synchronization process depends on whether you have specified the
target system earlier or not. If not, then y2ysync assumes the manual mode for
composition and publication. In this case, you must trigger composition and publication
manually in Data Hub. However, if you have specified your target system
inY2YStreamConfigurationContainer in the source Platform, the target system
implicitly enables the automatic passthrough mode. In the automatic passthrough mode,
composition and publication are triggered automatically. For more information, see Target
System Definition and Auto Passthrough.

CautionWhen using the automatic passthrough mode, before the first synchronization make
sure that you have set your target system on
theY2YStreamConfigurationContainer object. The first synchronization request
creates the necessary feed and pool, and they must have a specific publication strategy set with the
target system names. Therefore, if the target system was invalid or empty, automatic passthrough
won't take place.

Change Consumption
Modes
The process of marking items as synchronized (consumed) is called change consumption. There
are two change consumption modes that you can use to synchronize data, the synchronous mode
and the asynchronous mode. The asynchronous mode may contribute to improving the overall
performance of the synchronization process

Data Hub uses REST to notify the source Platform that it has completed downloading all medias with
changed items created in the change detection process. The endpoint responsible for receiving this
notification call is exposed
byde.hybris.y2ysync.controller.ConsumeChangesController at
url /changes/

In the synchronous mode Data Hub doesn't proceed with data composition or publication until all
items are consumed. The notification call to the change consumption endpoint is blocking, and the
response is rendered only after all items are consumed. Such behavior may not be desired as it
could slow down the whole synchronization process

The asynchronous mode uses the task engine and returns the acknowledgment immediately. It
allows Data Hub to resume processing data in no time. As a result the composition and the
publication in Data Hub take place simultaneously with change consumption in the source Platform.
The asynchronous processing logic is defined by
the consumeY2YChangesTaskRunner Spring bean.

The change consumption process works in batches and by default uses multiple threads. You can
configure it with the following properties:

Generating and Uploading


a Data Hub Configuration
You can generate a configuration required for a Data Hub instance that participates in
a y2ysync synchronization scenario

You might also like