Sapbw - Implementing Delta Updates in The Financial Domain: by Sergei Peleshuk
Delta updates in SAP BW allow loading only new or changed data from the source system into the data warehouse. There are challenges to implementing delta updates for historical data loads involving large volumes. To address this, the document recommends:
1) Splitting historical data loads into portions based on attributes like company code or time periods.
2) Performing initial loads of these portions step-by-step to populate the ODS and InfoCube.
3) Scheduling daily delta update jobs to automatically load new and changed data from the source systems on an ongoing basis.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0 ratings0% found this document useful (0 votes)
68 views3 pages
Sapbw - Implementing Delta Updates in The Financial Domain: by Sergei Peleshuk
Delta updates in SAP BW allow loading only new or changed data from the source system into the data warehouse. There are challenges to implementing delta updates for historical data loads involving large volumes. To address this, the document recommends:
1) Splitting historical data loads into portions based on attributes like company code or time periods.
2) Performing initial loads of these portions step-by-step to populate the ODS and InfoCube.
3) Scheduling daily delta update jobs to automatically load new and changed data from the source systems on an ongoing basis.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3
SAP BW
Implementing Delta Updates in the Financial Domain
By Sergei Peleshuk
Challenges with delta updates Delta updates in SAP BW are used when we have to update our data targets with recently changed information. These could be newly created documents or old documents recently modified. The way in which this mechanism works in SAP BW is different from extractor to extractor, however, there are some similarities. If we take a financial line item extractor (datasource 0FI_GL_4), the delta process takes all newly created documents based on a document timestamp and, in addition to that, all documents modified since the last extract.
There are a number of challenges in implementing this process and moving it into production. First of all, the data flow architecture has to be setup like this: Source-> ODS object -> Data target (Figure A). The ODS object identifies the changes made for individual characteristics and key figures within a delta data record. Other data destinations (InfoCubes) can be provided with data from this ODS object. If there are corrections to the documents in the source system, it is tracked in the ODS object and a reversal entry is created for the InfoCube automatically.
Figure A
This setup has certain advantages and drawbacks. For example, it is not easy to reload documents for a certain period of time or a specific range of documents. Depending on the way how the delta process is initialized you may be able to simplify the procedure of data refresh if it is needed at a later stage.
Another challenge in this process is historical data load. If you have to load several years of transactional data the system wont let you using Full updates year by year and then switching back to delta. On the other hand, if you try initializing delta process for the whole history your database engine may not be able to handle days of uninterrupted processing time.
So what is the solution? Well, there is a way to initialize delta process in such a way that would allow transferring historical data step by step, and make smart updates to the system later on.
Which data is picked up by Delta Delta extraction enables you to load into the BW system only that data which has been added or has changed since the last extraction event. Data that is already extracted and that is not changed is kept. This data does not need to be deleted before a new upload.
There are two streams of data picked up by the Delta process via the datasource 0FI_GL_4:
1) All documents created in the source system with a timestamp later than documents picked up by the last delta. A time stamp on the line items serves to identify the status of delta data. Time stamp intervals that have already been read are then stored in a time stamp table (BWOM2_TIMEST).
2) All changed document line items since the last data request in the SAP R/3 system. All line items that are changed in a way relevant for BW are logged in the source system in the delta queue table (BWFI_AEDAT).
Stumbling points when moving to Production As soon as all objects are transported to production we have to start historical data load. This process may be complex and time consuming. At the same time when loading production data you may encounter problems you never faced in the Q&A environment. For example, production transactions may contain disallowed characters for some characteristics, or have lower case characters in the cases when they are not permitted. This is usually discovered during historical data loads, and therefore requires corrections made in the development environment, transporting changes all the way into production and finally reloading data.
On the other hand, dealing with huge data volumes creates a problem related to database capacity, server processing power and in some cases disk space issues. In practice, it is extremely important to load data in reasonable portions, which allows monitoring and control over the database.
How to deal with historical volumes In the case historical volumes take millions of records it is important to find a way on how to split up the loads into portions and upload data step by step. In the scenario of delta loads to ODS and later to the InfoCube you have to use initial loads in order to apply delta extractors.
First step in this process would be to analyze historical data, and identify data objects that would allow you to split up the loads into reasonable portions. These objects could be Company code ranges, cost center ranges, etc. It is important that these ranges do not change over time. Time periods can be used here as well. However, if you discover that documents for a certain company code have to be reloaded at a later stage, you would need to refresh the whole history for the cube. If on the contrary you make your initial loads by company code, for example, it gives you an extra flexibility to refresh history for a certain company code only.
Second step involves running initial loads step by step for each company code/ time period range. For example, we start with company codes 01-10 for year 2000 and continue year by year until the last year. For the continuous delta update we have to run an initial load for a time period range from current year until 31/12/9999 for example. Then we proceed to the next company range, say 11-20, and finally 21-99 (Figure B).
This approach ensures reasonable loads of historical data and leaves you a flexibility for data refresh at a later stage.
Figure B
Job scheduling Delta update process can be fully automated. This means no manual involvement is required in the daily data updates unless there are system problems or breakdowns.
A daily update process consists of three major phases: 1) Master data updates, e.g. customer, vendor, GL account data; 2) Running a daily transactional extractor from the source system to the ODS; 3) Further upload from ODS to InfoCube - at this stage additional transaction lines may be generated depending on the number of reversal entries required in the cube.
This standard job schedule process may be preceded or followed by other relevant jobs, depending on the system design and whatever other extractions are required by the ultimate solution.
Sergei Peleshuk ABSS Europe Tel. +32 2 375 9752 www.abss.be
THE STEP BY STEP GUIDE FOR SUCCESSFUL IMPLEMENTATION OF DATA LAKE-LAKEHOUSE-DATA WAREHOUSE: "THE STEP BY STEP GUIDE FOR SUCCESSFUL IMPLEMENTATION OF DATA LAKE-LAKEHOUSE-DATA WAREHOUSE"
Knight's Microsoft Business Intelligence 24-Hour Trainer: Leveraging Microsoft SQL Server Integration, Analysis, and Reporting Services with Excel and SharePoint