Towards Automated Enterprise Architecture Documentation: Data Quality Aspects of SAP PI
Towards Automated Enterprise Architecture Documentation: Data Quality Aspects of SAP PI
Key words: Enterprise Service Bus (ESB), SAP PI, data quality, automated Enter-
prise Architecture documentation
1
2 Sebastian Grunow, Florian Matthes, and Sascha Roth
2 Related Work
Existing EA frameworks covering inter alia The Open Group Architecture Frame-
work (TOGAF) [20], The Integrated Architecture Framework (IAF) [22], and Enter-
prise Architecture Planning (EAP) [18] commonly do not detail how to acquire and
incorporate EA knowledge. Only few approaches considering the documentation of
the status quo could be identified. However, the descriptions usually take place on a
high abstraction level without consideration of concrete process tasks. For instance,
TOGAF suggests the usage of existing architecture definitions as a starting point,
which if necessary, can be updated and verified. In case such information is un-
available the collection of data “whatever format comes to hand” [20] is advised by
TOGAF.
Data Quality Aspects of SAP PI 3
Moser et al. [13] give a first idea for an automated tool-aided collection pro-
cess by introducing a set of EA process patterns, one of which is called Automatic
Data Acquisition / Maintenance. The authors propose a process aimed at automati-
cally collecting data from various sources converted into an EA information model
instance. However, the considerations do not detail possible information sources
including data quality thereof.
Based on identified requirements on an automated documentation process, Far-
wick et al. [6] develop a basic structure of an automated maintenance process com-
prising the collection of data as well as the propagation of changes. However, when
it comes to implementation details the authors refer to future work.
A more detailed look on an implementation is taken by Buschle et al. in [5],
whereby NeXpose, a vulnerability scanner aimed at determining weaknesses within
the system landscapes is used. Apart from weaknesses the scanner also collects in-
formation about the underlying systems landscape which subsequently is mapped to
an EA information model. Hence, information on existing services, installed soft-
ware, and used operating systems could be gathered. While the information cov-
erage, i.e. the extent to which the demanded EA information can be determined,
is considered within the publication the usefulness of the collected data is not dis-
cussed leaving open questions – as to the correctness and completeness of the data,
for instance. Instead of using a vulnerability scanner, [4] employ an ESB. While the
degree of coverage to which data of a productive system can can be used for EA
documentation is thoroughly analyzed data quality thereof is not evaluated.
To the authors best knowledge there is no existing research analyzing data quality
aspects of ESB systems and in particular SAP PI systems in productive IT environ-
ments.
In line with Steen et al. [19] we observed the EAs to be an important starting point
for analysis, design, and decision processes. For this to work, EA information must
provide an accurate, correct, and up-to-date model of the real world [6]. Accord-
ingly, aiming at developing an automated EA documentation process the suitability
of an ESB system as an information source is not only determined by information
content but also by the quality of the data saved in the system, e.g., Are the data
up-to-date? Are the attribute values correct? and Are the data consistent?.
Subsection 3.1 gives an overview of SAP PI. In order to gain a deeper insight into
the data quality of a SAP PI system in practice we conducted interviews with SAP
PI experts and an online survey with EA practitioners. Overall, 19 EA practitioners
participated in the subsequent online survey and could be identified as responsible
for an SAP PI system. Thereby, questions about the data content of SAP PI were
assessed in terms of selected quality dimensions (see Subsection 3.2). In Subsec-
tion 3.3 we present the survey results in greater detail.
4 Sebastian Grunow, Florian Matthes, and Sascha Roth
SAP Application
Third Party
Enterprise Integration Application
Integration Server
Service Builder Builder Marketplace/
Business Partner
Third Party
Middleware
As no research concerned with the quality of ESB systems in general and SAP PI
systems in particular could be identified, we conducted an online survey aimed at
evaluating the quality of data contained in SAP PI systems in practice. The survey
was opened within 45 days. 45 SAP PI experts started the survey, 19 fully com-
pleted, 24 partly completed it whereas 2 only answered the first two questions. The
last two respondents are neglected in the subsequent analysis.
and South America. Out of all respondents 67 percent claimed to work for a con-
sulting company. This circumstance favours the results of the survey as the answers
include the experience about the situation in more than one organization.
Perception of the overall quality: All respondents rate their data quality as nearly
perfect (80 percent) or even perfect (20 percent) which is partly attributable to high
correlation between quality and functioning of the system. Incorrect or out-of-date
data in most cases would lead to an unwanted behavior of the system, e.g. malfunc-
tions.
20
Only elements involved in
the collaborative processes
15
Databases
Computer Systems
SAP Software Products / Software
Web AS ABAP)
I don't know
15
Elements are complete with some
10 exceptions (missing attribute
values, default values)
5 Commonly the elements are
incomplete (missing attribute
0 values, default values)
SAP Technical Systems
Databases
SAP Software Products
Computer Systems
I don't know
Products
Systems
Systems
20
Elements are accurate
Number of participants
15
Elements are accurate with
10 some exceptions (e.g.
incorrect attribute values)
Commonly the elements are
5
faulty
0 I don't know
SAP Software
Databases
Technical Systems
Third Party Software
Computer Systems
Standalone Java
Products
Products
Systems
Fig. 4 Correctness of SLD data (n=19)
3%
7%
17%
0-1 Month 23% 0-1 Month
28%
1-3 Months 1-3 Months
3-6 Months 10% 3-6 Months
28%
6 Months – 1 Year 6 Months – 1 Year
13%
> 1 Year > 1 Year
17%
I don’t know I don’t know
34% 20%
information beyond is commonly abstracted even though the SLD officially is meant
to be the central information source about the system landscape.
This paper gives a first impression which impact data quality aspects of productive
IT systems have on automated EA documentation endeavors. Survey results pre-
sented show the majority of data in productive SAP PI systems seems to be accurate
(complete and correct). Combined with results of our recent study [4], an SAP PI
system seems to be 1) suitable and 2) a reasonable starting point for an automated
EA documentation endeavor. However, this statement cannot be generalized and fur-
ther research of productive IT environments is necessary to show the real potential
of an automated EA documentation.
Our current efforts are centered around a particular ESB system, namely SAP PI.
Further research could integrate Configuration Management Databases (CMDBs)
and infrastructure monitoring tools. Both could be very beneficial for impact analy-
ses and strategic planning. Our long term hypothesis is ’when information is gath-
10 Sebastian Grunow, Florian Matthes, and Sascha Roth
ered from upper layers, e.g. business processes and capabilities, more unstructured
information is to expect which would have an impact on 1) the data quality and 2)
on the automation potential for EA documentation. With this in mind, a closer look
at the field of process mining could be worthwhile.
References
19. Steen, M., Strating, P., Lankhorst, M., Doest, H., Eugenia, I.M.E.: Service-Oriented Enterprise
Architecture. Idea Group Publishing (205)
20. The Open Group: TOGAF Version 9 – A Manual, 9th edn. Van Haren Publishing (2009)
21. Wang, R.W., Strong, D.M.: Beyond accuracy: What data quality means to data consumers.
Journal of Management Information Systems 12, 5–30 (1996)
22. van’t Wout, J., Waage, M., Hartman, H., Stahlecker, M., Hofman, A.: The Integrated Archi-
tecture Framework Explained: Why, What, How, first edn. Springer, Berlin (2010)
23. Zachman, J.A.: A framework for information systems architecture. IBM Syst. J. 26(3), 276–
292 (1987). URL http://portal.acm.org/citation.cfm?id=33596