0% found this document useful (0 votes)
91 views40 pages

Evolution of System Integration

The document summarizes the evolution of system integration solutions from a historical perspective. Early solutions focused on non-real time data sharing through file sharing and common databases, which had limitations around timeliness of data and reliability. Sockets then enabled real-time data sharing by establishing direct connections between applications, but required agreement on protocols. Later solutions like RPC, ORBs, and messaging aimed to reduce coupling between systems while providing functionality sharing and language independence.

Uploaded by

Fredrik
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
91 views40 pages

Evolution of System Integration

The document summarizes the evolution of system integration solutions from a historical perspective. Early solutions focused on non-real time data sharing through file sharing and common databases, which had limitations around timeliness of data and reliability. Sockets then enabled real-time data sharing by establishing direct connections between applications, but required agreement on protocols. Later solutions like RPC, ORBs, and messaging aimed to reduce coupling between systems while providing functionality sharing and language independence.

Uploaded by

Fredrik
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 40

Lecture 2

Evolution of System Integration


Source: Course Book, chapters 3 to 5

Systemintegration av IT-baserade affärsystem


SYSTINT
Jelena Zdravkovic
Goal of the Lecture
To:

 explain system integration solutions from a historical perspective,

 emphasize the benefits and drawbacks of the presented solutions.


What is evolution of SI about?
 Aside from the challenge to enable application connection in a non-local (i.e.
distributed) system environment…

 ….the main principle behind integration has been to facilitate loose


coupling among participating systems/applications, to reduce the
assumptions that two systems/applications make about each other when
they exchange information or functionalities.

 Aim: to make possible for systems to integrate, while keeping them low
dependable from each other (to execute other tasks at the same time, to
integrate with other systems for the same needs, to make internal changes
without requiring changes of others, etc.)
What is evolution of SI about?
 The more assumptions two parties make about each other (i.e.
higher coupling), the more efficient the communication can be, but
the less tolerant the solution is for changes because the parties are
tightly coupled to each other in some or even all aspects to the
below:
o Time – all the connecting apllications/systems have to be available at
the same time?
o Exchange protocol – it must match (example: https) for all the
connecting applications/systems?
o Data Format – the list of parameters and their types must match?
A View on the Integration Evolution

Data Sharing Sockets RPC ORBs Messaging

Real-time data sharing

Non-real time Functionality sharing Functionality sharing. Separate


data sharing Interface description components for network and
marshalling* functions. Scalable
connectivity, guaranted delivery

Functionality sharing.
Language/Platform independence
Separate components for
network and marshalling functions

Tight coupling Loose coupling

IoSI
Improved, combined to…. SoSI
PoSI

*Serialisation of an object from the memory representation into the format suitable for transmission to
another system
Content
 Data sharing and Sockets
 Remote Procedure Calls (RPC)
 Distributed Objects (DO) and Application Servers
 Messaging (part II)
Data Sharing and Sockets
 They encompass different methods for sharing data between
applications.
 Applications started to share data long before they started sharing
functionalities.
 Data integration approaches change/developed along three layers:
from OS layer, to databse layer, and to application layer.
 There have been 3 significant methods for sharing data between
systems:
1. File(-based data) sharing (non real-time integration)
2. Common database (non real-time integration)
3. Sockets (real-time integration)
Real Vs. Non-Real Time Integration
1. Real-time integration refers to the processing of (ability to use) data as
soon as input data are created (received).

2. In contrast, non real-time integration occurs after an input has been


created – i.e. the processing is done at a later time. Use of data is
delayed after input data is created for the time needed to receive or
process data (near-real time integration and batch integration also come
into this category).

 There are advantages and disadvantages to both methods:


o Real-time processing produces data that is up-to-date
o Non-real time processing is typically more cost-effective (resource related)
1- File-based Data Sharing
 The method facilitates the sharing of data through files.
 It is the most common integration method, because storing data in
files is universal.
 In this method, one application writes data to a file, while the other
application reads data from the same file.

Server (machine) Server 1 Server 2

Application A Application B Application A Application B

write read write read


FTP
File File File
1 - File-based Data Sharing
Server (machine) Server 1 Server 2

Application A Application B Application A Application B

write read write read


FTP
File File File

 If applications A and B are running on the same machine, they use


local R/W functions to read and write; when they run on two different
machines, a file-transfer technology between the machines must be
used, such as File-Transfer-Protocol, FTP.
1 - File-based Data Sharing
Try ‘ftp ftp.dsv.su.se’, or use a “sftp” utility
1 - File-based Data Sharing
 Today, systems share files using more visualized and interactive
methods, such as File Sharing of Microsoft Windows. It is a continuous
design effort to enable such a solution for data integration between
systems - distributed environment, different OS (Windows, Mac, mobile)
1 - File-based Data Sharing
 The method can be used for sharing different kind of files, including
images, video, or text files. As for automated processing - text files can
be “flat” or written using a text-processing-aware language, such as XML.
o Flat files (meanings not included):
• Fixed-length record files:
Yellow01452456 (6,4,4)
• Varying-length record files:
Yellow,145,2456
o XML files (meanings are included):
<paper>
<color>Yellow</color>
<length>145</length>
<width>2456</width>
</paper>
1 - File-based Data Sharing
 Main advantage – simplicity of use.
 However, data sharing through files has a number of disadvantages:
o The data is not shared in real time.
o The method is not reliable if a large number of files are involved:
• The used files must be locked for reading/writing at the transmitting time.
• For processing, the involved applications must agree on the format of the files.
• There must be an agreement on where to place files for the transfer, i.e. for writing
and reading.
• Because of the point-to-point integration*, the method may be less suitable for
involving a large number of applications in a same integration context (because of
increased network traffic or heavy maintenance of the connections).*in

*each distinct pair of integrated applications (systems) involves a separate connection, i.e. point-to-
point link is a dedicated logical link that connects exactly two systems. A multipoint link is a link that
connects two or more systems.
2 - Common Database
 In this method, one application writes data to a common database,
and another application reads the data from that database, using a
common database language (SQL)
 The common (“shared”) database is typically placed on a separate
machine. System 1 System 2

Application A Application C

write read
Application B Application D

read read

Database
2 - Common Database
 Method’s main advantages:
o It is a multipoint integration structure, and
o Use of relational databases as the storage (SQL is used by all, same
data format).
 Disadvantages:
o Data is not shared in real time.
o A unified database schema is needed for all the involved applications.
o Bottlenecks due to database locks when a large number of
applications is involved.
o Not a suitable solution for large distance networks, for the
transmission’s performance reason.
3 - Sockets
 To avoid problem of “stale” data, a real time connection was needed.

 The first approach to establish a real-time connection between


application was through sockets.

 A socket is a communication end-point that one can address on a


network, i.e. it is a combination of IP address and a port number:
Socket socket = getSocket(type = "TCP")
connect(socket, address = "1.2.3.4", port = "80")
send(socket, "Hello, world!")
close(socket)

 Sockets have become the standard for connecting over TCP/IP.


Many integration approaches of today rely on them.
3 - Sockets
 A socket API* is needed to develop socket applications.

 Sockets assume the existence of a pair of machines:


o server (”listens” its socket for an incoming data request) and
o client (writes a data request to the socket of the server)

*An application programming interface (API) is a library including a set of functions enabling
connecting and integrating with a specific application software; such as Java’s “socket API”.

https://en.wikipedia.org/wiki/Network_socket
http://en.wikipedia.org/wiki/Port_number
https://docs.oracle.com/javase/tutorial/networking/sockets/definition.html
https://en.wikipedia.org/wiki/Application_programming_interface
(An API example)
3 - Sockets
Create a Socket Create a Socket

Build the Address Structure Bind to a Port Both the server


For the Server
and the client
code are written in
Listen to Connection Requests
Connect to the Server a programming
language
Accept a Connection Request
according to the
Write and Read to the Socket
(i.e. exchange data with the Server) steps*.

Read and Write to the Socket

Close the Socket

Client steps Server steps


*Your first exercise - complete the socket example (“Chat Exercise”)
described on iLearn2 and present it on the deadline (Lab 2)!
3 - Sockets, an Example
3 - Sockets
 The main advantages of sockets:
o Real-time connection.
o Fast, because of a small communication overhead (small message).

 Disadvantages:
o Only data can be shared/integrated (not functionality)
o Socket programming is ”low-level”, i.e. difficult and not suitable for
integration of complex data.
o Different platforms have developed different ”socket” structures.
o The connection is point-to-point and both sides must have their
sockets opened (a tight coupling).
Data and File Sharing - Conclusion
 The major drawback of the three methods at the time they were
proposed was the inability to share functionality.

 Also, sockets have been the only method out of them enabling real time
connectivity.

 Due to enabling of ”connectivity” and unique addressing, sockets have


become the most essential element of many other integration
approaches that allow for sharing of functionality, such as RPC and
distributed objects.
Content
 Data sharing and Sockets
 Remote Procedure Calls (RPC)
 Distributed Objects (DO) and Application Servers
 Messaging (part II)
RPC Overview
 The Remote Procedure Call method allows the functions defined in
one application to be called by other applications, from the same
machine, or over a network.

 Another advantage: one of the most powerful structuring


mechanisms in application design encapsulation - where modules
hide their data through a function call interface (“process order”). In
this way, they can intercept changes in data to carry out the various
actions they need to do when the data is changed.
RPC Overview
 RPC is also known as ”client/server architecture”, and is a conceptual
solution above socket programming for sharing functionality.
Application A Application B
(Client) (Server)

function call

return

Network

Most of Internet’s main application protocols, including HTTP, SMTP, telnet, and
others use the client/server model. XML-RPC (http://en.wikipedia.org/wiki/XML-
RPC) is used as an alternative to SOAP and REST service protocols.
RPC in Details
 RPC has introduced several core elements of the system integration
used in today’s solutions, such as:
o The declaration of the function interface.
the interface is defined in a specification file describing the input and output
parameters of a server function, which is exposed to the client.
o Marshaling and unmarshaling of parameters over network.
transforming the local memory representation (at the client or at the server) of a
function to a data format suitable for transmission, and vice-versa.
o RPC API to include the calls for network and application routines.
thus, the application programmer just needs to invoke the remote function, while the
library solves the transport over the network.
o RPC API has been supported by many programming languages.
RPC Process (Informative)
1. The server starts and registers a temporary port to listen for Client Server
incoming calls. Application Application
2. The client starts and it connects to the server (using its network
address and the port).
Client Server
3. Client calls client stub which does the conversion / packing of Routines Routines
parameters (marshaling).
4. RPC runtime library is invoked to send the packaged messages
over the network to the server machine.
Client stub Server stub
5. As the message is received on the server side, it is sent to the
server stub by the RPC runtime. RPC RPC
runtime runtime
6. The server stub unmarshals the input parameters and invokes
the requested (local) function call.
7. After the function is completed, the server stub marshals the
return value into one or more network messages.
8. Server’s RPC runtime sends the packaged return value via
network routines to the client.
Network Network
9. The client stub reads the message obtained from its network Routines Routines
kernel through the RPC runtime functions.
Conclusion
 RPC has allowed for the first time real distributed computing (integration)
by allowing applications to share functionality.
 In Java, it has been implemented through the RMI API*.
 The method has also introduced several integration components used in
today’s integration solutions (such as “function interface”).
 However, RPC has a number of shortcomings:
o RPC is not language independent, i.e. client and server must use a same language.
o Client-server calls are synchronous.
o The method is not scalable to a large number of remote calls due to their
synchronous nature (blocking of client&server).
o The integration is “point-to-point”, i.e. not efficient for a multi client-server
constellation.
*Compared to a function, an object is more complex: it encapsulates both
attributes (data) and behavior (functions)
Content
 Data sharing and Sockets
 Remote Procedure Calls (RPC)
 Distributed Objects (DO) and Application Servers
 Messaging (part II)
Distributed Objects (DO)
 Based on the RPC solution, The method is proposed to solve major
shortcomings of the RPC, by:
o Enabling loose coupling between client and server, by separating the code for
marshaling and network communication. from the application into a standalone
software component (“middleware”)
o Enabling Language (Java, C++) and Platform (Windows, Unix) independence*.
o Enabling multipoint connection.

 The distributed objects method extends the concepts of objects


introduced in Object-Oriented Programming (OOP), by enabling more
robust and scalable programming. With OOP, only objects belonging
to the same application could interact with each other at runtime, and
only objects written in a same language.
Distributed Objects (DO)
 Three different proposals of DO are available:
o Common Object Request Broker Architecture (CORBA) – any L, any P
o Microsoft Distributed Component Object Model (DCOM) – single P, any L
o Enterprise Java Beans (origin from Java RMI) – single L, any P

 Platform-independence means that you can run some language


code with no modification on multiple platforms.

 Language-independence means that the (compiled) functional


interface of a code (language) is applicable (can be invoked without
modification) toward arbitrary language bindings.
Distributed Objects (DO)
 The most important component of DO solutions is Object Request
Broker (ORB).
 ORB is a middleware (a software enabling multipoint connections
among different applications). As a component separated from
applications, ORB encapsulates the marshaling tasks and the
network calls.
o In this way conceptualized ORB can be reused by several applications,
o Decoupling is increased (i.e. beyond point-to-point, first step toward ESB).

Application A Application B
Machine 1

ORB

network

ORB
Machine 2

Application C Application D
DO Process
1. When a client (i.e. object) wants to use the functions and the data of
another object (on the server), it first obtains a reference (address) to
the server object, through ORB. Server itself needs to register its
objects as through its ORB, using a well-defined object’s interface

2. On the client side, ORB accepts the parameters of the object function
being invoked and marshals them to the network. The ORB on the
server unmarshals parameters, passes them to the server object.

3. The server object returns the result through the same path.
DO Process (Informative)
Step 1
Client Server
object reference

ORB ORB

ORB locates server


Step 2 Client Server

marshal unmarshal
ORBparameters parametersORB

Step 3
Client Server

unmarshal marshal
ORBreturn return ORB
ORB – Additional Services
 Naming services – they allow DO to register and get located by name.

 Security services – they provide authentication, authorization, auditing


and non-repudiation features to DO.

 Concurrency control services – they enable a concurrent use of DO,


through R/W locks.

 Transaction services – they enable several tasks (i.e. calls) to be


executed as atomic (i.e. “all or nothing”).

 Life Cycle Services – they are responsible for creating and deleting DO
Application Servers (Informative)
 Application servers are the tools (products) that realize the concept of the
distributed objects, i.e. ORBs.
 A number of such products are available, such as:
o Microsoft COM++ (Visual Studio)
o IBM WebSphere
o BEA WebLogic Server
o JBoss, etc.

 The backbone for these servers are CORBA, or DCOM, or Enterprise Java
Beans (EJB). All three use the ORB concept.
 A conclusion: the distributed object approach to the integration has been a
big step forward – compared to RPC, it has enabled a more reusable, less
coupled, L/P independent communication between remote applications.
Questions?
To Answer

How RPC
What is integration
How tight is
coupling in approach
coupling in
the SI context works?
Data Sharing
about?
solutions?

How DO Benefits /
integration drawbacks of the
approach integration
works? approaches?
Tasks for the next session…

You might also like