Eight Styles of Data Integration
Eight Styles of Data Integration
A White Paper
1 Executive Summary
18 Conclusion
18 About Information Builders
Executive Summary
Most people assume that the starting point for any data integration or business intelligence (BI)
project is a data warehouse. While data warehouses are important for many types of initiatives, they
aren’t always necessary. Building a data warehouse can dramatically increase the cost of a project,
while reducing the value, relevance, and timeliness of enterprise information. Many projects can
benefit just as much – or perhaps even more so – from alternate data integration scenarios.
Data warehouses themselves are not the issue. Problems arise when a data warehouse is viewed
as the only solution to a data integration or BI deployment, or there is an expectation that simply
building a data warehouse will address a specific information need. Data warehouses should not
be implemented without a clear understanding of the business challenges they will solve.
If you are a project, business, and IT manager with responsibility for data integration or BI activities,
you need to carefully research potential data access architectures and understand the various
techniques for accessing data, so you can devise the best method for leveraging your data within
the scope of your project.
In this paper, we’ll highlight eight proven integration strategies, and use real-world examples to
demonstrate the high-value, high-return data access options that are available to you. You’ll learn
about data warehouses, as well as other methods for making relevant, timely information available
to your business users, systems, and processes. These eight ways to integrate and access data – all
supported by Information Builders market-leading intelligence, integration, and integrity solutions
– can be applied to effectively solve various business problems:
3 Operational data access, for a real-time view of business activity from operational data
and applications
6 Search technology, to rapidly scan indexed content, creating Google-style results from data
sources throughout the enterprise
7 Web services and native APIs, which can expose or extract data from multiple sources,
irrespective of underlying operating systems, applications, or databases
8 Cloud data, to optimize the way cloud-based information is accessed and leveraged
1 Information Builders
Eight Styles of Data Integration
There are many valid reasons for building a data warehouse, including:
■■ To reduce overhead on a transaction processing system or production application by staging
data to a centralized database
■■ To reduce the complexity of the data and put it in a form that is suitable for reporting or
other purposes
■■ To maintain and retrieve historical data that is no longer accessible in operational applications
For example, Moneris Solutions, Canada’s leading payment processor, created a data warehouse
that allows merchants to view daily and historical sales information. Developers used Information
Moneris maintains three months worth of daily transactions and 24 months of summary data in its
data warehouse, which is known as Merchant Direct and is dimensionally modeled to accelerate
reporting and analysis. The company downloads about five million rows of new transactional
information into the warehouse each day to support a merchant base of more than 300,000
customers. It’s a massive data access, summarization, and delivery exercise, and a data warehouse
is an ideal way to supply the information that customers need. These merchants use Information
Builders’ WebFOCUS BI and analytics platform to run parameterized reports such as the daily
authorization logs, monthly merchant statements, and daily corporate summaries, and to create
reports about individual stores and roll up summary information to reflect larger operations.
Some companies try to solve this problem by migrating customer data from multiple systems
into a central data warehouse, which customer support reps can query for insight about customer
activities. But keeping the information current is a challenge. You might call to ask a question
about your cell phone plan, and a few minutes later send a message requesting information
about a new feature the rep just told you about. How long will it take the company to update your
customer records in the data warehouse so all the reps can see?
3 Information Builders
A North American telecommunications company faced a similar challenge. The firm created a data
warehouse that accumulates data from five different operational sources each night. Data was
moved into the warehouse in batch mode at the end of each day. This architecture was adequate
for most customer inquiries, except for cases where a customer issue involved several different
calls or e-mail messages during the course of a day. These inquiries sometimes entailed accessing
data from several different operational systems. To get the information, customer support reps
had to transfer phone calls to other reps, delaying call resolutions and increasing support costs.
To resolve the situation, this telecommunications company used iWay integration technology to
tricklefeed the data warehouse – meaning new records are added right away. Today, as soon as
new data is entered into any one of these five operational systems, it is extracted, transformed,
and loaded into a real-time repository that includes information about customer accounts,
invoices, service orders, products, support histories, and much more. Call center representatives
always have up-to-the-minute information about customer accounts and inquiries, and customers
no longer get passed from one division to another.
In this scenario, the data warehouse is updated simultaneously with operational systems, one
record at a time.
Is this a complicated architecture? Not if you have the right integration tools. iWay listens for
transactions as they occur within each of the operational systems, then makes corresponding
updates to the real-time data warehouse, transposing information into a common format along
the way. As a result, updates to any of the operational systems are reflected in the data warehouse
within five minutes of any customer interaction, regardless of which venue the customer uses
to contact the company. The telecommunications company also used WebFOCUS to create a
business intelligence portal for displaying the data – a real-time window that enables reps to stay
up-to-date on the history of each account.
Integration technology is important to both operational and analytic BI systems, but in different
ways. Analytical BI applications rely on extract, transform, and load (ETL) tools to keep a data
warehouse current, perhaps once a day or once a week. Operational BI applications generally get
their information from an automated workflow process or directly from production systems. There
is less latency between when an event occurs and when the BI system is aware of that event,
putting business users in touch with timelier information.
5 Information Builders
For example, RBC Royal Bank provides real-time loan status information to its asset-based lending
(ABL) customers. Asset-based lending is a flexible way of providing fast-growing or highly
leveraged companies with working capital. The lending institution approves revolving lines of
credit secured by accounts receivable and inventory. The major difference between asset-based
lending and traditional commercial lending is control; lenders must continually assess the makeup
and status of each borrower’s collateral. This enables them to maximize the borrower’s margin
availability based on the underlying value of its current assets.
To make its ABL calculations, RBC considers more than a million invoices each month, along
with lengthy inventory reports. They use iWay solutions to translate this steady stream of data
into meaningful information that can be directly input to the ABL reporting system. This gives
customers a real-time view of the status of their loans – up to the millisecond. If the operations
group updates the data, it is posted immediately, so the customer always obtains the latest
information.
Thanks to this real-time reporting architecture, RBC’s asset-based lending clients can view their
borrowing base position, outstanding loan balances, collateral composition, and listings of
ineligible accounts through a secured and encrypted website.
4 Data Virtualization
When an operational BI application accesses multiple sources of information, it is typically referred
to as data virtualization or enterprise information integration (EII). This architecture enables BI
systems to look across multiple business applications and accept events from multiple sources,
such as those supporting customer relationships, the supply chain, and sales transactions. These
federated queries can propagate information from any source – real-time ERP transactions,
warehoused data, and business-to-business systems – and deliver it to line managers, executives,
or automated business processes.
Data virtualization refers to the real-time aggregation of corporate data across multiple data
sources. It presents distributed data as if it exists in a single location. This distinguishes it from
other types of data access technologies, since data is not permanently moved or replicated into a
new location or database. The source data remains intact.
Data virtualization combines data from several sources, which can include
operational systems and data warehouses.
A major Canadian airline used this architecture to create a BI application that helps maintenance
workers identify deviations, such as aircraft maintenance issues, including tracking parts for repairs.
Previously, even trivial maintenance issues such as a faulty seatback table or a torn seat cushion
prevented the airline from selling those seats on its flights, reducing revenue and profitability.
However, maintenance workers weren’t always notified in time, since the information the airline
needed to expedite these repairs was distributed across three different applications. The airline
needed real-time information to service these planes between flights.
At first glance, it might seem that integrating data from the three different applications into a
staged data warehouse would satisfy the requirements. After carefully analyzing the requirements,
the airline realized it could generate a federated query to access data from all of these sources
simultaneously. They didn’t need to build a warehouse to maintain this information.
Developers used WebFOCUS to build a report that combines data from three operational sources:
■■ The primary maintenance system, which holds information about seat and other problems on
the plane
■■ The parts inventory system, which holds information about the location of the necessary
replacement parts
■■ The plane routing system, which holds scheduling information
This single report informs maintenance workers about which planes need which parts in which
locations, enabling them to fix each problem as soon as possible.
Based on this report, the airline can attend to maintenance problems in a timely fashion, increasing
seat sales and improving profitability. Maintenance personnel use WebFOCUS to list all the devia-
7 Information Builders
tions requiring attention. They can generate standard or parameterized reports that list the type,
location, and destination of each affected plane, along with a catalog of available parts. This
federated system not only makes it easy to identify the required parts, it has become an important
performance management tool for monitoring the activities of each maintenance crew, such as
their success identifying, classifying, and closing deviations.
5 Process Integration
While users querying a database or running a report typically initiate analytical BI systems, the
business process itself triggers process-driven systems. For example, when an order entry system
receives an order, or a manufacturing process updates a bill of materials, these events might notify
other applications within the enterprise. In some cases, users are asked to supply input, perhaps to
correlate events with data obtained from other parts of a business process. In other cases, there is
no user input involved.
iWay is a key technology in these scenarios, because it enables applications to listen for events,
detect them, propagate them, and determine which actions to take according to conditions that
have been determined in advance. Setting up triggers and alerts enables a process to interface
with transaction systems and be triggered by events occurring in those systems. You might set
up a trigger to send a message when conditions reach a predefined threshold, such as when
inventory falls below a certain level or new sales figures are available.
In all three cases, the application acquires data before it ever gets loaded into a database.
For example, let’s say a customer orders 50 widgets through your online store. A BI application
might send a real-time alert to verify that there is enough stock on hand to fill that order. A process-
driven BI application not only checks the inventory, but also makes a decision to replenish it by
sending a message to the supplier. Transaction integration is similar, but in this case a database
transaction triggers the event. In other words, simply committing the order to the database triggers
an alert to verify the stock on hand, along with a message to the supplier to replenish the inventory.
All three scenarios are closely related, since they involve delivering real-time information based
on a business event or as part of a business process. Messages are generated, monitored, and
interpreted so that applications can take the necessary actions.
Sometimes, this type of integration scenario is referred to as business activity monitoring (BAM).
But whatever term is used, it involves monitoring events related to business processes like
EDI transactions, message bus activity, FTP activity, e-mail activity, database transactions, and
application updates. Very few business intelligence products can monitor and interpret these real-
time events. WebFOCUS is an exception, thanks in part to its close relationship with iWay.
Consider IPC, the largest group purchasing organization for independent pharmacies in the
United States. In 2006, when the U.S. Food and Drug Administration (FDA) announced that
drug wholesalers must track the pedigree of all pharmaceutical products dispensed in retail
pharmacies, IPC had to set up a real-time environment to track each bottle of drugs that passes
through its warehouses.
9 Information Builders
IPC used iWay to track the information and update the associated information systems. Now, when
pharmaceutical wholesalers send product data to IPC, the drug-tracking system automatically
ties it to a purchase order. iWay matches the purchase order with a shipping notice, then sends
back an e-mail confirmation of the transaction. After the wholesaler verifies that the correct
manufacturer is listed, iWay updates the pharmaceutical database as well. Thanks to this
automated business flow, IPC knows the exact pedigrees of the drugs that it has purchased before
the products even arrive at the shipping dock. In turn, they are able to provide this information to
their individual pharmacies, fulfilling the FDA requirement.
IPC plans to use WebFOCUS to expand its operational reporting capabilities, creating reports to
drill down into pharmaceutical deliveries by region, as well as to provide inventory summaries to
individual pharmacies. Now that iWay is monitoring its business processes, current order activity,
shipping status, and inventory levels are always listed in these reports.
Many state and local governments also rely on process integration to facilitate collaboration
among agencies. For example, the New York City Department of Health (DoH) has developed a
first response system to help hospitals, emergency workers, and the Centers for Disease Control
and Prevention to proactively monitor the outbreak of diseases. Nearly three dozen New York
hospitals routinely feed patient data to the DoH, which uses iWay and WebFOCUS to combine
and analyze the information. There’s no time to put the data into a traditional data warehouse,
let alone expect healthcare workers to go looking for it. These are real-time problems and
they demand a real-time solution. The same data-sharing partnership applies to the city’s
911 emergency system. As information comes in from both sources, the DoH uses business
intelligence tools to spot trends that indicate a disease cluster in specific neighborhoods, then
immediately sends messages to the appropriate authorities.
6 Search Technology
Everybody is familiar with the convenience and far-reaching capabilities of search technology.
But few companies have learned how powerful this technology can be in the context of BI
applications. The problem is that search engines are designed to index and track web pages, not
necessarily database transactions. Enter the iWay Enterprise Index, which taps into these streams
of information, transforms them into a usable format, and prepares them to be searched by end
users. This unleashes information that was previously locked up in proprietary information systems
– no data warehouse required.
An enriched version of each transaction is sent to a search engine in HTML format, in concert with
the operational system. Subsequent searches link transactions to reports that will further reveal
necessary information.
Why is this unique? The real breakthrough is in its scope. Search technology allows users to find
data across disparate applications and databases, even when they don’t know what they are
looking for. The iWay Enterprise Index can turn database transactions into web pages, then feed
those pages to a search engine. Subsequent searches will not only return the usual web page
findings, but also uncover information stored in transactional web pages, which iWay creates
on the fly. These special web pages contain links to the original database sources, as well as to
relevant reports, revealing new insight into the items you are searching for.
For example, since September 11, 2001, law enforcement officials have realized the importance of
sharing information across local, state, and federal databases. They have made great strides with
BI applications that can combine and access data from many different places. However, like most
BI solutions, these applications assume you know what you are looking for before you generate a
report or submit a query. Unfortunately, that’s not always how law enforcement personnel operate.
With WebFOCUS Magnify, a simple search for a license plate number could uncover transactions
across multiple data sources and law enforcement organizations. Magnify indexes transactions
11 Information Builders
across multiple data sources, then allows you to reach back into those data sources to find related
information – without having to create a data warehouse or join databases. A simple search on
indexed web pages reveals database records that help the user qualify the information. The search
results might represent three or four different databases along with references to transactions,
such as records of moving violations. It’s easy to tie those results back to a WebFOCUS report that
presents a history of the registered owner of the vehicle.
This progression is illustrated in the following screenshots. In Figure 1, a law enforcement officer
enters a partial license plate number (YOR) in the Magnify search box at the top of the page. This
returns a list of database transactions that include this string of plate numbers (on the right) and a
list of the associated data sources (on the left).
Figure 1.
In Figure 2, the officer has clicked on the data source to reveal the database records from which
these search results were derived. The officer found eight records in the Incident and Criminal
database, and five records in the Vehicle Registration database, along with an Officer Activity
Report about a particular offense.
In Figure 3, the officer clicks on Arrest History to narrow the search further. This brings up the
complete record of the incident in question.
Figure 3.
13 Information Builders
Magnify takes search technology to the next level. You begin with freeform, Google-style
searches, then reach into the associated transactions and databases to find additional information.
Customer support reps could use this same type of technology to investigate a problem. The
billing system, marketing system, shipping system, and order entry system might all reference the
same customer. Simply entering a customer number or phone number could turn up records of
customer activity, so the rep can find additional information related to the customer’s problem.
Other companies need to combine their own internal data with external information. For example,
they may want to compare customer information with external demographic information and
plot it on a map. With the iWay Web Services Adapter, you can join data from an internal system
with demographic data from the Internet, perhaps using a zip code column as a point of similarity.
In other situations, developers create web services to extract a subset of information from an
internal database or application, enabling multiple departments to access their own slices of the
data. For example, the marketing department might need to tap into certain parts of a sales or
finance system. A web service can reveal just the pertinent data.
This flexibility is especially important in today’s highly distributed manufacturing world, where
a company might want to source production and assembly tasks to multiple partners, plants, or
contract manufacturers. Consider Guardian Industries Corp., a leading manufacturer of float glass and
fabricated glass products. Guardian and its affiliates manufacture glass in 24 plants in 14 countries.
The company selected WebFOCUS because it can work with a service-oriented architecture (SOA)
and because it preserves the hierarchy of information represented in reporting objects.
Guardian publishes web services from its enterprise resource planning (ERP) applications and
passes the information to WebFOCUS. WebFOCUS consumes the services and generates pertinent
reports for each department and contractor. To integrate its information systems, Guardian uses
15 Information Builders
WebFOCUS ReportCaster, via its API, to automatically generate and print reports using the ERP
system. This enables the ERP system to create and maintain documents as discrete objects that are
accessible to many types of applications. Guardian can therefore securely exchange information,
both inside and outside the company, without the overhead of developing and maintaining a
data warehouse.
8 Cloud Data
Data integration within the cloud comes in many forms. Two of the more common scenarios
include:
■■ Leveraging cloud sources
■■ Storing data in the cloud
Cloud sources can also come in the form of online applications, such as Salesforce.com, which
store operational data for an organization.
When integrating with the cloud, multiple data integration patterns may be used. When used
within an appropriate framework, such as the iWay Integration Suite, cloud sources appear no
differently than any other on-premise database or application. This flexibility allows the use of
different integration patterns which leverage cloud, on premise, or both simultaneously.
Enterprise
Custom
Legacy Database
Apps
There are many advantages to storing data in the cloud. It can be faster and more affordable, because
it eliminates the need to purchase, deploy, and maintain hardware. This potential for lower TCO,
combined with increased agility, make cloud storage a very attractive option for many organizations.
17 Information Builders
Conclusion
Organizations create data warehouses for reasons that are not entirely valid. While some integration
and BI projects may call for the deployment of a data warehouse, there are many scenarios where
other data access methods may be more appropriate. As the examples in this paper illustrate, there
are a variety of options for accessing and consolidating enterprise data exist. Which one will be
most effective will depend on the nature and scope of your initiative.
We suggest that you analyze each business challenge to understand whether a data warehouse
or other type of information-access technique will present the best solution for your needs.
Always try to identify the best method at the outset of the project, and don’t assume that a data
warehouse is the correct solution before assessing all the options.
Minneapolis, MN* (651) 602-9100 Germany n Indonesia n Malaysia n Papua New Guinea
Corporate Headquarters Two Penn Plaza, New York, NY 10121-2898 (212) 736-4433 Fax (212) 967-6406 DN7507777.0714
Connect With Us informationbuilders.com [email protected]
Copyright © 2014 by Information Builders. All rights reserved. [118] All products and product names mentioned in this publication are
trademarks or registered trademarks of their respective companies.