0% found this document useful (0 votes)
114 views33 pages

General Enterprise Data Flow

The document discusses the different data sources in an enterprise and how data flows between them. There are three main categories of data sources: the ERP system, which is the central system but has limitations; other databases like legacy systems or remote databases; and flat files from remote locations. An enterprise data warehouse is often used to consolidate data from these various sources into a single version of the truth to power analytics and reporting without overloading the ERP system. Building a data warehouse is a complex project that involves determining reporting needs, technical requirements, implementation, and testing.

Uploaded by

Angela Magtibay
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
114 views33 pages

General Enterprise Data Flow

The document discusses the different data sources in an enterprise and how data flows between them. There are three main categories of data sources: the ERP system, which is the central system but has limitations; other databases like legacy systems or remote databases; and flat files from remote locations. An enterprise data warehouse is often used to consolidate data from these various sources into a single version of the truth to power analytics and reporting without overloading the ERP system. Building a data warehouse is a complex project that involves determining reporting needs, technical requirements, implementation, and testing.

Uploaded by

Angela Magtibay
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 33

General Enterprise Data Flow

Chapter 2
Business Analytics implementation in an
enterprise is not a venture that can be done
alone. It is a collaborative effort that involves
multiple teams from multiple departments
constantly communicating with each other in
order to figure out the information needed by
the main stakeholders. Determining the
information is the starting point of an
implementation, which will dictate which data
will actually be conducive for the desired
analysis.
An enterprise’s data needs grow bigger and
bigger as the business scales up. Due to this,
the machines (servers and clients) bought to
address data needs a few years ago might no
longer be enough to address the need today,
yet they are system-critical that taking them
Data Sources offline even for a bit could create an
operational scenario where the business users
won’t be able to transact, which makes the
data for reporting to the higher ups no longer
accurate or no longer available. There are
three main categories of data sources in an
Enterprise.
In an ideal world, ALL of an enterprise’s data is
fed into its ERP System (Enterprise Resource
Planning) and all reports are obtained directly
from it. However, the real world makes it
difficult because in the end, ERP Systems are
still just machines, with their limited (not
ERP System infinite) capacity and processing power. An
ERP System might also not be able to address
an enterprise’s needs as the company grows
larger. This means that the company will then
have to procure a new ERP System, or upgrade
its current one, which requires a significant
investment.
An ERP System makes extensive use of Master
Data to help keep track of Business Partners and
Items. Usually the maintenance of these is
assigned to key people, who will be the ones to
manage the creation of new Master Data or the
updating of such. Lastly, when new equipment is
bought or an existing ERP System is upgraded, the
company might need to schedule a little bit of
down time to implement them. The ERP System is
unavailable at these times, so these will need to
be scheduled ahead of time, and concerned
parties will need to be informed so they can work
around it (adding System Memory, for example,
requires for the system to be shut down first
before new Memory Modules can be installed).
1. Amazon uses an ERP software
called Systems Analysis and Program
Development (SAP).
• SAP was created in Germany in 1972 by
3 Real-Life five former IBM employees who
ERP System envisioned a software integration of all
business and data processing in real-
Examples time.
From • By 1975, the small company had built
applications for:
• Financial accounting
• Invoice verification
• Inventory management
• Now SAP business customers
can manage their…
• Finances
• Logistical business needs
• Human resources
• Order management
• Sales
• ...and more through just one
database.
• Starbucks uses Oracle ERP – a cloud-
based software solution used
to automate back-office processes and
day-to-day business activities. It’s a
business management software suite that
includes financial management, supply chain
2. management, project management,
accounting, and procurement.
Starbucks • Oracle E-Business Suite provides users
applications for customer relationship
management (CRM), enterprise resource
planning (ERP) and supply chain
management (SCM) processes.
• The Oracle ERP above shows
revenue analyses and includes
information you need to know at-a-
glance including:
• Revenue
• Expenses
• Sales data
• Inventory management
• Operations updates
3. Toyota

• Toyota Industries Corporation is Toyota’s head company. It wanted to


expand its reach globally to offer high-quality services like improved
operational management accuracy, a paperless system, reduction of
work hours, and increase in overall efficiency.
• So, Toyota chose Microsoft Dynamics 365 for the job. Dynamics helps
manage the after-sales service skills and operations for distributors
offering services to their products to customers all over the world.
• Here’s an example of a Dynamics Summary page. This section allows
the company to view budget information, opportunities for sales, and
timelines.
Other Databases
Sometimes, due to geographical or cost constraints, a branch of the company might be
physically impossible to connect to the corporate network. This means that they can’t
use the ERP System without resorting to workarounds. One such workaround is to
maintain a separate database that records all transactions for the day. At the end of the
day, the database will upload the collected data to the ERP system. In other instances,
databases might be part of a legacy system that is still being used. It might be integrated
into a Business Process that is system-critical, and current Cost/Time/Technical
constraints mean that they can’t be assimilated to the ERP system just yet.
In order to be able to decommission these systems, the business process and the data
they produce must be integrated to the ERP. If this is impossible, then an Enterprise Data
Warehouse will be required to consolidate their data. This will require additional cost in
time and manpower, as it is a project that will require specialized knowledge in both the
legacy system AND the ERP/EDW (This is an example of Data Migration).
Flat Files
As mentioned before, in a perfect world, all of an enterprise’s data is going to be present in
the ERP, for instant extraction and reporting. However, in reality, there is a process in place so
that data within it cannot be tampered with. Transactions will usually have an approval
process to help keep out doubtful and fraudulent records, while Master Data is managed by
key employees. However, there are some instances where a branch is in such a remote
location that an internet connection is not available.
This is where Flat Files come in. Transactions for that branch will be recorded in a flat file,
later to be sent to the Head Office for processing and consolidation. Flat files are usually Excel
or delimited text files that business users create in order to make their own reports when
needed. Delimited text files are usually either tab-delimited or comma-separated value (CSV)
files. These files can still be opened in Excel, though tab-delimited files might need a few
extra steps before it can be read (though because they are text files, Notepad will also do). In
order to keep an accurate enterprise-wide report, these will have to be formatted in such a
way that it can be uploaded back into the ERP or Enterprise Data Warehouse.
Delimited and CSV File
Enterprise Data Warehouse
While the ERP system has some built-in reporting functionality, it is far
from a complete solution. The most obvious limitations are the fact
that custom reports are difficult to create, and data visualization
capabilities are lacking, if present at all. What’s more, the reporting
functionality will also consume system memory in order to be
processed. This can have an adverse impact on its ability to transact,
especially if large, detailed reports (per customer or per item, or worse,
both) are needed. An Enterprise Data Warehouse is needed in order to
work around these limitations
The Enterprise Data Warehouse is built in order to consolidate the disparate data
sources so that only the data necessary for reporting will actually be used.
Consolidating data is an important aspect of Business Analytics, because first and
foremost, above even facilitating data analysis, is concerned with delivering “a single
version of the truth”. That is, an accurate representation of the business, from any
view point. From an implementation standpoint, this will require the following:
1. New hardware that will become the server hosting the Data Warehouse. It must
be connected to the corporate network.
2. A dedicated project team from the Enterprise Side made up of Business Users.
3. A dedicated project team either from the Enterprise IT Team or an external
organization who will be responsible for setting up the environment.
Building an Enterprise Data Warehouse is a massive undertaking that can take weeks,
months, even years to complete, depending on how large the target scope is. In order
to build an Enterprise Data Warehouse:
1. The Business Users will need to determine the reports they want to derive from
their data sources.
2. The Business Users will then convene with the IT Team in order to iron out the
technical requirements (Blueprinting). This includes providing information on
business processes and where the data can be obtained. This could take a few
days to a few weeks.
3. Once the IT Team has worked out the actual requirements needed by the
Business Users, it is time to implement the EDW to those specifications.
4. Testing will follow for data accuracy with the help of the Business Users.
Because it is on separate hardware, it usually follows a daily “load
schedule” during off-peak hours, usually midnight or very early
morning, where the previous day’s transactions will be loaded into it. It
is scheduled during off-peak hours because those times are usually the
ones where the ERP especially, is not being used.
Data Reliability
The one inviolable rule when working with numbers and computers
is this: “Garbage in, Garbage Out”. Some people say “Numbers
don’t lie”, but that is incomplete, because the veracity of the
numbers must be taken to task before calculations are made need
to be considered before any definitive statements can be made.
This is a constant challenge with Analytics. As the data travels and
transforms through the enterprise, something might get lost or
unintentionally changed, and tracking down these anomalies will
have significant impact on the correctness of the reports being
produced. Because if one item is inaccurate, are the other items
that came with it also affected? It is for this reason that Clear
Communication is a must not just within the company, but with
everyone involved in a Business Analytics Project. Sometimes these
can be easily traced, other times, not so much. The following are
just some ways inconsistencies can be introduced:
1. Inconsistent Terminology
A department might refer to an SKU as a
“Product” and another might refer to it as
“Material”. This extends to more than just the
labels. The “Product” department might be
using only the first 5 characters of the SKU’s
Code for their reporting, while the “Material”
department might need the whole 20-character
string for their own reporting. In that case, both
must be present and accounted for.
2. Rounding Errors and Truncation
Consider the number of decimal
places a given piece of numeric
data has. As it travels from the
Source to the EDW to the Reporting
Tool, it will have to be encoded into
different formats. Potential side
effects include Rounding Errors.
This could cause final numbers to
deviate from the source.
Truncation

Truncation will have the same effect


(though more pronounced),
however, instead of rounding the
number, decimal places are outright
omitted:
3. NULLs and Zeroes
Null Values represent “nothing”.
However, in computing, Nulls and
Zeroes are considered as different
entities. This can have an impact on the
evaluation of conditional formulas and
averages.
4. Incorrect Inputs
This is where the concept of “Garbage In, Garbage Out” is very
apparent. While ERP Systems usually have a built-in way to reject
incorrect inputs (inputting letters in a field that only accepts
numbers), some legacy systems don’t have this functionality. Even
worse still are the “technically correct” inputs that get accepted but
are gibberish (nonsense data and fields left blank that shouldn’t be).
Data cleanup to ensure consistency is a lot of work, and should only
be done as a last resort. The best way to avoid Garbage Inputs is to
put policies in place that will ensure correctness.
5. Outright Data Discrepancies
A company usually has some tactical decisions (particularly marketing)
where promos and bundles of their products and services will be joined
together, in order to take advantage of a gap in the market or a season, to
increase sales. Since the bundles consist of different products, it also has
an impact on inventory. In other cases, a trial run of a new product is made
available to the market to test its viability. This situation means that a new
Item should be present in the ERP in order to reflect their numbers
properly. However, because they are a special case that had to be created
quickly, they are for internal use only for the departments responsible.
These will have to be later pushed into the ERP in order to get a more
accurate reading on the enterprise as a whole.
The Star Schema
A schema or logical data model is a representation of the abstract
structure of domain information. It is often expressed as a diagram, and is
used as foundation to designing database structures. There are many
different kinds of schemas, but the most-commonly used one in
enterprise computing is the Star Schema.
A Star Schema is the simplest approach used in designing enterprise data
warehouses. It is comprised of a Fact Table (usually just one) referencing
any number of Dimension Tables.
A Fact Table records measurements for a specific event. These are
typically referred to as Transaction Tables that contain very granular
numeric data. In addition to this numeric data (typically amounts and
quantities), it will also contain surrogate keys that define its relationships
to many Dimension Tables, which contain descriptive data. In an
enterprise, an “event” can be any sale that occurs. In other words,
A Dimension Table by contrast will contain less
records than Fact Tables. They don’t contain
transactions, rather, they contain descriptive
information like Customer Information, Addresses,
Date and Time, etc. The data they contain are
sometimes referred to as Master Data. In an
enterprise, there will be dedicated custodians for
this kind of data because they should follow a strict
process to add/edit them, as they can change the
view of the enterprise data.

You might also like