0% found this document useful (0 votes)
28 views14 pages

Project Two: Gap Analysis & Proposal: Presented by John Miere

The document presents a gap analysis and proposal for an ETL process to move data from an outdated AS400 database into a SQL data warehouse. Currently, the Kansas City facility's data is stored separately from other facilities' data. The gap analysis identifies issues with the Kansas City data's format and accuracy. The proposal recommends extracting, transforming, and loading the Kansas City data into the SQL warehouse using an ETL process to standardize the data across all facilities. It also suggests implementing sandbox testing environments and additional ETL tools to support the project.

Uploaded by

Eugine
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
28 views14 pages

Project Two: Gap Analysis & Proposal: Presented by John Miere

The document presents a gap analysis and proposal for an ETL process to move data from an outdated AS400 database into a SQL data warehouse. Currently, the Kansas City facility's data is stored separately from other facilities' data. The gap analysis identifies issues with the Kansas City data's format and accuracy. The proposal recommends extracting, transforming, and loading the Kansas City data into the SQL warehouse using an ETL process to standardize the data across all facilities. It also suggests implementing sandbox testing environments and additional ETL tools to support the project.

Uploaded by

Eugine
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 14

PROJECT TWO:

GAP ANALYSIS &


PROPOSAL
Presented by
John Miere
GAP ANALYSIS
Kansas City Warehouse
Current and Future States
Current State Future State
• Kansas City data is stored in AS400 • Kansas City data is extracted and
database resulting in errors during loaded into MSQL server to
extraction of data to excel spreadsheet. correspond with other facilities.
• Data of other facilities are being • Kansas City data is formatted to be
stored in MSQL server warehouse. free of errors and missing
• AS400 database is incompatible values in order to have uniform
with MSQL server. data.
• All future data from Kansas City will
be loaded into the MSQL
warehouse.
Identification of the Gap
◦ The Kansas City facility has stored its data in a database system which is now
defunct. This has caused issues with validity and consistency.
◦ There was a variation with the number of months of data that was extracted
from the Kansas City dataset and the datasets of the other stores.
◦ For the gap to be filled and to have everything running smoothly, the Kansas City
data must be formatted, corrected, and loaded into the MSQL warehouse in
conjunction with the other stores.
Move From Current State to
Future
spreadsheet.
State
Extract Kansas City data from incompatible AS400 database into excel
Use supplemental word document to correct any null and/or error
values.

Format data to match the format in MSQL warehouse such as the numbering of
the months, Kansas City months start at “2”, this needs to corrected in order to be
consistent with the other facilities data.

Load the formatted and corrected data into MSQL warehouse to be with other
facilities.

Ensure all future data from Kansas City store flows into the MSQL
warehouse.
Visualize the Gap Analysis
Current Desired
S ta t State
Dadatabase
ta s is
tor e Extract to excel
incompatible with spreadsheet
e
MSQL warehouse
KC ddata
in is being Future data will be
loaded into defunct loaded into MSQL
AS400 database warehouse
Data stored in excel Data will be moved
spreadsheet, into MSQL
separate from other warehouse with
facilities Datoathceorrrfaectleitdie
Data stored in excel free of null and errors,
with null and error tsoformatted
be to match
values otherdata
facilities
Summarize the Gap Analysis
◦ The analysis conducted has proven that in order to operate efficiently as a business, a
warehouse storing the data of ALL facilities is crucial to accurate reporting and data
analyzing in a timely manner.
◦ Creating an ETL process would be beneficial as it allows for easier accessing and
analyzing of data, and to make informed decisions more quickly and efficiently.
ETL (Extract, Transform, and
Load) Process
◦ Extract
◦ Extracting the KC data from AS400 database where it currently resides and in its current
form, into an excel spreadsheet to view and correct, as necessary.
◦ Transform
◦ Correcting any null and error values with the aid of the supplemental word document
and ensuring accuracy.
◦ Load
◦ All KC data will be loaded accurately and consistently into the MSQL warehouse that houses
the data of the other facilities.
Sensitive or Confidential
Information Handling
◦ The data that is involved is not sensitive information, but it should be made so that
only internal processes and workers who need to use this data have access, such as
those who currently have access to this data at the other facilities.
PROPOSAL TO
DATA STEWARD
Issues Identified in Project One
◦ Kansas City data is stored in defunct AS400 database that is incompatible with the
MSQL warehouse.
◦ KC data initially shows null and error values.
◦ KC data missing one month
◦ Month numbering may have to be renumbered to start at “2”.
◦ Values loaded into excel are in scientific formula.
◦ Needs to be reformatted to match other facilities.
Production and Testing
Environments
◦ Sandbox environments are important to the preservation of data integrity and
should be supported for formatting and the loading of new data.
◦ Sandbox allows data to be manipulated and preserved without error.
◦ A master copy of data should be secured and separated as a backup to prevent
loss of data during the ETL process.
◦ ETL process should be tested in sandbox then moved to production
environment if proven successful.
Additional Data Resources
◦ Additional data resources that may be needed for this project could be some ETL
tools such as Talend or Stitch
References
◦ Cramer, J. J. (2019, March 5). 6 Key Responsibilities of the Invaluable Data
Steward. Dun & Bradstreet. https://www.dnb.com/perspectives/master- data/6-
key-responsibilities-of-data-stewards.html.
◦ Markovic, I. (2019, November 1). Gap analysis: What it is and why it’s
important in project management. TMS. https://tms-
outsource.com/blog/posts/gap-analysis/.
◦ Tobin, D. (2020, September 8). ETL & Data Warehousing Explained: ETL Tool
Basics. Xplenty. https://www.xplenty.com/blog/etl-data-warehousing- explained-
etl-tool-basics/.

You might also like