0% found this document useful (0 votes)

54 views29 pages

Example of Good Report

This document proposes enhancements to the infrastructure for managing big data within Xxxxxxx Motor Manufacturing. It discusses the company's current data collection and analytics capabilities, as well as the infrastructure and technology in place. The document recommends proposals to improve data sharing, storage, and usage across functions to increase business value and competitive advantage through better data-driven decision making.

Uploaded by

Zaid

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

54 views29 pages

Example of Good Report

Uploaded by

Zaid

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 29

A Proposal to Enhance

Infrastructure for Managing Big

Data within Xxxxxxx

Student Name
Module Code
Tutor: Tutor name
Submission: Date
Contents Page

1.1 Introduction and Business Context ......................................................................................... 3

2.1 Current Data and Metrics within the Xxxxxxx Supply Chain .................................................... 6
3.1 Current Infrastructure and Technology for Big Data ............................................................. 10
4.1 Proposals for Enhancements ................................................................................................. 14
5.1 Discussion and Conclusion .................................................................................................... 18
Appendix 1 .................................................................................................................................. 20
Appendix 2 .................................................................................................................................. 21
Appendix 3 .................................................................................................................................. 22
Appendix 4 .................................................................................................................................. 23
Appendix 5 .................................................................................................................................. 24
Appendix 6 .................................................................................................................................. 26
Bibliography ................................................................................................................................ 27

2
1.1 Introduction and Business Context
Big data is, as Lee (2017, page 293) states “one of the most important areas of future information technology and is
evolving at a rapid speed”. As Ularu et al (2012) discuss, the phrase “big data” is one which was first discussed in
the 1970’s but has only been made popular from around 2005 onwards. Big data is defined by the 2011 McKinsey
Global report (page 1) as “datasets whose size is beyond the ability of typical database software tools to capture,
store, manage and analyse”. This data can be collected in traditional structured forms but is also likely to have an
element of non-structured forms such as social media content, free text, video content, and emotional response
data. It is this combination of data, and vast array of data, which is largely considered as “big data”. As Bhadani
and Kothar (2015, page 168) suggest businesses are now “churning these unstructured data for gaining insights
about present and future trend of the business, or predicting consumer behaviour” amongst other uses.

Broadly speaking, today the world has more data available than ever before, but as Harris (2012) discusses, simply
having data alone is not enough. For an organisation to get real value out of the data, they must analyse it, and to
do so, it must be stored in a way that is conducive to being analysed. Further, the right tools to capture, store and
analyse the data must exist, along with the right skillset to carry out these activities. This report is designed to
critically analyse the current data and the infrastructure for managing big data within the context of Xxxxxxx Motor
Manufacturing. It will then examine potential opportunities to improve the current infrastructure to make data
analytics easier across the organisation, unlocking increased business value from the data, helping to give the
organisation a competitive edge.

Xxxxxxx Motor Manufacturing is a global auto-maker, who, in partnership with the Renault-Xxxxxxx-Mitsubishi
alliance annually make over 10.7million passenger vehicles (Reuters, 2019). Xxxxxxx traces its heritage back to
1914, and has been in the European market since the 1960’s. The UK manufacturing operation, Xxxxxxx Motor
Manufacturing United Kingdom (NMUK) started production in 1986 (Xxxxxxx, 2019). Since this point, Xxxxxxx has
grown it’s UK operations to produce over 500,000 vehicles a year, or 115 cars an hour in “one of the most efficient
car plants in the world” (Moody, 2016). With such a large manufacturing footprint, being supported by over 500
globally located suppliers in the supply chain, and with finished vehicle customers across the globe, there are
considerable amounts of data produced, stored and analysed by the organisation.

The author of this report works within the inbound supply chain development team, specifically considering how
the supply chain supporting the European manufacturing footprint can be made more efficient by both business
improvement and systems enhancements, with an underpinning drive to try and increase the usage of data to
drive decision making. It is however impossible to look at one function alone when talking about data. Whilst this
report will touch the full global and European business at a high level, it will focus predominately on the inbound
supply chain and the immediate data independencies between external sources and other internal functions.

Xxxxxxx, like many of its counterparts in the automotive industry, the manufacturing industry and global industry
as a whole, have latched onto the concept and importance of big data and have begun to store more data and try
to gain insights from it. For many companies, the logic and reasoning behind this is summarised by what the 2011
McKinsey report goes on to explain, that “data can create significant value for the world economy, enhancing the
productivity and competitiveness of companies”. BMW is leading the way within the automotive industry,
restructuring its logistics function into a start-up style organisation, with the aim of creating connected supply
chains, where more data is generated by utilising the Internet of Things (IOT) and then analytics can be run on this
data to make the operations ”more transparent and more efficient” (Peters, 2019). Looking outside the
automotive industry but considering manufacturing supply chains more generally, it is important to look at and
learn form Intel who can be seen as an innovative supply chain leader, after they have appeared in the top 6
of Gartners Top 25 Supply Chains for the past 5 years (Gartner, 2015-2019). From looking at Intel’s application of a
data lake and analytics, it was able to save $121million from its supply chain inventory.

The Xxxxxxx Supply Chain Management (SCM) function is responsible for developing new suppliers, managing
suppliers and their on-time supply of parts, controlling inventory levels and inventory level costs, managing all

3
elements of forecasting and scheduling, and overall supplier appraisal. This work requires cross functional work
between the design teams, purchasing function, sales companies and the production management team. As a
result, there is a lot of data and information shared across the internal organisation, but equally, a lot of data is
exchanged with suppliers, logistics companies and external sales teams. The subsequent sections in this report will
look at how this information is shared and made available to the various functions which can use it; and will make
recommendations in section 4.1 about how this exchange, storage and sharing of data could be improved.

Whilst having had sales operations in Europe since the late 1960’s, Xxxxxxx started to grow and expand rapidly
within Europe from the 1980’s when it established its manufacturing footprint in Europe. The internal operations
system was built on an IBM Mainframe, as was common in this era. Within the mainframe, various applications
were developed from customer ordering, pipeline management, parts forecasting and stock ordering to suppliers;
all heavily reliant upon the traditional batch processing that was common across mainframe systems developed at
this time. The driver behind the mainframe was, and still is, an Adabas pseudo-relational database with 6 main
data tables within it. Since this time, systems have developed and evolved and more detailed discussion on the
infrastructure currently in usage within Xxxxxxx is carried out in section 3.1.

The automotive industry as a whole is a complex ecosystem of original equipment manufacturers (OEMs), tier 1
suppliers and a multi-layered web of sub tier suppliers. Masoud and Manson (2016, page 1) note that “due to the
nature of this industry, companies operate under tremendous pressure to carry low inventory levels while still
meeting acceptable customer service levels.” They go on to comment on how this manifests through putting
pressure on supply chain management functions, to carefully manage an ever increasingly complex supply chain,
to keep costs as low as possible, whilst also reducing exposure to risk; ultimately seeking to balance and improve
“productivity, profitability, and competitiveness.”

Within the automotive industry, there are a high level of combinations and complexity with the potential buildable
specifications, calling for high management complexity to ensure stock profiles are correct based on production
build plans. Masoud and Manson talk about how nearly all OEMs have now adopted the Japanese originated just-
in-time (JIT) manufacturing methodology, to differing degrees, whereby the manufacturer holds a minimal
quantity of parts and relies on continued, un-interpreted deliveries from their global supply chains. In support of
this, automotive companies must consider their supplier locations, order lead times, cost of inventory holding, cost
of inventory recovery and weigh this data versus pipeline flexibility, customer demand fluctuation and building to
order or to customer demand. Therefore, to get the optimal inventory holding, which allows minimised costs and
maximum profits, a great deal of data has to be considered, this is why data analytics offers such a big opportunity
within the automotive sector’s supply chain function.

In a recent automotive industry report, Liguori (2019) suggests that through data analysis, there is potential for
$425billon of value to be added to the automotive industry (currently auto industry is worth $2 trillion worldwide
(Cision News, 2018)) and over 350 million tons of CO2 to be saved. From this claim, it is evident that there is
potential high value that can be unlocked by the automotive industry by analysing data and making changes based
on these analyses. There will be winners who manage to unlock profitability by applying analytics to improve their
operational efficiencies and winners who enter new markets faster, more competitively than incumbents who rely
on continuing to do the same thing. As Ylijoki and Porras (2018, page 486) confirm, “big data and advanced
analytics are capable of enabling disruptive innovations”. It is already being seen within the automotive industry
with some big names, like Fiat, Vauxhall and GM all being bought or merged in recent years due to standing still,
not using data to improve efficiencies and as a result being left behind (NB/ in each of these cases other factors
were also at play which lead to their downfalls and need to merge or be bought). It is also seen that new players
like Tesla, who are doing as Ylijoki and Porras suggested in using big data, advanced analytics, and new technology
to cause disruptive innovation in the industry, hav in turn pushed other incumbent organisations to try and adapt
to keep up.

Before moving on with this report and studying the detail of what data is being stored and the infrastructure for it,
it is important to reference the organisational structure so that context for the organisations data is understood.

4
Figure 1 below shows, a simplified Xxxxxxx Global structure to try and illustrate where the Xxxxxxx Europe SCM
function sits. One of the key things to note in the organisation, is that it is a large and distributed organisation, with
regional functions reporting to both regional management and global functional management. Within the three
key regions (Asia and Africa, Europe and America) there is a degree of process autonomy, which often leads to
regional systems being developed. There are global Centre of Excellence (COE) forums which aim to align best
practice and share common systems where possible, but the nature of the organisation’s structure means that
these global systems are often limited to the largest value systems, something which is often not aligned across
the global organisation.
Figure 1, Simplified Xxxxxxx Global Organisation Chart

NB/ Figure 1 does not show the full organisation. There is replication of functions within the other two regions
(Asia and Africa and Americas) and there is replication of the illustrated UK functions within Spain and Russia
within the European Manufacturing and Supply Chain function. Further, this illustration does not go into
granularity within each of the other functions.

5
2.1 Current Data and Metrics within the Xxxxxxx Supply Chain
As discussed in section 1.1, for any data to be of value, it must first be analysed to allow insight to be gained
(Harris, 2012). In 2001, Langey proposed three key characteristics of big data – volume, variety and velocity.
Whilst these characteristics have been accepted as the basic characteristics of big data, it has since been widely
acknowledged that this list is not exhaustive and veracity, variability and value are all additional metrics considered
relevant to the conversation. (For context as to how these terms are defined and used within this report, a brief
definition of each of these characteristics can be found in Appendix 1). Simply having data which subscribes to
theses characteristics is meaningless. Marr (2014) agrees with this and expands upon it further, proposing true
value comes not from simply having big data but by utilising it to transform business decision making. He presents
a model (figure 2) in which organisations need to look firstly at their organisational strategy, before then reviewing
their data metrics and understanding if the data they are collecting is helping them deliver their organisational
strategy. By understanding this model, then carrying out analytics in line with it and allowing the outputs to
transform decision making within a business, Marr suggests organisations will begin to see real value from big
data. Based on Marr’s model, technology is a foundation which helps with all pillars of the model, but specifically
in performing the data collection and analysis.
Figure 2, Marr SMART Model

Marr, 2014 page 21

For Xxxxxxx Europe, the overarching strategy, alongside mid-term strategies of achieving specific profit margins, is
to become the leading Asian car brand in Europe. More specifically, the key strategy of the Xxxxxxx SCM function
is to support the overall profitability of the production function, which in turn supports the overarching strategy of
becoming the most desirable Asian brand in Europe. Supporting overall profitability of the production function
means reducing supply chain-based costs (inventory, logistics, pipeline management); managing risks and the
impact of risks to minimise production impact; and optimising production runs to maximise output. To explore the
strategy further, another of Marr’s tools comes in useful, the SMART strategy board. This can help an organisation,
or function, define, redefine or simply think about their strategy in more detail. For Xxxxxxx SCM, a SMART
strategy board has been completed and a short commentary added in Appendix 2.

As suggested by Marr, understanding this strategy is a prerequisite for being able to understand if the data metrics
being collected and analysed are correct, or if they are simply metrics which are reported on but add no value to
supporting the functional and organisational strategy. Possibly current metrics need to be refined or removed if
more appropriate measures exist that enable an organisation to better support their strategy. This report will now
critique the current data metrics recorded within Xxxxxxx SCM to understand if big data is being recorded and if
the data being recorded is useful in supporting the organisational strategy.

Within the Xxxxxxx SCM function, there is a vast variety of data. This is the first indication that there is big data
within Xxxxxxxs SCM function and that appropriate big data storage systems and analytics processes should be in
place. Exploring this idea of variety further, table 1 shows a list of the main data of interest within Xxxxxxxs SCM
function.

6
Table 1, Big Data Within Xxxxxxx Europe SCM Function
Does metric
Structured/ System of System of
Data System of Storage Accessibility Value of data Volume Velocity support
Unstructured creation Capture
Strategy
150 buckets per 18000
3270 Screen/Web
Parts Projections Semi-Structured MRP MRP Adabas Table Interface Operation Critical PNs - 2.7M records per 1 file per 12hrs Yes
run
3270 Screen/Web
Parts achievements Semi-Structured MRP MRP Adabas Table Interface
Operation Critical 1.2million records per run 1 file per 12hrs Yes
Real time parts achievements Semi-Structured RTA Replicator AWS S3 bucket Webscreen High 5000 records per minute 1 run per min Yes
3MB per file, 1500 per 1500 files every 6
Supplier Capacity Planning Structured External creation ADCP/APPV Renault Data Lake High Yes
study months
SCOPE/Plant Changes once per
Plant pipeline Semi-Structured Schedule
Adabas Table 3270 Screen Operation Critical Within MF table Yes
day with batch
Central Central
Plant schedule proposals Structured Oracle Database Web Screen Medium 25MB per run Weekly run Yes
Scheduling Scheduling
Pipeline optimisation requests Unstructured SCOPE email None/email n/a Medium Adhoc Adhoc Yes

Pipeline optimisation
Unstructured External creation email None/email n/a Medium Adhoc Adhoc Yes
feedback
Excel/3270 Changed up to 5
Inhouse shop schedules Semi-Structured Excel MRP Adabas Table Operation Critical 100MB per run Yes
Screen times per day
Changes once per
Vehicle Specs Semi-Structured CSM CSM Adabas Table 3270 Screen Operation Critical Within MF table Yes
day with batch
Changes once per
WebScreen/ 3270
Part Application Semi-Structured G2B BOM M-BOM Adabas Table Screen Operation Critical Within MF table day with batch Yes
update
Supplier delivery Performance Semi-Structured SAIS SAIS Oracle Database Webscreen Medium 35GB Adhoc updates Yes
Monthly processing
Supplier quality performance Semi-Structured SAIS SAIS Oracle Database Webscreen Medium 35GB Yes
run
Ran in real time,
3270 Screen/Web
Parts Ordering Structured MRP MRP and WCS Adabas Table Interface Operation Critical Within MF table orders sent in daily Yes
batches
Periodically based on
Trial Part ordering Semi-Structured MRP MRP Adabas Table 3270 Screen Medium Within MF table Yes
trial reqt
Product Structures Semi-Structured MRP MRP Adabas Table 3270 Screen Operation Critical Within MF table Adhoc Yes
Vehicle achievement Semi-Structured VET VET Adabas Table 3270 Screen High ~600 records per run 1 file per 12hrs Yes
Sequencer Reporting Application
Body build planning Structured MRP Oracle DB)
Webscreen Operation Critical 35GB Multi runs per day Yes
System
Key users with access
High - legal
Concession traceability Structured MS Access DB MS Access DB MS Access DB to the DB in a table 10GB Adhoc updates Yes
format compliant
Daily run with up to
3270 Screen/Web
Supplier Ordering Semi-Structured EDI MRP Adabas Table Interface Operation Critical Within MF table 18000 records per Yes
run
3270 Screen/Web
Order tracking Semi-Structured External creation ASN Adabas Table Interface
Medium Within MF table Real time updates Yes
3270 Screen/Web
Vendor Static data Structured MRP MRP Adabas Table Interface
Medium Within MF table Adhoc updates Yes
3270 Screen/Web
Part Static Data Semi-Structured MRP MRP Adabas Table Interface
Medium Within MF table Adhoc updates Yes
MUSEP Semi-Structured MRP MRP MRP Flat File High 25MB File 1 file per 12hrs Yes
Re-scheduled MUSEP Structured STS/LTS and V2S MRP MRP WebScreen Medium Created adhoc Adhoc Yes
Consumables Consumption Semi-Structured CSS CSS Adabas Table 3270 Screen Medium Within MF table Daily batch job run Yes
Daily updates in real
Packaging Data Semi-Structured PVS PVS Oracle Database Email and DB Medium 35GB Yes
time
CAT3 Ordering Data Semi-Structured MRP and BOM MRP Oracle Database Webscreen Operation Critical 10GB Daily updates Yes

This table primarily looks at what data is currently being generated, and stored, even if the way it is being stored is
not deemed as the most appropriate or optimal place to store it. It is worth noting that other data is generated
during the production and supply chain processes, however these are the key, systemised pieces of data which are
centrally stored. Other data is often unstructured or may be stored locally in spreadsheets or email chains in a
highly distributed way whereby it is not easily accessible to do further analytics upon it. With better structures in
place to capture, store and use these data streams, it could unlock benefits to support the organisations goals.

As well as exploring the variety of data relevant to Xxxxxxx’s SCM function, table 1 begins to look at some of the
other characteristics of big data, to help assess further whether the data dealt with truly is big data, or just a high
variety of data. In terms of volume, value and velocity, all these characteristics are examined in the table, to
varying degrees. Each individual data set has its own set of values against each of these characteristics.

7
Individually, some data types can certainly not be considered “big data”. For example, packaging data, whilst itself
is of medium value to the business, it is very low volume, especially in comparison to the parts achievement
records. Never the less, it is still a valuable piece of data in the wider SCM data set, if used correctly. Therefore,
the author proposes that it is true to say that, once considered as a full data set, the SCM function has a big data
set that should be analysed.

One important characteristic to look at is the volume of data. Whilst at first glance none of the records in this
table symbolise a traditional “big data” metric in terms of volume, the combination of the fields marked “Within
MF table” is currently around 64 Terabytes. It is difficult to break this volume down to the specific records due to
the Adabas data storage style whereby records are maintained within the same tables to improve retrieval
performance. By far the highest volume of data are the parts achievement data sets. Whilst it might be true to say
that by itself, it cannot be classed as “Big Data” in terms of volume, as big data is often considered to range from
around 30 terabytes to multiple petabytes (Navint, 2012) and the full MF data table is only 64 terabytes, this is
only part of the picture. Firstly, the Xxxxxxx systems archive data, to differing degrees depending on the system.
Therefore, some live systems only hold 6 months’ worth of some data. To hold more data to enable trend analysis
is currently not always possible based on the individual system limitations. It is also true to say that these volumes
only reflect the current volume held for each metric which is not to say it is the optimal volume of each data
metric. If the velocity of a metric was to increase, so too would the volume. Further, as already discussed, volume
is only one characteristic of big data and, whilst volume might be low the true value in that metric may come from
its velocity or veracity instead.

Whilst table 1 looks primarily at the data captured by, and of most interest to, the European SCM function, it is
worth noting that, whilst some of this data is purely generated by and used by the SCM function (for example the
supplier delivery performance and order tracking ASN data), much of the data has interdependencies, cross
references, links and usages with other departments. Some of the data is captured by other functions and
exposed to the SCM function for usage, for example the pipeline data is owned outside of the SCM function by the
sales function, and this is then exposed to SCM by having a common viewable interface, without having write
access to the dataset. Equally, the SCM function records and creates data records like supplier capacity which is
used by other SCM functions globally to support demand planning and shortage prioritisation and by purchasing
functions to support capacity investments. In support of this, a common, global, input interface has been created
called ADCP (Alliance Demand Capacity Planning) with the data being exported to the Renault data lake. Access
can then be gained by multiple functions through the Renault data lake to enable extraction of this data to be used
as appropriate within their functional scope (Section 3.1 discusses the Renault data lake in more detail).

Developing upon this idea further, it is important to note that, as well as data generated internally by the various
functions within Xxxxxxx, there is a vast array of big data which is available via sources outside of Xxxxxxx, which
can be of key interest to the supply chain management function in enabling them to achieve their functional goals.
There are many reasons why an organisation should seek to use data from outside their immediate organisation,
but one main reason can be seen from the Deloitte (2013) Risk Intelligent Enterprise framework. Within this
framework, Deloitte suggest that business units have a responsibility and opportunity to identify risks and design
and implement responses to risk. To truly capture risks which can affect the supply chain, data must be gathered
from outside the immediate organisation, if it appropriately links back to Marr’s model and can, in some way,
support the organisation in meeting their overall aims and objectives. By being able to connect with external data
sources, organisations can better identify risks, enabling them to prepare for and respond to them, with the aim of
avoiding catastrophic business impact.

External data, based on its variety, volume and velocity, can often be classified as big data by itself. One key
example of this, of particular importance to the Xxxxxxx SCM function, is weather related data, whilst another is
geological data (e.g. earthquake prediction data). A key reason for this, specifically for the aforementioned data
sources, but applicable to other external data streams also, is as Vieira et al (2019, page 1) note that such events
“May jeopardize supply chain crucial activities, such as production, which, in its turn, may also lead to customers’
orders unfulfillment and hence negatively affect the supply chain”. This emphasises the importance of including
these external data sources into the wider discussions on data architecture and future architecture in relation to

8
big data. Data from both internal, and external sources, must be able to be analysed together to fully drive as
much value for the organisation as possible. This may mean storing external data to bring it into an appropriate
location to use in conjunction with internal company data or it may mean linking into the external data through
APIs. This decision should be made based on the external data source, its stability, the latency in getting a
response and its usage within an organisation.

Whilst the author of this report sees value in all the data that is currently produced, there are some important
points to note which influence the overall value from the data. Firstly, not all data is of equal value, some has
greater significance than others; even within a data type, some records have more significance than others. For
example, ASN tracking data. (ASN is advanced shipping notification, where Xxxxxxx is made aware in advance of
what has been shipped, so has knowledge of whether there are any short shipments enroute to the plant rather
than finding out upon the trucks arrival at the warehouse of any missing parts). Whilst this data provides high
value in illuminating the production stock coverage, if this data is missing for some part numbers it can be more
significant than others. Secondly, whilst the current data types may all add value to the organisation, the author
questions whether the velocity of some data types limits their value. As Erl (2015) discusses, value comes from
how long processing data takes (both in terms of creation and any analytics of it), and this processing of data can in
turn lead to a lower velocity in data availability. A specific example of this within the context of Xxxxxxx is the
speed with which the mainframe system processes vehicle achievement. Due to the speed and batch processing
structure of the mainframe this happens twice per day rather than dynamically in real time. If this could be
improved to achieve in real time, effectively achieving a vehicle roughly once per minute as it is built rather than
half a day’s production (approx. 600 vehicles) in one batch, then this higher velocity of data would allow better
decision making to take place which would enable the JIT process to work better, further supporting the functional
goal of reducing the production costs. In this case, the data metric is an appropriate metric to record and can be
used for analytics, however if its velocity can be improved, potentially there is an opportunity to get greater value
out of the same metric. Overall, the volume would not change, as the same achievements are being recorded, but
at differing periods.

Another area where the data value is lower than it could be, is management perception of data veracity. In its
simplistic forms, Spacey (2017) states that “Data veracity is the degree to which data is accurate, precise, and
trusted”. Whilst each piece of data has its own veracity, dependent upon how it was derived, its source and the
inputs to this source, overall data can often be seen as one tool in management decision making and “data” can be
given a confidence level as a whole, rather than on an individual data source level. There are two possible ways to
work around this situation. Firstly, increase veracity in each piece of data which is feeding the wider “data” pool
and secondly separate data sources out and have some form of veracity mark assigned to source data to ensure
that not all data is disregarded or tainted by some having a lower veracity. The current Xxxxxxx management have
a vast wealth of operational experience, therefore often make decisions based on experience and feeling rather
than making data driven decisions when faced with making difficult decisions. Long (2017, page 55) suggests this is
often the case within supply chains that “a large proportion of the decisions are made based purely on domain
knowledge or long experience of decision makers in their supply chain.” To unlock the full potential of data in the
Xxxxxxx organisation, data driven decision making must be the norm. Long emphasises the benefit of data driven
decision making by pointing out “Data-driven decision making…derives macro knowledge and rules from the micro
operational data of supply chain networks to support decision making.”

The final area to comment on in terms of data metrics links to the next section on data infrastructure. Within the
current data infrastructure, there is much duplication of the data that is held, due to there being no strong master
data process. This means, many data fields are repeated and stored in multiple systems and data sets. The author
believes that an improvement in this area could lead to a reduction in the data volume, without removing any real
data, so the value of data could improve as data analysis will perform quicker. Further, due to the current
infrastructure of data storage, often it is not easy to transfer data between systems for analysis or transpose all of
the right data into an analytical tool to carry out the analysis on. By building the infrastructure further in terms of
master data and data storage, this will begin to allow this to be more effective and more value driven from the
data.

9
3.1 Current Infrastructure and Technology for Big Data
This section will now explore the data further by looking at the current capture and ingestion processes, any data
cleansing processes and the overall infrastructure for storing data for data analytics. To be of greatest value, data
must not only be stored, but be stored in a way which allows analysis to take place.

As introduced in section 1.1 Xxxxxxx uses a mainframe as the backbone for much of its production and SCM
management systems as well as for the sales and pipeline systems infrastructure. Table 1 in section 2.1 begins to
highlight how many internal Xxxxxxx applications sit on and processes occur within the mainframe (note, this table
only shows processes of specific interest to the SCM function, many additional processes exist on the mainframe
not mentioned in this table). Over the years since its first introduction to Xxxxxxx Europe as a traditional IBM
mainframe with 3270 terminal screens, additional applications have been built both within the original mainframe
architecture providing an increased library of 3270 screens and more recently on web pages which write back to
the same mainframe Adabas database tables. The key advantage of utilising the Adabas database over more
traditional relational databases is the way data can be indexed in Adabas. This can change the way applications
read and retrieve data from the database, increasing performance speed of the applications linked to the
database.

Within Xxxxxxx’s SCM function, systems have been developed over the last 15 years to move the function away
from the traditional IBM mainframe system that has run the business over the last 30 years, often introducing
other databases and technologies to perform specific tasks. Some of these developments have been forced due to
necessity (e.g. to allow unstructured data to be captured, processed and stored for example in the SAIS system for
supplier appraisal), and some have been evolutionary based on the opportunities that have arisen to incrementally
improve the processing and capture of data (e.g. moving the body build application from a mainframe application
to an Angular Java application with an Oracle database behind it.) In many of these newer systems and
applications there are still strong links and dependencies on the mainframe system to power elements of the
processing, and to reference data sets stored within the mainframe’s Adabas database tables. In table 1, in section
2.1, the applications in the “systems for data creation, capture and storage” columns, shaded in yellow are all
applications and data sets on the mainframe. From this table, it’s easy to see the vast reliance on mainframe and
Adabas databases and the fact that, even if data is not generated or processed on the mainframe initially, it is
often stored on it, or vice versa.

Whilst thinking about the data collection, cleansing and ingestion processes, it is important to think about the
processing which occurred to generate the source data as this can greatly affect the collection and ingestion
process. The processing which has already occurred to the data will greatly affect the availability of raw data and
the degree of cleansing that must occur to allow it to be ingested into the data storage layer. Ghasemaghaei and
Calic (2019) write in their paper about the importance of high quality, well cleansed data, and highlight the risk of
poor data quality data highlighting that “the cost of poor data quality could be as high as 8% to 12% of a firms
revenue”. Based on this, it is important Xxxxxxx either only generates clean data, or in some way cleanses it before
storing it.

In terms of thinking about the processing which has occurred during the capture and collection layer, when
discussing mainframe systems, the conversation is often linked back to batch data processing vs online transaction
processing (OLTP). Within the Xxxxxxx SCM systems, a combination of both approaches are adopted, whereby
immediate transactions can be created on some applications and the impact seen in real time (e.g. inventory
transactions) but some data requires batch processing (e.g. vehicle and part achievements) due to the size of these
transactions and the amount of records that require updating. Baer (2013) helpfully explains that OLTP isn’t
designed for processing large volumes and varieties of big data in real time, but is designed for locating the most
frequently or recently used data in the most accessible parts of the disk.

This naturally leads the discussion to consider system latency and the importance latency (“how fast a user can get
a response after the user sent out a request” Tian et al 2015,page 33) has in both general system performance, but

10
specifically to the analysis of data. In some cases, e.g. supply risk, having a high latency could be deemed as OK, as
batch creation of analytics overnight might be sufficient for reacting to events occurring in other continents,
however in other instances, more real time data is needed to make quicker, more reactive decisions, so in these
instances low latency is required from the big data infrastructure. Therefore, to optimise fully the load balancing
of the overall architecture, it is beneficial to be able to support both, based on the specific need of an application
or specific set of analytics and the type of action that will occur based on the analytics.

Big data, by definition, has a high velocity, and often some, or indeed much, of the value in using big data can be
linked to a company’s ability to utilise this high velocity of data, to get analysis and actions out of the data in real
time. This can mean the creation of data needs to be in real time (e.g. having real time vehicle and part
achievement, will create data at a higher velocity than it would via batch), the ingestion of the data into the
appropriate storage location needs to be real time and also the analysis of the data that occurs needs to be in real
time, or in other terms; streamed. This can in turn allow for real time response and actions based on the data.

The discussion has looked at the collection and ingestion processing in terms of data sources which are connected
to the mainframe. One further point of infrastructure discussion whilst discussing the mainframe, in relation to
figure 2 must be in terms of the data storage. As briefly touched upon in section 2.1, due to the nature of the
Adabas database behind the mainframe, data does not require to be normalised, but rather is stored in periodic
grouping. Whilst this optimises the storage of data enabling increased performance and access speed to the
database, it does mean extraction of data to perform analytics in another environment requires a degree of
conversion and requires skilled database analysts to set up extractions as required by business users (Software AG,
2015).

In addition to the main Adabas tables that have traditionally stored much of the Xxxxxxx SCM data, Xxxxxxx SCM
also has data storage in non-mainframe-based databases. These are mainly linked to web-based applications
which have different processing demands to what can be delivered by the mainframes batch or OLTP capabilities.
The following four data storage structures are currently being utilised to greater or lesser extents -
• Hortonworks HDFS – Hadoop Distributed File System
• AWS
• Oracle Databases
• Renault HUE data lake

When data lakes where first being discussed as bringing businesses value and Xxxxxxx decided to pursue this
avenue, a Hortonworks Hadoop based architecture was introduced, bringing a full stack of Hadoop applications
including a HDFS big data storage application. (Full details of this tech stack can be found in appendix 3). Whilst
this technology and architecture was deemed as beneficial in initially getting Xxxxxxx to move towards becoming
an organisation that was utilising its data and analysing it to drive business value, there was limited rollout of the
tech stack with a slow ingestion of data into the data lake, due to the legacy system data sources and the
complexity of managing a global data project which was often perceived to have low value to business users. As
such, due to the cross global and cross functional scope of this, Xxxxxxx Europe SCM data was never ingested into
this as a storage option and no new data sources are currently being ingested into this architecture.

This initial exploration into big data storage paved the way however to what is now possibly the most important
area of non-mainframe based architecture, a move towards AWS cloud for IaaS, with a focus on exporting data
into S3 and S3 Glacier to form the foundations of a data lake. Broadly speaking, the concept is looking at
developing new applications on the AWS platform and exporting the data from those directly into an S3 bucket,
alongside selected data from other legacy systems into S3 Glacier. This concept is still in early stages of infancy
and is a globally lead project. Within Europe, Xxxxxxx SCM hosts the first pilot application that is being developed
in early 2020 that will utilise this technology stack. With this, the full AWS suite of applications (currently approx.
165 applications (AWS, 2019)) becomes available to develop new applications with. Section 4.1 talks about other
opportunities that become available with AWS and some of the additional elements which Xxxxxxx, and specifically
Xxxxxxx Europe’s SCM function, could utilise to support their organisation aims and see functional benefits.

11
The next key non-mainframe data storage infrastructure in usage within Xxxxxxx SCM, are Oracle databases. Due
to the demand for applications which provided more computational power, the storage of unstructured data and
more intuitive user interfaces, many online applications have been developed over the last 15years which utilise
Oracle databases for backend storage and processing - this is evidenced in table 1 in section 2.1. Whilst this does
provide increased processing of data outside the batch mainframe processing, there is still an issue whereby much
of this data is disrupted across individual Oracle databases, with little or no relation. As stand-alone applications
therefore, they work quickly and perform the task they are designed to do well. Where these applications fall
short is in being able to work with the data from other applications and data sets to enable deeper analytics and
drive further value. Further, many of these applications rely on webservices and WSDLs to bring data in from the
mainframe Adabas data tables.

Finally, whilst both are still individual companies, due to a strategic alliance with Renault started in 1999, Xxxxxxx
shares many functions with Renault and seeks synergies where possible. Whilst there is an Alliance Information
Technology (IT) department that aims to have synergies in IT and Information Systems (IS), there are still individual
Renault and Xxxxxxx IT and IS functions also. Where possible, and when it makes business sense, instead of both
companies developing an IS system, a common system will be developed. Within SCM, there have been two
recent examples of this, both the ADCP system (Alliance Demand Capacity Planning system) and the APPV system
(Alliance Parts Pipeline Visibility system). Despite being a shared application, they are hosted on the Renault Data
layer, which is currently hosted by Hue, but is planned to be migrated to Google Cloud Platform (GCP) within 2020.
As a result of being hosted in this environment, there is direct output of data into the Renault data lake which has
various interfaces with analytics tools, mainly Apache Zeppelin, Jupyter and Spotfire. (See appendix 4 for
discussion).

Whilst considering the current infrastructure, it is important to remember there is vast array of systems
architecture available to businesses today. More cloud-based servers are becoming available, and each brings with
it a technology stack with common benefits across infrastructures types, but also unique benefits. Before moving
on to discuss recommendations for potential improvements, it is important at this stage to critique further the
current infrastructure laid out above, to understand where the current weaknesses are and where the greatest
opportunities to improve lie. To do this, the SWOT analysis in Table 2 explores the current strengths and weakness,
alongside the opportunities and the threats that the current infrastructure landscape hold. Individual SWOTs for
the technology of each key element of architecture are contained within appendix 5.

Table 2, SWOT Analysis on the Xxxxxxx SCM Data Storage Infrastructure and Technology

Strengths Weaknesses
• The mainframe system is highly reliable • No single solution for data storage – some data is
• The mainframe can support large-scale within Adabas DB linked to mainframe, some is
transaction processing within multiple Oracle DBs – making data
• Mainframe can support many users at same time collection and access for analytics longer, slower
without performance issues and disrupted
• Mainframe can manage both terabytes of data • Low levels of data democracy, meaning getting
storage access to other functional and global data is
• and high bandwidth user access making it difficult and bureaucratic
beneficial for big data storage • As no central storage, many local analytics and
• High performance speed in writing data to Adabas data storage takes place, meaning the same data
DB, 1million+ commands per second is being stored in multiple locations across the
• User friendly interfaces can be created which organisation
utilise data stored in mainframe alongside Oracle • Functional strategies, training and approaches to
database processing capabilities big data and analytics – not centralised
• Mainframe has very high security levels meaning
data is very secure

12
• Current infrastructure unsuitable for real time
streaming, prohibiting the velocity of supportable
data
Opportunities Threats
• Centralise storage of data – reduce total amount • As data storage and analytics is often done locally
of data being stored within functions and teams, threat to business as
• Introduce data cleansing policies – reduce full system support not available for any locally
duplicated stored data developed systems and databases
• Cost saving in not having to maintain and upgrade • Potential security of network failures - cloud
servers if cloud storage of all data can be adopted based storage has built in back up and distributed
• Opportunity to reduce processing power and pay storage globally for security and safety
for only what is needed if IaaS more widely • Locally stored data is unlikely to be as well
adopted protected, this poses a threat to the organisation
• Opportunity to empower business users to of data leaks – potentially violating GDPR
become more capable in data analytics regulations, but also potentially leaking business
• Opportunity to incorporate more unstructured intelligence to competitors
data and external data into analytics if • Organisation misses market opportunities by not
infrastructure supports its storage having capable applications to analyse the market
• Opportunity to increase velocity of current data and react – lack of ability to run predicative
metrics in infrastructure can support real time analytics
streaming

Finally, when discussing the infrastructure for big data storage, discussion must be had around the security of the
stored data. This element of conversation needs to happen to ensure that the security of the infrastructure is
sufficient to comply with both ethical and legal compliance elements. In addition to protecting against legal and
ethical issues, good data security can also help protect the company’s added value which is derived from the data
and protect it from being stolen and used by a competitor. This means protecting data against outside threat
through sufficient firewalls, but in the Renault-Xxxxxxx case, it also means protecting access to individual company
data which is stored in the common Renault Data Layer. This is managed via user IDs, with system administrators
having to authenticate and grant access to each system.

Building upon older regulations, GDPR came into effect in May 2018 (GDPR.org, 2019), with the aim to “Protect
and empower all EU citizens data privacy”. It is the world’s largest data compliance regulations and has been put
in place primarily to focus on protecting individual’s data which companies hold on them, and ensure this data is
being used for ethical purposes (GDRP.org, 2019).

Whilst the wider Xxxxxxx organisation holds much information on the customer base, on a local level within the
Xxxxxxx SCM function, consideration must be made to ensure that the data stored is compliant with GDPR
regulations. The Xxxxxxx Compliance Team has oversight of all data being stored and ensures full compliance is
adhered to. As discussed in section 2.1, the data that Nisan SCM holds is largely production management related
data, so is therefore predominately related to vehicles specification, build plans, achievement rates, component
orders, supplier data and so forth. No personal customer data reaches the SCM function, this is managed in the
sales systems outside of SCM. There is however personal data recorded within the SAIS application, which stores
names, emails and phone numbers of vendor contacts for the >750 registered vendors. This data is supplied and
uploaded into the system by vendors, so the right to share this data is obtained within the vendors before data is
uploaded. It is openly accessible for the vendor to return and remove any personal data that is no longer valid, or
should no longer be shared, and security protocols are in place which ensures access to this data is restricted
within Xxxxxxx to those with business justification to access it. This data is maintained purely for emergency
contact information and to auto-route concerns back to supplier contacts through the portal. There is no value in
using this data for further analysis and therefore it does not leave the portal.

13
4.1 Proposals for Enhancements
After analysing the current data metrics and the infrastructure around big data within Xxxxxxx, specifically within
the European SCM function, this report will now look at proposals to overcome some of the current issues and
limitations.

Figure 3 below shows a proposal for how big data infrastructure could look within Xxxxxxx. There have been many
big data infrastructures proposed by industry experts and academics, all of which largely highlight the same core
elements, however many of the different proposals have unique characteristics, normally in terms of the routing of
data into and through the infrastructure. This model has been chosen based on its holistic coverage and
understandable flow by both business and IT specialists. The infrastructure shown below demonstrates that data
must pass through several layers from initial data sources to the visualisation layer. It is the output of this process
(within the visualisation layer, like a Spotfire report) that is often seen as adding value to the organisation, but it is
important to remember each layer is important in capturing, preparing and analysing the data to allow the
visualised output. For this reason, it is important to invest in each layer and have a well-defined infrastructure, to
enable the most effective output.

Figure 3, Big Data Infrastructure within Xxxxxxx SCM

Adapted from Gill, 2017

Based on the analysis of the current data and infrastructure in the previous sections and the above proposed big
data infrastructure, the following core issues have been identified and proposals are discussed in how each of
them can be improved.
• Lack of strategy for big data → either centrally or locally within functions
• Lack of central storage and wider data accessibility issues internally for analytics → in part due to lack of
strategy for big data
• Distributed architecture of systems that generate data across mainframe and Oracle databases → server
capacity and inconsistent archiving processing means extraction and consistent analysis across systems
can be difficult
• Lack of user skills in terms of data storage and usage among business users
• Velocity of data is not sufficient for all data metrics to be as effective as they could be → real time
streaming to increase velocity

First and foremost, the discussion about enhancing and improving the organisations approach to big data must, as
big data expert Marr commentates, “Instead of starting with the data itself, every business should start with
strategy” (2019). Before an organisation can look at utilising big data properly for analytics, it must first have a
business strategy for its approach to data and how data is stored, managed and accessed across the organisation.
Marr’s SMART strategy board (discussed in section 2.1 and appendix 2) helps firms understand their data and the

14
business use case for it, before establishing and defining the strategy for it. Whilst considering the strategy for big
data, it is important to remember the idea raised in both sections 2.1 and 3.1, that data is often not exclusive to
individual functions. This report tried to address the different functional data metrics in terms of how they
supported the function’s strategy and therefore, in turn, the overall organisational strategy. This recommendation
goes further in expecting not just the organisation’s core strategy to be clear, but also the overarching strategy for
how data is stored, analysed and processed to be aligned across functions. Currently, different functions and
divisions within the global Xxxxxxx organisation have their own data lakes, system and architecture, which makes
the cross functional ability to access data difficult. The author’s recommendation is that the data strategy should
be clearly defined, with common architecture available across functions and regional divisions, saving both time
and costs. A business intelligence (BI) leader from each function should come together to form a steering
committee (Steerco), meeting regularly with the IT function to ensure alignment on data storage and data analytics
across the wider Xxxxxxx Europe organisation. Another key task for this group would be to work closely with the
senior management team to improve the veracity of data, by improving management perception and trust in the
data, as highlighted as an area for improvement in section 2.1.

Following this clear strategy setting, the newly formed cross functional BI Steerco should review and consider
options to implement improvements in the next biggest area that can be enhanced - the data ingestion and
storage layers of the model presented in figure 3 by building a better centralised data storage layer. Often the
value of analytics, and therefore the value of data, comes from being able to process different data and types of
data together. This means cross functional sharing or access to data, alongside analytical tools which can process
different types of structured and unstructured data. Storey and Song (2017, page 54) argue that “Traditional
relational database management systems (RDMS) are simply not capable of handling big data. The data is too big,
too fast, and too diverse to store and manipulate.” In Xxxxxxx SCM’s case, storage of most of the data on the
mainframe database is possible at current volume, but the limited ability to manipulate and store unstructured
data, and the restricted ability to analyse it alongside external data demonstrates the mainframe’s limitations. The
author believes that, the most appropriate way to build a better data storage layer is to move away from Adabas
tables and Oracle databases and establish an AWS S3 data lake whereby all data can be stored centrally and
accessed by functional data analysts to do data mining and analysis.

Whilst a S3 data lake could be introduced by itself, with ingestion from the current mainframe and Oracle based
applications, further significant benefit could be achieved by migrating the full tech stack away from its current
distributed mainframe and Oracle architecture. This forms the third key recommendation of this report; moving to
AWS for IaaS and application migration to the new AWS infrastructure. As previously mentioned, new applications
are being developed on AWS infrastructure, but this recommendation goes further to propose a migration of
current applications onto the AWS infrastructure. Whilst this would potentially have a high upfront investment
cost to facilitate and manage a migration, as Farr and de Valence (2018) suggest, significant cost saving can be
achieved by running applications on the cloud rather than on mainframes, namely by reducing the investment in
renewing and servicing the physical servers of a mainframe and by reducing the capacity limitations by utilising
cloud based elastic computing, like Amazon EC2 or Lambda which Xxxxxxx could begin to utilise through the
introduction of the AWS tech stack. A detailed proposal for this is presented in appendix 6.

In addition to the areas already discussed, upon the establishment of the S3 data lake and population of it with
current data, (this is something which can be facilitated easily by the AWS Lake Formation application, moving
current data into the new S3 data lake) the author would also like to suggest that it is worth investing resources
into data discovery. As Marr (2015, page 26) notes, “Data discovery is a process of looking at data from the other
direction”. By ingesting the data which has been built up and stored on the mainframe, into the data lake, it will
provide the data lake with a wealth of knowledge which can then be used on many of the other applications, for
data analytics, and to begin training machine learning algorithms. Due to the fact that Xxxxxxx SCM has a well-
established operational strategy, and much of the data generated is as a direct result of this and to fulfil specific
elements of this strategy, finding time (and capable resource) to look at the data without any objective may enable
new discoveries to be made.

15
The next recommendation builds upon this, suggesting that once accessible data is in place, it is important to have
both sufficiently skilled personnel and data democracy within the organisation. For Xxxxxxx to get true value from
its data, they must recruit the right people, and continue to invest in their development. It can often be dismissed
and taken for granted, that the right people will be available to manage the data with the right skill set. The skill
sets required are continually evolving and therefore continued development of, and investment in, people must be
part of the overarching data strategy. As Harris (2012) discusses, organisations need to think about the skill sets
that its business users have, not just the IT specialists. Employees must not simply be continuing to report on the
data, with easier access to it and better tools, they instead must possess the correct analytical and statistical skills,
as well as the drive and motivation to question the data, and the workload capacity to have time to explore the
data. Harris goes on to summarise that getting value out of big data is “as much about fostering a data-driven
mindset and analytical culture as it is about adopting new technology.” Specifically, within the SCM function, to
get true value out of the data, even if it is more accessible by being stored in a more optimal way, it is important to
have experienced business professionals, who understand the data but also have an analytical skill set to probe the
data, ask questions of its meaning and explore it fully. Currently within Xxxxxxx SCM, pockets of these skill sets can
be seen, but further emphasis must be put on these skills in recruitment, and the right organisation structure and
culture must be provided to enable these individuals to question and explore the data as part of their functional
tasks. The other important side of this is data democracy. This is the “idea that if each employee, with full
awareness, can have easy access to as much data as possible, the company as a whole will reap the benefits”
(Bodet et al, 2019 page 4). The theory is that, providing employees with as much data as possible, will encourage
data driven decisions at all levels within the organisation and encourage further analytics than if data is confined to
a BI team. The author recommends that the cross functional BI steerco team are champions of this process across
the organisation, supporting continued development in all employees’ analytical skills helping encourage citizen
data scientists to emerge across the organisation.

The final recommendation within the report is in relation to utilising AWS as IaaS to increase the velocity of some
of the data metrics. By having data in a central data lake and data generation applications within the AWS
environment, it will make cross system and cross functional data sharing for analytics easier. This change in
infrastructure will in turn enable a higher velocity of data to flow through from the data collection through
ingestion and analytics layers of the infrastructure.

In general, there are three distinctive ways that data from mainframe databases can be transferred to the AWS
cloud, for consumption by users, analytic tools and other applications.

• Batch File transfer (largely in use within Xxxxxxx already to transfer data between systems)
• Database queries – either direct or through APIs/webservices (largely in use within Xxxxxxx already to
transfer data between systems)
• Real-time streaming

Batch file transfers present a limitation to data access, as it often results in snapshots of data being taken, rather
than real time streams of data. An alternative to this would be the direct database query transfer of data, or using
APIs to call back data, however as de Valence (2019) notes, “The database queries also increase the expensive
mainframe Millions of Instructions Per Second (MIPS) consumption.” Therefore, another area where there is an
opportunity for value to be added to the current data infrastructure at Xxxxxxx is through real time streaming
leading to real time analytics.

As Corallo et al (2018, page 730) illustrate, real time analytics “promises significant performance improvements
along the entire supply chain”. By utilising this technology, velocity of data will increase as data is turned from
being available periodically after batch processes to new data continuously being made available through live
streams. As highlighted in section 2.1, one example of this is in terms of parts availability in production, allowing
the supply chain to react quicker and pull more stock in if the production facility is overachieving, or delay delivery
of goods if there has been a delay in production. This helps keep lineside stock at the lowest optimal JIT stock

16
levels, supporting the functional role of minimising costs. To begin to do this, the recommendation is to utilise
AWS Kinesis, and build data streams to capture data from both the customer side in terms of changing requests
and demands and the manufacturing/SCM functions in terms of vehicle and part achievements. Using these data
streams, various optimisation tools can then be built that can unlock significant benefits in terms of optimising
build plans and improving JIT operations.

It must be acknowledged that implementing real time visibility is a complex migration to execute, taking careful
planning to manage, including experts in both the current mainframe application and those with the skill set to
manage adding the streaming to the cloud. As a first step, the author of this report proposes the use of a
replicator, Attunity, to replicate the mainframe within the cloud. This will allow for the mainframe systems to run
without a large scale code re-write, and the data to be exposed to the S3 data lake. This will accelerate data
analytics with low latency whilst reducing the cost of MIPS (Qlik, 2019).

Real time streaming comes as a secondary step, once AWS has already been implemented. For practical reasons
(cost and resources) it is unrealistic to migrate the full current systems architecture to AWS in one instance.
Therefore, consideration of which data metrics would be most advantageous to increase in velocity via real time
streaming should be used as a prioritisation method for which systems to migrate in the earlier stages of AWS
migration.

Considering these recommendations, and indeed any business changes, it is important to reflect and consider if
any legal or ethical implicantions arise that were not previously a consideration, and if so they need to be
provisioned for and managed. In this instance, when talking about the SCM data, the author believes no new legal
or ethical implications are created as a result of the proposals. Whilst the proposals change where and how data is
stored, and increase the velocity and volume of some current data metrics, there is no proposal to add any
additional data metrics. One point worth noting however, is that if the same S3 data lake is to be used across the
full organisation as a central data lake, as proposed, consideration will be needed as to how to sufficiently remove
any personal or customer data that other functions wish to store, or how to partition access to it, so as to avoid
any potential opportunity for it to be misused or inappropriatley accessed and cause the organisation to not
adhere to GDPR standards.

17
5.1 Discussion and Conclusion
Whilst a number of recommendations have been put forward in section 4.1, in practical terms it would be
unrealistic to implement all of these recommendations straight away. This is both due to monetary limitations, as
many of these recommendations represent high capital and operational expenditures for the organisation, but also
due to organisational readiness to adopt. Some steps will be easier to implement in the shorter term as “quick
wins” whilst some will need more foundational work to occur before they can be implemented to achieve the
optimum benefit. Whilst the author largely believes the items have been presented in section 4.1 in the logical
deliverable order, Figure 4 sets out a roadmap to suggest how these ideas can be delivered to the organisation and
some proposed timescales which portray when this work can realistically take place in line with the organisation’s
current digital portfolio and budget limitations. The most immediately obvious recommendation that can be
worked upon straight away is the formation of a BI Steerco group, bringing together the key personnel cross
functionally to begin aligning and developing a clear strategy. Whilst the cheapest item to implement, it is also a
foundational item that will shape and guide all other items on the road map.

Figure 4, Roadmap for Delivery

Big data
utilisation Full IaaS on
within the
organisation

Investment in training

& BI Steerco
FY20 H1 FY20 H2 FY21 H1 FY21 H2 FY22 FY23
FY20 H1

Speaking about SCM as a discipline, Biswas and Sen (2016, page 8) claim that “Exploiting the rich capabilities of
analytics, organizations can reap the benefits of big data-driven insights to work with optimal lead time and
improve prediction of future to cope up with uncertainties.” This shows that big data and the various
recommendations set out above, can be seen as a delivery pillar for the functional goals and strategy of the
Xxxxxxx Europe SCM function, as discussed in section 1.1. They further make the case that “every object linked
with a supply chain is now acting as a continuous generator of data in structured or unstructured form” and thus
begin to make the case for big data driven supply chain analytics. The key in this work is that by effectively
implementing big data and analytics, supply chains can benefit from both operational efficiency improvements and
prediction of future events.

In their invaluable piece of work, they go on to explain how “SCM has witnessed a paradigm shift in its focus from
cost reduction and serviceability to cost effectiveness, reliability and predictability of the future”. (2016, page 9) It
is from this narrative that many automotive makers are trying to build upon the lean supply chains they have built
and drive more cost effectiveness for their organisations by not simply trying to squeeze out small operational
efficiency gains, but instead by making strides in predicting and preventing the impact of the large scale,

18
infrequent but costly risk events which can affect a supply chain. One such event that cost Xxxxxxx and the global
automotive industry was the 2011 earthquake and Tsunami which had both a deadly impact in Japan, and costly
impact across the global automotive supply chain. Xxxxxxx attributed it with a 10.4 percent decrease to profit and
Peugeot took a 300 million Euro hit to profits in Europe for the second half of 2011 showing the global impact of
such an event (Reuters, 2011). The automotive industry as a whole is starting to move further in the area of big
data and big data analytics to better predict the future. This can be clearly seen as in North America through the
emergence of a data sharing alliance “Surgere” where nine of the large automotive companies and their main
suppliers are coming together with the aim of sharing resource and knowledge in the pursuit of big data to predict
risk (Henderson, 2018).

In order to remain competitive, utilising big data is key in all industries and organisations today. For Xxxxxxx, the
desire is to harness as much information from the data as possible, to allow themselves to competitively position
versus other automotive manufactures, and indeed, in an ever-changing market, against mobility providers more
generally as the industry is disrupted by the likes of Uber, Lyft and Alphabet. The desire is to add value to
operations from understanding data, using it for future prediction, operational optimisation and risk reduction
through advanced, predicative analytics. In well-established industries, disrupters are challenging the status-quo
of big players, challenging organisations to continually adapt and develop to stay ahead and stay relevant. In
incumbent, long established organisations like Xxxxxxx, who have built a complex organisation and culture without
big data over the last century, there needs to be a degree of re-engineering of processes, systems and potentially
even organisational structures, to allow these analytics to take place. All of which has been touched upon in this
report.

As this report has discussed in detail, Xxxxxxx’s SCM function has a vast array (variety) and volume of well stored,
structured data and is in early stages of adding to this with an increasing volume of structured and unstructured
data, from a variety of internal and external sources. This report explored proposals for enhancing how the data is
stored, primarily through the recommendation of moving to AWS S3 as a provider of a data lake due to it’s
scalability and it’s on demand pricing strategy. Further justification comes from research published by Aberdeen
Group (Lock, 2017) who found that 43% of companies who had implemented and effectively used a data lake had
seen operational efficiencies compared to similar organisations in their industries. What’s more, they had, on
average, 9% better operating profits than similar organisations in their industry.

This report also looked at how the volume of data may be increased to add greater value going forward through
real time or data streaming or some data metrics. For certain metrics, this is perceived as having significant value
as it can reduce the reactivity time to concerns and help support the lean manufacturing disciplines of JIT.
Importantly, in all the recommendations, the foundation is having a strong strategy, governed and owned by a
cross functional BI Steerco team, championing big data and analytics best practise and keeping the organisation
aligned across functions.

To conclude, it is clear to see that Xxxxxxx has a vast array of data currently being collected, some of which has big
data characteristics and some of which does not. Whilst there are some systems and data infrastructure in place,
largely capable of handling the current data, this report has suggested ways to develop the infrastructure and
enhance the data metrics, to better support big data storage and to better facilitate data analytics. The author
believes Kusiak’s comments to be very true in the Xxxxxxx SCM context when he says, “Most companies do not
know what to do with the data they have, let alone how to interpret them to improve their processes and
products.” (2017, page 23). There are clear opportunities to improve and move towards being a more data driven
organisation. The key benefit of embracing better big data storage within Xxxxxxx’s SCM is that it will, as McKinsey
(2011) begins to suggest, allow for replacing, or at least supporting, human decision making with automated
algorithms based on the vast array of data, too complex for a human to process and calculate without
sophisticated algorithmic support. As highlighted by Biswas and Sen, using BI Xxxxxxx can begin to “build an
intelligent SCM system which is capable of analyzing what-if scenarios and take smart decision. Hence, supply chain
analytics is of paramount importance for enhancing dynamic capabilities of supply chains.” (2016, page 14)

19
Appendix 1
As discussed in section 2.1, there are broadly “6 V’s” associated with big data. The following is a brief explanation
as to the definition and interpretation of how each of these has been considered within the context of this report:-

Volume – This relates to the amount of data generated, as in the quantity of records and the size of those records.
Marr (2014) tries to demonstrate the sheer volume of data available today “On Facebook alone we send 10 billion
messages per day, click the "like' button 4.5 billion times and upload 350 million new pictures each and every day.”
Each type of data record will have a volume associated with it, made up both of the individual record size and the
quantity of record sizes – this in turn gives us the full volume of data being stored.

Variety – Variety relates to the different types of data that are being stored. With developments in analytical tools,
we are able to analyse both structured and unstructured data sets together to gain the highest value (Marr, 2014).
This means, we are able to process records held in traditional databases, alongside video content, social media
comments, message content as well as live feeds from semi-structured sources like weather channels, volcanic
trackers and real time processors within production areas.

Velocity – Erl (2015, page 34) explains that “From an enterprise’s point of view, the velocity of data translates into
the amount of time it takes for the data to be processed once it enters the enterprise’s perimeter. Coping with the
fast inflow of data requires the enterprise to design highly elastic and available data processing solutions and
corresponding data storage capabilities.” Velocity is becoming faster and faster in data generation, and therefore
there are more requirements for businesses to move from more traditional batch processing of data to high
velocity processing in real time as data becomes available. With a higher velocity of data generation, ingestion and
processing, there is also a demand and requirement for higher velocity of data analysis.

Veracity – This characteristic “refers to the messiness or trustworthiness of the data.” (Marr, 2014). Whilst many
structured data sources have a high level of validity built into them, this may not hold true for less structured
sources which lack validation – for example social media data does not have validation to confirm its accuracy. In
such instances as social media, where validation is low but volume is high, this can, at times, provide a different
form of validation (through mass confirmation of similar inputs) allowing veracity to remain in the data (Marr,
2014).

Value – All data has a value, and that value will be different to different organisations. The same set of data will be
worth more to one company than another. The value a company perceives to have in a data set is partly related to
their market, but also partly related to their ability as an organisation to take and use that data within their
analytical tools and their ability, and willingness, to implement decisions based on the data (Erl, 2015 and Marr,
2014)

Variability – This characteristic brings in a number of factors. “One is the number of inconsistencies in the data.”
(Firican, 2017) Which is something that is more prevalent in certain types of data set (e.g. social media,
unvalidated content) than other data sets. But it’s also important to note that “Variability can also refer to the
inconsistent speed at which big data is loaded into your database.” There can be times when records are refreshed
multiple times a minute, then large, inconsistent gaps for hours before another record is generated. Any big data
analytics tool must be able to cope with this inconsistent supply of data.

20
Appendix 2

Section 2.1 discusses using Marr’s SMART strategy board in helping to understand and define the organisations
strategy. Below a strategy board has been completed at a Xxxxxxx Europe SCM functional level to demonstrate
the internal functional strategy. Whilst the function’s overarching purpose is to support the production function
and minimise cost, in this instance the author has taken this further and has said the strategy is to do this by
utilising big data and data analytics.

21
Appendix 3

As discussed in section 3.1, a full stack of Hortonworks Hadoop based big data architecture was introduced to
Xxxxxxx. The below summarises the main applications within the stack.

22
Appendix 4
There is a vast amount of data currently being stored within the Renault Data Lake. The below schematic shows
the data currently being ingested into it. As can be seen, the source is predominately Renault but also some
Xxxxxxx data from the applications mentioned in section 3.1. It currently contains approximately 24million files
loaded in the platform from both the Renault and Xxxxxxx organisation, and on average 10,000 spark jobs are run
every day on the platform. From a Xxxxxxx SCM perspective, it is clear to see some of the benefits which can be
achieved by having this array and volume of data in the same place to allow more in depth analysis to take place.
Whilst Xxxxxxx SCM are only using the data lake for access to information for the two shared applications currently
in this cloud data lake, it is clear to see the additional value which can start to be unlocked should all data either go
to this data lake, or a similar Xxxxxxx data lake. By having a wider, more complete picture with supplementary
data to what is collected internally within Xxxxxxx, better understanding and analysis of the situation within
supplier’s capacity can be gauged and risks better managed. By utilising a common data lake going forward, there
are further opportunities to harness benefits from the data that would otherwise not be available or would need
integration and cleansing work done to if it where to be stored separately and analysed together at a later point.

23
Appendix 5
Due to the nature of Xxxxxxx’s distributed and varied systems for data capture, creation and storage, multiple
individual SWOT analysis have been carried out for each type of infrastructure. The SWOT analysis in the main
body of the report looks collectively at the strengths, weaknesses, opportunities and threats of having the current
infrastructure landscape.

SWOT 1 – Mainframe with Adabas DB

Strengths Weaknesses
• Mainframe highly regarded for reliability, • High cost to maintain and service physical, onsite
availability, and serviceability (IBM, 2014) servers and need to hold physical back up servers
• Mainframe computers have extensive capabilities in additional location → high cost (Kra, 2018)
to simultaneously share, but protect, a firm's data • Unable to store unstructured data
among multiple users. (IBM, 2014) • No single solution for data storage – some data is
• Capable of supporting large-scale transaction within Adabas DB linked to mainframe whilst
processing (thousands of transactions per second) other data created by other technologies (Oracles
(IBM, 2014) DBs or unstructured sources) hard/impossible to
• Can support vast multitude (thousands) of users ingest into the mainframe
and application programs concurrently accessing • Difficult to increase data democracy due to the
numerous resources (IBM, 2014) structure and file access style of mainframes
• Able to manage both terabytes of data storage, • Ingestion and processing limitations prohibiting
and high bandwidth user access making it the ability to widely introduce real time streaming
beneficial for big data storage and processing • Difficult to find capable staff with sufficient skill
(IBM, 2014) level
• High performance speed in writing data to Adabas
DB, 1million+ commands per second (Software
AG, 2015)
• Data Privacy and encryption through Data Privacy
Passports mean mainframe are a very secure
location to store data (Dignan, 2019)
• Storing data to Adabas DB removes need for
normalisation (Software AG, 2015)
Opportunities Threats
• By utilising mainframe, Xxxxxxx is able to process • Due to the nature of mainframes, unstructured
lots of transactional records very fast data cannot be ingested. There is a threat to the
• Easy to leverage the current data that is stored organisation of missing insights from this data if it
within the mainframe Adabas data tables is not analysed and processed in a different way
• Opportunity to keep data safe (no mainframe • Mainframes aren’t capable of running advanced
system has ever been hacked by an outsider and predictive analytic so there is a threat in
(Elliot, 2016)) missing these insights also
• By maintaining a separate/additional system to
process unstructured data, there is a threat in
duplication of systems efforts and costs
• The cost of maintaining mainframe servers could
become prohibitive and prevent investment
elsewhere

24
SWOT 2 – Oracle Databases

Strengths Weaknesses
• Stores data in a relational manner meaning data is • Cost – Oracle licenses tend to be expensive
easily visualised by users (Arteaga, 2019) compared to other SQL databases (Sullivan, 2019)
• Ability to group several transactions into the same • Not capable of providing the same speed and
batch for processing sets Oracle apart from its serviceability levels for the volume of transactions
competitors (Sullivan, 2019) required by Xxxxxxx
• High scalability (Sullivan, 2019) • More complex than some other SQL databases
• Capable of storing unstructured data (Oracle, (Sullivan, 2019)
2016)
• High scalability, protection and performance for
activities aimed at business productivity (Arteaga,
2019)
• Standardization: Allows standardization between
different implementations of SQL. (Arteaga,
2019) This means recruitment of skilled workers is
easier
• High versatility – database can be run via any
operating system (Sullivan, 2019)
Opportunities Threats
• Able to scale performance easier by adding • As not capable of replacing mainframe,
additional servers (overall lower cost than scaling maintaining both means no central storage
mainframe) (Sullivan, 2019) location for all data meaning potential duplication
• Able to create more user-friendly systems, which in data storage, processing and costs
provide more analytics than mainframe-based • Missing the opportunity to migrate all data
systems processing, analytics and data storage to a single
• Able to export data into AWS S3 buckets to make location – costs threat as well as data integrity
data created through oracle databases available threat
for analysis with other data • Security threat – easier to hack than mainframe
• Opportunity to access a wider resource pool as applications
more available, trained resource capable of
utilising and developing Oracle DBs

25
Appendix 6
As discussed in section 4.1, there are various elastic load balancing opportunities which can be utilised on the AWS
infrastructure. The nature of the Xxxxxxx SCM operations means at certain times of the day and periods in the
week, as well as end of month processing, more work is carried out on the server. By utilising elastic load
balancing within AWS, on demand pricing will reflect this usage rather than maintaining server capacity to support
peaks in demand, as well as potentially facilitating a change to when and how data can be processed, which could
in some instances bring increased business value. Despite these operational IT benefits and potential cost savings,
the most significant benefit, is in terms of making data available in the data layer to take forward into the
processing layer and analytics engine for data analytics.

This appendix discusses in more detail the options that we examined and why EC2 was recommended.

AWS (2019) define elastic loading balancing as something which “distributes incoming application or network
traffic across multiple targets”. There are multiple approaches to how this can be technically structured with
systems architecture, and AWS provide multiple tools within their suite to support elastic load balancing. The ones
considered by this report were
• Lambda
• EC2

Using AWS EC2 is recommended by this report as the best option partly based on its merit and partly because of
the limitations the other two options present.

Lambda, whilst highly scalable, with dynamic scaling with traffic up to 500 instances concurrently, has a lower
timeout of 300 seconds. In comparison, EC2 is more flexible by enabling longer running times (AWS, 2019) which
certain instances within the Xxxxxxx infrastructure would require. Whilst EC2 does require more up-front
administrator effort to manage scalability, it does allow full control to the business deploying it and there are
scaling groups which do provide much of the auto-scaling functionality that Lambda benefits from, albeit there is
more work required to set these up.

26
Bibliography
EUGDPR – Information Portal [online]. Available at: https://eugdpr.org/ [Accessed Nov 12, 2019].

AMAZON, 2019. Amazon Web Services (AWS) - Cloud Computing Services [online]. Available
at: https://aws.amazon.com/ [Accessed Dec 18, 2019].

ARTEAGA, A., 2019. Oracle Definition, Functions, Advantages and Disadvantages [online]. Available
at: https://www.symantec.com/connect/articles/oracle-definition-functions-advantages-and-
disadvantages [Accessed Dec 15, 2019].

BAER, T., 2013. Process big data at speed [online]. Available

at: https://www.computerweekly.com/feature/Process-big-data-at-speed [Accessed Dec 7, 2019].

BISWAS, S. and SEN, J., 2016. A Proposed Architecture for Big Data Driven Supply Chain Analytics. IUP Journal of
Supply Chain Management; Hyderabad, 13 (3), 7-33.

BODET, G., et al., 2019. How does Data Democracy Strengthen Agile Data Governance? Online: Zeenea.

CORALLO, A., et al., November 2018. Processing Big Data in Streaming for Fault Prediction: An Industrial
Application. In: 14th International Conference on Signal-Image Technology & Internet-Based Systems, 2018. pp.
730-736.

DE VALENCE, P., 2019. Demystifying Legacy Migration Options to the AWS Cloud [online]. Available
at: https://aws.amazon.com/blogs/apn/demystifying-legacy-migration-options-to-the-aws-cloud/ [Accessed Nov
15, 2019].

DE VALENCE, P. and FARR, E., 2018. Yes, You Should Modernize Your Mainframe with the Cloud [online]. Available
at: https://aws.amazon.com/blogs/enterprise-strategy/yes-you-should-modernize-your-mainframe-with-the-
cloud/ [Accessed Nov 15, 2019].

DELOITTE, 2013. Supply Chain Resilience: A Risk Intelligent approach to managing global supply chains. Online:
Deloitte.

DIGNAN, L., 2019. IBM launches z15 mainframe, aims to automate compliance via Data Privacy Passports [online].
Available at: https://www.zdnet.com/article/ibm-launches-z15-mainframe-aims-to-automate-compliance-via-
data-privacy-passports/ [Accessed Dec 15, 2019].

ELLIOT, T., 2016. Is it true that mainframe computers have never been hacked? [online]. Available
at: https://www.quora.com/Is-it-true-that-mainframe-computers-have-never-been-hacked [Accessed Dec 15,
2019].

ERL, T., 2015. Big Data Fundamentals: Concepts, Drivers & Techniques. 1st ed. London: Pearson Education.

FIRICAN, G., 2017. The 10 Vs of Big Data [online]. Available at: https://tdwi.org/articles/2017/02/08/10-vs-of-big-
data.aspx [Accessed Nov 9, 2019].

GHASEMAGHAEI, M. and CALIC, G., 2019. Can big data improve firm decision quality? The role of data quality and
data diagnosticity. Decision Support Systems, 120, 38-49.

GILL, N.S., 2017. Big Data Ingestion, Processing, Architecture and Tools [online]. Available
at: https://www.xenonstack.com/blog/big-data-ingestion/ [Accessed Dec 2, 2019].

HARRIS, J., 2012. Data Is Useless Without the Skills to Analyze It. Harvard Business Review.

HENDERSON, J., 2018. Nine automakers to share supply chain data [online]. . Available
at: https://www.supplychaindigital.com/scm/nine-automakers-share-supply-chain-data [Accessed Dec 10, 2019].

27
IBM, 2014. Mainframe Concepts [online]. Available
at: www.ibm.com/support/knowledgecenter/zosbasics/com.ibm.zos.zmainframe/zconc_onlinetrans.htm [Accesse
d Nov 12, 2019].

KRA, D., 2018. What are the disadvantages of introducing mainframe Computers? [online]. Available
at: https://www.quora.com/What-are-the-disadvantages-of-introducing-mainframe-computers [Accessed Dec 15,
2019].

KUSIAK, A., 2017. Smart manufacturing must embrace big data. Nature; London, 544 (7648), 23-25.

LEE, I., 2017. Big data: Dimensions, evolution, impacts, and challenges | Elsevier Enhanced Reader. Business
Horizons, 60 (3), 293-303.

LIGUORI, G., 2019. Adapt or die: Why Europe’s business must embrace Industry 4.0 [online]. New Europe. Available
at: https://www.neweurope.eu/article/adapt-or-die-why-europes-business-must-embrace-industry-4-0/ [Accessed
Nov 8, 2019].

LOCK, M., 2017. Angling for Insight in Today's Data Lake. Waltham, Massachusetts: Aberdeen Group.

LONG, Q., 2017. Data-driven decision making for supply chain networks with agent-based computational
experiment | Elsevier Enhanced Reader. Knowledge-Based Systems, 141, 55-56.

MARR, B., 2019. Why every business needs a data and analytics strategy [online]. Available
at: https://www.bernardmarr.com/default.asp?contentID=768 [Accessed Nov 22, 2019].

MARR, B., 2015. Big Data: Using SMART Big Data, Analytics and Metrics to Make Better Decisions and Improve
Performance. 1st ed. New York, UNITED KINGDOM: John Wiley & Sons, Incorporated.

MARR, B., 2014. Big Data: The 5 Vs Everyone Must Know [online]. Available
at: https://www.linkedin.com/pulse/20140306073407-64875646-big-data-the-5-vs-everyone-must-know
[Accessed Nov 9, 2019].

MASOUD, S.A. and MASON, S.J., 2016. Computers & operations research. Computers & Operations Research, 67, 1-
11.

MCKINSEY GLOBAL INSTITUTE, 2011. Big Data: The Next Frontier for Innovation, Competition, and
Productivity. Online: McKinsey Global Institute.

MOODY, S., 2016. Two born every minute: inside Xxxxxxx’s Sunderland factory [online]. Car Magazine. Available
at: https://www.carmagazine.co.uk/features/car-culture/two-born-every-minute-inside-nissans-sunderland-
factory-car-february-2016/ [Accessed Nov 1, 2019].

NAVINT, 2012. Why is Big Data Important? [online]. Navint Partners. Available at: https://navint.com/wp-
content/uploads/2017/06/Big-Data.pdf [Accessed Dec 12, 2019].

NISSAN, 2019. 10 Millionth Car Built at Xxxxxxx Sunderland Plant | News | Xxxxxxx UK [online]. Xxxxxxx. Available
at: https://www.nissan.co.uk/experience-nissan/news/ten-millionth-vehicle-built-at-nissan-sunderland-
plant.html [Accessed Nov 1, 2019].

ORACLE, 2019. Creating the autonomous business. Online: Oracle.

ORACLE, 2016. Unstructured Data Management with Oracle Database 12c. Online: Oracle.

PETERS, N., 2019. How is every single link in the BMW supply chain being tightened? [online]. Available
at: https://www.themanufacturer.com/articles/how-is-every-single-link-in-the-bmw-supply-chain-being-
tightened/ [Accessed Dec 6, 2019].

28
QLIK, 2019. Mainframe Modernization | Qlik [online]. Available at: https://www.qlik.com/us/mainframe-to-
cloud/mainframe-modernization [Accessed Dec 14, 2019].

RASHMI ALOK BHADANI and S N KOTKAR, 2015. Big Data: An Innovative way to Gain Competitive Advantage
Through Converting Data into Knowledge. International Journal of Advanced Research in Computer Science, 6 (1),
168.

REUTERS, 2019. Renault-Xxxxxxx group sold most cars last year, but VW's No.1 including trucks [online]. Reuters.
Available at: https://www.reuters.com/article/us-automakers-sales-japan/renault-nissan-group-sold-most-cars-
last-year-but-vws-no-1-including-trucks-idUSKCN1PO0R1 [Accessed Oct 31, 2019].

SOFTWARE, A.G., 2015. Adabas & Natural: Database Management System Platform. Online: Software AG.

SPACEY, J., 2017. 12 Examples of Data Veracity [online]. Available at: https://simplicable.com/new/data-
veracity [Accessed Nov 9, 2019].

SULLIVAN, D., 2019. The Advantages of DB2 | Techwalla.com [online]. . Available

at: https://www.techwalla.com/articles/the-advantages-of-db2 [Accessed Dec 15, 2019].

TIAN, X., et al., 2015. Latency critical big data computing in finance | Elsevier Enhanced Reader. The Journal of
Finance and Data Science, 1 (1), 33.

ULARU, E., et al., 2012. Perspectives on Big Data and Big Data Analytic; Database Systems Journal, 3 (4), 3-13.

VEDA C. STOREY and IL-YEOL SONG, 2017. Big data technologies and Management: What conceptual MARK.

VIEIRA, A., et al., 2019. Simulation of an automotive supply chain using big data | Elsevier Enhanced
Reader. Computers & Industrial Engineering, 137, 1-14.

YLIJOKI, O. and PORRAS, J., 2018. What managers think about big data. International Journal of Business
Information Systems, 29 (4), 486.

Catalog Amp Ruang Teknik Group
100% (1)
Catalog Amp Ruang Teknik Group
23 pages
MCQ - Class 9 - Matter in Our Surroundings
100% (4)
MCQ - Class 9 - Matter in Our Surroundings
22 pages
Job Application Letter Volunteer
100% (1)
Job Application Letter Volunteer
6 pages
Sustainable Housing Case Study
No ratings yet
Sustainable Housing Case Study
9 pages
NL2SQL Schema Linked Guide
No ratings yet
NL2SQL Schema Linked Guide
4 pages
Rubrics For Group Presentation
100% (1)
Rubrics For Group Presentation
1 page
Corex Delivery
No ratings yet
Corex Delivery
37 pages
Application Guide
No ratings yet
Application Guide
4 pages
Usg Plasters Hydrocal Gypsum Cements Sealers Parting Compounds Brochure en IG515
No ratings yet
Usg Plasters Hydrocal Gypsum Cements Sealers Parting Compounds Brochure en IG515
2 pages
Written Performance Task in English 9
No ratings yet
Written Performance Task in English 9
4 pages
Rubric For Preparation of Design/Computational Plate
No ratings yet
Rubric For Preparation of Design/Computational Plate
1 page
Communication in Freaky Friday
No ratings yet
Communication in Freaky Friday
4 pages
Triac: Fill-In Questions
No ratings yet
Triac: Fill-In Questions
4 pages
Module 1-Session 1 LAS
No ratings yet
Module 1-Session 1 LAS
5 pages
CID 20210320173003021556 989295 uniROC Ipayob
No ratings yet
CID 20210320173003021556 989295 uniROC Ipayob
6 pages
HRM - 1st Midterm
100% (1)
HRM - 1st Midterm
81 pages
Bulletin168bmet W
No ratings yet
Bulletin168bmet W
43 pages
Scaler User Manual
No ratings yet
Scaler User Manual
20 pages
Grade 6 Conjunctions
No ratings yet
Grade 6 Conjunctions
65 pages
Cab and Chassis Connections Cab Wiring (Right Side) Fuse Block Wiring
No ratings yet
Cab and Chassis Connections Cab Wiring (Right Side) Fuse Block Wiring
4 pages
Home Cell Group Explosion Compress
No ratings yet
Home Cell Group Explosion Compress
4 pages
5 Paragraph Essay
No ratings yet
5 Paragraph Essay
5 pages
SwOS CSS326
No ratings yet
SwOS CSS326
14 pages
Human Resource
100% (1)
Human Resource
92 pages
The Evolving Concept of Life
No ratings yet
The Evolving Concept of Life
17 pages
Problem Solving and Conceptual Understanding
No ratings yet
Problem Solving and Conceptual Understanding
4 pages
Study of E Banking Services Offered by ICICI Bank Manavi Mhaskar 09
No ratings yet
Study of E Banking Services Offered by ICICI Bank Manavi Mhaskar 09
58 pages
Dynamics Problem Solving
No ratings yet
Dynamics Problem Solving
6 pages
Ifr Cross Country Flight Planning Guide Aerodynamic
100% (2)
Ifr Cross Country Flight Planning Guide Aerodynamic
4 pages
Public Administration
No ratings yet
Public Administration
178 pages
The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
From Everand
The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
Mark Manson
4/5 (6458)
A Man Called Ove: A Novel
From Everand
A Man Called Ove: A Novel
Fredrik Backman
4.5/5 (5181)
Never Split the Difference: Negotiating As If Your Life Depended On It
From Everand
Never Split the Difference: Negotiating As If Your Life Depended On It
Chris Voss
4.5/5 (1005)
Principles: Life and Work
From Everand
Principles: Life and Work
Ray Dalio
4/5 (643)
The Little Book of Hygge: Danish Secrets to Happy Living
From Everand
The Little Book of Hygge: Danish Secrets to Happy Living
Meik Wiking
3.5/5 (464)
Grit: The Power of Passion and Perseverance
From Everand
Grit: The Power of Passion and Perseverance
Angela Duckworth
4/5 (650)
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
From Everand
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
Viet Thanh Nguyen
4.5/5 (141)
The Glass Castle: A Memoir
From Everand
The Glass Castle: A Memoir
Jeannette Walls
4.5/5 (1856)
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
From Everand
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
Brené Brown
4/5 (1175)
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
From Everand
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
Ashlee Vance
4.5/5 (582)
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
From Everand
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
Ben Horowitz
4.5/5 (361)
The Emperor of All Maladies: A Biography of Cancer
From Everand
The Emperor of All Maladies: A Biography of Cancer
Siddhartha Mukherjee
4.5/5 (298)
Steve Jobs
From Everand
Steve Jobs
Walter Isaacson
4.5/5 (1139)
The Woman in Cabin 10
From Everand
The Woman in Cabin 10
Ruth Ware
3.5/5 (2814)
The Perks of Being a Wallflower
From Everand
The Perks of Being a Wallflower
Stephen Chbosky
4.5/5 (4103)
The Outsider: A Novel
From Everand
The Outsider: A Novel
Stephen King
4/5 (2885)
Shoe Dog: A Memoir by the Creator of Nike
From Everand
Shoe Dog: A Memoir by the Creator of Nike
Phil Knight
4.5/5 (629)
Yes Please
From Everand
Yes Please
Amy Poehler
4/5 (2016)
Rise of ISIS: A Threat We Can't Ignore
From Everand
Rise of ISIS: A Threat We Can't Ignore
Jay Sekulow
3.5/5 (144)
Angela's Ashes: A Memoir
From Everand
Angela's Ashes: A Memoir
Frank McCourt
4.5/5 (943)
The Light Between Oceans: A Novel
From Everand
The Light Between Oceans: A Novel
M.L. Stedman
4.5/5 (815)
The World Is Flat 3.0: A Brief History of the Twenty-first Century
From Everand
The World Is Flat 3.0: A Brief History of the Twenty-first Century
Thomas L. Friedman
3.5/5 (2289)
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
From Everand
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
Margot Lee Shetterly
4/5 (1022)
Fear: Trump in the White House
From Everand
Fear: Trump in the White House
Bob Woodward
3.5/5 (836)
The Constant Gardener: A Novel
From Everand
The Constant Gardener: A Novel
John le Carré
4/5 (278)
Wolf Hall: A Novel
From Everand
Wolf Hall: A Novel
Hilary Mantel
4/5 (4135)
Bad Feminist: Essays
From Everand
Bad Feminist: Essays
Roxane Gay
4/5 (1090)
Her Body and Other Parties: Stories
From Everand
Her Body and Other Parties: Stories
Carmen Maria Machado
4/5 (903)
Sing, Unburied, Sing: A Novel
From Everand
Sing, Unburied, Sing: A Novel
Jesmyn Ward
4/5 (1267)
Team of Rivals: The Political Genius of Abraham Lincoln
From Everand
Team of Rivals: The Political Genius of Abraham Lincoln
Doris Kearns Goodwin
4.5/5 (244)
A Tree Grows in Brooklyn
From Everand
A Tree Grows in Brooklyn
Betty Smith
4.5/5 (2033)
The Unwinding: An Inner History of the New America
From Everand
The Unwinding: An Inner History of the New America
George Packer
4/5 (45)
John Adams
From Everand
John Adams
David McCullough
4.5/5 (2546)
The Yellow House: A Memoir (2019 National Book Award Winner)
From Everand
The Yellow House: A Memoir (2019 National Book Award Winner)
Sarah M. Broom
4/5 (100)
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
From Everand
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
Dave Eggers
3.5/5 (233)
The Art of Racing in the Rain: A Novel
From Everand
The Art of Racing in the Rain: A Novel
Garth Stein
4/5 (4372)
On Fire: The (Burning) Case for a Green New Deal
From Everand
On Fire: The (Burning) Case for a Green New Deal
Naomi Klein
4/5 (78)
Little Women
From Everand
Little Women
Louisa May Alcott
4.5/5 (2369)
Brooklyn: A Novel
From Everand
Brooklyn: A Novel
Colm Toibin
3.5/5 (2133)
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
From Everand
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
Gilbert King
4.5/5 (280)
Manhattan Beach: A Novel
From Everand
Manhattan Beach: A Novel
Jennifer Egan
3.5/5 (919)

Example of Good Report

Uploaded by

Example of Good Report

Uploaded by

A Proposal to Enhance

Infrastructure for Managing Big

1.1 Introduction and Business Context ......................................................................................... 3

Marr, 2014 page 21

Figure 3, Big Data Infrastructure within Xxxxxxx SCM

Adapted from Gill, 2017

Figure 4, Roadmap for Delivery

SWOT 1 – Mainframe with Adabas DB

BAER, T., 2013. Process big data at speed [online]. Available

ORACLE, 2019. Creating the autonomous business. Online: Oracle.

SULLIVAN, D., 2019. The Advantages of DB2 | Techwalla.com [online]. . Available

You might also like