INFS2030: Digital Business Management: Week 6 - The Data-Driven Organisation: Big Data, Analytics and Decision-Making

Download as pdf or txt
Download as pdf or txt
You are on page 1of 22

INFS2030: Digital

Business
Management

Week 6 – The Data-Driven


Organisation: Big Data, Analytics and
Decision-Making

Uri Gal
Rm 4067/H70
[email protected]
Consultation: By appointment

Page 1
Some admin issues…

Mid-term exam April 1


– 10 multiple-choice questions
– Covers material from weeks 1-5

Page 2
Page 18
Exercise 1: What is Big Data?

What is big data?


– “Big”, “lots”, “exponentially growing” data – so there is just massive volume!
– Human or machine-generated
– New types of data incl. unstructured data (clickstream, pictures, video, social media
text, etc.) – variety
– Speed (velocity) of data

What is its relevance for decision-making in business?


– Relevant, valuable for business/marketing
– –or–‘potentially valuable” (hence we hoard it for its potential for future analyses)

Page 19
The 3 Vs of Big Data

“The 3 Vs” of Big Data:

– Volume – massive amount of data (quantity)


– Variety – unstructured & structured data; different sources (variations)
– Velocity – high frequency (speed)

Page 20
Page 21
1) Human-Generate Big Data

Intentionally created:
– Text messages
– Posts on social media
– Photos, video, audio
– “Likes” or “helpful” votes
– Web searches
– Webpages bookmarks
– Emails
– Phone calls
– Online purchases

22

Page 22
1) Human-Generated Big Data

Unintentionally created:
– Photo metadata: Time, GPS location,
Direction of phone
– Phone call metadata: Time, location (cell
tower), length
– Email metadata: From/to/cc/timestamp
etc.
– Twitter metadata: Header includes
location, creation date of account,
application sent with…

– Metadata can be much larger than


the actual user data
– May not be user readable but is
machine readable

Page 23
2) Machine-Generated Data

– Cell phone <> cell tower


– Satellite > GPS device
– RFID readings
– Medical devices readings
– Spam bots vs. spam filters!

– The Internet of Things


– Smart homes, sensors etc.

Page 24
Uses of Internet of Things

– Monitoring production lines


– Metering utilities
– Tracking animals
– Measuring temperature/environment (e.g., earthquake, tsunami warnings)
– Automating home/building
– Managing energy
– Directing industrial appliances
– Managing infrastructure
– Monitoring health

– Content: machine-readable, often high volume/velocity but low variety

Page 25
Some Examples

– Fraud detection in banking industry:


– Identifying patterns in customer behavior
– Detecting any deviations from known patterns

– Healthcare:
– Drug development and new treatments
– Personalized medicine

– Public policy
– Leveraging data from healthcare, finance, and
education, provides governments with insights needed
to create policies

Page 26
Some Examples

– Starbucks:
– Concern over the taste of a new coffee product
– Social Media Sentiment Analysis revealed the taste was
fine, but price was too high

– Chevron:
– Each drilling miss can cost $100M
– Up to 50 terabytes of seismic survey data is analyzed
before drilling
– Odds have increased from 1 in 5 to 1 in 3

– US Express
– In-cab system generates more than 950 pieces of data
– Data indicates the truck’s location, where they have been,
whether the truck is idle or moving, and what customer is
being served
– Data is used in real-time to re-route the fleet
– Reduced idle time has saved millions
Page 27
Three types of Big Data Analytics

Descriptive Predictive Prescriptive

- What happened? - What will happen? - What should we do?


- Dashboards - Forecasting - Optimization (achieve
- Business intelligence performance the best outcome)
- Predicting failure of - Minimize risk
- Team A is 30% below machines - Make best use of
quarterly sales quota - Estimating risk (credit resources, minimize
scores) wastage

- Team A will finish 5% - Change team A’s


under quota incentive structure
Page 28
Why Big Data Analytics? → The promises

Data
Abundance

Predictive Unbiased
power decisions

Page 29
1) The data abundance argument:

– More data coming from more sources


– Social media, sensor data, machine-generate data, RFID,
image, audio, GPS
– A terabyte used to be a Data Warehouse; today many are
over a petabyte in size

– The argument then goes something like this:


– We have more and more data at our disposal
– There is little value in simply storing the data
– We need to analyse the data
– Imagine what we could learn from all this data

– Core assumption: The more data we have the more we can learn (more =
better)

Page 30
But: Are claims about BIG data justified?

– Does data speak for itself?


– All possible correlations between 1,000 variables in a data-set leads to 21000
possible combinations; much larger than the number of particles in the Universe
(1080).
– Claims that data will simply speak to us are nonsensical
• We must decide what variables to focus on, what questions to ask, which correlations
are most informative, which data to ignore!

– Is more data actually better?


– Any analysis will create spurious or wrong correlations (“false positives”)
– The more data we have, the higher the likelihood of false positives
– Volume by itself increases noise; data must be relevant, understood, timely

Page 31
2) The prediction argument:

– “In the next two decades, we will be able to predict huge areas of the
future with far greater accuracy than ever before in human history, including
events long thought to be beyond the realm of human inference.”
– (Patrick Tucker “The Naked Future”)

– Causality is dead: “Petabytes allow us to say: ‘Correlation is enough.’”


– Using machine learning and predictive analytics we are able to simply
predict what will happen next – there is no longer a need to understand
‘why’
– The end of theory and science?
– “The End of Theory: The Data Deluge Makes the Scientific Method Obsolete” Chris Anderson, Wired Magazine, June 23,
2008

– Assumption: The future can be predicted


on the basis of the past

Page 32
But: Are predictions realistic?

– Extrapolation is not
prediction
– Only works when the future
resembles the past
– The behavior of dynamic
complex systems as an
emergent property of the
interconnections between
their parts
– Potential for error increases
with prediction time-scale

– Self-fulfilling prophecy
– We help create the future
we predict

Page 33
3) The decision-making argument

– “Analytics brings rigorous techniques to decision making; big data is at once


simpler and more powerful.”
– “It’s a simple formula: Using big data leads to better predictions, and better
predictions yield better decisions.”
– “Leaders will either embrace this fact or be replaced by others who do.
Companies that figure out how to combine domain expertise with data
science will pull away from their rivals.”

– Assumption: Data is objective, based on facts and


analytics (not intuition) and will therefore lead to correct and
better decisions (Data is unbiased!)

Page 34
Does Big Data analytics lead to objective decisions?

– Reducing complex (qual) phenomena


to their (quant) digital residues

– Using simplistic proxies for multi-


faceted traits or skills

– Working with Big Data requires


making subjective (human) decisions

– Numbers are not neutral

Page 35
Page 37
Next week

– AI or Design thinking?

Page 41

You might also like