0% found this document useful (0 votes)
7 views

Topic 06-Exploring Data For Analysis

Uploaded by

bhattibaba118
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views

Topic 06-Exploring Data For Analysis

Uploaded by

bhattibaba118
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 33

Topic 06: Exploring

Data for Analysis

ICT601 Business Analytics


Dr Umera Imtinan and Dr Saeed Shariati
Please
don’t forget
to turn the
recording
on
Resources for this topic

• See Topic 06 Readings in Moodle for links


• #MakeoverMonday (Chapter 1, pp. 33-62)
• This topic also refers to sources you have met earlier:
• Knaflic
• Few
Learning outcomes

At the completion of this topic, you should be able to:


• Demonstrate an ability to work with unfamiliar data to produce valuable insights
• Define the different types and sources of metadata and demonstrate an ability to
extract and record the metadata for a given data set
• Prepare an exploratory analysis of an unfamiliar data set, demonstrating a
reproducible process
Lecture outline

• Unfamiliar Data
• Metadata
• Explore the Data
• Now what?
Topic 06: Part 01
Unfamiliar Data
Unfamiliar Data

“Data is more than It is unrealistic to expect that we will only ever


numbers, and to visualize be presented with data and datasets of which
it, you must know what it we have a complete understanding
represents. Data
IDCJAC0009 8147 1909 8 1 15.2 1Y
represents real life. It’s a IDCJAC0009 8147 1909 8 2 82.6 1Y

snapshot of the world in IDCJAC0009


IDCJAC0009
8147
8147
1909
1909
8
8
3
4
0
0
Y
Y
the same way that a IDCJAC0009
IDCJAC0009
8147
8147
1909
1909
8
8
5
6
4.8
17.5
1Y
1Y
photograph captures a IDCJAC0009 8147 1909 8 7 20.8 1Y
IDCJAC0009 8147 1909 8 8 15.2 1Y
small moment in time.” IDCJAC0009 8147 1909 8 9 10.2 1Y
IDCJAC0009 8147 1909 8 10 0 Y
(Yau, 2013, p.2)
So what should we do?

Of course, we would always start off by


“Exploratory analysis
trying to get as full a picture of the data as
is what you do to
understand the data possible
and figure out what - This will come from engagement with:
might noteworthy or - Stakeholders
interesting to - Analysis consumers
highlight to others.” - Data owners
(Knaflic, 2015, p.19)
What else do we need to know before we
start?

Let’s also make sure we are clear about:


“Working with - What the customer needs
constantly changing
- What they want to know
topics, data sizes, and
data complexity… - Do they want/need it in a particular
helps you develop the format?
skills to make you - Are there ways in which we can deliver
better at your job.” more than asked for?
(Kriebel and Murray, - How can we find out more?
2018, p.33)
Topic 06: Part 02
Metadata
What is metadata?

https://www.abc.net.au/news/2014-08-07/brandis-explanation
-adds-confusion-to-metadata-proposal/5654186?pfmredir=sm
…again…

• At it’s most basic, metadata is data about


“You have to know the
data
who, what, when,
where, why, and how • This definition tells us very little: and will
… before you can not attract any marks should it be used as
know what the an answer in an exam 
numbers are actually • It is far more complicated than simply, data
about.” (Yau, 2013, about data
p.37)
…metadata

• Metadata will help us to understand the data


“..a wealth of useful
• It can be technical…
metadata [can be
created] with the user
having to take no
action at all to
generate it.” (Jackson
and Lockwood, 2018,
p.66)
…sources of metadata

• Metadata can come from computer systems


“When you find yourself
looking at unfamiliar data, • Operational systems
you want to avoid feeling • Database systems
overwhelmed and instead
feel excited to explore, to
• Applications
analyze, to ask questions,
and to find answers…”
(Kriebel and Murray, 2018,
p.150)
…sources of metadata

• Metadata can also come from organisational


“The first thing to do when
faced with a visualization systems
challenge is to make sure • Processes
you have a robust
understanding of the
• Procedures
context and what you need • Policies
to communicate.” (Knaflic,
2015, p.188)
…sources of metadata

• …and it can come from people


“The true promise of the
information age isn’t tons • Project managers
of data but decisions and • Sponsors
actions that are better
because they are based on
• Subject matter experts
an understanding of • Customers
what’s really going on in • …
the world” (Few, 2012,
p.xv)
…gain insights from the data

• A good starting point it to use the data to help


“It is beneficial to first get
familiar with the data us what we know, what we think we might
itself.” (Kreibel and know, and what we don’t know…
Murray, 2018, p.35)
• for example:
• What types of data are in the data set?
• What ranges of values do the fields
contain?
• Is the data complete?
Data about the data…

• What types of data are in the data set?


• What ranges of values do the fields
contain?
• Is the data complete?
IDCJAC0009 8147 1909 8 1 15.2 1Y
IDCJAC0009 8147 1909 8 2 82.6 1Y
IDCJAC0009 8147 1909 8 3 0 Y
IDCJAC0009 8147 1909 8 4 0 Y
IDCJAC0009 8147 1909 8 5 4.8 1Y
IDCJAC0009 8147 1909 8 6 17.5 1Y
IDCJAC0009 8147 1909 8 7 20.8 1Y
IDCJAC0009 8147 1909 8 8 15.2 1Y
IDCJAC0009 8147 1909 8 9 10.2 1Y
IDCJAC0009 8147 1909 8 10 0 Y
Topic 06: Part 03
Explore the data
Explore the data

• If we have some understanding of the data, then we


“Typically, when you can start to explore the data
approach a new data • Do we have any attribute names?
set, you will know • Do we know what the attribute names mean?
something about it • How much variety is there in the attributes’
and the purpose of values?
the analysis.” (Kreibel • Is there any hierarchy in the data?
and Murray, 2018, • What happens if we compare the data across
p.151) attributes?
• Does one attribute affect another?
Data about the data…
• Do we have any attribute names?
• Do we know what the attribute names mean?
• How much variety is there in the attributes’
values?
• Is there any hierarchy in the data?
• What happens if we compare the data across
attributes?
• Does one attribute affect another?
Bureau of Period over which
Product Meteorology station Rainfall amount rainfall was measured
code number Year Month Day (millimetres) (days) Quality
IDCJAC0009 8147 1909 8 1 15.2 1Y
IDCJAC0009 8147 1909 8 2 82.6 1Y
IDCJAC0009 8147 1909 8 3 0 Y
IDCJAC0009 8147 1909 8 4 0 Y
IDCJAC0009 8147 1909 8 5 4.8 1Y
IDCJAC0009 8147 1909 8 6 17.5 1Y
IDCJAC0009 8147 1909 8 7 20.8 1Y
IDCJAC0009 8147 1909 8 8 15.2 1Y
IDCJAC0009 8147 1909 8 9 10.2 1Y
IDCJAC0009 8147 1909 8 10 0 Y
What else?

• What type of data is it?


“These questions are • Is it survey data, sensor measurements, summary
a set of the statistics?
parameters within • What is the topic of the data?
which you will • Who is the audience or consumers of the analysis?
undertake your • What tools do you have to explore, analyse and
analysis...” (Kreibel visualise the data?
and Murray, 2018, • Are there any constraints you have to take into
p.151) account?
Exploring…

Start simple!
“Explore the data by - Do lots of charts!
building lots and lots of
- If it is time-based data, start with year and drill
charts. Remember, you are
ultimately a data analyst.” down
(Kriebel and Murray, 2018, - Look for patterns? Are these important? Can
p.37) they be explained? Are the patterns the same
from year to year?
Simplify for exploration

Some data sets you get to analyze will have many fields,
“After undertaking an so it pays, in the early stages at least, to simplify where
entire analysis, it can be
tempting to want to show possible
your audience everything, - Remove unnecessary fields
as evidence of all the work - Focus on a subset of the data
you did…Concentrate on
…the information your ***Of course, this does not mean you should lose or
audience needs to know .”
(Knaflic, 2015, p.20) delete any data that is not in your early analyses
Exploring…

When looking at your initial exploratory charts, think


“Sticking to basic and about the following:
easy-to-understand
- What do you notice?
insights while making
enough time to practice - Are there obvious outliers?
regularly is perfectly fine.” - Are there trends that are immediately apparent?
(Kriebel and Murray, 2018, - Are there possibly correlations between two
p.154) metrics?
- Are there interesting clusters in your data?
- Are there repetitive patterns in the data, such as
seasonal spikes/troughs
Topic 06: Part 04:
Now what?
You’ve explored the data …

…and feel confident that you now have a good


“…if we use preattentive
attributes strategically, understanding of:
they can help us enable - What you know
our audience to see what - What you don’t know
we want them to see
before they even know - What you are not sure of
they’re seeing it!” (Knaflic,
2015, p.104)

…now what?
Check and check again…

Remember the Rock Project?


“Hmmm, can’t think of
anything interesting to say
- This is now the point at which you need to
here…” (Toohey, 2019) make sure that you and the client are on the
same page
- …asking more questions
- “I see there is a pattern here; every
June there is a fall in the sales of
swimming pools… is that important?”
Explanatory analysis

…and this is where the focus changes from


“Exploratory analysis is
what you do to understand exploring to explaining
the data and figure out
what might be noteworthy
…to highlight to others…
[explanatory analysis is
where] you have a specific
thing you want to explain,
a specific story you want
to tell” (Knaflic, 2015, p.19-
20)
Topic 06
Topic Summary
Learning outcomes

At the completion of this topic, you should be able to:


• Demonstrate an ability to work with unfamiliar data to produce valuable insights
• Define the different types and sources of metadata and demonstrate an ability to
extract and record the metadata for a given data set
• Prepare an exploratory analysis of an unfamiliar data set, demonstrating a
reproducible process
Where to from here…

Next topic is:


• Visualisation Best Practice
• Choosing the right chart type
• Creating effective views
• Designing holistic dashboards
• Perfecting your work
• Evaluating your work
References

Few, S., 2012, Show Me the Numbers: Designing tables and graphs to
enlighten. 2nd Ed., Analytics Press, Burlingame
Jackson, T.W., and Lockwood, S., 2018, Business Analytics: A contemporary
approach. Macmillan International, London
Knaflic, C.N., 2015, Storytelling with Data: a data visualization guide for
business professionals. Wiley, Hoboken
Kriebel, A., and Murray, E., 2018, #MakeoverMonday: Improving how we
visualize and analyse data, one chart at a time. Wiley, Hoboken
Wexler, S., Shaffer, J., and Cotgreave, A., 2017, The Big Book of Dashboards:
Visualizing your data using real-world business scenarios. Wiley, Hoboken
Yau, N., 2013, Data Points: Visualization that Means Something, Wiley,
Hoboken

You might also like