0% found this document useful (0 votes)
45 views36 pages

Course 2 Google

Course 2 Google Data Analytics

Uploaded by

abdirahmanja23
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
45 views36 pages

Course 2 Google

Course 2 Google Data Analytics

Uploaded by

abdirahmanja23
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 36

Course 2: google data analytics

MODULE 1

Introduction and content


Hello, and welcome to the second course of the Google Data Analytics Certificate program. You’re
on an exciting journey!
In this part of the program, you’ll learn how data analysts use structured thinking to solve business
problems. Then, you’ll explore how to ask effective questions and use the answers to tell a
meaningful story about data. Finally, you’ll discover strategies for effectively communicating and
collaborating with your stakeholders when defining a problem and presenting data insights. This
will enable you to support and advance business goals with data!

Course 2 content
Each course is broken into modules. Here’s a quick overview of the skills you’ll gain in each of
the four Course 2 modules.

Module 1: Ask effective questions

Data analysts are constantly asking questions in order to find solutions and identify business
potential. In this part of the course, you’ll learn about effective questioning techniques that will
help guide your analysis.

Module 2: Make data-driven decisions

In analytics, data drives decision-making, and this is your opportunity to explore data of all kinds
and its impact on all sorts of business decisions. You’ll also learn how to effectively share your
data through reports and dashboards.

Module 3: Spreadsheet magic

Spreadsheets are a key data analytics tool. Here you’ll learn both why and how data analysts use
spreadsheets in their work. You’ll also investigate how structured thinking helps analysts
understand problems and come up with solutions.

Module 4: Always remember the stakeholder

Successful data analysts balance the needs and expectations of their team and the stakeholders they
support. In this part of the course, you’ll learn strategies for managing stakeholder expectations
while establishing clear communication with your team.
Terms and definitions for Course 2, Module 1

 Action-oriented question: A question whose answers lead to change


 Cloud: A place to keep data online, rather than a computer hard drive
 Data analysis process: The six phases of ask, prepare, process, analyze, share, and act
whose purpose is to gain insights that drive informed decision-making
 Data life cycle: The sequence of stages that data experiences, which include plan, capture,
manage, analyze, archive, and destroy
 Leading question: A question that steers people toward a certain response
 Measurable question: A question whose answers can be quantified and assessed
 Problem types: The various problems that data analysts encounter, including categorizing
things, discovering connections, finding patterns, identifying themes, making predictions,
and spotting something unusual
 Relevant question: A question that has significance to the problem to be solved
 SMART methodology: A tool for determining a question’s effectiveness based on whether
it is specific, measurable, action-oriented, relevant, and time-bound
 Specific question: A question that is simple, significant, and focused on a single topic or a
few closely related ideas
 Structured thinking: The process of recognizing the current problem or situation,
organizing available information, revealing gaps and opportunities, and identifying
options
 Time-bound question: A question that specifies a timeframe to be studied
 Unfair question: A question that makes assumptions or is difficult to answer honestly
Evaluate your current data analytics skills
The Google Data Analytics Certificate program is designed for anyone who wants to gain the skills
required to become an entry-level data analyst. If that sounds like you, move on to the next item
in this course, Meet and greet. However, if you already have some experience with data analytics,
you may consider earning the Google Advanced Data Analytics Certificate or the Google Business
Intelligence Certificate instead. In this reading, you’ll learn more about the knowledge and skills
you need for one of the advanced certificate programs. You’ll also discover more about those
programs and why they might be a great next step for you!
Data analytics knowledge and skills
Data analysts must have a comprehensive understanding of the data analytics process, as well as
the technical skills that allow them to complete the data analysis process. In this section, you’ll
consider questions about the data analytics process and specific technical skills to determine your
readiness for the advanced certificate programs.
First, evaluate your knowledge of the data analytics process by considering whether the following
statements apply to you:

 I have a thorough understanding of data-driven decision-making and how it helps


organizations guide their business strategy based on facts.
 I’m able to ask questions and make hypotheses about business problems and use them to
guide me through the data analysis process.
 I know the steps to verify data credibility and perform data validation.
 I understand data modeling and know how organizations use it as a tool to understand their
data.
 I can select and design visualizations that help me effectively communicate analysis
insights to stakeholders.
If the previous statements apply to you, you probably know the basics of the data analysis process.
Continue reading to evaluate your technical skills.
Data analysts use a variety of tools, including software and programming languages, to analyze
data. You will be most successful in an advanced certificate program if you’re able to use
spreadsheets, SQL, Tableau, and R, which are covered in this program. Consider whether the
following statements apply to you:
 I’m able to join data from multiple sources to use for data analysis.
 I can sort data in both a spreadsheet and a database.
 I’m able to clean data by ensuring it contains no duplicate or incorrect entries and is in the
correct format.
 I know how to create data visualizations using a spreadsheet, Tableau, and R.
 I can write a SQL command that would select several columns from a table.
 I understand packages in R and can select and install the packages I need to complete
specific tasks.
Choose your next certificate program
If you confidently answered, “Yes” to all of the questions in the previous section, you might choose
to pursue the Google Advanced Data Analytics Certificate or Google Business Intelligence
Certificate to further your knowledge of data analytics.
In the Google Advanced Data Analytics Certificate program, you’ll build on your data analytics
skills and explore what it means to be a data scientist throughout seven courses. You’ll enhance
your Tableau skills. And you’ll learn another programming language, Python, and practice using
it to prepare, process, clean, analyze, and visualize data. Finally, you’ll delve into statistics and
use statistical techniques such as regression and machine learning to answer business questions.
This program is ideal for individuals preparing for data science or more advanced data analytics
roles.
The Google Business Intelligence Certificate program is composed of three courses, which will
expand your knowledge through practical, hands-on projects featuring tools and platforms such as
Big Query, Dataflow, and Tableau. You’ll learn about data management and the systems required
to successfully manage it in a business environment. Throughout the program, you’ll discover how
to design and interpret dashboards that provide dynamic, live data insights to stakeholders. The
Google Business Intelligence Certificate program is ideal for individuals seeking entry-level
business intelligence roles.

Key takeaways
If you’re already familiar with the data analytics concepts and skills presented in the Google Data
Analytics Certificate, you may wish to proceed directly to our more advanced programs:
 Google Advanced Data Analytics Certificate

 Google Business Intelligence Certificate

If you’re not sure which program to take, feel free to explore each of them. You can return to the
Google Data Analytics Certificate program at any time if you decide it’s the best starting point for
you!
From issue to action: the six data analysis phases
There are six data analysis phases that will help you make seamless decisions: ask, prepare,
process, analyze, share, and act. Keep in mind, these are different from the data life cycle, which
describes the changes data goes through over its lifetime. Going through the steps will help you
solve all kinds of business problems that you might face on the job.

 Step 1: Ask
It’s impossible to solve a problem if you don’t know what it is. These are some things to consider:
 Define the problem you’re trying to solve
 Make sure you fully understand the stakeholder’s expectations
 Focus on the actual problem and avoid any distractions
 Collaborate with stakeholders and keep an open line of communication
 Take a step back and see the whole situation in context

Questions to ask yourself in this step:


 What are my stakeholders saying their problems are?
 Now that I’ve identified the issues, how can I help the stakeholders resolve their questions?

 Step 2: Prepare
You will decide what data you need to collect in order to answer your questions and how to
organize it so that it is useful. You might use your business task to decide:
 What metrics to measure
 Locate data in your database
 Create security measures to protect that data

Questions to ask yourself in this step:


 What do I need to figure out how to solve this problem?
 What research do I need to do?

 Step 3: Process
Clean data is the best data and you will need to clean up your data to get rid of any possible errors,
inaccuracies, or inconsistencies. This might mean:
 Using spreadsheet functions to find incorrectly entered data
 Using SQL functions to check for extra spaces
 Removing repeated entries
 Checking as much as possible for bias in the data
Questions to ask yourself in this step:
 What data errors or inaccuracies might get in my way of getting the best possible answer
to the problem I am trying to solve?
 How can I clean my data so the information I have is more consistent?
 Step 4: Analyze
You will want to think analytically about your data. At this stage, you might sort and format your
data to make it easier to:
 Perform calculations
 Combine data from multiple sources
 Create tables with your results

Questions to ask yourself in this step:


 What story is my data telling me?
 How will my data help me solve this problem?
 Who needs my company’s product or service? What type of person is most likely to use
it?

 Step 5: Share
Everyone shares their results differently so be sure to summarize your results with clear and
enticing visuals of your analysis using data via tools like graphs or dashboards. This is your chance
to show the stakeholders you have solved their problem and how you got there. Sharing will
certainly help your team:
 Make better decisions
 Make more informed decisions
 Lead to stronger outcomes
 Successfully communicate your findings

Questions to ask yourself in this step:


 How can I make what I present to the stakeholders engaging and easy to understand?
 What would help me understand this if I were the listener?

 Step 6: Act
Now it’s time to act on your data. You will take everything you have learned from your data
analysis and put it to use. This could mean providing your stakeholders with recommendations
based on your findings so they can make data-driven decisions.

Questions to ask yourself in this step:


 How can I use the feedback I received during the share phase (step 5) to actually meet the
stakeholder’s needs and expectations?
These six steps can help you to break the data analysis process into smaller, manageable parts,
which is called structured thinking. This process involves four basic activities:
 Recognizing the current problem or situation
 Organizing available information
 Revealing gaps and opportunities
 Identifying your options

When you are starting out in your career as a data analyst, it is normal to feel pulled in a few
different directions with your role and expectations. Following processes like the ones outlined
here and using structured thinking skills can help get you back on track, fill in any gaps and let
you know exactly what you need.

Six common problem types


Data analytics is so much more than just plugging information into a platform to find insights. It
is about solving problems. To get to the root of these problems and find practical solutions, there
are lots of opportunities for creative thinking. No matter the problem, the first and most
important step is understanding it. From there, it is good to take a problem-solver approach to
your analysis to help you decide what information needs to be included, how you can transform
the data, and how the data will be used.

 Data analysts typically work with six problem types


A video, common problem types, introduced the six problem types with an example for each.
The examples are summarized below for review.

 Making predictions
A company that wants to know the best advertising method to bring in new customers is an
example of a problem requiring analysts to make predictions. Analysts with data on location,
type of media, and number of new customers acquired as a result of past ads can't guarantee
future results, but they can help predict the best placement of advertising to reach the target
audience.

 Categorizing things
An example of a problem requiring analysts to categorize things is a company's goal to improve
customer satisfaction. Analysts might classify customer service calls based on certain keywords
or scores. This could help identify top-performing customer service representatives or help
correlate certain actions taken with higher customer satisfaction scores.
 Spotting something unusual
A company that sells smart watches that help people monitor their health would be interested in
designing their software to spot something unusual. Analysts who have analyzed aggregated
health data can help product developers determine the right algorithms to spot and set off alarms
when certain data doesn't trend normally.

 Identifying themes
User experience (UX) designers might rely on analysts to analyze user interaction data. Similar
to problems that require analysts to categorize things, usability improvement projects might
require analysts to identify themes to help prioritize the right product features for improvement.
Themes are most often used to help researchers explore certain aspects of data. In a user study,
user beliefs, practices, and needs are examples of themes.
By now you might be wondering if there is a difference between categorizing things and
identifying themes. The best way to think about it is: categorizing things involves assigning
items to categories; identifying themes takes those categories a step further by grouping them
into broader themes.

 Discovering connections
A third-party logistics company working with another company to get shipments delivered to
customers on time is a problem requiring analysts to discover connections. By analyzing the wait
times at shipping hubs, analysts can determine the appropriate schedule changes to increase the
number of on-time deliveries.

 Finding patterns
Minimizing downtime caused by machine failure is an example of a problem requiring analysts
to find patterns in data. For example, by analyzing maintenance data, they might discover that
most failures happen if regular maintenance is delayed by more than a 15-day window.

Key takeaway
As you move through this program, you will develop a sharper eye for problems and you will
practice thinking through the problem types when you begin your analysis. This method of
problem solving will help you figure out solutions that meet the needs of all stakeholders.
More about smart questions
Companies in lots of industries today are dealing with rapid change and rising uncertainty. Even
well established businesses are under pressure to keep up with what is new and figure out what is
next. To do that, they need to ask questions. Asking the right questions can help spark the
innovative ideas that so many businesses are hungry for these days.
The same goes for data analytics. No matter how much information you have or how advanced
your tools are, your data won’t tell you much if you don’t start with the right questions. Think of
it like a detective with tons of evidence who doesn’t ask a key suspect about it. Coming up, you
will learn more about how to ask highly effective questions, along with certain practices you want
to avoid.

 Highly effective questions are SMART questions:

 Specific:
Is the question specific? Does it address the problem? Does it have context? Will it uncover a lot
of the information you need?

 Measurable: Will the question give you answers that you can measure?

 Action-oriented: Will the answers provide information that helps you devise some type of
plan?

 Relevant: Is the question about the particular problem you are trying to solve?

 Time-bound: Are the answers relevant to the specific time being studied?

Examples of SMART questions


Here's an example that breaks down the thought process of turning a problem question into one or
more SMART questions using the SMART method: What features do people look for when buying
a new car?

 Specific: Does the question focus on a particular car feature?


 Measurable: Does the question include a feature rating system?
 Action-oriented: Does the question influence creation of different or new feature
packages?
 Relevant: Does the question identify which features make or break a potential car
purchase?
 Time-bound: Does the question validate data on the most popular features from the last
three years?
Questions should be open-ended. This is the best way to get responses that will help you
accurately qualify or disqualify potential solutions to your specific problem. So, based on the
thought process, possible SMART questions might be:
On a scale of 1-10 (with 10 being the most important) how important is your car having four-
wheel drive? Explain.
 What are the top five features you would like to see in a car package?
 What features, if included with four-wheel drive, would make you more inclined to buy
the car?
 How does a car having four-wheel drive contribute to its value, in your opinion?

Things to avoid when asking questions

 Leading questions: questions that only have a particular response

Example: This product is too expensive, isn’t it?


This is a leading question because it suggests an answer as part of the question. A better question
might be, “What is your opinion of this product?” There are tons of answers to that question, and
they could include information about usability, features, accessories, color, reliability, and
popularity, on top of price. Now, if your problem is actually focused on pricing, you could ask a
question like “What price (or price range) would make you consider purchasing this product?”
This question would provide a lot of different measurable responses.

 Closed-ended questions: questions that ask for a one-word or brief response only

Example: Were you satisfied with the customer trial?


This is a closed-ended question because it doesn’t encourage people to expand on their answer. It
is really easy for them to give one-word responses that aren’t very informative. A better question
might be, “What did you learn about customer experience from the trial.” This encourages people
to provide more detail besides “It went well.”

 Vague questions: questions that aren’t specific or don’t provide context

Example: Does the tool work for you?


This question is too vague because there is no context. Is it about comparing the new tool to the
one it replaces? You just don’t know. A better inquiry might be, “When it comes to data entry, is
the new tool faster, slower, or about the same as the old tool? If faster, how much time is saved?
If slower, how much time is lost?” These questions give context (data entry) and help frame
responses that are measurable (time).

END MODULE 1
MODULE 2

Terms and definitions for Course 2, Module 2

 Algorithm: A process or set of rules followed for a specific task


 Big data: Large, complex datasets typically involving long periods of time, which enable
data analysts to address far-reaching business problems
 Dashboard: A tool that monitors live, incoming data
 Data-inspired decision-making: The process of exploring different data sources to find out
what they have in common
 Metric: A single, quantifiable type of data that is used for measurement
 Metric goal: A measurable goal set by a company and evaluated using metrics
 Pivot chart: A chart created from the fields in a pivot table
 Pivot table: A data summarization tool used to sort, reorganize, group, count, total, or
average data
 Problem types: The various problems that data analysts encounter, including categorizing
things, discovering connections, finding patterns, identifying themes, making predictions,
and spotting something unusual
 Qualitative data: A subjective and explanatory measure of a quality or characteristic
 Quantitative data: A specific and objective measure, such as a number, quantity, or range
 Report: A static collection of data periodically given to stakeholders
 Return on investment (ROI): A formula that uses the metrics of investment and profit to
evaluate the success of an investment
 Revenue: The total amount of income generated by the sale of goods or services
 Small data: Small, specific data points typically involving a short period of time, which are
useful for making day-to-day decisions
Data trials and triumphs

Introduction
A data analytics professional’s job is to provide the data necessary to inform key decisions. They
also need to frame their analysis in a way that helps business leaders make the best possible
decisions. In this reading, you’re going to explore the role of data in decision-making and the
reasons why data analytics professionals are so important to this process. You’ll compare data-
driven and data-inspired decisions to understand the difference between them. You’ll also check
out some examples where projects failed or succeeded based on how the data was applied.
Both data-driven and data-inspired approaches are rooted in the idea that data is inherently valuable
for making a decision. Well-curated data can provide information to decision-makers that
improves the quality of their decisions. Remember: Data does not make decisions, but it does
improve them.

 Data-driven decisions
As you’ve been learning, data-driven decision-making means using facts to guide business
strategy. The phrase “data-driven decisions” means exactly that: Data is used to arrive at a
decision. This approach is limited by the quantity and quality of readily-available data. If the
quality and quantity of the data is sufficient, this approach can far improve decision-making. But
if the data is insufficient or biased, this can create problems for decision-makers. Potential dangers
of relying entirely on data-driven decision-making can include overreliance on historical data, a
tendency to ignore qualitative insights, and potential biases in data collection and analysis

Example of a data-driven decision


A/B testing is a simple example of collecting data for data-driven decision-making. For example,
a website that sells widgets has an idea for a new website layout they think will result in more
people buying widgets. For two weeks, half of their website visitors are directed to the old site;
the other half are directed to the new site. After those two weeks, the analyst gathers the data about
their website visitors and the number of widgets sold for analysis. This helps the analyst understand
which website layout resulted in more widget sales. If the new website performed better in
producing widget sales, then the company can confidently make the decision to use the new layout!

 Data-inspired decisions
Data-inspired decisions include the same considerations as data-driven decisions while adding
another layer of complexity. They create space for people using data to consider a broader range
of ideas: drawing on comparisons to related concepts, giving weight to feelings and experiences,
and considering other qualities that may be more difficult to measure. Data-inspired decision-
making can avoid some of the pitfalls that data-driven decisions might be prone to.
Example of a data-inspired decision
A customer support center gathers customer satisfaction data (often known as a “CSAT” score).
They use a simple 1–10 score along with a qualitative description in which the customer describes
their experience. The customer support center manager wants to improve customer experience, so
they set a goal to improve the CSAT score. They start by analyzing the CSAT scores and reading
each of the descriptions from the customers. Additionally, they interview the people working in
the customer support center. From there, the manager formulates a strategy and decides what needs
to improve the most in order to raise customer satisfaction scores. While the manager certainly
relies on the CSAT data in the decision-making process, input of support center representatives
and other qualitative information informs the approach as well.

A data analysis triumph

When data is used strategically, businesses can transform and grow their revenue. Consider the
example below.

PepsiCo

Since the days of the New Coke launch, things have changed dramatically for beverage and other
consumer packaged goods (CPG) companies.
According to a Think with Google article by Shyam Venugopal, PepsiCo “hired analytical talent
and established cross-functional workflows around an infrastructure designed to put consumers’
needs first. Then [the company] set up the right processes to make critical decisions based on data
and technology use cases. Finally, [it] invested in the right technology stack and platforms so that
data could flow into a central cloud-based hub. This is critical. When data comes together, we
develop a holistic understanding of the consumer and their journeys."
In this data-inspired decision, PepsiCo is not just using its own set of data, but also employing
external sources to supplement its datasets and expand its market reach. Learn about how PepsiCo
is delivering a more personal and valuable experience to customers using data in How one of the
world’s biggest marketers ripped up its playbook and learned to anticipate intent.

 Data analysis failures


You’ve been learning why data is such a powerful business tool and how data analysts help their
companies make data-driven decisions for great results. Using data to draw accurate conclusions
and make good recommendations starts with having complete, correct, and relevant data.
Note: It’s important to remember that it’s possible to have solid data and still make the wrong
choices. It’s up to data analysts to interpret the data accurately. When data is interpreted
incorrectly, that incorrect interpretation can lead to huge losses. Consider the following.
 Coke launch failure
In 1985, New Coke was launched, replacing the classic Coke formula. The company had done
taste tests with 200,000 people and found that test subjects preferred the taste of New Coke over
Pepsi, which had become a tough competitor. Based on this data alone, classic Coke was taken off
the market and replaced with New Coke. The company thought this was the solution to take back
the market share that had been lost to Pepsi.
But as it turns out, New Coke was very unpopular—and the company ended up losing tens of
millions of dollars. The data seemed correct, but it was incomplete: The data didn't consider how
customers would feel about New Coke replacing classic Coke. The company’s decision to retire
classic Coke was a data-driven decision based on incomplete data.

 Mars Orbiter loss


In 1999, NASA lost the $125 million Mars Climate Orbiter even though the teams had good data.
The spacecraft burned to pieces because of poor collaboration and communication. The Orbiter’s
navigation team was using the International System of Units (newtons) for their force calculations,
but the engineers who built the spacecraft used the English Engineering Units system (pounds) for
force calculations.
No one realized there was a problem until the Orbiter burst into flames in the Martian atmosphere.
Later, a NASA review board investigating the cause of the problem discovered the issue was in
the software that controlled the thrusters. One program calculated the thrusters’ force in pounds;
another program working with the data assumed it was in newtons. The software controllers were
making data-driven decisions to adjust the thrust based on 100% accurate data, but these decisions
were wrong because of inaccurate assumptions when interpreting it. The two teams might have
communicated so they picked a single unit of measure, or so the analysts would have known that
conversion was a necessary step in the process to prepare the data. A conversion of the data from
one system of measurement to the other could have prevented the loss.
There’s a difference between making a decision with incomplete data and making a decision with
a small amount of data. You learned that making a decision with incomplete data is dangerous.
But sometimes accurate data from a small test can help you make a good decision. Stay tuned:
You’ll learn about how much data to collect later in the program.

Key takeaways
As a data analyst, you’ll rarely need to consider, “Am I being data-driven or data-inspired?” It’s
helpful to have some context for these two approaches, though your own skills and knowledge will
be the most important parts of any analysis project. So, keep a data-driven mindset and ask lots of
questions. Experiment with many different possibilities. And use both logic and creativity along
the way. Using this approach, you’ll be prepared to interpret your data with the highest levels of
care and accuracy.
Qualitative and quantitative data in business

This reading further elaborates on the meaning of qualitative versus quantitative.


As you have learned, there are two types of data: qualitative and quantitative.
Now, take a closer look at the data types and data collection tools. In this scenario, you are a data
analyst for a chain of movie theaters. Your manager wants you to track trends in:

 Movie attendance over time


 Profitability of the concession stand
 Evening audience preferences
 Assume quantitative data already exists to monitor all three trends.

Movie attendance over time


Starting with the historical data the theater has through its loyalty and rewards program, your first
step is to investigate what insights you can gain from that data. You look at attendance over the
last 3 months. But, because the last 3 months didn’t include a major holiday, you decide it is better
to look at a full year’s worth of data. As you suspected, the quantitative data confirmed that average
attendance was 550 per month but then rose to an average of 1,600 per month for the months with
holidays.
The historical data serves your needs for the project, but you also decide that you will resume the
analysis again in a few months after the theater increases ticket prices for evening showtimes.

Profitability of the concession stand


Profit is calculated by subtracting cost from sales revenue. The historical data shows that while the
concession stand was profitable, profit margins were razor thin at less than 5%. You saw that
average purchases totaled $20 or less. You decide that you will keep monitoring this on an ongoing
basis. Based on your understanding of data collection tools, you will suggest an online survey of
customers so they can comment on the food at the concession stand. This will enable you to gather
even more quantitative data to revamp the menu and potentially increase profits.

Evening audience preferences


Your analysis of the historical data shows that the 7:30 PM showtime was the most popular and
had the greatest attendance, followed by the 7:15 PM and 9:00 PM showtimes. You may suggest
replacing the current 8:00 PM showtime that has lower attendance with an 8:30 PM showtime. But
you need more data to back up your hunch that people would be more likely to attend the later
show.
Evening movie-goers are the largest source of revenue for the theater. Therefore, you also decide
to include a question in your online survey to gain more insight.
Qualitative data for all three trends plus ticket pricing
Since you know that the theater is planning to raise ticket prices for evening showtimes in a few
months, you will also include a question in the survey to get an idea of customers’ price sensitivity.
Your final online survey might include these questions for qualitative data:
1. What went into your decision to see a movie in our theater today? (movie attendance)
2. What do you think about the quality and value of your purchases at the concession stand?
(concession stand profitability)
3. Which ShowTime do you prefer, 8:00 PM or 8:30 PM, and why do you prefer that time?
(evening movie-goer preferences)
4. Under what circumstances would you choose a matinee over a nighttime showing? (ticket
price increase)

Key takeaways

Data analysts will generally use both types of data in their work. Usually, qualitative data can help
analysts better understand their quantitative data by providing a reason or more thorough
explanation. In other words, quantitative data generally gives you the what, and qualitative data
generally gives you the why. By using both quantitative and qualitative data, you can learn when
people like to go to the movies and why they chose the theater. Maybe they really like the reclining
chairs, so your manager can purchase more recliners. Maybe the theater is the only one that serves
root beer. Maybe a later show time gives them more time to drive to the theater from where popular
restaurants are located. Maybe they go to matinees because they have kids and want to save money.
You wouldn’t have discovered this information by analyzing only the quantitative data for
attendance, profit, and showtimes.

Tools for visualizing data

In this course, you’ll work with Tableau and spreadsheets. Both of these tools have advantages
and disadvantages Often, data analysts will discover they need to use multiple tools, even on a
single project. What you use will largely be determined by the work you’re doing and your goals.
This reading explores two of the tools you might use to visualize and present data: spreadsheets
and Tableau.

 Spreadsheets
Google Workspace and Microsoft Office Suite both offer spreadsheet applications. You’ve worked
with Google Sheets in this course, and it’s very similar in function to Microsoft Excel. If you want
to compare some of the features of Sheets to Excel, check out the Microsoft video Create a chart
from start to finish.
Both Sheets and Excel are go-to choices for creating static charts and graphs. They offer basic
data visualization capabilities that are often enough for simple visualizations. In addition, you can
use them to clean, sort, and filter data. And both offer a range of chart types, graphing tools, and
pivot tables for creating effective data visualizations. These charts are easy to manage; they update
when the source data is updated, so they don’t require much manual intervention once
implemented.
Sheets and Excel are connected to other apps in their product suites. Google Docs and Slides are
very similar to Microsoft Word and PowerPoint, for example. You can incorporate data
visualizations from Sheets or Excel into reports and documents in Docs and Word. Presentation
programs such as Slides and PowerPoint allow you to create engaging presentations that include
data visualizations so you can share insights in a presentation format. Learn more about the power
of this interconnectivity among Google tools in the article Link a chart, table, or slides to Google
Docs or Slides.

 Tableau
Tableau is used to create powerful and interactive visualizations, making it an excellent choice
for data visualizations such as live dashboards. Tableau also makes it easy to create charts, graphs,
and dashboards in a drag-and-drop interface. The application supports a wide range of data sources
and provides advanced analytics capabilities. These features allow for in-depth exploration of data
trends and patterns.
Tableau is particularly useful for creating visualizations using huge datasets, like in this World
Happiness Report by Sustainable Development Solutions which uses global reporting data on
different countries' happiness ratings. Likewise, this visualization of Population and Housing State
Data from 2020 United States Census Data compares population rates in the United States and
available housing.
Tableau is widely known and used for its versatility and power, but it can take quite a bit of time
to learn to use Tableau effectively. Soon, you’ll begin practicing with Tableau. But if you’d like
to check it out now, there is a free environment you can access at Tableau Public.

Key takeaways
There are many visualization tools you will have the opportunity to use as a data professional.
Different tools have different advantages and disadvantages. Although Tableau ultimately has
more power than a basic spreadsheet application, it’s most often used for specific cases and to
work with large datasets. Don’t underestimate how much you can do with spreadsheets or how
powerful interconnectivity between apps can be!
Most of the time, especially for something like a quick report; you’re more likely to reach into
your toolkit for your spreadsheet app of choice. But your data career will definitely benefit from
Tableau, so as you progress take advantage of opportunities to work with it. With so many different
data analysis situations, familiarity with all of these tools will help you know which is the best for
each situation.
Design compelling dashboards

Dashboards are powerful visual tools that help you tell your data story. A dashboard is a tool that
monitors live, incoming data. It organizes information from multiple datasets into one central
location, offering huge time savings. Data analysts use dashboards to track, analyze, and
visualize data in order to answer questions and solve problems. For a basic idea of what
dashboards look like, refer to this article: “Real-world examples of business intelligence
dashboards.”

The beauty of dashboards

The following table summarizes the benefits of using a dashboard for both data analysts and
their stakeholders.

Benefits For data analysts For stakeholders

Centralization Share a single source of data Work with a comprehensive


with all stakeholders view of data, initiatives,
objectives, projects,
processes, and more

Visualization Show and update live, Spot changing trends and


incoming data in real time* patterns more quickly

Insightfulness Understand the story behind


Pull relevant information the numbers to keep track of
from different datasets goals and make data-driven
decisions

Customization Create custom views Drill down to more specific


dedicated to a specific person, areas of specialized interest
project, or presentation of the or concern
data

It’s important to remember that changed data is pulled into dashboards automatically only if the
data structure is the same. If the data structure changes, you have to update the dashboard design
before the data can update live.
Tableau

There are many different visualization tools available. One of the most powerful is Tableau, which
supports a range of data sources and has advanced analytics capabilities that allow for in-depth
exploration of data trends and patterns. Tableau can handle more data and larger datasets than
many other tools and offers real-time data availability.
It does take some time to learn to use Tableau, but your efforts can be well-rewarded, as Tableau
visualizations are pleasantly interactive. For a dashboard to be successful, it needs to engage users
and help them learn. Tableau has put in a lot of effort to ensure that its users have a great experience
and the platform is accessible to everyone.

 Create a dashboard
Here’s a process you can follow to create a dashboard, whether in Tableau or another visualization
tool:

1. Identify the stakeholders who need to see the data and how they will use it

Begin by asking effective questions. Check out this dashboard requirements gathering worksheet
to explore a wide range of good questions you can use to identify relevant stakeholders and their
data needs. This is a great resource to help guide you through this process again and again.

2. Design the dashboard (what should be displayed)

Use these tips to help make your dashboard design clear and easy to follow:

 Use a clear header to label the information.


 Add short text descriptions to each visualization.
 Show the most important information at the top.

3. Create mockups if desired

A mockup is a simple draft of a visualization used for planning a dashboard and evaluating its
progress. This is optional, but a lot of data analysts like to sketch out their dashboards before
creating them.

4. Select the visualizations

You have a lot of options here. Which visualizations you select depends on the data story you are
telling. If you need to show a change in values over time, line charts or bar graphs might be the
best choice. If your goal is to show how each part contributes to the whole amount being reported,
a pie or donut chart is probably a better choice.
To learn more about choosing the right visualizations, check out Tableau’s galleries:
For more samples of area charts, column charts, and other visualizations, visit the Tableau
Dashboard Showcase. This gallery is full of great examples that were created using real data;
explore this resource on your own to get some inspiration.
Explore Tableau’s Viz of the Day to check out visualizations curated by the community. These are
visualizations created by Tableau users and are a great way to learn more about how other data
analysts are using data visualization tools.

5. Create filters as needed

Filters show certain data while hiding the rest of the data in a dashboard. This can be a big help to
identify patterns while keeping the original data intact. It’s common for data analysts to use and
share the same dashboard, but manage their part of it with a filter. To dig deeper into filters and
find an example of filters in action, visit Tableau’s page on Filter Actions. This is a useful resource
to save and come back to when you start practicing using filters in Tableau on your own.

Key takeaways

Just like how the dashboard on an airplane shows the pilot their flight path, your dashboard does
the same for your stakeholders. It helps them navigate the path of a project inside the data. If you
add clear markers and highlight important points on your dashboard, users will understand where
your data story is headed. Then, you can work together to make sure the business gets where it
needs to go.
Big and small data

As a data analyst, you will work with data both big and small. Both kinds of data are valuable,
but they play very different roles.
Whether you work with big or small data, you can use it to help stakeholders improve business
processes, answer questions, create new products, and much more. But there are certain
challenges and benefits that come with big data and the following table explores the differences
between big and small data.

Small data Big data

Describes a dataset made up of specific Describes large, less-specific datasets that


metrics over a short, well-defined time period cover a long time period

Usually organized and analyzed in Usually kept in a database and queried


spreadsheets

Likely to be used by small and midsize Likely to be used by large organizations


businesses

Simple to collect, store, manage, sort, and Takes a lot of effort to collect, store, manage,
visually represent sort, and visually represent

Usually already a manageable size for Usually needs to be broken into smaller
analysis pieces in order to be organized and analyzed
effectively for decision-making

Challenges and benefits


Here are some challenges you might face when working with big data:
 A lot of organizations deal with data overload and way too much unimportant or
irrelevant information.
 Important data can be hidden deep down with all of the non-important data, which makes
it harder to find and use. This can lead to slower and more inefficient decision-making
time frames.
 The data you need isn’t always easily accessible.
 Current technology tools and solutions still struggle to provide measurable and reportable
data. This can lead to unfair algorithmic bias.
 There are gaps in many big data business solutions.
Now for the good news! Here are some benefits that come with big data:
 When large amounts of data can be stored and analyzed, it can help companies identify
more efficient ways of doing business and save a lot of time and money.
 Big data helps organizations spot the trends of customer buying patterns and satisfaction
levels, which can help them create new products and solutions that will make customers
happy.
 By analyzing big data, businesses get a much better understanding of current market
conditions, which can help them stay ahead of the competition.
 As in our earlier social media example, big data helps companies keep track of their
online presence—especially feedback, both good and bad, from customers. This gives
them the information they need to improve and protect their brand.

The three (or four) V words for big data


When thinking about the benefits and challenges of big data, it helps to think about the three Vs:
volume, variety, and velocity. Volume describes the amount of data. Variety describes the
different kinds of data. Velocity describes how fast the data can be processed. Some data
analysts also consider a fourth V: veracity. Veracity refers to the quality and reliability of the
data. These are all important considerations related to processing huge, complex datasets.

Volume Variety Velocity Veracity

The amount of data The different kinds of How fast the data can The quality and
data be processed reliability of the data

END MODULE 2
MODULE 3

Terms and definitions for Course 2, Module 3

 AVERAGE: A spreadsheet function that returns an average of the values from a selected
range
 Borders: Lines that can be added around two or more cells on a spreadsheet
 Cell reference: A cell or a range of cells in a worksheet typically used in formulas and
functions
 COUNT: A spreadsheet function that counts the number of cells in a range that meet a
specific criteria
 Equation: A calculation that involves addition, subtraction, multiplication, or division (also
called a math expression)
 Fill handle: A box in the lower-right-hand corner of a selected spreadsheet cell that can be
dragged through neighboring cells in order to continue an instruction
 Filtering: The process of showing only the data that meets a specified criteria while hiding
the rest
 Header: The first row in a spreadsheet that labels the type of data in each column
 Math expression: A calculation that involves addition, subtraction, multiplication, or
division (also called an equation)
 Math function: A function that is used as part of a mathematical formula
 MAX: A spreadsheet function that returns the largest numeric value from a range of cells
 MIN: A spreadsheet function that returns the smallest numeric value from a range of cells
 Open data: Data that is available to the public
 Operator: A symbol that names the operation or calculation to be performed
 Order of operations: Using parentheses to group together spreadsheet values in order to
clarify the order in which operations should be performed
 Problem domain: The area of analysis that encompasses every activity affecting or affected
by a problem
 Range: A collection of two or more cells in a spreadsheet
 Report: A static collection of data periodically given to stakeholders
 Return on investment (ROI): A formula that uses the metrics of investment and profit to
evaluate the success of an investment
 Revenue: The total amount of income generated by the sale of goods or services
 Scope of work (SOW): An agreed-upon outline of the tasks to be performed during a
project
 Sorting: The process of arranging data into a meaningful order to make it easier to
understand, analyze, and visualize
 SUM: A spreadsheet function that adds the values of a selected range of cells
Spreadsheets and the data life cycle

To better understand the benefits of using spreadsheets in data analytics, let’s explore how they
relate to each phase of the data life cycle: plan, capture, manage, analyze, archive, and destroy.

Plan for the users who will work within a spreadsheet by developing organizational standards.
This can mean formatting your cells, the headings you choose to highlight, the color scheme, and
the way you order your data points. When you take the time to set these standards, you will improve
communication, ensure consistency, and help people be more efficient with their time.

Capture data by the source by connecting spreadsheets to other data sources, such as an online
survey application or a database. This data will automatically be updated in the spreadsheet. That
way, the information is always as current and accurate as possible.

Manage different kinds of data with a spreadsheet. This can involve storing, organizing, filtering,
and updating information. Spreadsheets also let you decide who can access the data, how the
information is shared, and how to keep your data safe and secure.

Analyze data in a spreadsheet to help make better decisions. Some of the most common
spreadsheet analysis tools include formulas to aggregate data or create reports, and pivot tables for
clear, easy-to-understand visuals.

Archive any spreadsheet that you don’t use often, but might need to reference later with built-in
tools. This is especially useful if you want to store historical data before it gets updated.

Destroy your spreadsheet when you are certain that you will never need it again, if you have
better backup copies, or for legal or security reasons. Keep in mind, lots of businesses are required
to follow certain rules or have measures in place to make sure data is destroyed properly.

Resources for more information


Spreadsheet shortcuts can help you become more efficient with spreadsheets. If you’d like to learn
more, you can explore the collection of Google Sheets shortcuts, or visit the Microsoft Excel
shortcuts page if you are using Excel. Both of these resources contain a list of spreadsheet shortcuts
you can save and reference as you work more with spreadsheets on your own.
Formulas in a spreadsheet

Now that you have learned some basic ways to avoid errors, you can focus on what to do when
that dreaded pop-up does appear. The following table is a reference you can use to look up
common spreadsheet errors and examples of each. Knowing what the errors mean takes some of
the fear out of getting them.

Error Description Example

#DIV/0! A formula is trying to divide =B2/B3, when the cell B3


a value in a cell by 0 (or an contains the value 0
empty cell with no value)

(Google Sheets =COUNT(B1:D1 C1:C10) is


#ERROR! only) Something can’t be invalid because the cell
interpreted as it has been ranges aren't separated by a
input. This is also known as a comma
parsing error.

A formula can't find the data The cell being referenced


#N/A can't be found

The name of a formula or The name of a function is


#NAME? function used isn't recognized misspelled

The spreadsheet can't perform =DATEDIF(A4, B4, "M") is


a formula calculation because unable to calculate the
#NUM! a cell has an invalid numeric number of months between
value two dates because the date in
cell A4 falls after the date in
cell B4

A formula is referencing a A cell used in a formula was


#REF! cell that isn't valid in a column that was deleted

A general error indicating a There could be problems with


#VALUE! problem with a formula or spaces or text, or with
with referenced cells referenced cells in a formula;
you may have additional
work to find the source of the
problem.
If you are working with Microsoft Excel, an interactive page, How to correct a #VALUE! error,
can help you narrow down the cause of this error. You can select a specific function from a drop-
down list to display a link to tips to fix the error when using that function.

The importance of context

Context in data analytics is the condition and circumstances that surround and give meaning to the
data. Context is important in data analytics because it helps make disorganized data accessible and
understood. The fact is, data has little value if it is not paired with context.
Understanding the context behind the data can help us make it more meaningful at every stage of
the data analysis process. For example, you might be able to make a few guesses about what you're
looking at in the following table, but you couldn't be certain without more context.

2010 28000

2005 18000

2000 23000

1995 10000

On the other hand, if the first column was labeled to represent the years when a survey was
conducted, and the second column showed the number of people who responded to that survey,
then the table would start to make a lot more sense. Take this a step further, and you might notice
that the survey is conducted every 5 years. This added context helps you understand why there are
five-year gaps in the table.

Years (Collected every 5 years) Respondents

2010 28000

2005 18000

2000 23000

1995 10000
Context can turn raw data into meaningful information. It is very important for data analysts
to contextualize their data. This means giving the data perspective by defining it. To do this, you
need to identify:

 Who: The person or organization that created, collected, and/or funded the data collection

 What: The things in the world that data could have an impact on

 Where: The origin of the data

 When: The time when the data was created or collected

 Why: The motivation behind the creation or collection

 How: The method used to create or collect it

Understanding and including the context is important during each step of your analysis process,
so it is a good idea to get comfortable with it early in your career. For example, when you collect
data, you’ll also want to ask questions about the context to make sure that you understand the
business and business process. During organization, the context is important for your naming
conventions, how you choose to show relationships between variables, and what you choose to
keep or leave out. And finally, when you present, it is important to include contextual information
so that your stakeholders understand your analysis.

END MODULE 3
MODULE 4

Terms and definitions for Course 2, Module 4

 Cloud: A place to keep data online, rather than a computer hard drive
 Reframing: Restating a problem or challenge, then redirecting it toward a potential
resolution
 Turnover rate: The rate at which employees voluntarily leave a company

Working with stakeholders

Your data analysis project should answer the business task and create opportunities for data-driven
decision-making. That's why it is so important to focus on project stakeholders. As a data analyst,
it is your responsibility to understand and manage your stakeholders’ expectations while keeping
the project goals front and center.
You might remember that stakeholders are people who have invested time, interest, and resources
into the projects that you are working on. This can be a pretty broad group, and your project
stakeholders may change from project to project. But there are three common stakeholder groups
that you might find yourself working with: the executive team, the customer-facing team, and the
data science team.
Let’s get to know more about the different stakeholders and their goals. Then we'll learn some
tips for communicating with them effectively.

 Executive team
The executive team provides strategic and operational leadership to the company. They set goals,
develop strategy, and make sure that strategy is executed effectively. The executive team might
include vice presidents, the chief marketing officer, and senior-level professionals who help plan
and direct the company’s work. These stakeholders think about decisions at a very high level and
they are looking for the headline news about your project first. They are less interested in the
details. Time is very limited with them, so make the most of it by leading your presentations with
the answers to their questions. You can keep the more detailed information handy in your
presentation appendix or your project documentation for them to dig into when they have more
time.
For example, you might find yourself working with the vice president of human resources on an
analysis project to understand the rate of employee absences. A marketing director might look to
you for competitive analyses. Part of your job will be balancing what information they will need
to make informed decisions with their busy schedule.
But you don’t have to tackle that by yourself. Your project manager will be overseeing the progress
of the entire team, and you will be giving them more regular updates than someone like the vice
president of HR. They are able to give you what you need to move forward on a project, including
getting approvals from the busy executive team. Working closely with your project manager can
help you pinpoint the needs of the executive stakeholders for your project, so don’t be afraid to
ask them for guidance.

 Customer-facing team
The customer-facing team includes anyone in an organization who has some level of interaction
with customers and potential customers. Typically they compile information, set expectations, and
communicate customer feedback to other parts of the internal organization. These stakeholders
have their own objectives and may come to you with specific asks. It is important to let the data
tell the story and not be swayed by asks from your stakeholders to find certain patterns that might
not exist.
Let’s say a customer-facing team is working with you to build a new version of a company’s most
popular product. Part of your work might involve collecting and sharing data about consumers’
buying behavior to help inform product features. Here, you want to be sure that your analysis and
presentation focuses on what is actually in the data-- not on what your stakeholders hope to find.

 Data science team


Organizing data within a company takes teamwork. There's a good chance you'll find yourself
working with other data analysts, data scientists, and data engineers. For example, maybe you team
up with a company's data science team to work on boosting company engagement to lower rates
of employee turnover. In that case, you might look into the data on employee productivity, while
another analyst looks at hiring data. Then you share those findings with the data scientist on your
team, who uses them to predict how new processes could boost employee productivity and
engagement. When you share what you found in your individual analyses, you uncover the bigger
story. A big part of your job will be collaborating with other data team members to find new angles
of the data to explore. Here's a view of how different roles on a typical data science team support
different functions:

 Working effectively with stakeholders


When you're working with each group of stakeholders- from the executive team, to the customer-
facing team, to the data science team, you'll often have to go beyond the data. Use the following
tips to communicate clearly, establish trust, and deliver your findings across groups.

Discuss goals. Stakeholder requests are often tied to a bigger project or goal. When they ask you
for something, take the opportunity to learn more. Start a discussion. Ask about the kind of results
the stakeholder wants. Sometimes, a quick chat about goals can help set expectations and plan the
next steps.
Feel empowered to say “no.” Let’s say you are approached by a marketing director who has a
“high-priority” project and needs data to back up their hypothesis. They ask you to produce the
analysis and charts for a presentation by tomorrow morning. Maybe you realize their hypothesis
isn’t fully formed and you have helpful ideas about a better way to approach the analysis. Or maybe
you realize it will take more time and effort to perform the analysis than estimated. Whatever the
case may be, don’t be afraid to push back when you need to.
Stakeholders don’t always realize the time and effort that goes into collecting and analyzing data.
They also might not know what they actually need. You can help stakeholders by asking about
their goals and determining whether you can deliver what they need. If you can’t, have the
confidence to say “no,” and provide a respectful explanation. If there’s an option that would be
more helpful, point the stakeholder toward those resources. If you find that you need to prioritize
other projects first, discuss what you can prioritize and when. When your stakeholders understand
what needs to be done and what can be accomplished in a given timeline, they will usually be
comfortable resetting their expectations. You should feel empowered to say no-- just remember to
give context so others understand why.

Plan for the unexpected. Before you start a project, make a list of potential roadblocks. Then,
when you discuss project expectations and timelines with your stakeholders, give yourself some
extra time for problem-solving at each stage of the process.

Know your project. Keep track of your discussions about the project over email or reports, and
be ready to answer questions about how certain aspects are important for your organization. Get
to know how your project connects to the rest of the company and get involved in providing the
most insight possible. If you have a good understanding about why you are doing an analysis, it
can help you connect your work with other goals and be more effective at solving larger problems.

Start with words and visuals. It is common for data analysts and stakeholders to interpret things
in different ways while assuming the other is on the same page. This illusion of agreement* has
been historically identified as a cause of projects going back-and-forth a number of times before a
direction is finally nailed down. To help avoid this, start with a description and a quick visual of
what you are trying to convey. Stakeholders have many points of view and may prefer to absorb
information in words or pictures. Work with them to make changes and improvements from there.
The faster everyone agrees, the faster you can perform the first analysis to test the usefulness of
the project, measure the feedback, learn from the data, and implement changes.

Communicate often. Your stakeholders will want regular updates on your projects. Share notes
about project milestones, setbacks, and changes. Then use your notes to create a shareable report.
Another great resource to use is a change-log, which is a tool that will be explored further
throughout the program. For now, just know that a change-log is a file containing a chronologically
ordered list of modifications made to a project. Depending on the way you set it up, stakeholders
can even pop in and view updates whenever they want.

*Jason Fried, Basecamp, www.inc.com/magazine/201809/jason-fried/illusion-agreement-team-


project.html

Use common multiple strategies ro reach your audience

Being able to communicate in multiple formats is a key skill for data analysts. Listening, speaking,
presenting, and writing skills will help you succeed in your projects and in your career. This
reading covers effective communication strategies, including examples of clearly worded emails
for common situations.
Here's an important first tip: Know your audience! When you communicate your analysis and
recommendations as a data analyst, it's vital to keep your audience in mind.
Be sure to answer these four important questions related to your audience:

1. Who is your audience?

2. What do they already know?

3. What do they need to know?

4. How can you best communicate what they need to know?

Project example

As a data analyst, you'll get plenty of requests and questions through email. Let’s walk through an
example of how you might approach answering one of these emails. Assume you're a data analyst
working at a company that develops mobile apps. Let's start by reviewing answers to the four
audience questions we just covered:

• Who is your audience?

Kiri, Product Development Project Manager

• What they already know

Kiri received updates about our project from its planning stages, including the most recent project
report, sent two weeks ago.
• What they need to know

Kiri needs an update on the analysis project’s progress and needs to know that the executive team
approved changes to the data and timeline. You know that adding a new variable to the analysis
will impact the current project timeline. Kiri will need to change the project’s milestones and
completion date.

• How can you communicate that effectively to them?

You can start by sending an email update to Kiri with the latest timeline for the project, but a
meeting might be necessary if she wants to talk through her concerns about missing a deadline.

Updated timeline email sample

After answering the audience questions, you have the key building blocks you need to write an
email to Kiri. Here's an example of how these questions can help organize the flow of the email
message:
After receiving your email, Kiri will have a clearer view of the changes to the analysis project and
will be able to make adjustments to work with the new timeline.

Project follow-up email sample

After the next report is completed, you can also send out a project update offering more
information. The email could look like this:

Good communication keeps stakeholders updated on progress and ultimately helps prevent
problems. Carefully worded responses are key. Whether you gather and address feedback using
email, meetings, or reports, everyone you work with will know what to expect. As a result, they
will be able to better manage their own schedules, resources, and teams.
Limitations of a data

Data is powerful, but it has its limitations. Has someone’s personal opinion found its way into
the numbers? Is your data telling the whole story? Part of being a great data analyst is knowing
the limits of data and planning for them. This reading explores how you can do that.

 The case of in complete (or non existent data)

If you have incomplete or nonexistent data, you might realize during an analysis that you don't
have enough data to reach a conclusion. Or, you might even be solving a different problem
altogether! For example, suppose you are looking for employees who earned a particular
certificate but discover that certification records go back only two years at your company. You
can still use the data, but you will need to make the limits of your analysis clear. You might be
able to find an alternate source of the data by contacting the company that led the training. But to
be safe, you should be up front about the incomplete dataset until that data becomes available.

 dont miss mis-aligned data

If you're collecting data from other teams and using existing spreadsheets, it is good to keep in
mind that people use different business rules. So one team might define and measure things in a
completely different way than another. For example, if a metric is the total number of trainees in
a certificate program, you could have one team that counts every person who registered for the
training, and another team that counts only the people who completed the program. In cases like
these, establishing how to measure things early on standardizes the data across the board for
greater reliability and accuracy. This will make sure comparisons between teams are meaningful
and insightful.

 deal with dirty data

Dirty data refers to data that contains errors. Dirty data can lead to productivity loss, unnecessary
spending, and unwise decision-making. A good data cleaning effort can help you avoid this. As a
quick reminder, data cleaning is the process of fixing or removing incorrect, corrupted,
incorrectly formatted, duplicate, or incomplete data within a dataset. When you find and fix the
errors - while tracking the changes you made - you can avoid a data disaster. You will learn how
to clean data later in the training.
 Tell a clear story

Avinash Kaushik, a Digital Marketing Evangelist for Google, has lots of great tips for data
analysts in his blog: Occam's Razor. Below are some of the best practices he recommends for
good data storytelling:

Compare the same types of data: Data can get mixed up when you chart it for visualization. Be
sure to compare the same types of data and double check that any segments in your chart
definitely display different metrics.

Visualize with care: A 0.01% drop in a score can look huge if you zoom in close enough. To
make sure your audience sees the full story clearly, it is a good idea to set your Y-axis to 0.

Leave out needless graphs: If a table can show your story at a glance, stick with the table
instead of a pie chart or a graph. Your busy audience will appreciate the clarity.

Test for statistical significance: Sometimes two datasets will look different, but you will need a
way to test whether the difference is real and important. So remember to run statistical tests to
see how much confidence you can place in that difference.

Pay attention to sample size: Gather lots of data. If a sample size is small, a few unusual
responses can skew the results. If you find that you have too little data, be careful about using it
to form judgments. Look for opportunities to collect more data, then chart those trends over
longer periods.

 be the judge

In any organization, a big part of a data analyst’s role is making sound judgments. When you
know the limitations of your data, you can make judgment calls that help people make better
decisions supported by the data. Data is an extremely powerful tool for decision-making, but if it
is incomplete, misaligned, or hasn’t been cleaned, then it can be misleading. Take the necessary
steps to make sure that your data is complete and consistent. Clean the data before you begin
your analysis to save yourself and possibly others a great amount of time and effort.
Lead great meetings

One day soon, you might find yourself planning a meeting in your role as a data analyst. Great
things can happen when participants anticipate a well-executed meeting. Attendees show up on
time. They aren’t distracted by their laptops and phones. They feel like their time will be well
spent. It all comes down to good planning and communication of expectations. The following are
our best practical tips for leading meetings.

Before the meeting


If you are organizing the meeting, you will probably talk about the data. Before the meeting:

 Identify your objective. Establish the purpose, goals, and desired outcomes of the
meeting, including any questions or requests that need to be addressed.

 Acknowledge participants and keep them involved with different points of view and
experiences with the data, the project, or the business.

 Organize the data to be presented. You might need to turn raw data into accessible
formats or create data visualizations.

 Prepare and distribute an agenda. We will go over this next.

Crafting a compelling agenda

A solid meeting agenda sets your meeting up for success. Here are the basic parts your agenda
should include:

 Meeting start and end time


 Meeting location (including information to participate remotely, if that option is
available)
 Objectives
 Background material or data the participants should review beforehand

Here's an example of an agenda for an analysis project that is just getting started:
Sharing your agenda ahead of time
After writing your agenda, it's time to share it with the invitees. Sharing the agenda with
everyone ahead of time helps them understand the meeting goals and prepare questions,
comments, or feedback. You can email the agenda or share it using another collaboration tool.

During the meeting


As the leader of the meeting, it's your job to guide the data discussion. With everyone well
informed of the meeting plan and goals, you can follow these steps to avoid any distractions:

 Make introductions (if necessary) and review key messages

 Present the data

 Discuss observations, interpretations, and implications of the data

 Take notes during the meeting

 Determine and summarize next steps for the group

After the meeting


To keep the project and everyone aligned, prepare and distribute a brief recap of the meeting
with next steps that were agreed upon in the meeting. You can even take it a step further by
asking for feedback from the team.

 Distribute any notes or data

 Confirm next steps and timeline for additional actions

 Ask for feedback (this is an effective way to figure out if you missed anything in your
recap)

A final word about meetings


Even with the most careful planning and detailed agendas, meetings can sometimes go off track.
An emergency situation might steal people’s attention. A recent decision might unexpectedly
change requirements that were previously discussed and agreed on. Action items might not apply
to the current situation. If this happens, you might be forced to shorten or cancel your meeting.
That's all right; just be sure to discuss anything that impacts your project with your manager or
stakeholders and reschedule your meeting after you have more information.

END MODULE 4

You might also like