Unit 2 Project - Data Analysis Project
Unit 2 Project - Data Analysis Project
Here is a link to the data. When you click on it, you will be prompted to make a copy.
1) Put your last name in this title (replace the “NAME” in the title with your name)
4) Follow the instructions below; some of the instructions will require you to work in your Google Sheets while
others will have you type your answer in this document. You will be submitting both your Google Sheet as well
as this document with your typed answers.
Gathering/Cleaning Data
1) Looking at the data, why does it appear to be "messy"?
TYPE ANSWER HERE
2) Trying to do math calculations or trying to graph this data will be impossible; we have to clean the data first.
Recall that "cleaning data" is a process that makes the data uniform without changing their meaning.
● Change all heights over to inches: 5' 11" would become 71, 71" would become 71, etc. You want all
integers, with no spaces or symbols. Do this CAREFULLY. Details matter.
● Change all the ages to years and make sure that they are all
integers (no units)
Analyzing Data
1) To sort the data, click on the column, and to sort numerical from least to greatest, select A → Z. When you
do that for the height's column, what happens to the other columns? Why?
TYPE ANSWER HERE
2) To count up how many people participated in this survey, we use a function called COUNT. To use a
function, you always start with the "=" sign in your cell and then type your function's name. Inside the ( ), you
usually put a range of some sort; it depends on the function!
=COUNT(A2:A#)
Where A2 is the starting cell and A# is the cell that you end at. Using this, how many people participated in the
survey? TYPE ANSWER HERE
3) Now we will look at some functions for finding some summary statistics. In the table below, use the formulas
to find the value and type what they are.
Average
Min
Visualizing Data
1) We will be creating a bar graph of the ages who participated in the
survey. To make a bar graph, we first need a frequency table. A frequency
table lists even intervals of our ages and then the number of ages that fall in
that range. The table to the right is what you should put in your Google
Sheets.
Why is it necessary to make a table like this before we create our graph?
TYPE ANSWER HERE
2) To get the count (number of ages that fall in each interval) doing it by hand would take a long time and could
lead to human error. There is a function in Google Sheets called "=COUNTIF()" and it is a way for us to count
the number of cells that fit a certain description.
With our age ranges, however, we want MULTIPLE criteria. Think about this: for the first range, we want to
count the ages that are 10 or larger but that are also 19 or lower. Because of this, we use the function
=COUNTIFS(
Our ranges for each criteria will be the same, we just want to check two different inequalities: ">=10" and
"<=19":
Repeat this for each range to complete your table. Fill in the table below with the formulas you used:
To then create your bar graph, you will highlight the table with your data, click on INSERT and then down to
CHART and then you will select the Column Chart under chart type. Make sure your graph is clearly labeled in
your Google Sheet.
3) Create a graph that compares the heights and the weights of everyone in the data set. Make sure the graph is
well labeled and visually satisfying. Leave the graph in your Google Sheet for evaluation.
Submitting
Once you are finished, submit this document with some of your typed answers, as well as your Google Sheet, to
the Unit 2 Project assignment in the Google Classroom.