0% found this document useful (0 votes)
5 views

Unit 2 Project - Data Analysis Project

The document outlines a data analysis project involving a spreadsheet containing personal data such as height, weight, and age. Participants are instructed to clean the data, perform calculations, and create visualizations using Google Sheets. The project requires submission of both the completed Google Sheet and this document with typed answers.

Uploaded by

27datala
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views

Unit 2 Project - Data Analysis Project

The document outlines a data analysis project involving a spreadsheet containing personal data such as height, weight, and age. Participants are instructed to clean the data, perform calculations, and create visualizations using Google Sheets. The project requires submission of both the completed Google Sheet and this document with typed answers.

Uploaded by

27datala
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 4

Unit 2: Digital Information

Data Analysis Project


The link below is a spreadsheet to a large amount of data. Each row in the spreadsheet represents a person and
that person’s height, weight, and age was gathered. You goal today is to sort, organize, analyze, and help create
a visualization of this data.

Here is a link to the data. When you click on it, you will be prompted to make a copy.

1) Put your last name in this title (replace the “NAME” in the title with your name)

2) Label the tab "Original Data"

3) Make a copy ("Duplicate") and call that tab "Cleaned Data"

4) Follow the instructions below; some of the instructions will require you to work in your Google Sheets while
others will have you type your answer in this document. You will be submitting both your Google Sheet as well
as this document with your typed answers.

Gathering/Cleaning Data
1) Looking at the data, why does it appear to be "messy"?
TYPE ANSWER HERE

2) Trying to do math calculations or trying to graph this data will be impossible; we have to clean the data first.
Recall that "cleaning data" is a process that makes the data uniform without changing their meaning.

● Change all heights over to inches: 5' 11" would become 71, 71" would become 71, etc. You want all
integers, with no spaces or symbols. Do this CAREFULLY. Details matter.

● Change all weights to pounds and make sure there are no


units attached to them: 235 pounds would become 235

● Change all the ages to years and make sure that they are all
integers (no units)

3) To gather the heights, I just posed an open-ended question


saying "What is your height?". What is a way I could have
gathered this information that would result in me not having to
"clean" so much of it?
TYPE ANSWER HERE

Analyzing Data
1) To sort the data, click on the column, and to sort numerical from least to greatest, select A → Z. When you
do that for the height's column, what happens to the other columns? Why?
TYPE ANSWER HERE

2) To count up how many people participated in this survey, we use a function called COUNT. To use a
function, you always start with the "=" sign in your cell and then type your function's name. Inside the ( ), you
usually put a range of some sort; it depends on the function!

For our COUNT function, it looks like this:

=COUNT(A2:A#)

Where A2 is the starting cell and A# is the cell that you end at. Using this, how many people participated in the
survey? TYPE ANSWER HERE

3) Now we will look at some functions for finding some summary statistics. In the table below, use the formulas
to find the value and type what they are.

Average

Average Average Average


Height: answer Weight: answer Age: answer

Max Maximum Maximum Maximum


Height: answer Weight: answer Age: answer

Min

Minimum Minimum Minimum


Height: answer Weight: answer Age: answer
Standard Deviation

Standard Standard Standard


Deviation of Deviation of Deviation of
Height: answer Weight: answer Age: answer

Visualizing Data
1) We will be creating a bar graph of the ages who participated in the
survey. To make a bar graph, we first need a frequency table. A frequency
table lists even intervals of our ages and then the number of ages that fall in
that range. The table to the right is what you should put in your Google
Sheets.

Why is it necessary to make a table like this before we create our graph?
TYPE ANSWER HERE

2) To get the count (number of ages that fall in each interval) doing it by hand would take a long time and could
lead to human error. There is a function in Google Sheets called "=COUNTIF()" and it is a way for us to count
the number of cells that fit a certain description.

With our age ranges, however, we want MULTIPLE criteria. Think about this: for the first range, we want to
count the ages that are 10 or larger but that are also 19 or lower. Because of this, we use the function
=COUNTIFS(
Our ranges for each criteria will be the same, we just want to check two different inequalities: ">=10" and
"<=19":

Repeat this for each range to complete your table. Fill in the table below with the formulas you used:

Age Range Count

10 - 19 =COUNTIFS(C2:C98, ">=10", C2:C98, "<=19")

20 - 29 TYPE FORMULA HERE

30 - 39 TYPE FORMULA HERE

40 - 49 TYPE FORMULA HERE

50 - 59 TYPE FORMULA HERE

60 - 69 TYPE FORMULA HERE

70 - 79 TYPE FORMULA HERE

To then create your bar graph, you will highlight the table with your data, click on INSERT and then down to
CHART and then you will select the Column Chart under chart type. Make sure your graph is clearly labeled in
your Google Sheet.

3) Create a graph that compares the heights and the weights of everyone in the data set. Make sure the graph is
well labeled and visually satisfying. Leave the graph in your Google Sheet for evaluation.

Submitting
Once you are finished, submit this document with some of your typed answers, as well as your Google Sheet, to
the Unit 2 Project assignment in the Google Classroom.

You might also like