0% found this document useful (0 votes)
2 views4 pages

Assignment Unit 2

The document is an assignment for a statistics course, focusing on data interpretation and analysis using R programming. It includes definitions of botanical terms, questions about cumulative frequency tables, and tasks related to analyzing a dataset of flower measurements. The assignment requires students to perform calculations, interpret results, and explain their findings based on the provided data.

Uploaded by

teacher.naderah
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views4 pages

Assignment Unit 2

The document is an assignment for a statistics course, focusing on data interpretation and analysis using R programming. It includes definitions of botanical terms, questions about cumulative frequency tables, and tasks related to analyzing a dataset of flower measurements. The assignment requires students to perform calculations, interpret results, and explain their findings based on the provided data.

Uploaded by

teacher.naderah
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 4

Assignment Unit 2

Course Unit: MATH 1280 - Introduction to Statistics

Course Instructor: SP Chan

Student name:

Date: 24th November 2022


1. Sometimes it is difficult to understand data if you do not know what the
numbers represent. Provide short definitions of two words: sepal,
and petal (be sure to cite your sources even if you paraphrase):

Sepal: ____ Each calyx component of a flower, which encloses the petals and
is often green but also leaf-like.

Petal: __ Every one of the modified leaf-like corolla segments, that are usually
colored, on a flower.

2. There is a cumulative relative frequency table printed above for petal


lengths (using rounded values for petal length). Below the number 3 in that
table is the number .35. What does .35 represent? (multiple choice)

d. Of all the flowers measured in this sample 35% had a petal length of 3 or
less (after rounding the petal lengths).

3. Using only the cumulative relative frequency table printed above


combined with some simple paper-and-pencil calculations, which petal
length occurs most frequently?

4 and 5

4. Describe how you determined your answer to the previous question


(describe the calculations that you used). Do not show R code for this task--it
will not be counted as an answer. _Petal relative length = (previous
cumulative frequency less current cumulative frequency) most frequent is
the highest number.

5. Assuming that you read the flowers.csv file into an R object called
flower.data, run the following R code (do not paste the ">” character into R)
and paste both the command and the output into your answer (you should
see five names, each of which should be enclosed in quotes--if you do not
see this, try again or contact your instructor):

> names(flower.data)

[1] "Sepal.Length" "Sepal.Width" "Petal.Length" "Petal.Width" "Species"

6. The number of observations in the "flower.data" data frame is: __150__.

7. List the variables in the data frame (you can do this by entering the name
of the R object that holds that data that you read using the read.csv
command--you should have called it flower.data). If you do not see five
columns of data, then there was a problem reading the input file--try again
or contact your instructor. For each variable identify the type of the variable
(factor or numeric).
The name and type of the 1st variable:__ Sepal.Lenght, numeric
The name and type of the 2nd variable:__ Sepal.Width, numeric
The name and type of the 3nd variable:__ Petal.Lenght, numeric
The name and type of the 4nd variable:__ Petal.Width, numeric
The name and type of the 5nd variable: __ Species, factor

8. Round the data for the variable Sepal.Length so that it contains integers,
then find the frequency of the value 7 (not the relative frequency): __24

Assuming that you read the flowers.csv file into an R object called
flower.data, run the following R code (do not paste the ">” character into R).
Note that we are not rounding the numbers here. Use the output for the next
five tasks:

> table(flower.data$Sepal.Width)
> plot(table(flower.data$Sepal.Width))

9. What is the sum of the first three frequencies in the frequency table for
sepal width? _8____

10. What does your answer to the previous question represent (in terms of
sepal width and frequency and the percentage of all sepal measurements)
____ 8 sepal ≤ to 2.3

11. What is the sum of the last three frequencies in the frequency table for
sepal width? __3___

12. How many flowers in the sample had sepal widths less than 4 (do NOT
round the sepal width numbers for this, but you can round your final answer
to 3 decimal places)? _________

13. What does the tallest bar in the plot represent? (multiple choice)
b. mode

14. Create a frequency table that shows the frequencies for each species of
flower in the sample. Paste your R command and output into your answer
(do NOT display data from a data frame, display data using the table()
command)_________

15. Explain two things about the table that you created for the previous task:

Why did the frequency table for flower species contain words in the first row
as opposed to numbers?_they are factors___
What is the meaning of the numbers in the second row of the table? _ they
represent frequency of the species in the sample.

Reference,

You might also like