0% found this document useful (0 votes)
38 views5 pages

Assignment Unit 2

1) The document provides instructions and examples for analyzing iris flower data from a file called "flowers.csv" using R. It includes tasks to create data frames, calculate statistics, and answer questions about the data. 2) A frequency table is shown for petal length, and tasks ask the learner to interpret values in the cumulative relative frequency column and determine which petal length is most frequent. 3) Additional tasks have the learner read the flower data, identify variable names and types, calculate frequencies of values, and create and interpret frequency tables in R.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
38 views5 pages

Assignment Unit 2

1) The document provides instructions and examples for analyzing iris flower data from a file called "flowers.csv" using R. It includes tasks to create data frames, calculate statistics, and answer questions about the data. 2) A frequency table is shown for petal length, and tasks ask the learner to interpret values in the cumulative relative frequency column and determine which petal length is most frequent. 3) Additional tasks have the learner read the flower data, identify variable names and types, calculate frequencies of values, and create and interpret frequency tables in R.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 5

Quizzes and graded assignments in MATH1280 are NOT group projects.

Please answer the questions and do not share the questions or answers with
others or accept advance access to questions or answers from others.

The file "flowers.csv" file contains information on measurements of the iris


flowers. Create an R data frame by the name "flower.data" that contains the data
in the file.

The following R code shows an example of how to round a vector of numbers to


zero decimal places and then calculate some statistics using the rounded
numbers.  You might need some of the calculations for this assignment, but you
might not need others. You would replace example$years with the name of the R
object that you want to analyze (in other programming languages, you might call
example$years a variable).  

> x <- round(example$years, 0)


> freq <- table(x)
> rel.freq <- freq/sum(freq)
> cumsum(rel.freq)

Cumulative Frequency Table for Petal Length

Use the following table to answer tasks 2-4.

Value: 1 2 3 4 5 6 7
Cumulative .16 .33 .35 .58 .81 .97 1.00
Relative
Frequency: .16 .17 .02 .23 .23 .16 .03
Tasks

1. Sometimes it is difficult to understand data if you do not know what the


numbers represent. Provide short definitions of two words: sepal, and petal (be
sure to cite your sources even if you paraphrase):

sepal: : is one the individual part of the calyx of a flower

petal: is one of the often colored segment of a corolla flower

2. There is a cumulative relative frequency table printed above for petal lengths
(using rounded values for petal length).  Below the number 3 in that table is the
number .35.  What does .35 represent? (multiple choice)
Answer : d. Of all the flowers measured in this sample 35% had a petal length of 3 or less
(after rounding the petal lengths).

a. Three of the flowers had petal length of 0.35.


b. There were 0.35 observations that had petal length of 3 (after rounding the
petal lengths).
c. Of all the flowers measured in this sample 35% had a petal length of 3 (after
rounding the petal lengths).
d. Of all the flowers measured in this sample 35% had a petal length of 3 or less
(after rounding the petal lengths).
e. A study of all flowers on the planet would show that about 35% had petal
lengths of 3 or less (after rounding the petal lengths).

3. Using only the cumulative relative frequency table printed above combined
with some simple paper-and-pencil calculations, which petal length occurs most
frequently ?

4 > 0.23 , 5 > 0.23

4. Describe how you determined your answer to the previous question (describe
the calculations that you used). Do not show R code for this task--it will not be
counted as an answer I have reached this answer by subtracting 7th value of the
Cumulative relative frequency (CRF) from the 6th vale of the CRF

my work

7) 1.00 -0.97 =0.03

6) 0.97 – 0.81 =0.16

5) 0.81 – 0.58 = 0.23

4) 0.58 -0.35 =0.23

3) 0.35 – 0.33 = 0.02

2) 0.33 – 0.16 = 0.17

1) 0.16 – 0 = 0.16

------------------------------------------------------------------------------------------
--------------
5. Assuming that you read the flowers.csv file into an R object called flower.data,
run the following R code (do not paste the ">” character into R) and paste both
the command and the output into your answer (you should see five names, each
of which should be enclosed in quotes--if you do not see this, try again or
contact your instructor):

> names(flower.data)

Answer: It outputted the name of the columns of the tables (variables) “Sepal.Length”,
“Sepal.Width” “Petal.Length”, “Petal.Width”, “Species”

6. The number of observations in the "flower.data" data frame is: _150___.

7. List the variables in the data frame (you can do this by entering the name of
the R object that holds that data that you read using the read.csv command--you
should have called it flower.data).  If you do not see five columns of data, then
there was a problem reading the input file--try again or contact your instructor. 
For each variable identify the type of the variable (factor or numeric).
    The name and type of the 1st variable:_ Sepal.Length – numeric
    The name and type of the 2nd variable: Sepal.Width - numeric
    The name and type of the 3nd variable: Petal.Length –numeric
    The name and type of the 4nd variable: Petal.Width –numeric
    The name and type of the 5nd variable: Species – factors

8. Round the data for the variable Sepal.Length so that it contains integers, then
find the frequency of the value 7 (not the relative frequency): ____.

> round.flower <- round(flower.data$Sepal.Len gth) > round.flower <-


round(flower.data$Sepal.Len gth) > round.flower <-
round(flower.data$Sepal.Len gth) > round.flower <-
round(flower.data$Sepal.Len gth) > round.flower <-
round(flower.data$Sepal.Len gth)

Answer

>round.flower < round (flower.data$Sepal.Legnth) > round.flower

[1] 5 5 5 5 5 5 5 5 4 5 5 5 5 4 6 6 5 5 6 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 6 5 5 6 [38] 5 4 5 5 4 4 5 5 5 5 5
5 5 7 6 7 6 6 6 6 5 7 5 5 6 6 6 6 7 6 6 6 6 6 6 6 6 [75] 6 7 7 7 6 6 6 6 6 6 5 6 7 6 6 6 6 6 6 5 6 6 6 6 5
6 6 6 7 6 6 8 5 7 7 7 6 [112] 6 7 6 6 6 6 8 8 6 7 6 8 6 7 7 6 6 6 7 7 8 6 6 6 8 6 6 6 7 7 7 6 7 7 7 6 6
[149] 6 6
> table (round.flower)

>round.flower

45678

5 47 68 24 6

Hence the frequency of 7 is 24

------------------------------------------------------------------------------------------
-----------------

Assuming that you read the flowers.csv file into an R object called flower.data,
run the following R code (do not paste the ">” character into R).  Note that we
are not rounding the numbers here. Use the output for the next five tasks:

> table(flower.data$Sepal.Width) 2 2 2 2 3 2 2 5 2 6 2 7 2 8 2 9 33. 13. 23. 33. 43. 53. 63.


73. 83. 94 1 3 4 3 8 5 9 14 10 26 1 13 6 12 6 4 3 6 2 1 4. 14. 24. 4 111
> plot(table(flower.data$Sepal.Width))

9. What is the sum of the first three frequencies in the frequency table for sepal
width? _( 1+ 3+ 4) = 8

10. What does your answer to the previous question represent (in terms of sepal
width and frequency and the percentage of all sepal measurements) It represents
the number of the first three observations of the variable's frequency (sepal width) within the
population sample

11. What is the sum of the last three frequencies in the frequency table for sepal
width? (1 + 1 +1)= 3

12. How many flowers in the sample had sepal widths less than 4 (do NOT round
the sepal width numbers for this, but you can round your final answer to 3
decimal places)? 146 flowers has less 4 sepal width

13. What does the tallest bar in the plot represent?  (multiple choice)

a. mean
b. mode
c. median
------------------------------------------------------------------------------------------
----------------

14. Create a frequency table that shows the frequencies for each species of flower
in the sample.  Paste your R command and output into your answer (do NOT
display data from a data frame, display data using the table() command)

>table (flower.data$Species)

setosa versicolor virginica

50 50 50

15. Explain two things about the table that you created for the previous task:

Why did the frequency table for flower species contain words in the first row as
opposed to numbers?

Because the variable species is qualitative data which means it hold words not numbers hence
the R code saved it as factors not as a numeric value

What is the meaning of the numbers in the second row of the table?

It represented the number of each species in the population. From my calculation each species
had same share, it give us a total of 150 observation. In other word each species made up 1/3
of the population .__

Reference

Sepal definition n.d dictionary.com retrieved on 21 April 2021


https://www.dictionary.com/browse/sepa

Petal definition n.d dictionary.com retrieved on 21 April 2021


https://www.dictionary.com/browse/petal

You might also like