Assignment Unit 2
Assignment Unit 2
Please answer the questions and do not share the questions or answers with
others or accept advance access to questions or answers from others.
Value: 1 2 3 4 5 6 7
Cumulative .16 .33 .35 .58 .81 .97 1.00
Relative
Frequency: .16 .17 .02 .23 .23 .16 .03
Tasks
2. There is a cumulative relative frequency table printed above for petal lengths
(using rounded values for petal length). Below the number 3 in that table is the
number .35. What does .35 represent? (multiple choice)
Answer : d. Of all the flowers measured in this sample 35% had a petal length of 3 or less
(after rounding the petal lengths).
3. Using only the cumulative relative frequency table printed above combined
with some simple paper-and-pencil calculations, which petal length occurs most
frequently ?
4. Describe how you determined your answer to the previous question (describe
the calculations that you used). Do not show R code for this task--it will not be
counted as an answer I have reached this answer by subtracting 7th value of the
Cumulative relative frequency (CRF) from the 6th vale of the CRF
my work
1) 0.16 – 0 = 0.16
------------------------------------------------------------------------------------------
--------------
5. Assuming that you read the flowers.csv file into an R object called flower.data,
run the following R code (do not paste the ">” character into R) and paste both
the command and the output into your answer (you should see five names, each
of which should be enclosed in quotes--if you do not see this, try again or
contact your instructor):
> names(flower.data)
Answer: It outputted the name of the columns of the tables (variables) “Sepal.Length”,
“Sepal.Width” “Petal.Length”, “Petal.Width”, “Species”
7. List the variables in the data frame (you can do this by entering the name of
the R object that holds that data that you read using the read.csv command--you
should have called it flower.data). If you do not see five columns of data, then
there was a problem reading the input file--try again or contact your instructor.
For each variable identify the type of the variable (factor or numeric).
The name and type of the 1st variable:_ Sepal.Length – numeric
The name and type of the 2nd variable: Sepal.Width - numeric
The name and type of the 3nd variable: Petal.Length –numeric
The name and type of the 4nd variable: Petal.Width –numeric
The name and type of the 5nd variable: Species – factors
8. Round the data for the variable Sepal.Length so that it contains integers, then
find the frequency of the value 7 (not the relative frequency): ____.
Answer
[1] 5 5 5 5 5 5 5 5 4 5 5 5 5 4 6 6 5 5 6 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 6 5 5 6 [38] 5 4 5 5 4 4 5 5 5 5 5
5 5 7 6 7 6 6 6 6 5 7 5 5 6 6 6 6 7 6 6 6 6 6 6 6 6 [75] 6 7 7 7 6 6 6 6 6 6 5 6 7 6 6 6 6 6 6 5 6 6 6 6 5
6 6 6 7 6 6 8 5 7 7 7 6 [112] 6 7 6 6 6 6 8 8 6 7 6 8 6 7 7 6 6 6 7 7 8 6 6 6 8 6 6 6 7 7 7 6 7 7 7 6 6
[149] 6 6
> table (round.flower)
>round.flower
45678
5 47 68 24 6
------------------------------------------------------------------------------------------
-----------------
Assuming that you read the flowers.csv file into an R object called flower.data,
run the following R code (do not paste the ">” character into R). Note that we
are not rounding the numbers here. Use the output for the next five tasks:
9. What is the sum of the first three frequencies in the frequency table for sepal
width? _( 1+ 3+ 4) = 8
10. What does your answer to the previous question represent (in terms of sepal
width and frequency and the percentage of all sepal measurements) It represents
the number of the first three observations of the variable's frequency (sepal width) within the
population sample
11. What is the sum of the last three frequencies in the frequency table for sepal
width? (1 + 1 +1)= 3
12. How many flowers in the sample had sepal widths less than 4 (do NOT round
the sepal width numbers for this, but you can round your final answer to 3
decimal places)? 146 flowers has less 4 sepal width
13. What does the tallest bar in the plot represent? (multiple choice)
a. mean
b. mode
c. median
------------------------------------------------------------------------------------------
----------------
14. Create a frequency table that shows the frequencies for each species of flower
in the sample. Paste your R command and output into your answer (do NOT
display data from a data frame, display data using the table() command)
>table (flower.data$Species)
50 50 50
15. Explain two things about the table that you created for the previous task:
Why did the frequency table for flower species contain words in the first row as
opposed to numbers?
Because the variable species is qualitative data which means it hold words not numbers hence
the R code saved it as factors not as a numeric value
What is the meaning of the numbers in the second row of the table?
It represented the number of each species in the population. From my calculation each species
had same share, it give us a total of 150 observation. In other word each species made up 1/3
of the population .__
Reference