HW1 PDF
HW1 PDF
2- A survey was conducted on 193 Yale University undergraduate students who took
Statistical Inference course in 2016. This survey asked the students about their GPA,
which can range between 0 and 4 points, and number of hours they studied per week. The
relationship between these two variables is shown in scatterplot below:
3- Number of infants per 1000 live birth is called infant mortality rate. Estimated infant
mortality rate in 2012 is shown for 222 countries in the data of histograms bellow. The
histogram on the left side displays frequencies, and the other displays relative
frequencies.
Homework 1
Statistical Inference, Spring 97
a. Examine the shape of distribution for each histogram. Which of these words are
appropriate to describe the shape of the distribution:
left-skewed, right-skewed, symmetric, uniform, bimodal, large outliers?
b. Estimate minimum, Q1, median, Q3, maximum, and mode of the distribution.
c. Is mean larger or smaller than the median? Why?
d. Which boxplot is associated to the infant mortality data?
e. What features of the distribution are apparent in the histogram and not the box
plot?
5- The following paragraph presents description of a study. Identify the explanatory and
response variables. Can presence of a confounding variable be found in this study? If the
answer is positive, how can you change the study in order to eliminate the effect of the
confounding variable?
In an observational study, more than 50 people who previously worked for an
international organization applied for a position and were asked to get a letter of
motivation from their previous employers. It was observed that letters of
recommendation that contained more detail were much more persuasive and
acceptable than letters contained less detail.
6- The EIA test (a test used to examine if an individual suffers from HIV) gives a positive
result with probability of 0.006 if a person is HIV negative.
a. Two individuals who are both HIV negative are tested. What is the probability of
having at least a positive test result?
b. What is the probability of getting at least a positive test result if three people who
are all HIV negative take the test?
7- Misleading graph maybe a result of trying to impress readers by showing them the false
information. One way of generating such a graph is to mispresent performance.
a. Why is the bar chart below misleading? How should the information be
represented?
8- In an observational study, 60 patients who suffered from high blood pressure were given
a specific treatment. 16 out of 25 women who were under treatment were cured. Also, 12
out of 35 men under treatment were cured. We want to test weather this cure has the same
effect on men as on women or not.
a. Based on the mosaic plot, is gender independent of weather the treatment works
or not? Explain your reasoning.
b. Explain null hypothesis and alternative hypothesis in terms of words.
c. Suppose we want to use simulation method in order to test our hypothesis.
Describe the simulation process.
d. What do the simulation results shown below suggest about the effect of gender on
whether the patients are cured or not?
9- Search for a piece of news in the past year that contains statistical information and
critically analyze it. Look for sources of bias, statistical mistakes, fallacies,
misinformation, misleading graphs, and other common deficiencies.