Assignment 1 STAT2263
Assignment 1 STAT2263
Submit before 5:00 PM on Monday, January 22, 2024. Please scan or take CLEAR
pictures of your solutions to be submitted. (55 points)
Input your solution (either via typed text or by uploading a photo of your answer) in the correct
box on the Crowdmark assignment. For example, your solution to Q1 will be inserted into the Q1
answer box. Please make sure your answers align with the appropriate questions.
(21) Q1. For each of the scenarios in a)-c), answer the following questions:
Q1.a) A 2015 study used data from the Canadian Community Health Survey (CCHS) to study
the smoking habits of Canadian residents. The survey was completed by 57 000 Canadians.
Of primary interest was the participant’s smoking status (smoker or non-smoker) and the
number of cigarettes smoked per day.
Q1.b) Researchers are interested in examining the relationship between nitrogen dioxide (air pol-
lutant) levels and gestational age (period of time between conception and birth). During
the study period, researchers obtained data on nitrogen dioxide (in parts per million) col-
lected by air quality monitoring stations scattered throughout the city of Ottawa, Canada.
Data on the length of gestation (in completed weeks) was collected for 15 000 births in
Ottawa between 1989 and 1993, and nitrogen dioxide exposure during gestation was cal-
culated for each birth.
Q1.c) A group of researchers was interested in determining whether acupuncture relieves mi-
graine pain. To do this, the researchers enrolled individuals diagnosed with migraine
headaches at a single clinic in Ottawa, Canada, and randomly assigned them to one of
two groups: treatment or control. 50 individuals in the treatment group received acupunc-
ture that is specifically designed to treat migraines. 50 individuals in the control group
received placebo acupuncture (needle insertion at non-acupoint locations). 24 hours later,
all individuals were asked to rate their migraine pain as mild, moderate, or severe.
1
(6) Q2. Suppose you would like to replicate the study in the research scenario described in Q1.a) at
UNBSJ. You design a questionnaire that you will distribute to students asking them about
their smoking habits. Now, you must decide on how you will select students to participate in
your study.
Q2.a) Briefly (2-3 sentences) describe how simple random sampling could be used to select study
participants (2 pt).
Q2.b) Briefly (2-3 sentences) describe how random cluster sampling could be used to select study
participants (be sure to identify what your clusters are) (2 pt).
Q2.c) Briefly (2-3 sentences) describe how random stratified sampling could be used to select
study participants (be sure to identify what your strata are) (2 pt).
(12) Q3. Describe the distribution in the histograms below and estimate the mean and median (3 pt).
Match each histogram to the corresponding boxplot (1 pt).
Q3.a) Input your description of the histogram in plot (a), an estimate of the mean and median,
and whether it corresponds to boxplot (1), (2), or (3).
Q3.b) Input your description of the histogram in plot (b), an estimate of the mean and median,
and whether it corresponds to boxplot (1), (2), or (3).
Q3.c) Input your description of the histogram in plot (c), an estimate of the mean and median,
and whether it corresponds to boxplot (1), (2), or (3).
2
Minitab Questions: Download the dataset “2022 US Birth Data” from the Assignments module
on D2L. This is a data set containing birth information from 500 randomly selected pregnancies in
2022 in the United States.
(5) Q4. Consider the variable “Maternal Educ”. This is the highest degree or level of school completed
by the mother at the time of the delivery.
Q4.a) Classify this variable as quantitative or qualitative. If quantitative, state whether the
variable is discrete or continuous and the appropriate units. If categorical, state whether
the variable is nominal or ordinal and the number of levels. (1 pt)
Q4.b) Construct a bar chart for maternal education (note: ensure the bars are appropriately
ordered, if applicable). Upload a copy of chart. (2 pt)
Q4.c) Construct a frequency distribution and display it as a table. Your table should include a
column for frequency (counts), percents, and cumulative percents. (note: ensure the rows
are appropriately ordered, if applicable). Upload a copy of the Minitab output. (1 pt)
Q4.d) What percentage of mothers have a Bachelors Degree or Associate Degree? (1 pt)
(5) Q5. Consider the variable “Infant Birthweight”. This is the birthweight of the baby measured in
grams.
Q5.a) Classify this variable as quantitative or qualitative. If quantitative, state whether the
variable is discrete or continuous and the appropriate units. If categorical, state whether
the variable is nominal or ordinal and the number of levels. (1 pt)
Q5.b) Use Minitab to compute descriptive statistics for this variable. Upload a copy of the
Minitab output. (1 pt)
Q5.c) Construct the appropriate graph to visualize the distribution of infant birthweight. Upload
a copy of your graph. Make sure that the axes are correctly labelled. (2 pt)
Q5.d) Describe the distribution of infant birthweight. (1 pt)
(6) Q6. Consider the variable “Maternal Age”. This is the age of the mother (in years) at the time of
delivery.
Q6.a) Use Minitab to compute descriptive statistics for this variable. Upload a copy of the
Minitab output. (1 pt)
Q6.b) Using the Minitab output in a), what is the interquartile range? (1 pt)
Q6.c) Using the Minitab output in a), are there any outliers? How do you know? (2 pt)
Q6.d) Create a boxplot of the data in “Maternal Age”. Label the Q1 , median, and Q3 on the
plot. (2 pt)