Comparison of Means: Hypothesis Testing
Comparison of Means: Hypothesis Testing
Comparison of Means: Hypothesis Testing
Comparison of Means
There are many cases in statistics where you’ll want to compare means for two populations or
samples. Which technique you use depends on what type of data you have and how that data is
grouped together.
The four major ways of comparing means from data that is assumed to be normally distributed are:
1. Independent Samples T-Test. Use the independent samples t-test when you want
to compare means for two data sets that are independent from each other. Click here for a
step by step article.
2. One sample T-Test. Choose this when you want to compare means between one data set
and a specified constant (like the mean from a hypothetical normal distribution). Click
here for a step by step article.
3. Paired Samples T-Test. Use this test if you have one group tested at two different times. In
other words, you have two measurements on the same item, person, or thing.The
groups are “paired” because there intrinsic connections between them (i.e. they
are not independent). This comparison of means is often used for groups of patients before
treatment and after treatment, or for students tested before remediation and after
remediation. Click here for a step by step article.
4. One way Analysis of Variance (ANOVA). Although not really a test for comparison of
means, ANOVA is the main option when you have more than two levels of independent
variable. For example, if your independent variable was “brand of coffee” your levels might
be Starbucks, Peets and Trader Joe’s. Use this test when you have a group of individuals
randomly split into smaller groups and completing different tasks (like drinking different
coffee).
1
Testing a proportion,
The One-Sample Proportion Test is used to assess whether a population proportion (P1) is
significantly different from a hypothesized value (P0). This is called the hypothesis of inequality. The
hypotheses may be stated in terms of the proportions, their difference, their ratio, or their odds
ratio, but all four hypotheses result in the same test statistics.
If the test concludes that the correlation coefficient is significantly different from zero,
we say that the correlation coefficient is “significant.”
If the test concludes that the correlation coefficient is not significantly different from
zero (it is close to zero), we say that correlation coefficient is “not significant.”
one-sample t-test can be used only, when the data are normally distributed
one-sample t-test is used to compare the mean of one sample to a known
standard (or theoretical/hypothetical) mean (μ).a
What Is a T-Test?
A t-test is a type of inferential statistic used to determine if there is a significant
difference between the means of two groups, which may be related in certain
features. It is mostly used when the data sets, like the data set recorded as
the outcome from flipping a coin 100 times, would follow a normal distribution
and may have unknown variances. A t-test is used as a hypothesis testing
tool, which allows testing of an assumption applicable to a population.
• Calculating a t-test requires three key data values. They include the
difference between the mean values from each data set (called the
mean difference), the standard deviation of each group, and the number
of data values of each group.
• There are several different types of t-test that can be performed
depending on the data and type of analysis required.
• A large t-score indicates that the groups are different.
• A small t-score indicates that the groups are similar.
Z-TEST
2
• A z-statistic, or z-score, is a number representing the result from the z-
test.
• Z-tests are closely related to t-tests, but t-tests are best performed when
an experiment has a small sample size.
• Also, t-tests assume the standard deviation is unknown, while z-tests
assume it is known.
F TEST :
An F-test is any statistical test in which the test statistic has an F-distribution under the null
hypothesis. It is most often used when comparing statistical models that have been fitted to a
data set, in order to identify the model that best fits the population from which the data were
sampled.
a technique, which compares the samples on the basis of their means, is called
ANOVA.
Analysis of variance (ANOVA) is a statistical technique that is used to check if
the means of two or more groups are significantly different from each other.
ANOVA checks the impact of one or more factors by comparing the means of
different samples.
3
Another measure to compare the samples is called a t-test. When we have only
two samples, t-test and ANOVA give the same results. However, using a t-test
would not be reliable in cases where there are more than 2 samples. If we
conduct multiple t-tests for comparing more than two samples, it will have a
compounded effect on the error rate of the result.
There are two kinds of means that we use in ANOVA calculations, which are
separate sample means and the grand mean . The grand mean
is the mean of sample means or the mean of all observations combined,
irrespective of the samples
The Null hypothesis in ANOVA is valid when all the sample means are equal,
or they don’t have any significant difference.
On the other hand, the alternate hypothesis is valid when at least one of the
sample means is different from the rest of the sample means. In mathematical
form, they can be represented as:
The one-way ANOVA compares the means between the groups you are
interested in and determines whether any of those means are statistically
significantly different from each other. Specifically, it tests the null
hypothesis:
where µ = group mean and k = number of groups. If, however, the one-way
ANOVA returns a statistically significant result, we accept the alternative
hypothesis (HA), which is that there are at least two group means that are
statistically significantly different from each other.
4
ANOVA with interaction effects:
Interaction effects represent the combined effects of factors on the dependent measure. When an
interaction effect is present, the impact of one factor depends on the level of the other factor. Part
of the power of ANOVA is the ability to estimate and test interaction effects.
F-Statistic
The statistic which measures if the means of different samples are significantly
different or not is called the F-Ratio. Lower the F-Ratio, more similar are the
sample means. In that case, we cannot reject the null hypothesis.
when the outcome or dependent variable (in our case the test scores) is
affected by two independent variables/factors we use a slightly modified
technique called two-way ANOVA.
hat’s why a two-way ANOVA can have up to three hypotheses, which are as
follows:
Two null hypotheses will be tested if we have placed only one observation in
each cell. For this example, those hypotheses will be:
H1: All the music treatment groups have equal mean score.
H2: All the age groups have equal mean score.
6
Multiple linear regression (MLR), also known simply as multiple regression, is a
statistical technique that uses several explanatory variables to predict the
outcome of a response variable. The goal of multiple linear regression (MLR) is
to model the linear relationship between the explanatory (independent)
variables and response (dependent) variable.
▪ An outlier is a data point whose response y does not follow the general
trend of the rest of the data.
7
▪ A data point has high leverage if it has "extreme" predictor x values.
With a single predictor, an extreme x value is simply one that is
particularly high or low. With multiple predictors, extreme x values may
be particularly high or low for one or more predictors, or may be
"unusual" combinations of predictor values (e.g., with two predictors
that are positively correlated, an unusual combination of predictor
values might be a high value of one predictor paired with a low value of
the other predictor).
A data point is influential if it unduly influences any part of a regression
analysis, such as the predicted responses, the estimated slope coefficients, or
the hypothesis test results. Outliers and high leverage data points have
the potential to be influential, but we generally have to investigate further to
determine whether or not they are actually influential.
Outliers
Data points that diverge in a big way from the overall pattern are
called outliers. There are four ways that a data point might be considered an
outlier.
▪ It might be distant from the rest of the data, even without extreme X or
Y values.
Influential Points
An influential point is an outlier that greatly affects the slope of the regression
line. One way to test the influence of an outlier is to compute the regression
equation with and without the outlier.
If your data set includes an influential point, here are some things to consider.
8
▪ Compare the decisions that would be made based on regression
equations defined with and without the influential point. If the
equations lead to contrary decisions, use caution.
I. When the data set includes an influential point, the data set is nonlinear.
II. Influential points always reduce the coefficient of determination.
III. All outliers are influential data points.
(A) I only
(B) II only
(C) III only
(D) All of the above
(E) None of the above
Solution
9
be independent. If the degree of correlation between variables is high enough,
it can cause problems when you fit the model and interpret the results.
• By only keeping the most relevant variables from the original dataset
(this technique is called feature selection)
• By finding a smaller set of new variables, each being a combination of
the input variables, containing basically the same information as the
input variables (this technique is called dimensionality reduction)
10
• It takes care of multicollinearity by removing redundant features. For
example, you have two variables – ‘time spent on treadmill in minutes’
and ‘calories burnt’. These variables are highly correlated as the more
time you spend running on a treadmill, the more calories you will burn.
Hence, there is no point in storing both as just one of them does what
you require
• It helps in visualizing data. As discussed earlier, it is very difficult to
visualize data in higher dimensions so reducing our space to 2D or 3D
may allow us to plot and observe patterns more clearly
Principal Components Analysis,
WHAT IS PRINCIPAL COMPONENT ANALYSIS?
Factor Analysis
Probability:
Probability is the branch of mathematics concerning numerical descriptions of
how likely an event is to occur, or how likely it is that a proposition is true.
The probability of an event is a number between 0 and 1, where, roughly
speaking, 0 indicates impossibility of the event and 1 indicates certainty.
11
• The probability of an event can only be between 0 and 1 and can also be
written as a percentage.
• The probability of event AAA is often written as P(A)P(A)P, left
parenthesis, A, right parenthesis.
• If P(A) > P(B) means, then event A has a higher chance of occurring than
event B.
• If P(A) = P(B)P(A)=P(B)P, left parenthesis, A, right parenthesis, equals, P,
left parenthesis, B, right parenthesis, then events AAA and BBB are
equally likely to occur.
Types of Probability
There are three major types of probabilities:
• Theoretical Probability
• Experimental Probability
• Axiomatic Probability
Theoretical Probability
It is based on the possible chances of something to happen. The theoretical
probability is mainly based on the reasoning behind probability. For example, if
a coin is tossed, the theoretical probability of getting a head will be ½.
Experimental Probability
It is based on the basis of the observations of an experiment. The experimental
probability can be calculated based on the number of possible outcomes by
the total number of trials. For example, if a coin is tossed 10 times and heads is
recorded 6 times then, the experimental probability for heads is 6/10 or, 3/5.
Axiomatic Probability
In axiomatic probability, a set of rules or axioms are set which applies to all
types. These axioms are set by Kolmogorov and are known as Kolmogorov’s
three axioms. With the axiomatic approach to probability, the chances of
occurrence or non-occurrence of the events can be quantified.
First axiom[edit]
The probability of an event is a non-negative real number:
12
Second axiom[edit]
See also: Unitarity (physics)
This is the assumption of unit measure: that the probability that at least one of the elementary
events in the entire sample space will occur is 1
Third axiom[edit]
This is the assumption of σ-additivity:
Any countable sequence of disjoint sets (synonymous with mutually
Conditional Probability
The probability of one event given the occurrence of another event is called
the conditional probability. The conditional probability of one to one or more
random variables is referred to as the conditional probability distribution.
13
For example, the conditional probability of event A given event B is written
formally as:
• P(A given B)
The “given” is denoted using the pipe “|” operator; for example:
• P(A | B)
Marginal Probability
The probability of one event in the presence of all (or a subset of) outcomes of
the other random variable is called the marginal probability or the marginal
distribution. The marginal probability of one random variable in the presence
of additional random variables is referred to as the marginal probability
distribution.
There is no special notation for the marginal probability; it is just the sum or
union over all the probabilities of all events for the second variable for a given
fixed event for the first variable.
14
The marginal probability is different from the conditional probability
(described next) because it considers the union of all events for the second
variable rather than the probability of a single event.
Bayes Theorem.
KEY TAKEAWAYS
15
· A discrete probability distribution is applicable to the scenarios
where the set of possible outcomes is discrete (e.g. a coin toss, a roll of a
dice), and the probabilities are here encoded by a discrete list of the
probabilities of the outcomes, known as the probability mass function.
KEY TAKEAWAYS
Normal Distribution
The normal distribution is also called the Gaussian distribution (named for Carl
Friedrich Gauss) or the bell curve distribution.
16
The distribution covers the probability of real-valued events from many
different problem domains, making it a common and well-known distribution,
hence the name “normal.” A continuous random variable that has a normal
distribution is said to be “normal” or “normally distributed.”
Some examples of domains that have normally distributed events include:
In the study of probability theory, the central limit theorem (CLT) states that
the distribution of sample approximates a normal distribution (also known as a
“bell curve”) as the sample size becomes larger, assuming that all samples are
identical in size, and regardless of the population distribution shape.
• The central limit theorem (CLT) states that the distribution of sample
means approximates a normal distribution as the sample size gets larger.
• Sample sizes equal to or greater than 30 are considered sufficient for the
CLT to hold.
• A key aspect of CLT is that the average of the sample means and
standard deviations will equal the population mean and standard
deviation.
• A sufficiently large sample size can predict the characteristics of a
population accurately.
Central Limit Theorem exhibits a phenomenon where the average of the
sample means and standard deviations equal the population mean and
standard deviation, which is extremely useful in accurately predicting the
characteristics of populations.
The Central Limit Theorem states that the sampling distribution of the sample
means approaches a normal distribution as the sample size gets larger — no
matter what the shape of the population distribution. This fact holds especially
true for sample sizes over 30.
ll this is saying is that as you take more samples, especially large ones, your
graph of the sample means will look more like a normal distribution.
17
What is Discrete Distribution?
The Poisson distribution is a discrete function, meaning that the variable can
only take specific values in a (potentially infinite) list. Put differently, the
variable cannot take all values in any continuous range. For the Poisson
distribution (a discrete distribution), the variable can only take the values 0, 1,
2, 3, etc., with no fractions or decimals.
KEY TAKEAWAYS
• A Poisson distribution can be used to measure how many times an event
is likely to occur within "X" period of time, named after mathematician
Siméon Denis Poisson.
• Poisson distributions, therefore, are used when the factor of interest is a
discrete count variable.
• Many economic and financial data appear as count variables, such as
how many times a person becomes unemployed in a given year, thus
lending itself to analysis with a Poisson distribution.
What is a Binomial Distribution?
A binomial distribution can be thought of as simply the probability of a
SUCCESS or FAILURE outcome in an experiment or survey that is repeated
multiple times. The binomial is a type of distribution that has two possible
outcomes (the prefix “bi” means two, or twice). For example, a coin toss has
only two possible outcomes: heads or tails and taking a test could have two
possible outcomes: pass or fail.
· The first variable in the binomial formula, n, stands for the
number of times the experiment runs.
18
· The second variable, p, represents the probability of one
specific outcome.
Binomial distributions must also meet the following three criteria:
19
Predective Modeling
Predictive modeling is a process that uses data and statistics to predict
outcomes with data models. These models can be used to predict anything
from sports outcomes and TV ratings to technological advances and corporate
earnings. Predictive modeling is also often referred to as: Predictive analytics.
Predictive modeling is a process that uses data and statistics to predict
outcomes with data models. These models can be used to predict anything
from sports outcomes and TV ratings to technological advances and corporate
earnings.
• Predictive analytics
• Predictive analysis
• Machine learning
Concept of Multiple Linear regression
Multiple linear regression is used to estimate the relationship between two or
more independent variables and one dependent variable. You can use
multiple linear regression when you want to know:
1. How strong the relationship is between two or more independent
variables and one dependent variable (e.g. how rainfall, temperature,
and amount of fertilizer added affect crop growth).
2. The value of the dependent variable at a certain value of the
independent variables (e.g. the expected yield of a crop at certain levels
of rainfall, temperature, and fertilizer addition).
Multiple linear regression (MLR), also known simply as multiple regression, is a
statistical technique that uses several explanatory variables to predict the
outcome of a response variable. The goal of multiple linear regression (MLR) is
to model the linear relationship between the explanatory (independent)
variables and response (dependent) variable.
In essence, multiple regression is the extension of ordinary least-squares
(OLS) regression because it involves more than one explanatory variable.
KEY TAKEAWAYS
• Stepwise regression is a method that iteratively examines the statistical
significance of each independent variable in a linear regression model.
20
• The forward selection approach starts with nothing and adds each new
variable incrementally, testing for statistical significance.
• The backward elimination method begins with a full model loaded with
several variables and then removes one variable to test its importance
relative to overall results.
• Stepwise regression has its downsides, however, as it is an approach
that fits data into a model to achieve the desired result.
A dummy variable is a variable that takes values of 0 and 1, where the values
indicate the presence or absence of something (e.g., a 0 may indicate a
placebo and 1 may indicate a drug). Where a categorical variable has more
than two categories, it can be represented by a set of dummy variables, with
one variable for each category. Numeric variables can also be dummy
coded to explore nonlinear effects. Dummy variables are also known
as indicator variables, design variables, contrasts, one-hot coding, and binary
basis variables.
Logistic regression:
21
Logistic regression is a statistical model that in its basic form uses
a logistic function to model a binary dependent variable, although many more
complex extensions exist. In regression analysis, logistic regression (or logit
regression) is estimating the parameters of a logistic model (a form of
binary regression).
Logistic regression is the appropriate regression analysis to conduct when the
dependent variable is dichotomous (binary). Like all regression analyses, the
logistic regression is a predictive analysis. Logistic regression is used to
describe data and to explain the relationship between one dependent binary
variable and one or more nominal, ordinal, interval or ratio-level independent
variables.
Odds are determined from probabilities and range between 0 and infinity.
Odds are defined as the ratio of the probability of success and the probability
of failure.
22
The Likelihood-Ratio test (sometimes called the likelihood-ratio chi-squared
test) is a hypothesis test that helps you choose the “best” model between
two nested models. “Nested models” means that one is a special case of the
other. For example, you might want to find out which of the following models
is the best fit:
· Model One has four predictor variables (height, weight, age,
sex),
· Model Two has two predictor variables (age,sex). It is “nested”
within model one because it has just two of the predictor variables
(age, sex).
This theory cam also be applied to matrices. For example, a scaled identity
matrix is nested within a more complex compound symmetry matrix.
When running an ordinary least squares (OLS) regression, one common metric
to assess model fit is the R-squared (R2). The R2 metric can is calculated as
follows.
· R2 = 1 – [Σi(yi-ŷi)2]/[Σi(yi-ȳ)2]
23
The dependent variable is y, the predicted value from the OLS regression is ŷ,
and the average value of y across all observations is ȳ. The index for observations
is omitted for brevity.
24
So then what is a pseudo R-squared? When running a logistic regression, many
people would like a similar goodness of fit metric. An R-squared value does not
exist, however, for logit regressions since these regressions rely on “maximum
likelihood estimates arrived at through an iterative process. They are not
calculated to minimize variance, so the OLS approach to goodness-of-fit does
not apply.” However, there are a few variations of a pseudo R-squared which
are analogs to the OLS R-squared. For instance:
25
The true positive rate is calculated as the number of true positives divided by
the sum of the number of true positives and the number of false negatives. It
describes how good the model is at predicting the positive class when the
actual outcome is positive.
It is also called the false alarm rate as it summarizes how often a positive class
is predicted when the actual outcome is negative.
Classification table:
The Classification Table (aka the Confusion Matrix) compares the predicted
number of successes to the number of successes actually observed and
similarly the predicted number of failures compared to the number actually
observed.
True Positives (TP) = the number of cases which were correctly classified to be
positive, i.e. were predicted to be a success and were actually observed to be a
success
False Positives (FP) = the number of cases which were incorrectly classified as
positive, i.e. were predicted to be a success but were actually observed to be a
failure
26
True Negatives (TN) = the number of cases which were correctly classified to
be negative, i.e. were predicted to be a failure and were actually observed to
be a failure
False Negatives (FN) = the number of cases which were incorrectly classified as
negative, i.e. were predicted to be a failure but were actually observed to be a
success
Discriminant Function,
Linear discriminant analysis (LDA), normal discriminant analysis (NDA),
or discriminant function analysis is a generalization of Fisher's linear
discriminant, a method used in statistics and other fields, to find a linear
combination of features that characterizes or separates two or more classes of
objects or events. The resulting combination may be used as a linear classifier,
or, more commonly, for dimensionality reduction before later classification.
The data for the time series is stored in an R object called time-series object. It
is also a R data object like a vector or data frame.
The time series object is created by using the ts() function.
The data for the time series is stored in an R object called time-series object. It
is also a R data object like a vector or data frame.
The time series object is created by using the ts() function.
The gap between the actual data and the trend line is known as the seasonal
variation. Seasonal variation can be described as the difference between
the trend of data and the actual figures for the period in question. A seasonal
variation can be a numerical value (additive) or a percentage (multiplicative)
Time-series analysis involves looking at what has happened in the recent past
to help predict what will happen in the near future.
27
Seasonal variation
A Seasonal Variation (SV) is a regularly repeating pattern over a fixed number
of months
Trend
A Trend (T) is a long-term movement in a consistent direction. Trends can be
hard to spot because of the confusing impact of the SV. The easiest way to
spot the Trend is to look at the months that hold the same position in each set
of three period patterns.
Understanding Autocorrelation
In one-way analysis of variance, the null hypothesis assumes that are least two of the population
means are different. FALSE
One-way analysis of variance requires that the sample size for each level to be equal to one
another. FALSE
All analysis of variance models require that the data measurement be at least categorical level.
FALSE
When calculating an ANOVA table for Two-Factor Analysis of Variance with Replication, we consider
the following sources of variation except: Between the blocks
When performing a Two-Factor Analysis of Variance with Replication, in order to measure the
interaction effect, the sample size for each combination of Factor A and Factor B must be greater
than or equal to 2
In order to develop a
relative frequency
distribution, each
30
frequency count must
be divided by:
Your Answer: 6
31
8.1 to
2
10
10.1 to
3
12
Your Answer: 1
Correct Answer: 3
32
8. Look at the Excel Output in Figure 2-7 of
your text. How many males have a
balance from 990 to 1139?
33
Correct Answer: The independent
variable
34
Your Answer: True
Since , then .
7. The probability that an event will occur given that some other event
has already happened is known as joint probability.
The probability that an event will occur given that some other event
has already happened is known as conditional probability.
35
Your Answer: True
I error
36
5. Find the critical z-value for the hypothesis test calculated at α = 5% when
When the tail area under the standard normal curve is 5%, the z-value is -
1.645.
6. Find the z-value for the test statistic for the hypothesis test calculated at α
Your Answer:
If , reject the null hypothesis.
Correct Answer:
If or , reject the null
hypothesis.
37
selected and found to consist of 54 percent Republicans. What is the p-
value for this sample?
10. Which of the following is not one of the conditions for using the t-
distribution.
12. The probability that a hypothesis test will reject the null hypothesis when
the null hypothesis is false is called
Power is the probability that the hypothesis test will reject the null
hypothesis when the null hypothesis is false
13. What is the z-value for the test statistic for the following hypothesis
38
Your Answer: -0.1353
In one-way
analysis of
variance, the
null hypothesis
assumes that
are least two of
the population
means are
different.
Your Fal
Answer: se
Branch
A B C D
113 120 132 122
121 127 130 118
117 125 129 125
110 129 135 125
39
The results of a one-way ANOVA, is reported below.
ANOVA
Source of
SS df MS F
variation
Between 544.25 3 181.4167
Within 167.5 12 13.95833
Total 711.75 15
D1 = k -1 = 4 -1 =3, D2 = n − k = 16 − 4 = 12
at α = 0.05 F = 3.490
40
7. A randomized block ANOVA was performed on the differences
in price of a gallon of gasoline in three cities (A, B, and C),
where blocks represent the type of gasoline (regular, special,
extra, and super)
City
A B C
Regular 1.58 1.60 1.59
Special 1.84 1.68 1.90
Extra 1.44 1.45 1.50
Super 1.33 1.50 1.55
ANOVA
P-
Sourc d
SS MS F valu
e f
e
.2384 .0794 .004
Rows 3 13.108
67 89 82
Colum .0183 .0091 1.5130 .293
2
ns 5 75 55 7
.0363 .0060
Error 6
83 64
1
Total .2932
1
41
Tukey-Kramer is used to determine where the population
differences occur for a one-way ANOVA design. Fisher’s Least
Significant Difference is used for a randomized block ANOVA
design.
42
The degrees of freedom are equal to k-1
where k is the number of categories or
observed cell frequencies.
43
Your Answer: True
44
A high correlation between two independent variables such that the two variables contribute
redundant information to the model is known as
Consider the following stepwise regression procedure. All variables are forced into the model to
begin the process. Variables are removed one at a time until no more insignificant variables are
found. Once a variable has been removed from the model, it cannot be reentered. This procedure
is known as
Your Answer: one independent variable affects the relationship between another independent
variable and a dependent variable.
Which of the following methods is
used to help assess whether the
regression model meets the
assumption of having normally
distributed residuals?
45
Which of
the
following
statement
s is true?
Dummy variables are always assigned the value zero or one.
The number of dummy variables is always one fewer than the number of
categories.
In a multiple regression model, the regression slope coefficients measure the average change in
the dependent variable for a one-unit change in all the independent variables.
The regression slope coefficients measure the average change in the dependent variable for a one-
unit change in the independent variable, while all other independent variables remain constant.
In a multiple regression model, the sample size must be at least one greater than the number of
independent variables. However, it is recommended that the sample size should be at least four
times the number of independent variables.
Correlation coefficients are the quantitative measure used to determine the strength of the linear
relationship between two variables.
The variance inflation factor measures multicollinearity in the regression model. The analysis of
variance F-test is a method for testing whether the overall model is significant.
The variance inflation factor is an indication of the significance of the regression model.
The R-Squared value is a measure of the percentage of explained variation in the dependent
variable that takes into account the relationship between the sample size and the number if
independent variables in the regression model.
46
The Adjusted R-Squared value is a measure of the percentage of explained variation in the
dependent variable that takes into account the relationship between the sample size and the
number if independent variables in the regression model.
A complete polynomial model contains terms of all orders less than or equal to the pth order.
In the regression model, both the x and y variable are considered to be random variables.
The null hypothesis in a two-tailed significance test for the correlation is H0: ρ = 0.
47
Researchers tested to see if there is a correlation between drinking Beverage A and cancer. If they
found the correlation to be 0.9, they can assume that drinking Beverage A causes cancer.
In a linear regression model, the actual y values for each level of x are
uniformly distributed around the mean of y.
The residual in a regression model is defined as the difference between the actual value and the
predicted value of the dependent variable for a given level of the independent variable.
48
The chi-square goodness-of-fit test can be used to determine whether the sample data come from
a normally distributed population.
Your Answer: 6
Correct Answer: 3
For every parameter that is estimated using sample data, you lose one additional degree of
freedom. In this case both the mean and standard deviation need to be estimated from sample
data so the degrees of freedom are equal to k-1-2 where k=6 is equal to the number of
categories. The degrees of freedom are 3.
49
population does not follow the hypothesized
distribution.
50
Contingency analysis helps to make
decisions when multiple proportions are
involved.
51
To test whether having a laptop is
independent of gender, the expected cell
frequency for males who have a laptop is
106.67.
52