0% found this document useful (0 votes)
9 views8 pages

STATISTICS Review

for compre
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views8 pages

STATISTICS Review

for compre
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 8

STATISTICS

DESCRIPTIVE STATISTICS

Purpose: Descriptive statistics summarize and describe the main features of a dataset, providing simple
summaries about the sample and the measures.

Key Concepts:

 Measures of Central Tendency: Mean, median, and mode.


 Measures of Dispersion: Range, variance, and standard deviation.
 Graphical Representations: Histograms, bar charts, pie charts, and box plots.

Application: Descriptive statistics are used to present the basic information about the data and to make it
easier to understand through summary measures and visualizations. They do not involve making predictions
or generalizations beyond the data at hand.

Example

Descriptive Statistics:

 Calculating the average score of students in a class.


 Creating a bar chart to show the distribution of scores.

INFERENTIAL STATISTICS

Purpose: Inferential statistics make inferences and predictions about a population based on a sample of data.
They help in making decisions or predictions about the data.

Key Concepts:

 Sampling: Selecting a representative subset from the population.


 Hypothesis Testing: Testing assumptions (null and alternative hypotheses) about the population.
 Confidence Intervals: Estimating the range within which a population parameter lies, with a certain
level of confidence.
 Regression Analysis: Understanding relationships between variables and making predictions.

Application: Inferential statistics are used to draw conclusions about a population based on sample data,
testing hypotheses, and making predictions. This involves assessing the reliability and variability of the
sample to make generalizations.

Example

Inferential Statistics:

 Using the average score of students in a sample to estimate the average score of all students in a
school.
 Conducting a hypothesis test to determine if there is a significant difference between the average
scores of two different classes.
POPULATION

Definition: A population is the entire set of individuals, objects, or events that share a common
characteristic and are the subject of a statistical study. It includes all possible observations or outcomes that
could be of interest.

Characteristics:

 Size: Generally very large or infinite.


 Parameters: Values that describe a characteristic of the population, such as the population mean (μ\
muμ), population variance (σ2\sigma^2σ2), and population proportion (P).

Examples:

 All the students in a university.


 Every manufactured product from a specific production line.
 The entire population of a country.

Usage: In many cases, it is impractical or impossible to collect data from an entire population due to
constraints like time, cost, and accessibility. Therefore, researchers often rely on samples to make inferences
about the population.

SAMPLE

Definition: A sample is a subset of the population that is selected for the actual study. It should ideally be
representative of the population to provide accurate and reliable insights.

Characteristics:

 Size: Smaller and more manageable than the population.


 Statistics: Values that describe a characteristic of the sample, such as the sample mean (xˉ\
bar{x}xˉ), sample variance (s^2), and sample proportion (p).

Examples:

 A group of 100 students selected from the entire university for a survey.
 A batch of 50 products taken from a production line for quality testing.
 A survey of 1,000 households to estimate the average income in a city.

Usage: Samples are used to make inferences about the population. The accuracy and reliability of these
inferences depend on the sample size and how representative the sample is of the population.

Key Differences

1. Scope:
oPopulation: Includes all members of a defined group.
oSample: Includes only a subset of the population.
2. Parameters and Statistics:
o Population: Described by parameters (e.g., μ\muμ, σ2\sigma^2σ2).
o Sample: Described by statistics (e.g., xˉ\bar{x}xˉ, s^2).
3. Data Collection:
o Population: Often impractical to collect data from every member.
o Sample: Practical and cost-effective for data collection and analysis.
4. Inference:
o Population: Provides complete information if data from all members is available.
o Sample: Used to infer and make predictions about the population.

Example

Population: All the cars manufactured by a company in a year.

 Parameter: The average fuel efficiency of all cars produced.

Sample: 200 cars selected randomly from the production line.

 Statistic: The average fuel efficiency of the sampled cars.

INDEPENDENT VARIABLE:

 The independent variable is the variable that is manipulated or controlled by the researcher.
 It is considered the cause or input in an experiment.
 Changes in the independent variable are expected to cause changes in another variable.
 Example: If you are studying the effect of sunlight on plant growth, the amount of sunlight is the
independent variable.

DEPENDENT VARIABLE:

 The dependent variable is the variable that is measured or observed in an experiment.


 It is considered the effect or outcome that depends on the independent variable.
 Changes in the dependent variable are believed to result from changes in the independent variable.
 Example: In the same plant growth study, the growth of the plant (measured in height, number of
leaves, etc.) is the dependent variable.

Null Hypothesis (H₀):

o The null hypothesis is a statement that there is no effect or no difference, and it serves as the
default or starting assumption.
o It is a statement of no change or no association.
o The purpose of hypothesis testing is to determine whether there is enough evidence to reject
the null hypothesis.
o Example: If you are testing whether a new drug is effective in lowering blood pressure, the
null hypothesis might be that the drug has no effect on blood pressure (i.e., the mean blood
pressure before and after taking the drug is the same).

Alternative Hypothesis (H₁ or Ha):

o The alternative hypothesis is a statement that contradicts the null hypothesis. It suggests that
there is an effect or a difference.
o It represents what the researcher aims to support.
o Rejecting the null hypothesis in favor of the alternative hypothesis indicates that there is
enough evidence to suggest an effect or difference.
o Example: For the same drug test, the alternative hypothesis might be that the drug does have
an effect on lowering blood pressure (i.e., the mean blood pressure after taking the drug is
different from the mean blood pressure before taking the drug).
Key Points

 Hypothesis Testing Process:


1. Formulate the null hypothesis (H₀) and the alternative hypothesis (H₁).
2. Collect and analyze the data.
3. Calculate a test statistic and the corresponding p-value.
4. Compare the p-value to a significance level (usually 0.05).
5. Decide whether to reject or fail to reject the null hypothesis.

 Decision Criteria:

o If the p-value is less than the significance level, reject the null hypothesis in favor of the
alternative hypothesis.
o If the p-value is greater than the significance level, fail to reject the null hypothesis.

Measures of Central Tendency

Measures of central tendency describe the center or typical value of a data set. They provide a single value
that represents the middle or average of the data. The main measures of central tendency are:

1. Mean (Arithmetic Average):


o The sum of all values divided by the number of values.
o Example: For the data set [2, 4, 6, 8, 10], the mean is (2 + 4 + 6 + 8 + 10) / 5 = 6.
2. Median:
o The middle value when the data set is ordered from least to greatest.
o If the data set has an even number of values, the median is the average of the two middle
values.
o Example: For the data set [2, 4, 6, 8, 10], the median is 6. For the data set [2, 4, 6, 8], the
median is (4 + 6) / 2 = 5.
3. Mode:
o The value that appears most frequently in a data set.
o A data set can have no mode, one mode, or multiple modes.
o Example: For the data set [2, 4, 4, 6, 8], the mode is 4.

Measures of Variance

Measures of variance describe the spread or dispersion of a data set. They indicate how much the values in
the data set vary or deviate from the central value. The main measures of variance are:

1. Range:
oThe difference between the highest and lowest values in the data set.
oExample: For the data set [2, 4, 6, 8, 10], the range is 10 - 2 = 8.
2. Variance:
o The average of the squared differences between each value and the mean.
o It gives a measure of how data points differ from the mean.
o Example: For the data set [2, 4, 6, 8, 10], the variance can be calculated as:
1. Calculate the mean: (2 + 4 + 6 + 8 + 10) / 5 = 6
2. Calculate each squared difference from the mean: (2-6)², (4-6)², (6-6)², (8-6)², (10-6)²
= 16, 4, 0, 4, 16
3. Average these squared differences: (16 + 4 + 0 + 4 + 16) / 5 = 8
3. Standard Deviation:
o The square root of the variance.
o It provides a measure of the average distance of each value from the mean.
o Example: For the data set [2, 4, 6, 8, 10], the standard deviation is the square root of 8, which
is approximately 2.83.

Summary

 Measures of Central Tendency:


o Mean: Average value.
o Median: Middle value.
o Mode: Most frequent value.
 Measures of Variance:
o Range: Difference between the maximum and minimum values.
o Variance: Average of the squared differences from the mean.
o Standard Deviation: Square root of the variance, representing average deviation from the
mean.

Central tendency gives a single value to represent the data set, while measures of variance provide insight
into how spread out the data values are around this central value. Both are essential for understanding and
interpreting data.

One-Tailed Test

A one-tailed test, also known as a directional test, is used when the research hypothesis specifies a direction
of the effect or difference. You test whether the sample mean is significantly greater than or less than the
population mean in a specific direction.

 Right-Tailed Test: Used when the alternative hypothesis (H₁) states that the parameter is greater
than the null hypothesis (H₀).
o Example: Testing if a new drug increases the average recovery rate more than the current
drug. H₀: μ ≤ μ₀ (no increase), H₁: μ > μ₀ (increase).
 Left-Tailed Test: Used when the alternative hypothesis (H₁) states that the parameter is less than the
null hypothesis (H₀).
o Example: Testing if a new process reduces the average defect rate. H₀: μ ≥ μ₀ (no reduction),
H₁: μ < μ₀ (reduction).

Two-Tailed Test

A two-tailed test, also known as a non-directional test, is used when the research hypothesis does not specify
the direction of the effect or difference. You test whether the sample mean is significantly different from the
population mean, in either direction (greater than or less than).

 Example: Testing if a new teaching method has a different effect on test scores compared to the
traditional method, regardless of whether it is an increase or decrease. H₀: μ = μ₀ (no difference),
H₁: μ ≠ μ₀ (difference).
Key Points

1. Hypotheses Formulation:
o One-Tailed Test:
 Right-tailed: H₀: μ ≤ μ₀ vs. H₁: μ > μ₀
 Left-tailed: H₀: μ ≥ μ₀ vs. H₁: μ < μ₀
o Two-Tailed Test:
 H₀: μ = μ₀ vs. H₁: μ ≠ μ₀
2. Critical Region:
o One-Tailed Test: The critical region is located in one tail of the distribution (either the right
tail or the left tail).
o Two-Tailed Test: The critical region is split between both tails of the distribution.
3. Significance Level (α):
o One-Tailed Test: The entire significance level (e.g., 0.05) is placed in one tail.
o Two-Tailed Test: The significance level is divided between the two tails (e.g., 0.025 in each
tail if α = 0.05).
4. Decision Rule:
o One-Tailed Test: Reject H₀ if the test statistic falls in the critical region of the specified tail.
o Two-Tailed Test: Reject H₀ if the test statistic falls in either of the critical regions of the two
tails.

Examples

 One-Tailed Test: Suppose you are testing if a new fertilizer increases plant height. You would use a
right-tailed test because you are only interested in whether the fertilizer increases the height, not if it
decreases it.
o H₀: μ ≤ μ₀ (no increase in height)
o H₁: μ > μ₀ (increase in height)
 Two-Tailed Test: Suppose you are testing if a new teaching method changes student performance (it
could be better or worse). You would use a two-tailed test.
o H₀: μ = μ₀ (no change in performance)
o H₁: μ ≠ μ₀ (change in performance)

You might also like