0% found this document useful (0 votes)
57 views11 pages

BPCC 108 Notes

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
57 views11 pages

BPCC 108 Notes

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

NOTES

Concept and meaning of inferential statistics

Inferential statistics is a branch of statistics that allows researchers to make conclusions


or inferences about a population based on data collected from a sample. It involves
using probability theory to estimate population parameters, test hypotheses, and make
predictions.
Key Concepts of Inferential Statistics:
1. Population and Sample: The population refers to the entire group of individuals or
instances that researchers are interested in studying. A sample is a subset of the
population selected for analysis. Inferential statistics uses sample data to draw
conclusions about the larger population.
2. Estimation: Inferential statistics often involves estimating population parameters
(such as means or proportions) based on sample statistics. Point estimates provide a
single value estimate, while interval estimates (confidence intervals) provide a range
of values within which the population parameter is likely to fall.
3. Hypothesis Testing: This process involves making an assumption (the null hypothesis)
about a population parameter and then using sample data to determine whether
there is enough evidence to reject that assumption in favor of an alternative
hypothesis. Common tests include t-tests, chi-square tests, and ANOVA.
4. Significance Levels: Researchers set a significance level (commonly denoted as alpha,
α) to determine the threshold for rejecting the null hypothesis. A common
significance level is 0.05, meaning there is a 5% risk of concluding that a difference
exists when there is none (Type I error).
5. P-Values: The p-value indicates the probability of obtaining the observed results, or
more extreme results, if the null hypothesis is true. A low p-value (typically less
than the significance level) suggests that the null hypothesis may be rejected.
6. Generalization: One of the primary goals of inferential statistics is to generalize
findings from the sample to the broader population. This requires careful
consideration of sampling methods to ensure that the sample is representative of
the population.
Meaning of Inferential Statistics:
In essence, inferential statistics provides the tools and methods for making informed
decisions and predictions about a population based on limited data. It allows
researchers to draw conclusions, test theories, and make predictions while accounting
for uncertainty and variability inherent in sampling. This is crucial in fields such as
psychology, medicine, social sciences, and market research, where understanding trends
and making data-driven decisions are essential.
NOTES

The procedure for conducting inferential statistics


1. Define the Research Question
Clearly articulate the research question or hypothesis you want to test. This will guide
the entire analysis.
2. Identify the Population and Sample
Determine the population of interest and select a representative sample from that
population. Ensure that the sampling method minimizes bias (e.g., random sampling).
3. Collect Data
Gather data from the sample using appropriate methods (e.g., surveys, experiments,
observational studies). Ensure that the data collection process is systematic and
reliable.
4. Choose the Appropriate Statistical Test
Based on the research question and the type of data collected, select the appropriate
inferential statistical test (e.g., t-test, chi-square test, ANOVA). Consider the
assumptions of each test (e.g., normality, homogeneity of variance).
5. Set the Significance Level
Determine the significance level (alpha, α), commonly set at 0.05. This level indicates
the threshold for rejecting the null hypothesis.
6. Conduct the Statistical Analysis
Use statistical software or tools to perform the chosen test on the collected data.
Calculate the test statistic and the corresponding p-value.
7. Interpret the Results
Compare the p-value to the significance level:
If the p-value is less than or equal to α, reject the null hypothesis, indicating that
there is significant evidence to support the alternative hypothesis.
If the p-value is greater than α, fail to reject the null hypothesis, suggesting
insufficient evidence to support the alternative hypothesis.
8. Report the Findings
Present the results in a clear and concise manner, including the test statistic, p-value,
confidence intervals, and any relevant graphs or tables. Discuss the implications of
the findings in relation to the research question.
9. Consider Limitations and Future Research
Acknowledge any limitations of the study, such as sample size or potential biases.
Suggest areas for future research to further explore the topic.
10. Draw Conclusions
Summarize the main findings and their relevance to the research question. Discuss
how the results contribute to the existing body of knowledge in the field.
By following these steps, researchers can effectively use inferential statistics to draw
meaningful conclusions from their data and make informed decisions based on their
findings.
NOTES
Explain parametric statistics and non parametric statistics also describe its assumption

Parametric statistics refers to a set of statistical techniques that assume the underlying data
follows a specific distribution, typically a normal distribution. These methods are used to make
inferences about population parameters based on sample statistics. Common parametric tests
include t-tests, ANOVA (Analysis of Variance), and linear regression.
Assumptions of Parametric Statistics:
1. Normality: The data should be approximately normally distributed. This assumption is
particularly important for small sample sizes, as larger samples (typically n > 30) can often
satisfy this assumption due to the Central Limit Theorem.
2. Homogeneity of Variance: The variances among the groups being compared should be
approximately equal. This is crucial for tests like ANOVA, where the assumption is that the
variability within each group is similar.
3. Independence: The observations should be independent of one another. This means that the
data collected from one participant or observation should not influence or be related to the
data collected from another.
4. Linearity: For certain parametric tests, such as linear regression, there should be a linear
relationship between the independent and dependent variables.
5. Interval or Ratio Scale: The data should be measured on an interval or ratio scale, which
means that the data should have meaningful numerical values that allow for the calculation
of means and standard deviations.
When these assumptions are met, parametric statistical methods can provide more powerful and
reliable results. However, if the assumptions are violated, non-parametric methods may be more
appropriate, as they do not rely on these specific distributional assumptions.
Non-parametric statistics refers to a set of statistical methods that do not assume a specific
distribution for the data. These methods are particularly useful when the assumptions required
for parametric tests (such as normality and homogeneity of variance) are not met. Non-
parametric tests are often used for ordinal data or when the sample size is small.
Assumptions of Non-Parametric Statistics:
1. No Assumption of Normality: Non-parametric tests do not require the data to follow a
normal distribution. This makes them suitable for data that is skewed or has outliers.
2. Ordinal or Nominal Data: Non-parametric methods can be used with ordinal data (data that
can be ranked but not measured) or nominal data (categorical data without a specific order).
This flexibility allows for a wider range of data types.
3. Independence: Similar to parametric tests, non-parametric tests assume that the
observations are independent of one another. Each observation should not influence or be
related to another.
4. Homogeneity of Variance (in some cases): While many non-parametric tests do not require
equal variances, some tests may still assume that the groups being compared have similar
variances, although this is less stringent than in parametric tests.
5. Random Sampling: The data should be collected through a process of random sampling to
ensure that the sample is representative of the population.
Non-parametric statistics are valuable tools in research, especially when dealing with non-
normal data or when the sample size is too small to reliably assess the distribution. Common
non-parametric tests include the Mann-Whitney U test, Wilcoxon signed-rank test, Kruskal-
Wallis test, and Chi-square test.
NOTES
Applications and Uses of Parametric Statistics
1. Experimental Research: Parametric statistics are commonly used in experimental
research where the data is expected to follow a normal distribution. For example, in
clinical trials, researchers often use t-tests or ANOVA to compare the effects of
different treatments on health outcomes.
2. Psychology and Social Sciences: In fields like psychology, parametric tests are
frequently employed to analyze data from surveys and experiments. For instance,
researchers might use regression analysis to understand the relationship between
variables such as stress and academic performance.
3. Quality Control: In manufacturing and quality control processes, parametric methods
can be used to analyze measurements and ensure that products meet specified
standards. Techniques like control charts often rely on normality assumptions.
4. Market Research: Businesses use parametric statistics to analyze consumer preferences
and behaviors. For example, they might use t-tests to compare the average
satisfaction ratings of two different products.
5. Finance: In finance, parametric methods are used to model and predict stock prices,
returns, and risks. Techniques such as linear regression are commonly applied to
understand relationships between financial variables.
Applications and Uses of Non-Parametric Statistics
1. Ordinal Data Analysis: Non-parametric statistics are particularly useful for analyzing
ordinal data, such as survey responses on a Likert scale (e.g., strongly agree to
strongly disagree). Tests like the Wilcoxon signed-rank test can be used to compare
medians.
2. Small Sample Sizes: When dealing with small sample sizes where normality cannot be
assumed, non-parametric tests provide a robust alternative. For example, the Mann-
Whitney U test can be used to compare two independent groups without assuming
normality.
3. Non-Normal Distributions: Non-parametric methods are ideal for data that is skewed
or has outliers. For instance, the Kruskal-Wallis test can be used to compare more
than two groups when the data does not meet the assumptions of ANOVA.
4. Categorical Data: Non-parametric statistics are often used for analyzing categorical
data. The Chi-square test is a common non-parametric method used to assess the
association between two categorical variables.
5. Environmental Studies: In environmental research, non-parametric methods are
frequently applied to analyze data that may not meet the assumptions of parametric
tests, such as species abundance data or pollution levels.
In summary, both parametric and non-parametric statistics have their specific
applications and are chosen based on the nature of the data, the research question, and
the underlying assumptions. Parametric statistics are powerful when assumptions are
met, while non-parametric statistics provide flexibility and robustness in various
situations.
NOTES
Differences between parametric and non-parametric statistics:
1. Assumptions
Parametric Statistics: These methods assume that the data follows a specific
distribution, typically a normal distribution. They also assume homogeneity of
variance and that the data is measured on an interval or ratio scale.
Non-Parametric Statistics: These methods do not assume a specific distribution for the
data. They can be used with ordinal or nominal data and are more flexible regarding
the underlying data distribution.
2. Data Type
Parametric Statistics: Suitable for interval or ratio data, where the data can be
meaningfully averaged and has a consistent scale.
Non-Parametric Statistics: Can be used for ordinal data (ranked data) and nominal
data (categorical data), making them applicable in a wider range of situations.
3. Statistical Power
Parametric Statistics: Generally more powerful than non-parametric tests when the
assumptions are met. This means they are more likely to detect a true effect when one
exists.
Non-Parametric Statistics: Typically less powerful than parametric tests, especially
when the sample size is small. However, they are robust in situations where
parametric assumptions are violated.
4. Examples of Tests
Parametric Statistics: Common tests include t-tests (for comparing means), ANOVA
(for comparing means across multiple groups), and linear regression.
Non-Parametric Statistics: Common tests include the Mann-Whitney U test (for
comparing two independent groups), Wilcoxon signed-rank test (for comparing two
related groups), Kruskal-Wallis test (for comparing more than two groups), and Chi-
square test (for categorical data).
5. Interpretation of Results
Parametric Statistics: Results are often interpreted in terms of population parameters
(e.g., means, variances) and can provide estimates of effect sizes.
Non-Parametric Statistics: Results are typically interpreted in terms of ranks or
medians rather than means, and they may not provide estimates of population
parameters.
6. Use Cases
Parametric Statistics: Preferred when the data meets the necessary assumptions, such
as in controlled experiments with normally distributed data.
Non-Parametric Statistics: Preferred when dealing with non-normal data, small
sample sizes, or when the data is ordinal or nominal.
In summary, the choice between parametric and non-parametric statistics depends on the
nature of the data, the research question, and whether the assumptions of parametric
tests are met.
NOTES
Explain the steps in computation of one sample median test with the help of a
suitable example.
The one-sample median test is a non-parametric statistical test used to determine whether
the median of a single sample differs from a specified value (often the population median).
Here are the steps involved in computing the one-sample median test, along with a suitable
example.
Steps in Computation of One-Sample Median Test
Step 1: State the Hypotheses
Null Hypothesis (H0): The median of the population is equal to a specified value (e.g.,
m0m0​).
Alternative Hypothesis (H1): The median of the population is not equal to the specified
value.
Example:
Suppose we want to test whether the median height of a group of students differs from 170
cm.
H0: The median height is 170 cm.
H1: The median height is not 170 cm.
Step 2: Collect the Sample Data
Gather the sample data that you will analyze.
Example:
Let's say we have the following heights (in cm) of 10 students:
165,172,168,175,160,180,170,169,173,177165,172,168,175,160,180,170,169,173,177
Step 3: Determine the Sample Size
Count the number of observations in the sample.
Sample Size (n): In this case, n=10n=10.
Step 4: Calculate the Median of the Sample
Sort the data and find the median.
Sorted heights:
160,165,168,169,170,172,173,175,177,180160,165,168,169,170,172,173,175,177,180
Median: Since nn is even, the median is the average of the two middle values:
Median=170+1692=169.5Median=2170+169​=169.5
Step 5: Count the Number of Observations Above and Below the Specified Median
Count how many observations are above and below the specified median (170 cm).
Above 170 cm: 172,175,180,173,177172,175,180,173,177 (5 observations)
Below 170 cm: 165,168,169,160165,168,169,160 (4 observations)
Equal to 170 cm: 170170 (1 observation)
Step 6: Calculate the Test Statistic
The test statistic is based on the counts of observations above and below the specified median.
Let n1n1​be the number of observations above the median (5).
Let n2n2​be the number of observations below the median (4).
The test statistic TT is the smaller of n1n1​and n2n2​: T=min⁡(n1,n2)=min⁡(5,4)=4T=min(n1​,n2​
)=min(5,4)=4
Step 7: Determine the Critical Value
Using a binomial distribution, determine the critical value for the test statistic based on the
sample size and significance level (e.gThe one-sample median test is a non-parametric
statistical test used to determine whether the median of a single sample differs from a
specified value (often a hypothesized population median). Here are the steps involved in
computing the one-sample median test, along with a suitable example.

just read this question, you can skip if u want


NOTES

Steps in Computation of One-Sample Median Test


Step 1: State the Hypotheses
Null Hypothesis (H0): The median of the population is equal to a specified value (e.g., m0m0​).
Alternative Hypothesis (H1): The median of the population is not equal to the specified value.
Example:
Suppose we want to test whether the median height of a group of students differs from 170 cm.
H0: The median height is 170 cm.
H1: The median height is not 170 cm.
Step 2: Collect the Sample Data
Gather the sample data that you will analyze.
Example:
Let's say we have the following heights (in cm) of 10 students:
165,172,168,175,160,180,170,169,173,177165,172,168,175,160,180,170,169,173,177
Step 3: Determine the Sample Size
Count the number of observations in the sample.
Sample Size (n): In this case, n=10n=10.
Step 4: Calculate the Median of the Sample
Sort the data and find the median.
Sorted heights:
160,165,168,169,170,172,173,175,177,180160,165,168,169,170,172,173,175,177,180
Median: Since nn is even, the median is the average of the two middle values:
Median=170+1692=169.5Median=2170+169​=169.5
Step 5: Count the Number of Observations Above and Below the Specified Median
Count how many observations are above and below the specified median (170 cm).
Above 170 cm: 172,175,180,173,177172,175,180,173,177 (5 observations)
Below 170 cm: 165,168,169,160165,168,169,160 (4 observations)
Equal to 170 cm: 170170 (1 observation)
Step 6: Calculate the Test Statistic
The test statistic is based on the counts of observations above and below the specified median.
Let n1n1​be the number of observations above the median (5).
Let n2n2​be the number of observations below the median (4).
The test statistic TT is the smaller of n1n1​and n2n2​: T=min⁡(n1,n2)=min⁡(5,4)=4T=min(n1​,n2​
)=min(5,4)=4
Step 7: Determine the Critical Value
Using a binomial distribution, determine the critical value for the test statistic based on the sample size
and significance level (e.g., α=0.05α=0.05).
For a two-tailed test with n=10n=10, the critical values can be found in binomial tables or calculated
using statistical software.
Step 8: Make a Decision
Compare the test statistic to the critical value.
If TT is less than or equal to the critical value, reject the null hypothesis.
If TT is greater than the critical value, do not reject the null hypothesis.
Example Conclusion:
Assuming the critical value for α=0.05α=0.05 is 3 (for a two-tailed test), since T=4T=4 is greater than 3,
we do not reject the null hypothesis. Therefore, we conclude that there is not enough evidence to say
that the median height of the students differs from 170 cm.
Summary
The one-sample median test is a straightforward method for testing whether a sample median differs
from a specified value. By following these steps, researchers can effectively analyze their data and draw
conclusions based on the median.
NOTES

Explain the fundamental concepts in determining the significance of the difference


between means (Dec 2022)
Determining the significance of the difference between means is a key aspect of statistical
analysis, particularly in hypothesis testing. Here are the fundamental concepts involved:
1. Hypothesis Testing: This involves formulating two competing hypotheses:
Null Hypothesis (H0): Assumes no difference between the means of the groups
being compared.
Alternative Hypothesis (H1): Assumes there is a difference between the means.
2. Sample Means: Calculate the means of the samples from the groups being compared.
This provides the basis for comparison.
3. Standard Deviation and Variance: These measures indicate the variability within each
group. A smaller standard deviation suggests that the data points are closer to the
mean, while a larger standard deviation indicates more spread out data.
4. Standard Error (SE): This is the standard deviation of the sampling distribution of the
sample mean. It is calculated as the standard deviation divided by the square root of
the sample size (n). SE helps in understanding how much the sample mean is expected
to vary from the true population mean.
5. Test Statistic: This is a standardized value that is calculated from sample data during
a hypothesis test. Common test statistics for comparing means include:
t-test: Used when the sample sizes are small and/or the population standard
deviation is unknown.
z-test: Used when the sample sizes are large and the population standard
deviation is known.
6. Degrees of Freedom: This concept is important in determining the appropriate
distribution to use for the test statistic. For a t-test, degrees of freedom are typically
calculated as the total number of observations minus the number of groups.
7. P-value: This is the probability of observing the test results under the null hypothesis.
A low p-value (typically less than 0.05) indicates strong evidence against the null
hypothesis, leading to its rejection.
8. Confidence Intervals: These provide a range of values within which the true difference
between means is likely to fall. If a confidence interval does not include zero, it
suggests a statistically significant difference.
9. Effect Size: This measures the magnitude of the difference between means, providing
context beyond just statistical significance. Common measures include Cohen's d and
eta-squared.
10. Assumptions: Each statistical test has underlying assumptions (e.g., normality,
homogeneity of variance) that must be checked to ensure the validity of the results.
By understanding these concepts, researchers can effectively determine whether the
observed differences between means are statistically significant and meaningful in the
context of their study.
NOTES

Apllication of chi square


(dec 2023)
The Chi-square test is a statistical method used to determine whether there is a
significant association between categorical variables. Here are some common
applications of the Chi-square test:
1. Goodness of Fit Test: This application assesses whether the observed frequencies of a
single categorical variable match the expected frequencies based on a specific
distribution. For example, it can be used to determine if a die is fair by comparing
the observed outcomes of rolls to the expected outcomes.
2. Test of Independence: This application evaluates whether two categorical variables
are independent of each other. For instance, it can be used to analyze survey data to
see if there is an association between gender and preference for a particular
product.
3. Contingency Tables: Chi-square tests are often applied to contingency tables, which
display the frequency distribution of variables. Researchers can use these tables to
analyze relationships between two or more categorical variables.
4. Market Research: Businesses use Chi-square tests to analyze consumer preferences
and behaviors. For example, they might examine whether customer satisfaction
ratings differ by demographic groups.
5. Epidemiology: In public health studies, Chi-square tests can be used to investigate
the relationship between exposure to a risk factor and the occurrence of a disease.
For example, researchers might study whether smoking status is associated with
lung cancer incidence.
6. Genetics: Chi-square tests are frequently used in genetics to compare observed
genetic ratios (e.g., Mendelian inheritance patterns) with expected ratios to
determine if the observed data fit the expected genetic model.
7. Social Sciences: Researchers in sociology and psychology often use Chi-square tests
to analyze survey data, exploring relationships between variables such as education
level and voting behavior.
8. Quality Control: In manufacturing, Chi-square tests can be used to assess whether
the distribution of defects in products is consistent with expected quality standards.
NOTES

Appplication of one-way ANOVA (parametric test)


One-way ANOVA (Analysis of Variance) is a parametric statistical test used to compare
the means of three or more independent groups to determine if at least one group mean
is significantly different from the others. Here are some common applications of one-way
ANOVA:
1. Comparing Treatment Effects: In clinical trials, researchers often use one-way ANOVA
to compare the effects of different treatments or interventions on a particular
outcome. For example, a study might compare the effectiveness of three different
medications on reducing blood pressure.
2. Agricultural Studies: One-way ANOVA can be used to evaluate the effects of different
fertilizers on crop yield. Researchers can compare the mean yields from plots treated
with different types of fertilizers to see if there are significant differences.
3. Education Research: In educational settings, one-way ANOVA can be applied to
compare the performance of students from different teaching methods or curricula.
For instance, researchers might assess whether students taught using traditional
methods perform differently than those taught using interactive methods.
4. Marketing Research: Businesses can use one-way ANOVA to analyze consumer
preferences across different product variations. For example, a company might test
three different packaging designs to see which one leads to higher customer
satisfaction ratings.
5. Psychology Experiments: In psychological research, one-way ANOVA can be used to
compare the effects of different stimuli or conditions on participants' responses. For
instance, researchers might examine how different levels of noise affect concentration
levels in a task.
6. Quality Control: In manufacturing, one-way ANOVA can be used to compare the
quality of products produced under different conditions or processes. For example, a
factory might compare the mean defect rates of products produced on different
machines.
7. Sports Science: Researchers can use one-way ANOVA to compare the performance of
athletes under different training regimens. For example, they might assess whether
athletes following three different training programs show significant differences in
performance metrics.
8. Environmental Studies: One-way ANOVA can be applied to compare the effects of
different environmental conditions on species diversity in ecological studies. For
instance, researchers might compare species richness in areas with varying levels of
pollution.
In all these applications, one-way ANOVA helps researchers determine whether the
differences in group means are statistically significant, providing insights into the effects
of different factors or treatments.
NOTES
Mann-Whitney U-test
The Mann-Whitney U-test, also known as the Wilcoxon rank-sum test, is a non-
parametric statistical test used to compare the differences between two independent
groups when the data does not necessarily follow a normal distribution. Here are some
key points and applications of the Mann-Whitney U-test:
Key Points:
1. Non-parametric: The Mann-Whitney U-test does not assume that the data is normally
distributed, making it suitable for ordinal data or continuous data that do not meet the
assumptions of parametric tests.
2. Ranks: Instead of using the raw data, the test ranks all the observations from both
groups together. The U statistic is then calculated based on these ranks.
3. Hypotheses:
Null Hypothesis (H0): Assumes that the distributions of the two groups are equal.
Alternative Hypothesis (H1): Assumes that the distributions of the two groups are
not equal.
4. U Statistic: The test calculates the U statistic for each group, which reflects the number
of times a score from one group precedes a score from the other group in the ranked
list.
5. Interpretation: A significant U statistic indicates that there is a difference between the
two groups, but it does not specify the direction of the difference.
Applications:
1. Medical Research: The Mann-Whitney U-test is often used to compare the effectiveness
of two different treatments when the outcome measures are not normally distributed.
For example, comparing pain relief scores between two different pain management
therapies.
2. Psychology: Researchers may use the test to compare responses from two different
groups of participants, such as comparing stress levels between two different age
groups or genders.
3. Market Research: Businesses can apply the Mann-Whitney U-test to analyze customer
satisfaction ratings between two different products or services, especially when the
ratings are ordinal (e.g., on a scale of 1 to 5).
4. Education: In educational research, the test can be used to compare test scores
between two different teaching methods or curricula when the scores do not meet the
assumptions of normality.
5. Ecology: The Mann-Whitney U-test can be used to compare species diversity or
abundance between two different habitats or environmental conditions.
6. Quality Control: In manufacturing, the test can be applied to compare the quality
ratings of products from two different production lines when the ratings are not
normally distributed.
Overall, the Mann-Whitney U-test is a versatile tool for comparing two independent
groups, especially when the data does not meet the assumptions required for parametric
tests.

You might also like