0% found this document useful (0 votes)
18 views11 pages

What Is A Probability Distribution

Uploaded by

Srishti
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views11 pages

What Is A Probability Distribution

Uploaded by

Srishti
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 11

What Is a Probability Distribution?

• A probability distribution is a statistical function that describes all the possible values and
likelihoods that a random variable can take within a given range.

• A probability distribution depicts the expected outcomes of possible values for a given data
generating process.

• Probability distributions come in many shapes with different characteristics, as defined by the
mean, standard deviation, skewness, and kurtosis.

Types of probability distribution

• Poisson distribution

• Normal distribution

• Binomial distribution

Binomial Distribution

• Describes discrete data set i.e situations where there can be only two results in an experiment.

• For eg: pass or failure, yes or no, head or tail etc.

Conditions for applying binomial distribution

• each observation falls into one of two categories called a success or failure.

• there is a fixed number of observations.

• the observations are all independent.

• the probability of success (p) for each observation is the same - equally likely.
Poisson distribution

• Describes discrete data i.e situations where the random variable can take integer values.

• For eg: no. of patients arriving at the physician’s office,No. of cars arriving at the toll booth
Normal Distribution

• Normal distribution, also known as the Gaussian distribution, is a probability distribution that is
symmetric about the mean, showing that data near the mean are more frequent in occurrence
than data far from the mean. In graph form, normal distribution will appear as a bell curve.

Properties of a Normal Distribution

• The normal curve is symmetrical about the mean μ;

• The mean is at the middle and divides the area into halves;

• The total area under the curve is equal to 1;

• It is completely determined by its mean and standard deviation σ (or variance σ2)

• The mean, median and mode are equal

What Is Hypothesis Testing?

• A statistical hypothesis is an assumption about a population which may or may not be true.

• Hypothesis testing is a set of formal procedures used by statisticians to either accept or reject
statistical hypotheses.

Characteristics of Hypothesis

• Explain what it claims to explain

• Responsive to testing with in a reasonable time.

• Consistent with most known facts.

• limited in scope and must be specific.


• Stated relationship between variables.

• Capable of being tested.

• Clear and precise

Types of hypothesis

• Null hypothesis (H0)

• The null hypothesis states that a population parameter (such as the mean, the standard
deviation, and so on) is equal to a hypothesized value. The null hypothesis is often an initial
claim that is based on previous analyses or specialized knowledge.

• Alternative Hypothesis (H1)

• The alternative hypothesis states that a population parameter is smaller, greater, or different
than the hypothesized value in the null hypothesis. The alternative hypothesis is what you might
believe to be true or hope to prove true.

One and two sided hypotheis

• One-sided

• Use a one-sided alternative hypothesis (also known as a directional hypothesis) to determine


whether the population parameter differs from the hypothesized value in a specific direction.
You can specify the direction to be either greater than or less than the hypothesized value. A
one-sided test has greater power than a two-sided test, but it cannot detect whether the
population parameter differs in the opposite direction.

• Example of one sided

• A researcher has exam results for a sample of students who took a training course for a national
exam. The researcher wants to know if trained students score above the national average of
850. A one-sided alternative hypothesis (also known as a directional hypothesis) can be used
because the researcher is specifically hypothesizing that scores for trained students are greater
than the national average. (H0: μ = 850 vs. H1: μ > 850)

• Two-sided

• Use a two-sided alternative hypothesis (also known as a non directional hypothesis) to


determine whether the population parameter is either greater than or less than the
hypothesized value. A two-sided test can detect when the population parameter differs in either
direction, but has less power than a one-sided test.

Example of two-sided hypothesis

• A researcher has results for a sample of students who took a national exam at a high school. The
researcher wants to know if the scores at that school differ from the national average of 850. A
two-sided alternative hypothesis (also known as a nondirectional hypothesis) is appropriate
because the researcher is interested in determining whether the scores are either less than or
greater than the national average. (H0: μ = 850 vs. H1: μ≠ 850)

• Steps of Hypothesis Testing

• State the hypotheses - This step involves stating both null and alternative hypotheses. The
hypotheses should be stated in such a way that they are mutually exclusive. If one is true then
other must be false.

• Formulate an analysis plan - The analysis plan is to describe how to use the sample data to
evaluate the null hypothesis. The evaluation process focuses around a single test statistic.

• Analyze sample data - Find the value of the test statistic (using properties like mean score,
proportion, t statistic, z-score, etc.) stated in the analysis plan.

• Interpret results - Apply the decisions stated in the analysis plan. If the value of the test statistic
is very unlikely based on the null hypothesis, then reject the null hypothesis.

Purpose of Hypothesis testing

• It provides clarity to the research problem and research objectives

• It describes, explains or predicts the expected results or outcome of the research.

• It indicates the type of research design.

• It directs the research study process.

• It identifies the population of the research study that is to be investigated or examined.

• It facilitates data collection, data analysis and data interpretation

• Large sample theory.:The sample size n is greater than 30 (n≥30) it is known as large sample.
For large samples the sampling distributions of statistic are normal(Z test). A study of sampling
distribution of statistic for large sample is known as large sample theory.

• Small sample theory:If the sample size n is less than 30 (n<30), it is known as small sample. For
small samples the sampling distributions are t, F and χ2 distribution. A study
of sampling distributions for small samples is known as small sample theory.

Type 1 and type 2 error

 A Type I error occurs when the sample data appear to show a treatment effect when, in fact,
there is none.

• In this case the researcher will reject the null hypothesis and falsely conclude that the
treatment has an effect.
• Type I errors are caused by unusual, unrepresentative samples. Just by chance the researcher
selects an extreme sample with the result that the sample falls in the critical region even though
the treatment has no effect.

• A Type II error occurs when the sample does not appear to have been affected by the treatment
when, in fact, the treatment does have an effect.

• In this case, the researcher will fail to reject the null hypothesis and falsely conclude that the
treatment does not have an effect.

• Type II errors are commonly the result of a very small treatment effect. Although the treatment
does have an effect, it is not large enough to show up in the research study.
What are the sampling methods or Sampling Techniques?
In Statistics, the sampling method or sampling technique is the process of studying the
population by gathering information and analyzing that data. It is the basis of the data where
the sample space is enormous.

Types of Sampling Method


In Statistics, there are different sampling techniques available to get relevant results from the
population. The two different types of sampling methods are::

 Probability Sampling
 Non-probability Sampling

Probability Sampling Types


Probability Sampling methods are further classified into different types, such as simple random
sampling, systematic sampling, stratified sampling, and clustered sampling.

Simple Random Sampling


In simple random sampling technique, every item in the population has an equal and likely chance of
being selected in the sample. Since the item selection entirely depends on the chance, this method
is known as “Method of chance Selection”. As the sample size is large, and the item is chosen
randomly, it is known as “Representative Sampling”.
Example:
Suppose we want to select a simple random sample of 200 students from a school. Here, we can
assign a number to every student in the school database from 1 to 500 and use a random number
generator to select a sample of 200 numbers.

Systematic Sampling
In the systematic sampling method, the items are selected from the target population by selecting
the random selection point and selecting the other methods after a fixed sample interval. It is
calculated by dividing the total population size by the desired population size.
Example:
Suppose the names of 300 students of a school are sorted in the reverse alphabetical order. To
select a sample in a systematic sampling method, we have to choose some 15 students by randomly
selecting a starting number, say 5. From number 5 onwards, will select every 15th person from the
sorted list. Finally, we can end up with a sample of some students.
Stratified Sampling
In a stratified sampling method, the total population is divided into smaller groups to complete the
sampling process. The small group is formed based on a few characteristics in the population. After
separating the population into a smaller group, the statisticians randomly select the sample.
For example, there are three bags (A, B and C), each with different balls. Bag A has 50 balls, bag B
has 100 balls, and bag C has 200 balls. We have to choose a sample of balls from each bag
proportionally. Suppose 5 balls from bag A, 10 balls from bag B and 20 balls from bag C.

Clustered Sampling
In the clustered sampling method, the cluster or group of people are formed from the population set.
The group has similar significatory characteristics. Also, they have an equal chance of being a part
of the sample. This method uses simple random sampling for the cluster of population.
Example:
An educational institution has ten branches across the country with almost the number of students. If
we want to collect some data regarding facilities and other things, we can’t travel to every unit to
collect the required data. Hence, we can use random sampling to select three or four branches as
clusters.
All these four methods can be understood in a better manner with the help of the figure given below.
The figure contains various examples of how samples will be taken from the population using
different techniques.

What is Non-Probability Sampling?


The non-probability sampling method is a technique in which the researcher selects the sample
based on subjective judgment rather than the random selection. In this method, not all the members
of the population have a chance to participate in the study.

Non-Probability Sampling Types


Non-probability Sampling methods are further classified into different types, such as convenience
sampling, consecutive sampling, quota sampling, judgmental sampling, snowball sampling. Here, let
us discuss all these types of non-probability sampling in detail.

Convenience Sampling
In a convenience sampling method, the samples are selected from the population directly because
they are conveniently available for the researcher. The samples are easy to select, and the
researcher did not choose the sample that outlines the entire population.
Example:
In researching customer support services in a particular region, we ask your few customers to
complete a survey on the products after the purchase. This is a convenient way to collect data. Still,
as we only surveyed customers taking the same product. At the same time, the sample is not
representative of all the customers in that area.

Consecutive Sampling
Consecutive sampling is similar to convenience sampling with a slight variation. The researcher
picks a single person or a group of people for sampling. Then the researcher researches for a period
of time to analyze the result and move to another group if needed.

Quota Sampling
In the quota sampling method, the researcher forms a sample that involves the individuals to
represent the population based on specific traits or qualities. The researcher chooses the sample
subsets that bring the useful collection of data that generalizes the entire population.
Learn more about quota sampling here.

Purposive or Judgmental Sampling


In purposive sampling, the samples are selected only based on the researcher’s knowledge. As their
knowledge is instrumental in creating the samples, there are the chances of obtaining highly
accurate answers with a minimum marginal error. It is also known as judgmental sampling or
authoritative sampling.

Snowball Sampling
Snowball sampling is also known as a chain-referral sampling technique. In this method, the
samples have traits that are difficult to find. So, each identified member of a population is asked to
find the other sampling units. Those sampling units also belong to the same targeted population.

Probability sampling vs Non-probability Sampling Methods


The below table shows a few differences between probability sampling methods and non-probability
sampling methods.

Probability Sampling Methods Non-probability Sampling Methods

Probability Sampling is a sampling technique in Non-probability sampling method is a technique in


which samples taken from a larger population are which the researcher chooses samples based on
chosen based on probability theory. subjective judgment, preferably random selection.
These are also known as Random sampling These are also called non-random sampling
methods. methods.

These are used for research which is conclusive. These are used for research which is exploratory.

These involve a long time to get the data. These are easy ways to collect the data quickly.

There is an underlying hypothesis in probability The hypothesis is derived later by conducting the
sampling before the study starts. Also, the objective research study in the case of non-probability
of this method is to validate the defined hypothesis. sampling.

You might also like