0% found this document useful (0 votes)
4 views8 pages

Statistics Reviewr

The document outlines key concepts in statistics, including the distinction between population and sample, as well as descriptive and inferential statistics. It explains various sampling methods, highlighting the importance of random sampling for accurate representation of populations. Additionally, it contrasts parametric and non-parametric tests, emphasizing their reliance on statistical distributions.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views8 pages

Statistics Reviewr

The document outlines key concepts in statistics, including the distinction between population and sample, as well as descriptive and inferential statistics. It explains various sampling methods, highlighting the importance of random sampling for accurate representation of populations. Additionally, it contrasts parametric and non-parametric tests, emphasizing their reliance on statistical distributions.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 8

Statistics

A. Distinguish population and sample

B. Descriptive stat and inferential stat

C. Parametric and non parametric analysis

D. Random and non random sampling

A. Distinguish population and sample

A population is the entire group that you want to draw conclusions about.

A sample is the specific group that you will collect data from. The size of the sample is always less than
the total size of the population.

In research, a population doesn’t always refer to people. It can mean a group containing elements of
anything you want to study, such as objects, events, organizations, countries, species, organisms, etc.

Population Sample

Advertisements for IT jobs in the The top 50 search results for advertisements for IT jobs in the Netherlands
Netherlands on May 1, 2020

Songs from the Eurovision Song Winning songs from the Eurovision Song Contest that were performed in
Contest English

Undergraduate students in the 300 undergraduate students from three Dutch universities who volunteer
Netherlands for your psychology research study

All countries of the world Countries with published data available on birth rates and GDP since
2000

What do you mean by population in statistics?


Population refers to all of the individuals that the study wants to describe. In a study where a sample
of college students describe their eating habits, the population of interest may be all college
students. Usually, the sample is some of the individuals who satisfy the certain criteria, while the
population is all such individuals.

Why is a sample used instead of the population?


It is not always possible to observe or study every member of a population. Consider the problem of
polling every registered voter, or even more difficult, measuring every tuna in the oceans. For this
reason, take samples of those populations which are a good representative of what the population
as a whole is like, and measure those.
What is meant by sample in statistics?
In statistics, the sample is only those individuals that give data. So, it is the people who actually
responded to a poll, the people who participated in the study, or the objects which we actually
measured. If no data is gathered from an individual, that individual is not in the sample.

How do you find the sample in statistics?


To find the sample in statistics, ask "which individuals did the data derived from?" In a survey, the
sample is those who responded (and not just those who were called). In an observational study, the
sample is the people or objects that are being observed.

What is the difference between the population and the


sample?
The population is all individuals that the study concerns itself with. The sample is only those
individuals within the population from whom the data is taken. If the sample is the entire population,
then the study is called a census.

In some of the previous examples, it was not possible to study the entire population. This is often the
case in statistics, and for that reason, samples are used rather than populations. For situations in
which it is difficult to measure the entire population, a sample must be taken. To understand what is
a sample in statistics, consider the previous first example. The local non-profit wishes to describe all
drivers in their city, but only measures 200. These 200 drivers form the sample for this study. In
general, the sample in statistics describes those individuals we get data from.
While there are many ways to take a sample, some sampling methods are better than others. In
statistics, even when working with samples, statisticians want to describe an entire population. This
means it is important to ensure that the sample taken is a good representative of the population.
There are two main ways this is done: first, by taking probability samples, or randomized sampling,
and second, by repeating a study with many repeated samples. These two methods help to reduce
the potential for sampling bias.

A population vs. a sample


How to find Sample?
There are various methods for choosing samples. Here are a few example sampling methods:

 Simple random sampling - similar to drawing names out of a hat, individuals within a
population are chosen at random and studied.
 Stratified random sampling - a population is split into two or more groups, and a random
sample is taken from each group.
 Cluster sampling - A population is split into many groups. Some of these groups are chosen
at random, and every member of each chosen group forms the sample.
 Convenience sampling - the easiest sampling method, convenience sampling describes
using just the individuals who are close at hand, such as asking the first ten people one
meets to take a survey.

Of these, simple random sampling is the most preferred option for obtaining a representative
sample, and convenience sampling is the method most likely to contain bias.

B. Descriptive stat and inferential stat

Statistics is at the heart of data analytics. It is the branch of mathematics that helps us
spot trends and patterns in the bulk of numerical data. Statistical techniques can be
categorized as Descriptive Statistics and Inferential Statistics. In this post, we explore
the differences in descriptive vs. inferential statistics, how they impact the field of data
analytics. Interestingly, some of the measurement techniques are similar, but the
objectives are different. So, let’s understand the major differences.

What is Descriptive Statistics?

Descriptive Statistics describes the characteristics of a data set. It is a simple technique to


describe, show and summarize data in a meaningful way. You simply choose a group you’re
interested in, record data about the group, and then use summary statistics and graphs to describe
the group properties. There is no uncertainty involved because you’re just describing the people
or items that you actually measure. You’re not aiming to infer properties about a large data set.

Descriptive statistics involves taking a potentially sizeable number of data points in the sample
data and reducing them to certain meaningful summary values and graphs. The process allows
you to obtain insights and visualize the data rather than simply pouring through sets of raw
numbers. With descriptive statistics, you can describe both an entire population and an individual
sample.
What is Inferential Statistics?

In Inferential Statistics, the focus is on making predictions about a large group of data based on a
representative sample of the population. A random sample of data is considered from a
population to describe and make inferences about the population. This technique allows you to
work with a small sample rather than the whole population. Since inferential statistics make
predictions rather than stating facts, the results are often in the form of probability.

The accuracy of inferential statistics depends largely on the accuracy of sample data and how it
represents the larger population. This can be effectively done by obtaining a random sample.
Results that are based on non-random samples are usually discarded. Random sampling - though
not very straightforward always – is extremely important for carrying out inferential techniques.

Types of Descriptive Statistics

There are three major types of Descriptive Statistics.

1. Frequency Distribution

Frequency distribution is used to show how often a response is given for quantitative as well as
qualitative data. It shows the count, percent, or frequency of different outcomes occurring in a
given data set. Frequency distribution is usually represented in a table or graph. Bar charts,
histograms, pie charts, and line charts are commonly used to present frequency distribution. Each
entry in the graph or table is accompanied by how many times the value occurs in a specific
interval, range, or group.

These tables of graphs are a structured way to depict a summary of grouped data classified on the
basis of mutually exclusive classes and the frequency of occurrence in each respective class.

2. Central Tendency

Central tendency includes the descriptive summary of a dataset using a single value that reflects
the center of the data distribution. It locates the distribution by various points and is used to show
average or most commonly indicated responses in a data set. Measures of central tendency or
measures of central location include the mean, median, and mode. Mean refers to the average or
most common value in a data set, while the median is the middle score for the data set in
increasing order, and mode is the most frequent value.
3. Variability or Dispersion

A measure of variability identifies the range, variance, and standard deviation of scores in a
sample. This measure denotes the range and width of distribution values in a data set and
determines how to spread apart the data points are from the center.

The range shows the degree of dispersion or the difference between the highest and lowest
values within the data set. The variance refers to the degree of the spread and is measured as an
average of the squared deviations. The standard deviation determines the difference between the
observed score in the data set and the mean value. This descriptive statistic is useful when you
want to show how to spread out your data is and how it affects the mean.

Descriptive Statistics is also used to determine measures of position, which describes how a
score ranks in relation to another. This statistic is used to compare scores to a normalized score
like determining percentile ranks and quartile ranks.

Types of Inferential Statistics

Inferential Statistics helps to draw conclusions and make predictions based on a data
set. It is done using several techniques, methods, and types of calculations. Some of
the most important types of inferential statistics calculations are:

1. Regression Analysis

Regression models show the relationship between a set of independent variables and a
dependent variable. This statistical method lets you predict the value of the dependent
variable based on different values of the independent variables. Hypothesis tests are
incorporated to determine whether the relationships observed in sample data actually
exist in the data set.

2. Hypothesis Tests

Hypothesis testing is used to compare entire populations or assess relationships


between variables using samples. Hypotheses or predictions are tested using statistical
tests so as to draw valid inferences.
3. Confidence Intervals

The main goal of inferential statistics is to estimate population parameters, which are
mostly unknown or unknowable values. A confidence interval observes the variability in
a statistic to draw an interval estimate for a parameter. Confidence intervals take
uncertainty and sampling error into account to create a range of values within which the
actual population value is estimated to fall.

Each confidence interval is associated with a confidence level that indicates the
probability in the percentage of the interval to contain the parameter estimate if you
repeat the study.

Parametric and non-parametric analysis

The key difference between parametric and nonparametric test is that the
parametric test relies on statistical distributions in data whereas nonparametric do not
depend on any distribution. Non-parametric does not make any assumptions and
measures the central tendency with the median value. Some examples of non-
parametric tests include Mann-Whitney, Kruskal-Wallis, etc.

Parametric is a statistical test which assumes parameters and the distributions about
the population are known. It uses a mean value to measure the central tendency. These
tests are common, and therefore the process of performing research is simple.

Definition of Parametric and Nonparametric Test


Parametric Test Definition

In Statistics, a parametric test is a kind of hypothesis test which gives generalizations


for generating records regarding the mean of the primary/original population. The t-test
is carried out based on the students’ t-statistic, which is often used in that value.

The t-statistic test holds on the underlying hypothesis, which includes the normal
distribution of a variable. In this case, the mean is known, or it is considered to be
known. For finding the sample from the population, population variance is identified. It is
hypothesized that the variables of concern in the population are estimated on an interval
scale.
Non-Parametric Test Definition

The non-parametric test does not require any population distribution, which is meant by
distinct parameters. It is also a kind of hypothesis test, which is not based on the
underlying hypothesis. In the case of the non-parametric test, the test is based on the
differences in the median. So this kind of test is also called a distribution-free test. The
test variables are determined on the nominal or ordinal level. If the independent
variables are non-metric, the non-parametric test is usually performed.

D. Random and non - random sampling

There are mainly two methods of sampling which are random and non-random
sampling. Random sampling is referred to as that sampling technique where the
probability of choosing each sample is equal.

The sample that is chosen randomly is an unbiased representation of the total


population. If at all, the sample chosen does not represent the population, it leads to
sampling error.

Non-random sampling is a sampling technique where the sample selection is based on


factors other than just random chance. In other words, non-random sampling is biased
in nature.

Here, the sample will be selected based on the convenience, experience or judgment of
the researcher.

Following are some of the points of difference between random sampling and non-
random sampling

Random Sampling Non-random Sampling

Definition

Random sampling is a sampling technique Non-random sampling is a sampling technique


where each sample has an equal probability of where the sample selected will be based on
getting selected factors such as convenience, judgement and
experience of the researcher and not on
probability
Biases involved in Sampling

Random sampling is unbiased in nature Non-random sampling is biased in nature

Based on

Based on probability Based on other factors such as convenience,


judgement and experience of researcher but,
not based on probability

Representation of Population

Random sampling is representative of the Non-random sampling lacks the representation


entire population of the entire population

Chances of Zero Probability

Never Zero probability can occur

Complexity

Random sampling is the most simple sampling Non-random sampling method is a somewhat
technique complex sampling technique

You might also like