1 Ffaa
1 Ffaa
1 Ffaa
In
other words, it is a mathematical discipline to collect, summarize data. Also, we can say that statistics is
a branch of applied mathematics.
Statistics is the study of the collection, analysis, interpretation, presentation, and organization of data. In
other words, it is a mathematical discipline to collect, summarize data. Also, we can say that statistics is
a branch of applied mathematics. However, there are two important and basic ideas involved in
statistics; they are uncertainty and variation. The uncertainty and variation in different fields can be
determined only through statistical analysis. These uncertainties are basically determined by the
probability that plays an important role in statistics.
What is Statistics?
Statistics is simply defined as the study and manipulation of data. As we have already discussed in the
introduction that statistics deals with the analysis and computation of numerical data. Let us see more
definitions of statistics given by different authors here.
Basics of Statistics
The basics of statistics include the measure of central tendency and the measure of dispersion. The
central tendencies are mean, median and mode and dispersions comprise variance and standard
deviation.
Mean is the average of the observations. Median is the central value when observations are arranged in
order. The mode determines the most frequent observations in a data set.
Variation is the measure of spread out of the collection of data. Standard deviation is the measure of the
dispersion of data from the mean. The square of standard deviation is equal to the variance.
Mathematical Statistics
Mathematical statistics is the application of Mathematics to Statistics, which was initially conceived as
the science of the state — the collection and analysis of facts about a country: its economy, and,
military, population, and so forth.
Mathematical techniques used for different analytics include mathematical analysis, linear algebra,
stochastic analysis, differential equation and measure-theoretic probability theory.
Types of Statistics
Basically, there are two types of statistics.
Descriptive Statistics
Inferential Statistics
In the case of descriptive statistics, the data or collection of data is described in summary. But in the
case of inferential stats, it is used to explain the descriptive one. Both these types have been used on
large scale.
Descriptive Statistics
The data is summarised and explained in descriptive statistics. The summarization is done from a
population sample utilising several factors such as mean and standard deviation. Descriptive statistics is
a way of organising, representing, and explaining a set of data using charts, graphs, and summary
measures. Histograms, pie charts, bars, and scatter plots are common ways to summarise data and
present it in tables or graphs. Descriptive statistics are just that: descriptive. They don’t need to be
normalised beyond the data they collect.
Inferential Statistics
We attempt to interpret the meaning of descriptive statistics using inferential statistics. We utilise
inferential statistics to convey the meaning of the collected data after it has been collected, evaluated,
and summarised. The probability principle is used in inferential statistics to determine if patterns found
in a study sample may be extrapolated to the wider population from which the sample was drawn.
Inferential statistics are used to test hypotheses and study correlations between variables, and they can
also be used to predict population sizes. Inferential statistics are used to derive conclusions and
inferences from samples,
Statistics Formulas
The formulas that are commonly used in statistical analysis are given in the table below.
Methods in Statistics
The methods involve collecting, summarizing, analyzing, and interpreting variable numerical data. Here
some of the methods are provided below.
Data collection
Data summarization
Statistical analysis
Types of Data
2 Continuous data- is not fixed but has a range of data. It can be measured.
Representation of Data
There are different ways to represent data such as through graphs, charts or tables. The general
representation of statistical data are:
Bar Graph
Pie Chart
Line Graph
Pictograph
Histogram
Frequency Distribution
In Mathematics, statistics are used to describe the central tendencies of the grouped and ungrouped
data. The three measures of central tendency are:
Mean
Median
Mode
All three measures of central tendency are used to find the central value of the set of data.
Q1
Statistics is a branch that deals with the study of the collection, analysis, interpretation, organisation,
and presentation of data. Mathematically, statistics is defined as the set of equations, which are used to
analyse things.
Q2
The two different types of statistics used for analyzing the data are:
Descriptive Statistics: It summarizes the data from the sample using indexes
Inferential Statistics: It concludes from the data which are subjected to the random variation
Q3
Q4
Statistics is a part of Applied Mathematics that uses probability theory to generalize the collected
sample data. It helps to characterize the likelihood where the generalizations of data are accurate. This
is known as statistical inference.
Q5
Statistics make us learn to utilize a restricted sample to make accurate determinations about a more
prominent populace. The utilization of tables, diagrams, and graphs assumes a crucial part in introducing
the information being utilized to reach these determinations.
Q6
Statistics encourages you to utilize legitimate strategies to gather the information, utilize the right
examinations, and successfully present the outcomes. Measurement is a significant cycle behind how we
make disclosures in science, settle on choices dependent on information, and make forecasts.
Here are five basic statistic analysis concepts and when you might use them:
1. Regression
Regression is a method for comparing two variables when one of them is independent and the other, or
the others, depends on that first variable. There are different methods for regression depending on how
many variables you're analyzing. Once you calculate the regression for a set of data, you can predict
future results based on values for the independent variable. Regression focuses on trends, so it's
important to combine a regression analysis with interrogation and analysis of any outlying data points
that are far from what you expect.
Y = a + mx + e
Where:
e = The error term, used when forecasting with the regression formula
Example: The Better Bakery is trying to forecast how many doughnuts they'll sell if they know how many
they display. The independent variable is the number of doughnuts they display, and the number sold is
the dependent variable. They don't sell any doughnuts when they don't have any displayed, so their 'a'
value is zero. On a Thursday, they had 48 doughnuts on display and sold 36. On Friday, they had 60
doughnuts on display and sold 45. Applying the formula to both days allows them to understand the
slope of their doughnut regression:
Friday: 60 = 0 + (m x 45)
In both these equations, m = 0.75, so the bakery can use that in the equation to project how many
doughnuts they may sell in the future.
The mean of a data set, also called the average, can be useful for understanding the arrangement of
data within a set and where the numbers occur most frequently. It works best when trying to get a
general idea of the size of a single transaction or event. Combining the mean with other information,
like the data set's mode and range, can be helpful to understand the mean more completely. The
formula for calculating the mean is:
(Sum of all data points in set) / (Number of data points in set) = Mean of the data set
Example:September Sales and Distributing made five sales within a day, totaling $3,000, $5,500, $2,000,
$4,000 and $6,500. To calculate their mean sale for that day, they add the sales together and divide by
five:
3. Standard deviation
Standard deviation measures the distribution of data over its range. A data set with a large standard
deviation has data points spread over a wide area, while a data set with a small standard deviation has
most of its data clustered together. A standard deviation can be most useful when the data is over a
reasonable spread, and you don't have too many outliers. There are two formulas to calculate standard
deviation, depending on whether you have just a sample of data or the complete data set for the whole
population. Here’s one of the formulas:
Where:
N = Number of observations
Example: Mouse Greenhouse is measuring how much their sales of fertilizer bags change over the 12
weeks of summer. They calculate the standard deviation from their weekly sales all summer by first
calculating the mean of their weekly sales totals.
Then, for each week's number of sales, xi in the formula, they subtract the mean from that week's total
and square the result. They take the sum of all these squares, and divide it by the number of
observations minus one, in this case, 11. They take the square root of that and find their standard
deviation, which for this sample is six. This means that for most weeks of summer, they can expect the
number of bags of fertilizer they sell to be within six of their mean weekly sales.
Sample size determination is the process of choosing the appropriate data to analyze out of a large set.
A correctly chosen sample size can give you the same results as analyzing the whole sample, but it's
more efficient since it involves less processing. Here are the factors to consider when calculating your
sample size:
Total population size: This is the maximum size of all possible data. If you've completed your research,
your total population size is the number of data points or responses you've gotten, while if you're
designing a study, the total population size is the maximum number of possible data points.
Margin of error: The margin of error determines how much error you're willing to accept in your study.
Confidence level: This is the percentage likelihood that your results, such as a
Confidence level: This is the percentage likelihood that your results, such as a calculated mean, fall
within the true mean of the entire data set. After you determine the necessary confidence level, usually
90% or above, use a table to find the z-score that corresponds with your chosen confidence level.
Standard deviation: This is the amount of variance you expect in your data.
Read more: How To Determine Sample Size: Calculate Using This Formula
5. Hypothesis testing
Hypothesis testing is a process you can use to determine whether data supports a specific hypothesis.
You can perform hypothesis testing by first determining what specific formula you expect to be true.
This expected result becomes your first hypothesis, or H1. The unexpected result is the null hypothesis,
or H0. It's important to note that hypothesis testing formulas depend on what you're analyzing and
testing. For example, the hypotheses may be specific formulas relating the two variables to each other
so that some numerical results would mean that H1 is true, while others directly show H0 to be true.
H0: A ≠ B
H1: A = B
Where:
Example: Smooth Storage Solutions believes their customers use their largest rental trucks for moving
more than 100 miles, so this is their first hypothesis:
H0: Average miles < 100 per trip or average miles = 100 per trip
They check odometers on their trucks before and after each truck rental and find that all trips were at
least 200 miles, so they have reason to believe their first hypothesis is true.