0% found this document useful (0 votes)

218 views64 pages

570 Asm 2

This document is an assignment submission sheet for a statistics unit. It includes the student's name and ID, the class details, date submitted, and a student declaration. The sheet also lists the grading criteria and provides space for assessor feedback and signatures. It aims to assess the student's understanding and application of key statistical concepts like descriptive statistics, inferential statistics, hypothesis testing, and correlation analysis. The student will use statistical methods for business applications such as quality management, inventory control, and capacity planning. Charts and tables must be used to communicate the findings.

Uploaded by

Pham Nguyen Anh (FGW HCM)

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

218 views64 pages

570 Asm 2

Uploaded by

Pham Nguyen Anh (FGW HCM)

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 64

ASSIGNMENT 02 FRONT SHEET

Qualification BTEC Level 5 HND Diploma in Business

Unit number and title Unit 31: Statistics for management

Submission date 13/05/2022 Date received (1st Submission) 13/05/2022

Re-submission date Date received (2nd Submission)

Student Name Phạm Nguyên Anh Student ID GBS200417

Class No. GBS0908B Assessor Name VO MINH VINH

Student declaration
I certify that the assignment submission is entirely my own work and I fully understand the consequences of plagiarism.
I understand that making a false declaration is a form of malpractice.
Student Signature
Uyen

Grading grid

1
P3 P4 P5 M2 M3 M4 D1 D2 D3

2
Description of activity undertaken

Assessment & Grading criteria

How the activity meets the requirements of the criteria

Student Signature Date:

Assessor Signature Date:

Assessor name:

3
 Summative Feedbacks  Resubmission Feedbacks

Grade: Assessor Signature: Date:

Internal Verifier’s Comments:

Signature & Date:

4
Table of Contents
Introduction ..................................................................................................................................................... 7
Differences between qualitative and quantitative raw data analysis ................................................................. 9
Quantitative data.......................................................................................................................................... 9
Advantages ............................................................................................................................................... 9
Disadvantages........................................................................................................................................... 9
Qualitative data .......................................................................................................................................... 10
Advantages ............................................................................................................................................. 10
Disadvantages......................................................................................................................................... 10
Descriptive statistics ....................................................................................................................................... 11
Measures of Central Tendency: mean, mode, and median .......................................................................... 11
Mean ...................................................................................................................................................... 11
Median ................................................................................................................................................... 13
Mode ...................................................................................................................................................... 14
Measures of Variability: range, variance and standard deviation .................................................................... 15
Range ......................................................................................................................................................... 15
Variance ..................................................................................................................................................... 16
Standard Deviation ..................................................................................................................................... 17
Inferential statistics .................................................................................................................................... 18
The differences between population and sample based on different sampling techniques and methods..... 18
One sample T-test: Estimation and Hypotheses testing ................................................................................... 19
Estimation .................................................................................................................................................. 19
Hypotheses testing ..................................................................................................................................... 21
Two-tailed testing ................................................................................................................................... 23
One-tailed testing ................................................................................................................................... 24
Two sample t-test ........................................................................................................................................... 26
Independent Sample T-test ......................................................................................................................... 27
Estimation .............................................................................................................................................. 27
Hypothesis testing .................................................................................................................................. 28
Dependent Sample T-test ........................................................................................................................... 30
Estimation .............................................................................................................................................. 30
Hypothesis testing .................................................................................................................................. 31
Measuring the association between two variables (from the dataset) ............................................................ 33
Correlation analysis .................................................................................................................................... 33
Regression .................................................................................................................................................. 35
Simple linear regression.............................................................................................................................. 35
Multiple linear regression ........................................................................................................................... 37
Histogram of the Residual ........................................................................................................................... 42

5
Normal P-P Plot of Regression..................................................................................................................... 43
Scatterplot .................................................................................................................................................. 44
Apply a range of statistical methods used in business planning for quality, inventory and capacity
management .................................................................................................................................................. 45
Probability Distribution ............................................................................................................................... 45
Joint Probability .......................................................................................................................................... 45
Conditional Probability ............................................................................................................................... 46
Applying a range of statistical methods used in business planning for quality, inventory, and capacity
management .................................................................................................................................................. 47
Measuring the variability in business processes or quality management ..................................................... 47
Measuring the probability by using probability distributions to business operations and processes............. 47
Normal Distribution ................................................................................................................................ 47
Poisson Distribution and Binomial Distribution........................................................................................ 50
Comparison ............................................................................................................................................ 50
Inference ................................................................................................................................................ 51
Using appropriate charts and tables to communicate findings of given variables ............................................ 52
Frequency table .......................................................................................................................................... 52
Bar chart ..................................................................................................................................................... 52
Pie chart ..................................................................................................................................................... 53
Histogram ................................................................................................................................................... 54
Scatter plot ................................................................................................................................................. 54
The strengths and weaknesses of using different types of charts and tables................................................ 55
The most effective way of communicating the results of the analysis .......................................................... 56
Conclusion...................................................................................................................................................... 58
References ..................................................................................................................................................... 59

6
Introduction
Statistical research in business empowers chiefs to analyze past execution, predict future business
strategic approaches and lead associations successfully. Statistics can portray markets, inform
advertising, set costs and respond to changes in customer interest. This assignment will present
measurable statistical definitions and explicit applications of them.
Bayerische Motoren Werke AG, normally known as Bavarian Motor Works, BMW or BMW AG, is a
German automobile, motorcycle and engine manufacturing organization established in 1916. BMW has
it’s headquarter in Munich, Bavaria. It likewise possesses and delivers Mini vehicles, and is the parent
organization of Rolls-Royce Motor Cars. BMW creates motorcycles under BMW Motorrad. In 2012,
the BMW Group created 1,845,186 autos and 117,109 bikes across its brands in general. BMW is
essential for the “German Big 3” luxury automakers, alongside Audi and Mercedes-Benz, which are the
three top rated extravagance automakers on the planet.
However, during its operation, technology is constantly innovating in this day and age. Therefore, the
company must update the latest technologies that can be applied to its products and services to attract
customers, raise the brand awareness and compete with competitors in the market. Besides, the company
also has some difficulties to face. This is likewise the reason for this report.
BMW is intending to work on the data framework and the decision-making process. As a research
examiner, I am assigned to lead an exploration by applying a few statistical methods, and present the
outputs to the BOD. This examination has significant implications for research investigation, it offers
the specialist with a precise and complete perspective on the numbers, data, and business context of an
organization. Simultaneously, when leading the examination, the information will be precise and
complete measurements. Consequently, the factual outcomes can uphold the organization's chiefs to
settle on new choices and techniques all the more adequately.
Secondary methodology is used in this statistic. Specifically, the data of used BMW car is acquired
from the distributed source by means of the website https://www.kaggle.com/. Moreover, dataset
contains 9 variables: information of price, transmission, mileage, fuel type, road tax, miles per gallon
(mpg), and engine size, and more than 50 observations.
This report is going to clearly provide definitions identified with statistics, three methods for
investigation, and a critical evaluation of the techniques for analysis. Moreover, the specified
measurable outputs of BMW Group will be clarified exhaustively in the assignment 2.

7
8
Differences between qualitative and quantitative raw data analysis
Quantitative data
Quantitative data refers to any data that can be evaluated — that is, numbers. Assuming that it tends to
be counted or estimated, and given a mathematical worth, it is quantitative in nature. Consider it a gauge.
Quantitative factors can tell you “the number of”, “how much”, or “how frequently”. (The fullstory
education team, 2021)

Advantages
Can be tested and checked. Quantitative exploration requires careful experimental design and the
capacity for anybody to repeat both the test and the outcomes. This makes the information you
accumulate more dependable and less bring to argument.
Direct analysis. At the point when you gather quantitative information, the sort of results will let you
know which factual tests are suitable to utilize. Therefore, interpreting your data and introducing those
discoveries is direct and less open to error and subjectivity.
Prestige. Research that includes complex measurements and information examination is viewed as
important and great on the grounds that many individuals do not understand the mathematics involved.
Quantitative exploration is related with specialized progressions like PC demonstrating, stock choice,
portfolio assessment, and different information based business choices. The relationship of renown and
worth with quantitative exploration can reflect well on your small business. (Devault, 2020)

Disadvantages
False focus on numbers. Quantitative research can be restricted in its quest for concrete, factual
connections, which can prompt scientists ignoring more extensive subjects and connections. By zeroing
in exclusively on numbers, you risk missing amazing or higher perspective data that can help your
business.
Difficulty setting up a research model. At the point when you conduct quantitative research, you need
to be careful fostering a hypothesis and setting up a model for gathering and breaking down data. Any
errors in your set up, bias with respect to the scientist, or missteps in execution can negate every one of
your outcomes. Even coming up with a hypothesis can be abstract, particularly assuming you have a
particular inquiry that you definitely realize you need to demonstrate or invalidate.
Can be misleading. Many individuals accept that on the grounds that quantitative exploration depends
on insights it is more sound or logical than observational, subjective examination. Nonetheless, the two

9
sorts of exploration can be emotional and deluding. The sentiments and predispositions of a scientist
are similarly prone to affect quantitative ways to deal with data gathering. Truth be told, the effect of
this inclination happens prior during the time spent quantitative examination than it does in subjective
exploration. (Devault, 2020)

Qualitative data
Everything revolves around the numbers. Quantitative exploration depends on the assortment and
understanding of numeric information. It centers on estimating (utilizing inferential measurements) and
summing up outcomes.
As far as advanced experience information, it puts everything as far as numbers (or discrete information)
— like the quantity of clients clicking a button, bob rates, time nearby, and then some. (The fullstory
education team, 2021)

Advantages
Qualitative Research can catch changing perspectives inside a target group such as purchasers of an
item or administration, or attitudes in the working environment.
Qualitative ways to deal with research are not limited by the restrictions of quantitative strategies.
Assuming reactions don't fit the specialist’s assumption that is similarly helpful subjective information
to add setting and maybe clarify something that numbers alone cannot uncover.
Qualitative Research gives a significantly more adaptable methodology. Assuming valuable
experiences are not being caught specialists can rapidly adjust questions, change the setting or some
other variable to further develop reactions.
Qualitative data catch permits specialists to be undeniably more theoretical with regards to what
regions they decide to explore and how to do as such. It permits information catch to be provoked by
a scientist’s instinctual or ‘stomach feel’ for where great data will be found. (Vaughan, 2021)

Disadvantages
The quality of qualitative data relies upon the nature of the scientists. Scientists need to have industry
experience and great talking abilities to ask follow-up inquiries. They additionally need to bond well
with the members to guarantee the exactness of the information. Consequently, assuming that the
analysts do not have industry experience or talking abilities, they will be unable to get great reactions
from the members. (Rahman, 2021)

10
Gathering subjective information is tedious. Assuming each meeting endures somewhere in the range
of one and two hours, a limit of three or four every day is regularly all that is conceivable (BPP Learning
Media, 2013).
A few inquiries might be awkward for members to reply in an up close and personal meeting, and along
these lines, they may not give answers illustrative of their actual sentiments.

Figure. Distinctions between quantitative and qualitative data.

Descriptive statistics
Measures of Central Tendency: mean, mode, and median

A measure of central tendency is a single value that endeavors to portray a bunch of information by
distinguishing the focal situation inside that arrangement of information. The mean, median and mode
are for the most part legitimate proportions of focal propensity, however under various conditions, a
few proportions of focal inclination become more proper to use than others.

Mean

The mean (or average) is the most famous and notable measure of central tendency. It tends to be utilized
with both discrete and persistent information, despite the fact that its utilization is frequently with
continuous data (statistics.leard.com, 2021). The mean is equivalent to the amount of the relative
multitude of qualities in the data set divided by the number of values in the data set. Thus, in the event
that we have n esteems in an informational index and they have values x1, x2, …, xn, the sample mean,
usually denoted by 𝑥 (pronounced "x bar"), is:
𝑥1 + 𝑥 2 + ⋯ + 𝑥 𝑥
𝑥=
𝑥

11
This formula is typically written in a marginally unique way utilizing the Greekcapitol letter ∑
pronounced “sigma”, which means “sum of...”:

𝑥𝑥
𝑥=
𝑥

Statistics
price (€)
N Valid 50
Missing 4793
Mean 17228.0
0

12
Median

The worth of the middlemost observation, gotten after organizing the data in ascending order, is known
as the median of the data. (statistics.leard.com, 2021)

n+1
Median = 2 𝑥ℎ (if n is an odd number)

( 𝑥)𝑥ℎ+( 𝑥+1
Median = 2 2 )𝑥ℎ
(if n is an even number)
2
Statistics
engine_power
N Valid 50
Missing 4793
Median 132.50

Example 1: The following dataset has an odd number of observations that are organized in ascending
order.

In this case: n = 5

13
1+5 𝑥ℎ = 3rd = 17
 Median value =2

Example 2: The following dataset has an even number of observations that are organized in ascending
order.

In this case: n = 6
6 6+1
( )𝑥ℎ+( )𝑥ℎ 3𝑥𝑥 + 4𝑥ℎ 9+17
2 2
 Median value = = = = 13
2 2 2

Mode

The mode is the most incessant score in our informational index. There is no recipe to sort out the Mode
of the dataset, yet it tends to be taken by the perception technique. (statistics.leard.com, 2021)

14
Statistics
mileage
N Valid 50
Missing 4793
Mode 13131a
a. Multiple modes exist.
The smallest value is
shown

Measures of Variability: range, variance and standard deviation

In statistics, variability, scattering, and spread are equivalent words that indicate the width of the
conveyance. Similarly as there are various proportions of central tendency, there are a few proportions
of variability. Range, variance, and standard deviation are some most normal measures of variability.
(Frost, 2021)

Range
The range of a dataset is the distinction between the biggest and littlest qualities in that dataset. While
the reach is straightforward, it depends on just the two most outrageous qualities in the dataset, which
makes it truly defenseless to anomalies. In the event that one of those numbers is uncommonly high or
low, it influences the whole reach regardless of whether it is abnormal. (Frost, 2021)
Statistics
price (€)
N Valid 50
Missing 4793
Range 67900
Minimum 1800
Maximum 69700
Example:

In this case:
Minimum value of the dataset = 1

15
Maximum value of the dataset = 23
 Range value of the dataset = Maximum value – Minimum value = 23 – 1 = 22

Variance
Variance is the average squared difference of the values from the mean. Unlike the previous measures
of variability, the variance includes all qualities in the evaluation by contrasting each worth with the
mean. To ascertain this measurement, you calculate a bunch of squared contrasts between the data points
and the mean, sum them, and afterward divide by the number of observations. There are two equations
for the variance contingent upon whether you are computing the variance for an entire population or
using a sample to gauge the populace difference. (Frost, 2021)
Sample Variance Population Variance
𝑥(𝑥 − 𝑥)2 𝑥(𝑥 − µ)2
𝑥2 = 𝑥2 =
𝑥−1 𝑥

s2 : the sample variance σ2 : the population parameter for the variance

M: the sample mean μ: the parameter for the population mean
N-1: in the denominator corrects for the tendency N: the number of data points

Statistics
mileage
N Valid 50
Missing 4793
Mean 119943.82
Std. Deviation 71202.830
Variance 5069843006
.844

Example: Considering the following dataset including 2, 9, 5, 3, 8 and calculate the variance.
In this case: N = 5
 Use the formula of Sample variance

∑x 2+9+5+3+8
𝑥= = = 5.4
𝑥 5

16
∑(X − M)2 (2 − 5.4)2 + (9 − 5.4)2 + (5 − 5.4)2 + (3 − 5.4)2 +(8 − 5. 4)2
𝑥2 = = = 9.3
𝑥−1 5−1

Standard Deviation
The standard deviation is the norm or commonplace distinction between every important item and the
mean. At the point when the qualities in a dataset are assembled nearer, you have a more modest
standard deviation. Then again, when the qualities are fanned out more, the standard deviation is bigger
on the grounds that the standard distance is more noteworthy.
The standard deviation is only the square root of the variance. Review that the difference is in squared
units. Henceforth, the square root returns the worth to the normal units. (Frost, 2021)
Sample standard deviation Population standard deviation

s = √𝑥2 σ = √σ2
𝑥 2 : the sample variance σ2 : the population variance
s: the sample standard deviation σ: the population standard deviation

Statistics
mileage
N Valid 50
Missing 4793
Mean 119943.82
Std. Deviation 71202.830
Variance 5069843006
.844

Example: Considering the following dataset including 2, 9, 5, 3, 8 and calculate the standard deviation.
In this case: N = 5
 Use the formula of Sample standard deviation

∑x 2+9+5+3+8
𝑥= = = 5.4
𝑥 5
∑(X − M)2 (2 − 5.4)2 + (9 − 5.4)2 + (5 − 5.4)2 + (3 − 5.4)2 +(8 − 5.4)2
𝑥2 = = = 9.3
𝑥−1 5−1

 𝑥 = √𝑥 2 = √9.3 = 3.05

17
Inferential statistics
Descriptive statistics portrays data (for instance, a diagram or graph) and inferential statistics permits
you to make expectations (“inferences”) from that data. With inferential statistics, you take data from
tests and make speculations about a populace. (statisticshowto.com, 2021)
There are two fundamental spaces of inferential statistics:
Estimating parameters. This implies taking a measurement from your sample data (for example the
sample mean) and utilizing it to offer something about a populace boundary (i.e. the population mean).
Furthermore, it is divided into point estimate and interval estimate.
Hypothesis tests. This is the place where you can utilize test information to address research questions.
Furthermore, Hypothesis testing in inferential measurement comprise of One-tail test and Two-tail test.

The differences between population and sample based on different sampling

techniques and methods

In basic terms, populace implies the total of all components under study having one or more normal
trademark. The population is not bound to individuals just, however it might likewise incorporate
creatures, occasions, articles, structures, and it very well may be of any size.
By the term sample, we mean a piece of populace picked indiscriminately for interest in the review. The
example so chose ought to be to such an extent that it address the populace in the entirety of its attributes,
and it ought to be liberated from predisposition, to create little cross-segment, as the example
perceptions are utilized to make speculations about the populace.

18
In spite of the above contrasts, it is additionally a fact that sample and population are identified with
one another, for example test is drawn from the populace, so without populace, test may not exist.
Further, the essential target of the sample is to make statistical inferences about the populace, and that
excessively would be pretty much as precise as could really be expected. The more prominent the size
of the sample, the higher is the degree of precision of speculation. (Surbhi, 2017)

One sample T-test: Estimation and Hypotheses testing

The one sample t-test is a statistical technique used to decide if an example of perceptions might have
been produced by an interaction with a particular mean (Statistics Solutions, 2021). In this part,
assessment and speculations testing will be explained.

Estimation
In statistics, estimation refers to the cycle by which one makes derivations about a populace, in light of
data acquired from an example. There are two sorts of estimation in statistics: point estimate and interval
estimate.
A point estimate is a worth of an example measurement that is utilized as a solitary gauge of a populace
boundary. No assertions are made with regards to the quality or accuracy of a point estimate.
Statisticians prefer interval estimates because interval estimates are joined by an assertion concerning
the level of certainty that the span contains the populace boundary being assessed. Interval estimates of
population parameters are known as confidence intervals. (Williams et al, 2020)

19
When σ is known

𝑥 𝑥: Population mean
µ = 𝑥 ± 𝑥𝑥/2
√𝑥
𝑥 : Sample mean
α: Significant level
Zα/2: The z-value of standard normal
distribution
𝑥: Population Standard deviation
n: Sample size

Example: Find the population mean with 10% significant level if a sample of 81 people has a mean of
65Kg and the population Standard Deviation is 20kg.

In this case: α = 0.1; 𝑥 = 65; n = 81

Lookup standard normal distribution of z value: 𝑥𝑥/2 = 𝑥0.1/2 = 𝑥0.05 = 1.645
Apply to the formula:
𝑥 20
µ=𝑥± 𝑥 = 65 ± 1.645 = 65 ± 3.65 = (61.35; 68.65)
𝑥/2 √𝑥 √81

Conclusion: The population mean of this case ranges from 61.35Kg to 68.65Kg.

When σ is unknown

𝑥 𝑥: Population mean
µ = 𝑥 ± 𝑥𝑥/2
(𝑥−1)
√𝑥 𝑥: Sample mean
α : Significant level
𝑥𝑥/2 The value of the student (t) probability
(𝑥−1)

with (n – 1) degree of freedom

S: Sample standard deviation
n: Sample size

Example: Find the income of population mean with 95% confidence if a sample of 25 persons has a
mean $1000 and the sample standard deviation is $30. How much is the interval estimate of
population mean?
In this case: 𝑥 = 1000; 𝑥 = 30; 𝑥 = 25

20
the population mean with 95% confidence
 the population mean with 100% — 95% = 5% = 0.05

Lookup standard normal distribution of t value: 𝑥 𝑥/2 = 𝑥 0.05/2 = 𝑥 0.025 = 2.064

(𝑥−1) (25−1) 24

Apply to the formula:

𝑥
µ = 𝑥 ± 𝑥 𝑥/2 = 1000 ±2.064 30 = 1000 ± 12.384 = (987.616; 1012.384)
(𝑥−1) √𝑥 √25

Conclusion: The population mean of this case ranges from $987.616 to $1012.384.

Example calculated in excel: Calculate the population mean of female salaries in data with 95%
confidence.

After applying data given to the formula, the conclusion in this case is the population mean of female
salaries is between $36,088.53 and $38,331.33.

Hypotheses testing
Hypothesis testing is a type of statistical inference that utilizes information from an example to make
determinations about a populace boundary or a populace likelihood conveyance. Statistical analysts
test a hypothesis by estimating and inspecting a random sample of the population being examined. All
experts utilize an irregular populace test to test two distinct speculations: the null hypothesis and the
alternative hypothesis.
First, a conditional supposition that is made with regards to the boundary or circulation. This
supposition that is known as the invalid theory and is indicated by Ho. An alternative hypothesis
(denoted Ha), which is something contrary to what is expressed in the invalid speculation, is then
characterized. The hypothesis-testing methodology includes utilizing test information to decide if Ho
21
can be rejected. Assuming Ho is rejected, the statistical conclusion is that the alternative hypothesis Ha
is valid. (Williams et al, 2020)
The null hypothesis is typically a speculation of fairness between population parameters; e.g., a null
hypothesis might express that the population mean return is equivalent to zero. The alternative
hypothesis is viably something contrary to a null hypothesis (e.g., the population mean return is not
equal to zero). Along these lines, they are totally unrelated, and just one can be valid. In any case, one
of the two speculations will forever be valid. (Majaski, 2021)
The following table shows the various speculations in the pertinent sets. As far as hypothesis,
particularly, the Ho is always the one that has the equal (=) sign.
Ho Ha
Equal (=) Not equal (≠)
Greater than or equal to (≥) Less than (<)
Less than or equal to (≤) More than (>)

As a rule, a theory test about the value of a population mean µ must take one of the following three
structures (where µo is the hypothesized value of the population mean) which are portrayed as
underneath:
Ho: µ ≥ µo Ho: µ ≤ µo Ho: µ = µo
H a: µ < µ o H a: µ > µ o H a: µ ≠ µ o
One-tailed (Lower-tail) One-tailed (Upper-tail) Two-tailed

There are five stages in the speculation testing process as follows:

 Step 1: Assume the null and alternative hypotheses
 Step 2: Determine the degree of significance.
 Step 3: Collect the sample data and register the value of the test measurement.
 Step 4: Determine the basic worth and dismissal rule utilizing the level of importance.
 Step 5: Conclude whether or not to eliminate Ho when determined using the value of
the test measurement and the disposal rule.

Ho Ha Value testing Rejecting Ho

µ = µo µ ≠ µo |𝑥| ≥ 𝑥𝑥/2
𝑥−1

22
µ ≤ µo µ > µo (𝑥 − µ𝑥)√𝑥 𝑥 ≥ 𝑥𝑥
𝑥−1
𝑥=
𝑥
µ ≥ µo µ < µo 𝑥 ≤ 𝑥𝑥
𝑥−1

µo: The value of a constant

𝑥: The mean of the sample data
n: Sample size
s: Sample standard deviation
𝑥 𝑥/2 𝑥𝑥 𝑥 𝑥 : The value of student (t) distribution table with degree of freedom DF: n – 1
𝑥−1 𝑥−1

Two-tailed testing
Example 1: A clothing designer claims that: The mean height of adult females is 155cm. The evidence
we have is that a sample of 16 females had an average height of 153cm and the population standard
deviation is known to be 9cm. With significance level 1%, the claim of the A clothing designer is right
or wrong?
Step 1:
Assuming Ho: the mean height of adult females is 155cm, µ = µo = 155cm.
Ha: the mean height of adult females is NOT 155cm, µ ≠ µo = 155cm.
Step 2: µo = 155cm; n = 16; 𝑥 = 153cm; 𝑥 = 9cm
(𝑥 − µ𝑥)√𝑥 (155 − 153)√16
𝑥= = = 0.889
𝑥 9

Step 3: Confidence = 1 – significance = 1 – 1% = 99%

α = 1%; n = 16
 α/2 = 0.005; n – 1 = 15
 𝑥 𝑥/2 = 𝑥 0.005 = 2.947
𝑥−1 15

Step 4: |𝑥| = 0.889 < 𝑥 0.005 = 2.15947

Step 5: Do not reject or Accept Ho
It means the population mean of adult females is 155cm, µ = µo = 155cm.

Example 2: In the statistical task of BMW company, the author compares the mean of the "Price"
variable of the database in the attached excel file with the test value is 17200 at the 99% level of
confidence.
23
Step 1: Assuming Ho: the average price of the BMW is €17200: µ = µ𝑥 = 17200 Ha: the
average price of the BMW is NOT €17200: µ ≠ µ𝑥 = 17200
Step 2: Confidence coefficient is 99%
 α = 1 – 99% = 1% = 0.01

Step 3:

One-Sample Statistics
Std. Std. Error
N Mean Deviation Mean
price 50 17228.0 12095.606 1710.577
(€) 0

One-Sample Test
Test Value = 17200
99% Confidence Interval
Sig. (2- Mean of the Difference
t df tailed) Difference Lower Upper
price .016 49 .987 28.000 -4556.26 4612.26
(€)

From One – sample T Test table

Sign(2-tailed) = .987 > α = 0.01
 Accept Ho

Conclusion: Ho should be accepted, which means the average price of the BMW is €17200: µ = µ𝑥 =
17200.

One-tailed testing
Example 1: A sample of 16 people has an average weight of 55kg and a standard deviation of 5kg. Let’s
test the claim whether the sample mean is less than or equal to 52kg at the 90% level of confidence.
Assuming Ho: the sample mean weight of people is less than or equal to 52kg, µ ≤ µo = 52kg.
Ha: the sample mean weight of people is greater than 52kg, µ > µo = 52kg.

24
Step 2: µo = 52kg; n = 16; 𝑥 = 55kg; 𝑥 = 5kg
(𝑥 − µ𝑥)√𝑥 (55 − 52)√16
𝑥= = = 1.2
𝑥 5
Step 3: Confidence = 90%

 Significance = 1 – confidence = 1 – 90% = 10% = 0.1

α = 10%; n = 16
 𝑥𝑥 = 𝑥 0.1 = 1.341
𝑥−1 15

Step 4: |𝑥| = 1.2 < 𝑥 0.1 = 1.15341

Step 5: Do not reject or Accept Ho
It means the sample mean weight of people is less than or equal to 52kg, µ ≤ µo = 52kg.

Example 2: In the statistical task of BMW Company, the author tested the mean of "Price" variable of
database in the attached excel file that is less than or equal to $20000 at 95% level of confidence
(significant value = 0.05) and 99% level of confidence (significant value = 0.01) by using t-value and
p-value test of statistic.
Assuming Ho: the mean price of the BMW is less than or equal to €20000: µ ≤ µ𝑥 = 17200 Ha: the
mean price of the BMW is more than €20000: µ > µ𝑥 = 17200

Findings:
After analyzing the Ho by excel, the above table presented findings:

25
 t-multiple at 5% level of significant = -1.677
 t-multiple at 1% level of significant = -2.405
 t-value test = -1.621
 p-value test = 0.056

Discussion:
By using t-value
At 5% level of significance: |𝑥 − 𝑥𝑥𝑥𝑥𝑥| = |−1.621| = 1.621 < |𝑥 − 𝑥𝑥𝑥𝑥𝑥𝑥𝑥𝑥| = |−1.677| = 1.677
 Accept Ho

At 1% level of significance: |𝑥 − 𝑥𝑥𝑥𝑥𝑥| = |−1.621| = 1.621 < |𝑥 − 𝑥𝑥𝑥𝑥𝑥𝑥𝑥𝑥| = |−2.405| = 2.405

 Accept Ho

By using p-value
At 5% level of significance: p-value = 0.056 > alpha = 0.05
 Accept Ho

At 1% level of significance: p-value = 0.056 > alpha = 0.01

 Accept Ho

Conclusion: It is highlighted that hypothesis Ho is accepted or not rejected, which means that the mean
price of the BMW is less than or equal to €20000: µ ≤ µ𝑥 = 17200.

Two sample t-test

The two-sample t-test (otherwise called the independent samples t-test) is a technique used to test
whether or not the obscure populace method for two gatherings are equivalent. A two-sample t-test is
utilized to examine the outcomes from A/B tests. You can utilize the test when your information esteems
are free, are haphazardly examined from two ordinary populaces and the two autonomous gatherings
have equivalent changes.
To conduct a legitimate test:

26
• Data values must be independent. Estimations for one perception do not influence
estimations for some other perception.
• Data in each gathering should be acquired through a random sample from the population.
• Data in each group are ordinarily distributed.
• Data values are persistent.
• The variances for the two autonomous gatherings are equivalent.

Independent Sample T-test

Example 1: With 95% confidence and parameters are given below. Let estimate the difference between
the average in Math grades of female and male.
Males Females
Mean 150 146
Standard Deviation 4 3.5
Sample Size 22 20

Estimation
Example 1:
Step 1: Degree of freedom when standard deviations are similar (4 : 3.5 = 1.1428 < 1.5)
𝑥𝑥 = 𝑥1 + 𝑥2 − 2 = 22 + 20 − 2 = 40
Step 2:
Confidence coefficient = 95%
 Significance level = 1 – 95% = 5% = 0.05
 α/2 = 0.05/2 = 0.025

Lookup standard normal distribution of t value

𝑥𝑥 = 𝑥0.025,40 = 2.021,𝑥𝑥
2

Step 3: Similar Variance of two sample

(𝑥1 − 1)𝑥2 + (𝑥2 − 1)𝑥2 (22 − 1)42 + (20 − 1)3.52
1 2
𝑥𝑥
2= = = 14.21875
𝑥1 + 𝑥2 − 2 22 + 20 − 2

Step 4: Difference of Population Mean:

27
1 1
𝑥1 − 𝑥2 = 𝑥1 − 𝑥2 ± 𝑥𝑥√𝑥2 ( + )
𝑥
2 𝑥1 𝑥2

1 1
= 150 − 146 ± 2.021√14.21875 ( + ) = 4 ± 2.355
22 20

Step 5: Conclusion: We are 95% confidence that the difference between the average in Math grades of
female and male is 4 ± 2.355 = (1.645; 6.355). In other word, on average, males might be between 1.645 and
6.355 Math grade better than that of females.

Hypothesis testing
Example 2:
Conditions to reject Ho
Ho Ha Value testing Reject Ho
𝑥1 − 𝑥 2 = 0 𝑥1 − 𝑥2 ≠ 0 (𝑥1− 𝑥 2) − (𝑥 1 − 𝑥 2) 𝑥 ≤ −𝑥𝑥/2
𝑥= 1 1
√𝑥2 ( + ) or
𝑥 𝑥 𝑥2
1
𝑥 ≥ 𝑥𝑥/2

Step 1: Assuming
Ho that there is NO difference between Math grade of males and females, 𝑥1 − 𝑥2 = 0. Ha
that there is difference between Math grade of males and females, 𝑥1 − 𝑥2 ≠ 0.
Step 2: Similar Variance of two sample:
(𝑥1 − 1)𝑥1 2 + (𝑥 2 − 1)𝑥 2 2 (22 − 1)42 + (20 − 1)3.52
𝑥2 =
𝑥
= = 14.21875
𝑥1 + 𝑥2 − 2 22 + 20 − 2
Step 3:
Calculate t
(𝑥1 − 𝑥 2) − (𝑥 1 − 𝑥 2) (150 − 146) − 0

𝑥= = = 3.4334
1 1 √14.21875 ( 1
√𝑥2 ( + ) 1
+ )
𝑥𝑥 𝑥2 22 20
1

Calculate 𝑥𝑥,𝑥𝑥
2

Confidence coefficient = 95%

 Significance level = 1 – 95% = 5% = 0.05
 α/2 = 0.05/2 = 0.025
28
Lookup standard normal distribution of t value

29
𝑥𝑥 = 𝑥0.025,40 = 2.021
2 ,𝑥 𝑥

Step 4: Comparison

Have: 𝑥 = 3.4334 > 𝑥 𝑥,𝑥𝑥 = 𝑥0.025,40 = 2.021

 Reject Ho

Looking at the graph given below, we can see the Ho and Ha parts. Because the Ho part is rejected, we
will look at the sides in the right and left of Ha. And seeing that the value 𝑥 = 3.4334 > 𝑥 𝑥 ,𝑥𝑥 =
2

𝑥0.025,40 = 2.021 is on the right side of the graph.

Step 5: Conclusion: there is difference between Math grade of males and females, 𝑥1 − 𝑥2 ≠ 0.

Example 3:
Assuming
Ho: there is NO difference between the average salary of technical staff and not technical staff: µ
− µ𝑥 = 0
Ha: there is difference between the average salary of technical staff and not technical staff: µ − µ𝑥 ≠ 0

Group Statistics
PC Std. Std. Error
Job N Mean Deviation Mean
Salary Yes 19 40305.2 6026.467 1382.566
6
No 189 39883.3 11662.429 848.317
9

30
Notice: Yes stands for technical staff
No stand for not technical staff

Independent Samples Test

BMW’s Test for Equality of
Variances t-t

F Sig. t df Sig. (2-tailed)

Salary Equal variances 2.737 .100 .155 206 .877
assumed
Equal variances not .260 33.648 .796
assumed

From the result above:

 Sign of BMW’s test = 0.100 > α = 0.05
 Variance is similar
 Sign of t-test (green) = 0.877 > α = 0.05
 Accept Ho

Conclusion: Accept Ho, which means there is NO difference between the average salary of technical
staff and not technical staff. In other words, the average salary of technical staff is equal to that of not
technical staff.

Dependent Sample T-test

Estimation
Example 1:
Use the information about the couple’s housework time every week. With 95% confidence, estimate
the mean difference in housework time between wives and husbands.
Step 1: Define DF
Wife 19 10 12 22 25

31
Husband 14 11 8 13 28
Diff (DF) = Wife - 5 -1 4 9 -3
Husband
Step 2:
Calculate SD
Have: 𝑥𝑥 = 5; 𝑥𝑥 = 2.8
Σ(𝑥𝑥 − 𝑥𝑥 )2 (5 − 2.8)2 + (−1 − 2.8)2 + (4 − 2.8)2 + (9 − 2.8)2 + (−3 − 2.8)2
𝑥2 =
𝑥
= = 23.2
𝑥𝑥 − 1 5−1

 𝑥𝑥 = 4.816

Step 3:

DF = 𝑥𝑥 − 1 = 5 − 1 = 4
With 95% confidence
 Significance level = 1 – 95% = 0.05
 α/2 = 0.05/2 = 0.025

Lookup standard normal distribution of t value:

𝑥𝑥,𝑥𝑥 = 𝑥0.025,4 = 2.776
2
Step 4:
𝑥𝑥 4.816
µ𝑥 = 𝑥𝑥 ± 𝑥 𝑥,𝑥𝑥 𝑥 = 2.8 ± 2.776 = 2.8 ± 5.98 = (−3.18; 8.78)

2 √ 𝑥 √5
Step 5:
Interpretation: The mean difference be as high as 8.78 (in favor of the wives) and as low as -3.18 (with
“-“ indicates that it could be in favor of the husbands).

Hypothesis testing
Example 2:
Testing the difference between the mean of the amount of money between Mile age variable and Price
variable based on a data set of BMW company at 95% level of confidence (α = 0.05).
Step 1: Assuming
Ho that there is NO difference in the average of the amount of money between Mile age and Price: µ1
− µ2 = 0
Ha that there is difference in the average of the amount of money between Mile age and Price: µ1 − µ2

32
≠0

33
Paired Samples Statistics
Std. Std. Error
Mean N Deviation Mean
Pair 1 mileage 119943.8 50 71202.830 10069.601
2
price 17228.00 50 12095.606 1710.577

Paired Samples Correlations

Correlatio
N n Sig.
Pair 1 mileage & 50 -.603 .000
price

From the data above:

µD = 102715.820
Sample size: 𝑥1 = 𝑥2 = 50
Sign. = .000 < α = 0.05
 Reject Ho. It means there is difference in the average of the amount of money between
Mile age and Price: µ1 − µ2 ≠ 0

In conclusion: There is difference in the average of the amount of money between Mile age and Price:
µ1 − µ2 ≠ 0

34
Measuring the association between two variables (from the dataset)
Correlation analysis
Correlation examination in research is a factual technique used to quantify the strength of the direct
connection between two factors and figure their affiliation. Basically – correlation analysis works out
the degree of progress in one variable because of the change in the other. A high correlation points to a
solid connection between the two factors, while a low correlation means that the variables are weakly
related. (QuestionPro, 2021)

The conceivable scope of qualities for the correlation coefficient is –1.0 to 1.0. All in all, the qualities
cannot surpass 1.0 or be not exactly –1.0. A correlation of – 1.0 shows an ideal negative connection,
and a correlation of 1.0 demonstrates an ideal positive connection. Assuming the correlation coefficient
is more prominent than zero, it is a positive relationship. Alternately, if the value is under zero, it is a
negative relationship. A value of zero shows that there is no connection between the two variables.
(Nickolas, 2021)

35
There are a few types of correlation coefficient equations. One of the most ordinarily utilized formula
is Pearson’s correlation coefficient one. In addition, there are the recipe of the correlation coefficient of
population and test.
Correlation coefficient of population:
𝑥(𝑥𝑥 − µ𝑥 )(𝑥𝑥 − µ𝑥 )
𝑥=
√𝑥(𝑥𝑥 − µ𝑥 )2 𝑥(𝑥𝑥 − µ𝑥 )2
Correlation coefficient of sample:
𝑥(𝑥𝑥 − 𝑥)(𝑥 𝑥 − 𝑥)
𝑥=
√𝑥(𝑥𝑥 − 𝑥)2 𝑥(𝑥𝑥 − 𝑥)2
The data given below is the practice of the BMW’s data analyzing the correlation coefficient Pearson
in SPSS so that we can understand the parameters in the correlation to make the statistical analysis more
effective. The table below is the results of running SPSS about the correlation with four variables as
Price, Engine power, Registration date, Mile age.

Correlations
engine_pow registration_ mile_ag
price er date e
price Pearson 1 .355* .702** -.603**
Correlation
Sig. (2-tailed) .011 .000 .000
N 50 50 50 50
engine_power Pearson .355* 1 .067 -.194
Correlation
Sig. (2-tailed) .011 .643 .177
N 50 50 50 50
registration_da Pearson .702** .067 1 -.623**
te Correlation
Sig. (2-tailed) .000 .643 .000
N 50 50 50 50
mile_age Pearson -.603** -.194 -.623** 1
Correlation
Sig. (2-tailed) .000 .177 .000
N 50 50 50 50

36
*. Correlation is significant at the 0.05 level (2-tailed).
**. Correlation is significant at the 0.01 level (2-tailed).
Looking at this table, we can see that Price, Engine power, Registration date, Mile age will have an
absolute linear relationship with themselves, so they all have r = 1. When looking at the Correlation
table, we will be interested in the sig and Pearson Correlation values. Sig value (with purple color) must
be less than α = 0.05 for the r correlation to be significant. Any value smaller than 0.05, we conclude
that the independent variable is linearly correlated with the dependent variable and value higher than
0.05, there is no correlation between the independent variable and the dependent variable.
Specifically as follows, the sig value of Registration date for the Engine power is 0.643 (> α = 0.05), so
there will be no linear correlation between these two variables. Also, we see that there is a correlation
between Engine power and Price because sig of 0.011 is smaller than 0.05 (< α = 0.05). And we will
know the correlation coefficient through Pearson Correlation (with green color). With the number 0.355,
it means that the correlation coefficient between Engine power and Price is 0.355.

Regression
Regression analysis is a bunch of measurable strategies utilized for the assessment of connections
between a reliant variable and at least one autonomous factors. It tends to be used to survey the strength
of the connection among factors and for demonstrating the future connection between them. Regression
models portray the connection between factors by fitting a line to the observed data. Linear regression
models apply a straight line, whilst calculated and nonlinear regression models utilize a curved line.
Regression permits you to assess how a reliant variable changes as the autonomous variable(s) change.
Regression investigation incorporates a few varieties, for example, linear, multiple linear, and nonlinear.
The most widely recognized models are simple linear and multiple linear. Nonlinear regression
examination is regularly utilized for more complicated data sets in which the reliant and free factors
show a nonlinear relationship. (CFI Education, 2021)

Simple linear regression

Simple linear regression is utilized to display the connection between two ceaseless factors. Regularly,
the goal is to anticipate the worth of a result variable (or response) in view of the worth of an input (or
predictor) variable. (Bevans, 2020)
The formula for a simple linear regression is:
𝑥 = 𝑥0 + 𝑥1𝑥 + 𝑥
𝑥: the predicted value of the dependent variable (𝑥) for any given value of the independent variable (𝑥)

37
𝑥0 : the intercept, the predicted value of 𝑥 when the 𝑥 is 0
𝑥1 : the regression coefficient – how much we expect 𝑥 to change as 𝑥 increases
𝑥: the independent variable ( the variable we expect is influencing 𝑥)
𝑥: the error of the estimate, or how much variation there is in our estimate of the regression coefficient

Example 1:
A part-time employee is paid a basic salary of 1 million VND. In addition, if he signs a contract, he will
be paid an additional 200,000 VND. Let S be the total amount of money the employee receives in a
month, T is the total number of contracts he signed, and we have the equation 𝑥 = 1,000,000 + 200,000𝑥.
So we can easily calculate, assuming this guy signs 3 contracts, the total amount he gets is
𝑥 = 1,000,000 + 200,000 × 3 = 1,600,000 𝑥𝑥𝑥.

Example 2:
Practice on SPSS
I choose the dependent variable Price and the independent variable Engine power to run linear
regression for BMW dataset. The results obtained are the following four tables:

Variables Entered/Removeda
Variables Variables
Model Entered Removed Method
1 engine_pow . Enter
erb
a. Dependent Variable: price
b. All requested variables entered.

Model Summary
R Adjusted R Std. Error of
Model R Square Square the Estimate
1 .355a .126 .108 11422.903
a. Predictors: (Constant), engine_power

38
ANOVAa
Sum of Mean
Model Squares df Square F Sig.
1 Regressio 905710557. 1 905710557. 6.941 .011b
n 834 834
Residual 6263170242 48 130482713.
.166 378
Total 7168880800 49
.000
a. Dependent Variable: price
b. Predictors: (Constant), engine_power

Coefficientsa
Standardize
Unstandardized d
Coefficients Coefficients
Model B Std. Error Beta t Sig.
1 (Constant) 4648.878 5040.431 .922 .361
engine_pow 88.176 33.468 .355 2.635 .011
er
a. Dependent Variable: price

From ANOVA table has shown that the sig. of the F test is 0.011 that is smaller than 0.05. Therefore,
the regression model is statistically meaningful. After that, based on Coefficients table can determine
linear regression model like that: 𝑥 = 4648.878 + 88.176𝑥. More specifically, where 𝑥 is Price and 𝑥 is Engine
power, the equation will be expressed as 𝑥𝑥𝑥𝑥𝑥 = 4648.878 + 88.176 × 𝑥𝑥𝑥𝑥𝑥𝑥 𝑥𝑥𝑥𝑥𝑥.

Multiple linear regression

Multiple linear regression (MLR), likewise referred to just as numerous regression, is a measurable
method that uses several explanatory variables to predict the result of a response variable. The objective
of multiple linear regression is to model the linear relationship between the explanatory (independent)
variables and response (dependent) variables. Fundamentally, multiple regression is the extension of

39
ordinary least-squares (OLS) regression because it includes more than one logical variable. (Hayes,
2021)
Formula and Calculation of Multiple Linear Regression:
𝑥𝑥 = 𝑥 0 + 𝑥1𝑥𝑥1 + 𝑥2𝑥𝑥2 + ⋯ + 𝑥𝑥𝑥𝑥𝑥 + 𝑥
Where 𝑥 = 𝑥 observations:
𝑥𝑥: dependent variable
𝑥𝑥 : explanatory variables
𝑥0 : y-intercept (constant term)
𝑥𝑥 : slope coefficients for each explanatory variable
𝑥: the model’s error term (also known as the residuals)

When running SPSS on Multiple linear regression, we will be keen on boundaries, for example, R-
squared, Adjusted R-squared, Durbin-Watson in model summary, sig value in ANOVA table, VIF and
B in coefficient table.
The "R" segment addresses the worth of R, the multiple correlation coefficient. R can be viewed as
one proportion of the nature of of the dependent variable.
The "R Square" segment addresses the R2 value (likewise called the coefficient of assurance), which
is the extent of fluctuation in the reliant variable that can be clarified by the independent variables
(actually, it is the extent of variety represented by the regression model above and beyond the mean
model). (Laerd Statistics, 2021)
"Adjusted R Square" (adj. R2) is a modified form of R-squared that has been adapted to the quantity
of indicators in the model. The adjusted R-squared increments when the new term works on the model
more than would be normal by some coincidence. It decreases when an indicator works on the model
by not exactly anticipated. Commonly, the adjusted R-squared is positive, not negative. It is consistently
lower than the R-squared. (The Investopedia Team, 2021)
The Durbin Watson (DW) measurement is a test for autocorrelation in the residuals from a factual
model or regression analysis. The Durbin-Watson statistic will always have a worth ranging between 0
and 4. (Kenton, 2021)
 Values from 1.5 to 2.5 point to no autocorrelation recognized in the sample.
 If the value come closely to 0 means proportional autocorrelation.
 If the value come closely to 4 means inverse autocorrelation.

Unstandardized coefficients show how much the reliant variable fluctuates with a free factor when any
remaining autonomous factors are held steady.

40
We can test for the statistical significance of each of the autonomous factors. This tests whether the
unstandardized (or normalized) coefficients are equivalent to 0 (zero) in the populace. If p < .05, we
can conclude that the coefficients are measurably essentially unique to 0 (zero). The t-value and
corresponding p-value are situated in the “t” and “Sig.” columns, respectively. (Laerd Statistics, 2021)
Variance inflation factor (VIF) is a proportion of how much multicollinearity in a bunch of various
regression variables. Numerically, the VIF for a regression model variable is equivalent to the
proportion of the general model difference to the fluctuation of a model that incorporates just that
solitary autonomous variable. This proportion is determined for each independent variable. A high VIF
shows that the related independent variable is exceptionally collinear with the other variables in the
model. (The Investopedia Team, 2021)
Precisely how huge a VIF must be before it causes issues is a subject of discussion. What is known is
that the more your VIF builds, the less solid your regression results are going to be. As a rule, a VIF
above 10 indicates high correlation and is cause for concern. Some authors propose a more safe degree
of 2.5 or above. (Stephanie, 2015)
A standardized beta coefficient compares the strength of the impact of every individual independent
variable to the dependent variable. The higher the absolute value of the beta coefficient, the stronger
the impact. (Stephanie, 2016)

Example:
Practice on SPSS the BMW’s dataset

Model Summaryb
Std. Error of
R Adjusted R the Durbin-
Model R Square Square Estimate Watson
1 .650a .422 .398 9387.581 1.737
a. Predictors: (Constant), engine_power, mile_age
b. Dependent Variable: price

Assuming Ho: R2 = 0
 The model does not exist.
Ha: R2 ≠ 0
 The model does exist.

41
From the Model Summary above, have R2 = 0.422, which means to reject Ho, the model does exist.
Next, have Adjusted R2 = 0.398, indicates that independent variables account for 39.8% of variability
of the dependent variable. The rest of 60.2% is accounted for by external variables and random error.
And the last indicator to mention is Durbin-Watson. Have Durbin-Watson value = 1.737 (ranges from
1.5 to 2.5 point) presents for no autocorrelation of the selected data.

ANOVAa
Sum of Mean
Model Squares df Square F Sig.
1 Regressio 3026927112 2 151346355 17.174 .000b
n .315 6.157
Residual 4141953687 47 88126674.2
.685 06
Total 7168880800 49
.000
a. Dependent Variable: price
b. Predictors: (Constant), engine_power, mile_age

From the ANOVA above, Sig. of F-test is 0.000 < α = 0.05. So we can conclude that the regression
model is statistically proper.

42
From the Coefficients table result above, none of VIF of any independent variable is greater than 10,
especially all are less than 2 (1.039 < 2), so it will be no collinearity.
Reversely, if VIF of any is greater than 10, look back to consider which 2 independent variables have
high correlation. Then eliminate those two variables respectively and compare their adjusted R square
from the two results after elimination. Select the correlation model with the higher adjusted R square.

Move on to the Sig. of the T-test column, all of the variables’ sig value is less than 0.05, 0.000 < 0.05
and 0.033 < 0.05, which means independent variables “Mile Age” and “Engine Power” impacts
dependent variable “Price”.

Let’s discuss the Coefficients table again, but this discussion will focus on the Unstandardized
Coefficients column.
According to B value in coefficients to create a regression equation, with the basic regression equation
and parameters like 𝑥 for “Price”, 𝑥1 for “Mile Age”, 𝑥2 for “Engine Power”, 𝑥0 is 19750.508, 𝑥1 is – 0.094,
𝑥2 is 61.512, I can have a regression equation below:
𝑥 = 19750.508 − 0.094𝑥1 + 61.512𝑥2

To deeply evaluate, I have:

 Starting Price of a BMW vehicle is €19750.508.

43
 All factors unchanged, when Mile Age increases by 1, Price will decrease by 𝑥 = 19750.508 − 0.094 =
19750.414
 All factors unchanged, when Engine Power increases by 1, Price will rise by 𝑥 = 19750.508 + 61.512
= 19812.02

Coefficientsa
Standardize
Unstandardized d Collinearity
Coefficients Coefficients Statistics
Toleranc
Model B Std. Error Beta t Sig. e VIF
1 (Constant) 19750.508 5160.786 3.827 .000
mile_age -.094 .019 -.554 -4.906 .000 .962 1.039
engine_powe 61.512 28.037 .248 2.194 .033 .962 1.039
r
a. Dependent Variable: price
Moving to the Standardized Coefficients column, this coefficient is better for proposing solutions and
standardized regression equation is formed in decreasing order of impact level of independent variables.
𝑥𝑥𝑥𝑥𝑥 = −0.554 × 𝑥𝑥𝑥𝑥 𝑥𝑥𝑥 + 0.248 × 𝑥𝑥𝑥𝑥𝑥𝑥 𝑥𝑥𝑥𝑥𝑥
To deeply evaluate, I have:
 All factors unchanged, when Mile Age increases by 1, Price will rise by 𝑥𝑥𝑥𝑥𝑥 = −0.554
 All factors unchanged, when Engine Power increases by 1, Price will rise by 𝑥𝑥𝑥𝑥𝑥 = 0.248

Histogram of the Residual

The Histogram of the Residual can be applied to check whether the difference is typically circulated. A
symmetric bell-shaped histogram which is equitably appropriated around 0 and Standard Deviation
comes near 1 show that the ordinariness supposition that is probably going to be valid. In the event that
the histogram demonstrates that irregular mistake is not regularly circulated, it recommends that the
model’s basic presumptions might have been abused.

44
From the data above, Mean = −6.77𝑥 × 10−17 comes to 0, and Standard Deviation = 0.979 comes to
1. Therefore, it is clearly that the residual has a standard distribution.

Normal P-P Plot of Regression

In statistics, a P–P plot (probability–probability plot or percent–percent plot or P value plot) is a
probability plot for surveying how intently two informational indexes concur, which plots the two
aggregate appropriation capacities against one another. P-P plots are immeasurably used to assess the
skewness of a conveyance.

45
From the plot above: the plots tightly associate with the regression line, so residual has standard
distribution.

Scatterplot

46
Based on the scatter plot, the scatters distribute randomly and gather around 0 axis. So I can jump to a
conclusion that linear relation between independent and dependent variables is not violated.

Apply a range of statistical methods used in business planning for quality,

inventory and capacity management
Process variability is the variety that happens during the assembling system. When utilizing insights,
process variability can be exhibited in two ways. The first is by computing the difference with a
numerical equation. The second is by tracking down the standard deviation from the fluctuation and
utilizing it to foster a histogram.

Probability Distribution
A probability distribution is a measurable capacity that depicts every one of the potential qualities and
probabilities that an irregular variable can take inside a given reach. This reach will be limited between
the base and greatest potential qualities, yet exactly where the conceivable worth is probably going to
be plotted on the likelihood circulation relies upon various elements. These elements incorporate the
dispersion’s mean (normal), standard deviation, skewness, and kurtosis. (Hayes, 2020)

Joint Probability
Joint probability is a factual measure that ascertains the probability of two occasions happening together
and at a similar moment. Joint likelihood is the likelihood of occasion Y happening while occasion X
happens. (Kenton, 2021)
P(A and B) if A and B are independent events:
𝑥(𝑥 𝑥𝑥𝑥 𝑥) = 𝑥(𝑥) ∗ 𝑥(𝑥)
The probability of A or B depends on if you have mutually exclusive events (ones that cannot happen
at the same time) or not.
P(A or B) if A and B are mutually exclusive:
𝑥(𝑥 𝑥𝑥 𝑥) = 𝑥(𝑥) + 𝑥(𝑥)
P(A or B) if A and B are NOT mutually exclusive:
𝑥(𝑥 𝑥𝑥 𝑥) = 𝑥(𝑥) + 𝑥(𝑥) − 𝑥(𝑥 𝑥𝑥𝑥 𝑥)

47
Conditional Probability
Conditional probability is characterized as the probability of an occasion or result happening, in view
of the event of a past occasion or result. Restrictive likelihood is determined by duplicating the
likelihood of the former occasion by the refreshed likelihood of the succeeding, or contingent, occasion.
(Barone, 2021)
This revised probability that an occasion A has happened, considering the extra data that one more
occasion B has most certainly happened on this preliminary of the analysis, is known as the restrictive
likelihood of A given B and is indicated by P(A|B).
𝑥(𝑥 𝑥𝑥𝑥 𝑥)
𝑥(𝑥|𝑥) =
𝑥(𝑥)

Example: I draw a sample of 50 students in class X, and classify them into Male and Female and as
subject Math, Literature, English.

The corresponding table of JOINT and MARGINAL probabilities:

From the given data, we can see that:

 𝑥(𝑥𝑥𝑥𝑥𝑥𝑥 𝑥𝑥𝑥 𝑥𝑥𝑥𝑥𝑥𝑥ℎ) = 0.3
 𝑥(𝑥𝑥𝑥𝑥 𝑥𝑥𝑥 𝑥𝑥𝑥𝑥𝑥𝑥𝑥𝑥𝑥𝑥) = 0
 Event “Male” and “Literature” are mutually exclusive.
 Assigning “Female” as A and “English” as B
𝑥(𝑥𝑥𝑥𝑥𝑥𝑥 𝑥𝑥𝑥 𝑥𝑥𝑥𝑥𝑥𝑥ℎ)
𝑥(𝑥𝑥𝑥𝑥𝑥𝑥|𝑥𝑥𝑥𝑥𝑥𝑥ℎ) = = 0.3
𝑥(𝑥𝑥𝑥𝑥𝑥𝑥ℎ) = 0.75
0.4
𝑥(𝑥 𝑥𝑥𝑥 𝑥)
 Use the formula 𝑥(𝑥|𝑥) = = 𝑥(𝑥), have
𝑥(𝑥)

48
 𝑥(𝑥𝑥𝑥𝑥𝑥𝑥|𝑥𝑥𝑥𝑥𝑥𝑥ℎ) = 0.75 ≠ 𝑥(𝑥𝑥𝑥𝑥𝑥𝑥) = 0.6
 The two events “Female” and “English” are dependent events.

Applying a range of statistical methods used in business planning for quality,

inventory, and capacity management
Measuring the variability in business processes or quality management
All assembling and estimation processes show variety. A focal piece of value the executives includes
following, recognizing and overseeing changes that happen in a framework or cycle; these are varieties.
At the point when changes are arranged and executed well, the outcomes are typically great – items
improve, and processes become more productive. At the point when changes have not been arranged,
differences in a framework are quite often terrible: Products miss the mark regarding particular
necessities, processes become wasteful, and organizations lose cash. (Weedmark, 2021)

Measuring the probability by using probability distributions to business operations and

processes
A probability distribution is a statistical model that shows the potential results of a specific occasion or
strategy just as the factual probability of every occasion. For instance, an organization may have a
likelihood circulation for the adjustment of deals given a specific promoting effort. The qualities on the
“tails” or the left and right finish of the dissemination are substantially less liable to happen than those
in the middle of the curve. (Richards, n.d)

Normal Distribution
Normal distribution, otherwise called as the Gaussian distribution, is a probability distribution that is
symmetric with regards to the mean, showing that information close to the mean are more continuous
in event than information a long way from the mean. In chart structure, ordinary dispersion will show
up as a bell curve. (Chen, 2021)
Standard normal density function:
1 2
𝑥(𝑥) = 𝑥−𝑥 /2
√2𝑥

The standard normal distribution has two parameters: the mean (of 0) and the standard deviation (of 1).
Formula:

49
𝑥−𝑥
𝑥= 𝑥

Example 1: The mean weight of female is normally distributed with 55kg and the standard deviation is
10kg. Let's calculate the probability of X ≤ 57.3kg?
Step 1: Apply to formula
𝑥−𝑥 57.3 − 55
𝑥= = = 0.23
𝑥 10

Step 2: Looking up the z distribution table

50
𝑥(𝑥 ≤ 57.3) = 𝑥(𝑥 < 0.23) = 0.5910 = 59.1%
Conclusion: The probability of X ≤ 57.3kg is 59.1%.

Example 2:
The mean of the distribution that client’s expenditure is $45, with an average standard deviation of $3.
 Calculate the probability that a randomly selected client spends less than $36?
 Calculate the probability that a randomly selected client spends between $13 and $33?
 Calculate the probability that a randomly selected client spends more than $12?
 Calculate the $ amount such that 80% of all clients spending no more than this?

By using Excel

51
Poisson Distribution and Binomial Distribution
The Poisson distribution describes the probability of encountering k occasions during a decent time
stretch. Assuming an irregular variable X follows a Poisson distribution, then the probability that X =
k events can be calculated by the following equation:
𝑥𝑥𝑥 −𝑥
𝑥(𝑥) =
𝑥!

The Binomial distribution portrays the likelihood of acquiring k triumphs in n binomial analyses.
Assuming an arbitrary variable X follows a binomial distribution, then the probability that X = k
successes can be found by the following formula:

𝑥(𝑥 𝑥𝑥𝑥𝑥𝑥𝑥𝑥𝑥𝑥 𝑥𝑥𝑥 𝑥𝑥 𝑥) = 𝑥𝑥𝑥𝑥(1𝑥− 𝑥)𝑥−𝑥

n: number of trials
k: the number of successes
p: probability of success on a given trial
𝑥!
𝑥 𝑥𝑥 =
𝑥!(𝑥−𝑥)!

Comparison
There are various comparative angles between these two dispersions: both are the discrete hypothetical
likelihood dissemination. Further, based on the upsides of boundaries, both can be unimodal or bimodal.
Additionally, the Binomial circulation can be approximated by the Poisson dispersion, if the quantity
of attempts (n) tends to infinity and success probability (p) tends to 0 so that m = np. (Surbhi, 2017)
The differences between Binomial and Poisson distribution can be drawn clearly on the following chart:

52
Inference
Inference, in statistics, the most common way of reaching inferences about a boundary one is looking
to quantify or appraise. One principal methodology of statistical inference is Bayesian estimation, which
joins sensible assumptions or earlier decisions (maybe dependent on past examinations), just as novel
perceptions or trial results. Another strategy is the probability approach, in which “prior probabilities”
are eschewed in favor of ascertaining a worth of the boundary that would be generally “reasonable” to
create the noticed dissemination of exploratory results.
In parametric inference, a specific numerical type of the dissemination work is accepted. Nonparametric
inference avoids this suspicion and is utilized to appraise boundary upsides of an obscure dissemination
having an obscure utilitarian structure.

53
Using appropriate charts and tables to communicate findings of given
variables
Frequency table
Frequency alludes to the occasions an occasion or a worth happens. A frequency table is a table that
rundowns things and shows the occasions the things happen. We represent the frequency by the English
alphabet ‘f’. A table that presents the frequency of different results in an example is called a Frequency
distribution table. (Mastin, 2020)

Bar chart
A bar is a graph that plots information utilizing rectangular bars or sections (called receptacles) that
address the total amount of observations in the data for that classification. Bar diagrams can be shown
with vertical segments, horizontal bars, comparative bars (multiple bars to show a comparison between
values), or stacked bars (bars containing multiple kinds of data). (Mitchell, 2021)

54
Pie chart
A Pie Chart is a type of diagram that shows information in a roundabout diagram. The bits of the
diagram are corresponding to the negligible portion of the entire in every classification. At the end of
the day, each cut of the pie is comparative with the size of that class in the gathering in general. The
whole “pie” addresses 100% of an entire, while the pie “cuts” address segments of the entirety.
(Statistics How To, 2021)

55
Histogram
A histogram is a graphical representation that puts together a gathering of elements into client indicated
ranges. Comparable in appearance to a structured presentation, the histogram consolidates an
information series into an effectively deciphered visual by taking numerous relevant items and gathering
them into consistent ranges or receptacles. (Chen, 2021)

Scatter plot
Scatter plots (also called scatter graphs) are similar to line diagrams. A line graph chart utilizes a line
on an X-Y hub to plot a ceaseless capacity, while a scatter plot uses dots to r address individual bits of
information. In statistics, these plots are helpful to check whether two factors are identified with one
another. For example, a scatter chart can propose a linear relationship (i.e. a straight line). (Statistics
How To, 2021)

56
The strengths and weaknesses of using different types of charts and tables

57
The most effective way of communicating the results of the analysis
Accounting data is frequently introduced as tables of numbers, at times essentially as a print out from
an accounting page or reports from an accounting software package. While this style of presentation
gives itemized figures, it may not forever be the best method for introducing and convey data. It is
possible that some key data ought to be featured, maybe connections between specific figures ought to
be underlined, or drifts recognized. Fitting show of information as diagrams or outlines can be a helpful
investigation instrument and assuming that the information is then adequately deciphered this can work
with the dynamic cycle.
First and foremost, bar diagrams are quite possibly the most well-known information visualization. You
can utilize them to rapidly think about information across classifications, feature contrasts, show
patterns and exceptions, and uncover recorded highs and lows initially. Bar graphs are particularly
powerful when you have information that can be parted into different classifications. Also, pie outlines
are strong for adding subtlety to different representations. Alone, a pie outline doesn't give the watcher
an approach to rapidly and precisely analyze data. Since the watcher needs to make setting all alone,
central issues from your information are missed. Rather than making a pie diagram the focal point of
your dashboard, take a stab at utilizing them to bore down on different representations. In conclusion,
Scatter plots are a successful method for exploring the connection between various factors, appearing
in the event that one variable is a decent indicator of another, or on the other hand assuming they will
generally change freely. A dissipate plot presents loads of particular informative items on a solitary

58
graph. The diagram would then be able to be improved with examination like group investigation or
pattern lines. (Rodgers, n.d)

59
Conclusion
To conclude, the report has successfully presented findings and recommendations to support decision-
making and business planning processes in BMW. To support those, I have analyzed and evaluated
qualitative and quantitative raw business data from a range of examples using appropriate statistical
methods. Additionally, I also applied a range of statistical methods used in business planning for quality,
inventory and capacity management. And lastly, I used appropriate charts and tables to communicate
findings of given variables. For the tools which I used, SPSS and Excel are very useful in presenting
clear data.

60
References
Barone, A., 2021. Conditional Probability. [Online] Investopedia. Available at:
<https://www.investopedia.com/terms/c/conditional_probability.asp#:~:text=Conditional%20probabil
ity%20is%20defined%20as,succeeding%2C%20or%20conditional%2C%20event> [Accessed: 13
May 2022].

Bevans, R., 2020. An introduction to simple linear regression. [Online] Scribbr. Available at:
<https://www.scribbr.com/statistics/simple-linear-regression/> [Accessed: 13 May 2022].
BPP Learning Media (2013). Business Essential Marketing Intelligence and Planning. London: BPP
Learning Media, 3rd Ed. [Accessed: 13 May 2022].
CFI Education, 2021. Regression Analysis – The estimation of relationships between a dependent
variable and one or more independent variables. Available at:
<https://corporatefinanceinstitute.com/resources/knowledge/finance/regression-analysis/> [Accessed:
13 May 2022].
Chen, J., 2021. Histogram. [Online] Investopedia. Available at:
<https://www.investopedia.com/terms/h/histogram.asp> [Accessed: 13 May 2022].
Chen, J., 2021. Normal Distribution. [Online] Investopedia. Available at:
<https://www.investopedia.com/terms/n/normaldistribution.asp> [Accessed: 13 may 2022]. Devault,
G., 2021. Advantages and Disadvantages of Quantitative Research. [Online] the balance small
business. Available at: <https://www.thebalancesmb.com/quantitative-research-advantages-and-
disadvantages-2296728> [Accessed: 13 may 2022].
Frost, J., 2021. Measures of Variability: Range, Interquartile Range, Variance, and Standard Deviation.
[Online] Statistics By Jim – Making statistics intuitive. Available at:
<https://statisticsbyjim.com/basics/variability-range-interquartile-variance-standard-deviation/>
[Accessed: 13 may 2022].
Hayes, A., 2020. Probability Distribution. [Online] Investopedia. Available at:
<https://www.investopedia.com/terms/p/probabilitydistribution.asp#:~:text=A%20probability%20dist
ribution%20is%20a,take%20within%20a%20given%20range.&text=These%20factors%20include%2
0the%20distribution's,deviation%2C%20skewness%2C%20and%20kurtosis> [Accessed: 13 May
2022].

Hayes, A., 2021. Multiple Linear Regression (MLR). [Online] Investopedia. Available at:
<https://www.investopedia.com/terms/m/mlr.asp> [Accessed: 13 May 2022].
Kenton, W., 2021. Durbin Watson Statistic Definition. [Online] Investopedia. Available at:
<https://www.investopedia.com/terms/d/durbin-watson-statistic.asp> [Accessed: 13 May 2022].

61
Kenton, W., 2021. Joint Probability Definition. [Online] Investopedia. Available at:
<https://www.investopedia.com/terms/j/jointprobability.asp#:~:text=Joint%20probability%20is%20a
%20statistical,time%20that%20event%20X%20occurs> [Accessed: 13 May 2022].
Laerd Statistics, 2021. Multiple Regression Analysis using SPSS Statistics. Available at:
<https://statistics.laerd.com/spss-tutorials/multiple-regression-using-spss-statistics.php> [Accessed: 13 May
2022].
Majaski, C., 2021. Hypothesis Testing. [Online] Investopedia. Available at:
<https://www.investopedia.com/terms/h/hypothesistesting.asp> [Accessed: 13 May 2022]. Mastin, L.,
2020. Frequency Statistic – Explanation & Examples. [Online] The Story of Mathematics. Available
at: <https://www.storyofmathematics.com/frequency-statistic> [Accessed: 13 May 2022].
Mitchell, C., 2021. Bar chart. [Online] Investopedia. Available at:
<https://www.investopedia.com/terms/b/bar-graph.asp> [Accessed:13 May 2022].
Nickolas, S., 2021. What Do Correlation Coefficients Positive, Negative, and Zero Mean? [Online]
Investopedia. Available at: <https://www.investopedia.com/ask/answers/032515/what-does-it-mean-if-
correlation-coefficient-positive-negative-or-zero.asp> [Accessed: 13 May 2022].
QuestionPro, 2021. Correlation analysis – Using correlation analysis to identify linear relationships
between two variables. Available at: <https://www.questionpro.com/features/correlation-
analysis.html> [Accessed:13 May 2022].
Rahman, M., 2021. Advantages and disadvantages of qualitative research. [Online] Howandwhat.
Available at: <https://howandwhat.net/advantages-disadvantages-qualitative-research/> [Accessed: 13
May 2022].
Reid, A., 2018. Advantages & Disadvantages of a Frequency Table. [Online] Sciencing. Available at:
<https://sciencing.com/do-calculate-class-width-8516043.html> [Accessed:13 May 2022]. Richards,
L., n.d. The Role of Probability Distribution in Business Management. [Online] Chron. Available
at: <https://smallbusiness.chron.com/role-probability-distribution-business-management-
26268.html> [Accessed:13 May 2022].
Rodgers, T., n.d. Which Type of Chart or Graph is Right for You? [Online] Available at:
<https://www.tableau.com/learn/whitepapers/which-chart-or-graph-is-right-for-you> [Accessed: 13 May 2022].
ROM Knowledgeware, 2011. Advantages and disadvantages of different types of graphs. [Online]
Available at: <http://www.kmrom.com/Site-En/Articles/ViewArticle.aspx?ArticleID=416> [Accessed:
13 may 2022].

62
Statistics How To, 2021. Pie Chart: Definition, Examples, Make one in Excel/SPSS. [Online] Available
at: <https://www.statisticshowto.com/probability-and-statistics/descriptive-statistics/pie-chart/>
[Accessed:13 may 2022].
Statistics How To, 2021. Scatter Plot / Scatter Chart: Definition, Examples, Excel/TI-83/TI-89/SPSS.
[Online] Available at: <https://www.statisticshowto.com/probability-and-statistics/regression-
analysis/scatter-plot-chart/> [Accessed: 13 May 2022].
Statistics Solutions, 2021. One Sample T-Test. [Online] Complete Dissertation by Statistics Solutions.
Available at: <https://www.statisticssolutions.com/free-resources/directory-of-statistical-analyses/one-
sample-t-test/> [Accessed:13 May 2022].
statistics.leard.com, 2021. Measures of Central Tendency. [Online] Available at:
<https://statistics.laerd.com/statistical-guides/measures-central-tendency-mean-mode-median.php>
[Accessed:13 May 2022].
statisticshowto.com, 2021. Inferential Statistics: Definition, Uses. [Online] Available at:
<https://www.statisticshowto.com/probability-and-statistics/statistics-definitions/inferential-
statistics/> [Accessed:13 May 20222].
Stephanie, 2015. Variance Inflation Factor. [Online] Statistics How To. Available at:
<https://www.statisticshowto.com/variance-inflation-factor/> [Accessed: 25 December 2021].
Stephanie, 2016. Standardized Beta Coefficient: Definition & Example. [Online] Statistics How To.
Available at: <https://www.statisticshowto.com/standardized-beta-coefficient/> [Accessed: 25
December 2021].
Surbhi, S., 2017. Difference Between Binomial and Poisson Distribution. [Online] Available at:
<https://keydifferences.com/difference-between-binomial-and-poisson-distribution.html> [Accessed:
12 December 2021].
Surbhi, S., 2017. Difference Between Population and Sample. [Online] Key Differences. Available at:
<https://keydifferences.com/difference-between-population-and-sample.html> [Accessed: 06
December 2021].
The fullstory education team, 2021. Qualitative vs. quantitative data: what's the difference? [Online]
Available at: <https://www.fullstory.com/blog/qualitative-vs-quantitative-data/> [Accessed: 06
December 2021].
The Investopedia Team, 2021. R-Squared vs. Adjusted R-Squared: What's the Difference? Available
at: <https://www.investopedia.com/ask/answers/012615/whats-difference-between-rsquared-and-
adjusted-rsquared.asp> [Accessed: 25 December 2021].

63
The Investopedia Team, 2021. Variance Inflation Factor (VIF). Available at:
<https://www.investopedia.com/terms/v/variance-inflation-factor.asp> [Accessed: 13 may 2022].
Vaughan, T., 2021. 10 Advantages and Disadvantages of Qualitative Research. [Online] Poppulo.
Available at: <https://www.poppulo.com/blog/10-advantages-and-disadvantages-of-qualitative-
research> [Accessed:13 May 2022].
weebly, n.d. Pros and Cons of Histograms. [Online] Available at:
<https://histogramsdennard.weebly.com/pros-and-cons-of-histograms.html> [Accessed: 13 May
2022].
Weedmark, D., 2021. Importance of Variation in Total Quality Management. [Online] Chron. Available
at: <https://smallbusiness.chron.com/importance-variation-total-quality-management-52234.html>
[Accessed: 13 May 2022].
Williams, T. A., Anderson, D. R. and Sweeney, D. J. (2020). Statistics. [Online] Encyclopedia
Britannica. Available at: <https://www.britannica.com/science/statistics> [Accessed:13 May 2022].

ASM 1 Thay Duong
No ratings yet
ASM 1 Thay Duong
8 pages
570 ASM2 NguyenDangQuang GBS0909A
No ratings yet
570 ASM2 NguyenDangQuang GBS0909A
34 pages
Report PMK
No ratings yet
Report PMK
23 pages
ASM2 - Distinction - NGUYEN VAN SANG - 570
No ratings yet
ASM2 - Distinction - NGUYEN VAN SANG - 570
63 pages
Asm1 570
No ratings yet
Asm1 570
16 pages
Ngo Hai Trang - 570 ASS2
No ratings yet
Ngo Hai Trang - 570 ASS2
29 pages
A Level NEA Guide To Titles and Proposal Forms
No ratings yet
A Level NEA Guide To Titles and Proposal Forms
20 pages
Statistics and Probability Reviewer
77% (13)
Statistics and Probability Reviewer
6 pages
Bajaj Auto Introduction
No ratings yet
Bajaj Auto Introduction
5 pages
BBE Assignment 1
No ratings yet
BBE Assignment 1
44 pages
A Study On Customer Preference and Satisfaction Towards Bajaj Bikes-2
No ratings yet
A Study On Customer Preference and Satisfaction Towards Bajaj Bikes-2
115 pages
BM Glossary 2023
No ratings yet
BM Glossary 2023
32 pages
Red Bull Sales
No ratings yet
Red Bull Sales
5 pages
Assignment 1
No ratings yet
Assignment 1
6 pages
INT. EDU Lead (WB)
No ratings yet
INT. EDU Lead (WB)
9 pages
Statistic ASM1
No ratings yet
Statistic ASM1
12 pages
1622 GCS210109 TranQuangHien Assignment2
100% (1)
1622 GCS210109 TranQuangHien Assignment2
19 pages
ASM2 AD Evaluation 1st LocNT BHAF200116
No ratings yet
ASM2 AD Evaluation 1st LocNT BHAF200116
29 pages
Le Ba Khanh-GBD210092-GBD1008A - 5060 - Assignment 1
No ratings yet
Le Ba Khanh-GBD210092-GBD1008A - 5060 - Assignment 1
24 pages
Assignment 2 570 Hien
No ratings yet
Assignment 2 570 Hien
37 pages
International Finance Practice Exam #1
No ratings yet
International Finance Practice Exam #1
6 pages
Business Statistices ECON1193 Assignment Task 1: Case Study Analysis
100% (1)
Business Statistices ECON1193 Assignment Task 1: Case Study Analysis
10 pages
570-Asm2-GBS1006-Tran Khanh Ly
No ratings yet
570-Asm2-GBS1006-Tran Khanh Ly
34 pages
ML Project - Ipynb
No ratings yet
ML Project - Ipynb
324 pages
ASM1 - Leadership - Nguyen Quoc Phu
No ratings yet
ASM1 - Leadership - Nguyen Quoc Phu
42 pages
BUSI1715 Toyota
No ratings yet
BUSI1715 Toyota
18 pages
Code - Bright User Guide Mai - 2010
No ratings yet
Code - Bright User Guide Mai - 2010
207 pages
Global Vehicle Sales PGM Demand Hydrogen Aug 2020
No ratings yet
Global Vehicle Sales PGM Demand Hydrogen Aug 2020
13 pages
5035.assignment 1 Frontsheet (2022) HRM
No ratings yet
5035.assignment 1 Frontsheet (2022) HRM
30 pages
SM - ASM2 - Masan Group
No ratings yet
SM - ASM2 - Masan Group
32 pages
570 Assignment1-NguyenHuuThang-GBD1004
No ratings yet
570 Assignment1-NguyenHuuThang-GBD1004
30 pages
FMA Project
No ratings yet
FMA Project
17 pages
Pritha@xlri - Ac.in: To Immediate Cancellation of The Examination
No ratings yet
Pritha@xlri - Ac.in: To Immediate Cancellation of The Examination
5 pages
Assignment 2 Front Sheet
No ratings yet
Assignment 2 Front Sheet
47 pages
Bachelor of Business Administration
No ratings yet
Bachelor of Business Administration
58 pages
Final Assigment 1 Statistic 570
No ratings yet
Final Assigment 1 Statistic 570
12 pages
Bse Listed List of Company
No ratings yet
Bse Listed List of Company
1 page
570 Assignment 1
No ratings yet
570 Assignment 1
11 pages
Adobe Scan 25 Jul 2022
No ratings yet
Adobe Scan 25 Jul 2022
23 pages
Data Analytics For International Business
No ratings yet
Data Analytics For International Business
16 pages
Unil Ever and The Trade Union Challenge 1
No ratings yet
Unil Ever and The Trade Union Challenge 1
58 pages
MRe Assignment 3
No ratings yet
MRe Assignment 3
23 pages
Tesco
No ratings yet
Tesco
16 pages
Assignment 2 Front Sheet: Qualification BTEC Level 5 HND Diploma in Computing Unit Number and Title Submission Date
No ratings yet
Assignment 2 Front Sheet: Qualification BTEC Level 5 HND Diploma in Computing Unit Number and Title Submission Date
2 pages
Strategic Management For Tesco Company
No ratings yet
Strategic Management For Tesco Company
23 pages
Assignment 02 Front Sheet: Huong
100% (1)
Assignment 02 Front Sheet: Huong
29 pages
Econ 3073 Assignment - 1 and 2 Complete
No ratings yet
Econ 3073 Assignment - 1 and 2 Complete
12 pages
EMIS Buyers Guide EN Fin WEB
No ratings yet
EMIS Buyers Guide EN Fin WEB
72 pages
Our Big 6 Kpis: We Have Six Simple Key Performance Measures For The Whole Business
No ratings yet
Our Big 6 Kpis: We Have Six Simple Key Performance Measures For The Whole Business
2 pages
ACasestudyof Environmental Accountingin Indiawithreferenceto JSWSteel
No ratings yet
ACasestudyof Environmental Accountingin Indiawithreferenceto JSWSteel
13 pages
4 Colgate Palmolive
No ratings yet
4 Colgate Palmolive
36 pages
AZ-104T00-A - Microsoft Azure Administrator - 3
No ratings yet
AZ-104T00-A - Microsoft Azure Administrator - 3
2 pages
(5032) - OFFICIALLY Assignment 2
No ratings yet
(5032) - OFFICIALLY Assignment 2
15 pages
Bloomberg Q1 2012 M&a Global League Tables
No ratings yet
Bloomberg Q1 2012 M&a Global League Tables
39 pages
Vu ASM1 574
No ratings yet
Vu ASM1 574
19 pages
Unit 31 Statistics For ManagementAssignment 1 (LO1 - LOs) 1
0% (1)
Unit 31 Statistics For ManagementAssignment 1 (LO1 - LOs) 1
3 pages
BB Greggs Interview
No ratings yet
BB Greggs Interview
1 page
Tata Motors
No ratings yet
Tata Motors
49 pages
MSE Activity
No ratings yet
MSE Activity
4 pages
Transition Plan - Plan A and Plan B
No ratings yet
Transition Plan - Plan A and Plan B
10 pages
Wal Mart & Bharti Case Study
No ratings yet
Wal Mart & Bharti Case Study
15 pages
570-Asm1-Trần Phước Vinh
No ratings yet
570-Asm1-Trần Phước Vinh
33 pages
Chapter 2. Methodology
No ratings yet
Chapter 2. Methodology
4 pages
Practice Questions With Solutions
100% (1)
Practice Questions With Solutions
19 pages
2.2 Normal Distribution Worksheet AP Statistics: Between 18.6 MPG and 31 MPG
No ratings yet
2.2 Normal Distribution Worksheet AP Statistics: Between 18.6 MPG and 31 MPG
2 pages
04.measure of Disperson
No ratings yet
04.measure of Disperson
17 pages
Introduction To Data Manipulation Language DML
No ratings yet
Introduction To Data Manipulation Language DML
9 pages
Laboratory Analytical Procedure: Procedure Title: Author: Date: Issue Date: Supersedes
No ratings yet
Laboratory Analytical Procedure: Procedure Title: Author: Date: Issue Date: Supersedes
7 pages
Clabe Problem Sheet 6 Solution
No ratings yet
Clabe Problem Sheet 6 Solution
5 pages
Basic Statistical Functions in Excel
No ratings yet
Basic Statistical Functions in Excel
16 pages
Standard Deviation
No ratings yet
Standard Deviation
40 pages
Project Blog Improve Phase
No ratings yet
Project Blog Improve Phase
9 pages
Chapter 3 QMT 554-Jul10
No ratings yet
Chapter 3 QMT 554-Jul10
59 pages
List of Activities For X AI Practical File 2024-25
No ratings yet
List of Activities For X AI Practical File 2024-25
21 pages
Effectiveness of Animal-Assisted Interventions For Children and Adults With Post-Traumatic Stress Disorder Symptoms A Systematic Review and Meta-Anal
No ratings yet
Effectiveness of Animal-Assisted Interventions For Children and Adults With Post-Traumatic Stress Disorder Symptoms A Systematic Review and Meta-Anal
22 pages
Analysis of Hydrochloric Acid: Standard Test Methods For
No ratings yet
Analysis of Hydrochloric Acid: Standard Test Methods For
8 pages
Mtahs & Statistics Mock Exam Question Paper
No ratings yet
Mtahs & Statistics Mock Exam Question Paper
23 pages
L 9.1 Sampling Distribution With Replacement
No ratings yet
L 9.1 Sampling Distribution With Replacement
28 pages
Standard Test Method For Water Retention of Hydraulic Cement-Based Mortars and Plasters'
No ratings yet
Standard Test Method For Water Retention of Hydraulic Cement-Based Mortars and Plasters'
4 pages
Learning Evaluation Paper
No ratings yet
Learning Evaluation Paper
15 pages
Non-Uniform Interpolation
No ratings yet
Non-Uniform Interpolation
6 pages
Group 1 E-Banking
No ratings yet
Group 1 E-Banking
14 pages
Edited Research
No ratings yet
Edited Research
12 pages
Sampling 1
No ratings yet
Sampling 1
26 pages
Chapter3 Lesson1
No ratings yet
Chapter3 Lesson1
27 pages
Process Capability
No ratings yet
Process Capability
14 pages
Classification of Control Charts: Submitted To
No ratings yet
Classification of Control Charts: Submitted To
34 pages
Numerical Methods Implementation On CUDA
No ratings yet
Numerical Methods Implementation On CUDA
73 pages
Math Assignment
No ratings yet
Math Assignment
4 pages
POP Mock Exam Qs
No ratings yet
POP Mock Exam Qs
2 pages

570 Asm 2

Uploaded by

570 Asm 2

Uploaded by

ASSIGNMENT 02 FRONT SHEET

Qualification BTEC Level 5 HND Diploma in Business

Unit number and title Unit 31: Statistics for management

Submission date 13/05/2022 Date received (1st Submission) 13/05/2022

Re-submission date Date received (2nd Submission)

Student Name Phạm Nguyên Anh Student ID GBS200417

Class No. GBS0908B Assessor Name VO MINH VINH

Assessment & Grading criteria

How the activity meets the requirements of the criteria

Student Signature Date:

Assessor Signature Date:

Grade: Assessor Signature: Date:

Internal Verifier’s Comments:

Signature & Date:

Figure. Distinctions between quantitative and qualitative data.

Measures of Variability: range, variance and standard deviation

s2 : the sample variance σ2 : the population parameter for the variance

The differences between population and sample based on different sampling

One sample T-test: Estimation and Hypotheses testing

In this case: α = 0.1; 𝑥 = 65; n = 81

with (n – 1) degree of freedom

Lookup standard normal distribution of t value: 𝑥 𝑥/2 = 𝑥 0.05/2 = 𝑥 0.025 = 2.064

Apply to the formula:

There are five stages in the speculation testing process as follows:

Ho Ha Value testing Rejecting Ho

µo: The value of a constant

Step 3: Confidence = 1 – significance = 1 – 1% = 99%

Step 4: |𝑥| = 0.889 < 𝑥 0.005 = 2.15947

From One – sample T Test table

 Significance = 1 – confidence = 1 – 90% = 10% = 0.1

Step 4: |𝑥| = 1.2 < 𝑥 0.1 = 1.15341

At 1% level of significance: |𝑥 − 𝑥𝑥𝑥𝑥𝑥| = |−1.621| = 1.621 < |𝑥 − 𝑥𝑥𝑥𝑥𝑥𝑥𝑥𝑥| = |−2.405| = 2.405

At 1% level of significance: p-value = 0.056 > alpha = 0.01

Two sample t-test

Independent Sample T-test

Lookup standard normal distribution of t value

Step 3: Similar Variance of two sample

Step 4: Difference of Population Mean:

Confidence coefficient = 95%

Have: 𝑥 = 3.4334 > 𝑥 𝑥,𝑥𝑥 = 𝑥0.025,40 = 2.021

𝑥0.025,40 = 2.021 is on the right side of the graph.

Independent Samples Test

F Sig. t df Sig. (2-tailed)

From the result above:

Dependent Sample T-test

Lookup standard normal distribution of t value:

Paired Samples Correlations

From the data above:

Simple linear regression

Multiple linear regression

To deeply evaluate, I have:

Histogram of the Residual

Normal P-P Plot of Regression

Apply a range of statistical methods used in business planning for quality,

The corresponding table of JOINT and MARGINAL probabilities:

From the given data, we can see that:

Applying a range of statistical methods used in business planning for quality,

Measuring the probability by using probability distributions to business operations and

Step 2: Looking up the z distribution table

𝑥(𝑥 𝑥𝑥𝑥𝑥𝑥𝑥𝑥𝑥𝑥 𝑥𝑥𝑥 𝑥𝑥 𝑥) = 𝑥𝑥𝑥𝑥(1𝑥− 𝑥)𝑥−𝑥

You might also like