DAFM

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 13

Data collection and application of

suitable statistical techniques


INTRODUCTION
We collected appropriate data for Laptop Selection through a survey. We
created a Google form which contained the survey questions and
responses were recorded. The questions asked were Age, Gender, What is
your income range? , How much is your income?, Which company laptop
do you prefer, What are the things that you look for while buying a new
laptop?, For what purpose will you want to buy a new laptop?, A price
range that you're ready to invest if you plan on getting a new laptop
(INR),Do you compare laptops of a specific price range? and Does the
current COVID-19 situation affect your decision to purchase a laptop?.
One Sample T-test
In our survey, we used age for one sample t-test data analysis. Below are the following
details:

t-Test: One-Sample
• Age ranged from 16-59 Age
years Mean 25.95
• We claimed population Variance 90.99246231
mean to be greater than 21 Observations 200
years Hypothesized Mean 21
• H0<=21 df 199
• H1>21 t Stat 7.338672179
• Alpha=0.05 P(T<=t) one-tail 2.68584E-12
• We kept Hypothesized
t Critical one-tail 1.652546746
Mean = 21
P(T<=t) two-tail 5.37168E-12
t Critical two-tail 1.971956544
INTERPRETATIONS

The area right of t Stat= 7.33 is P (T<=t) one-tail =2.68584E-12 which negligible and almost
equal to zero. Also P (T<=t) two-tail = 5.37168E-12 which is again negligible and almost
equal to zero. Since t Stat > critical one tail, H0 hypothesis is rejected meaning the average
population age is not less than or equal to 21 years.

Hence our claim that the average population age is greater than 21 years is true.
One Sample Proportion Test
Here, we consider the response from the survey as to whether the current COVID-19 situation affected
people’s decision to purchase a laptop. Differentiating the response from the survey (sample space, n=200),
we got the below response.
Sample Size 200.000
count of yes count of no
Yes Count - X 125.000
125 75
Sample proportion (P) 0.625
Test
Alpha 0.050
Claim that 50% of the population decision is affected
Hypothetical population
 H0: P <= 0.5 proportion(p) 0.500
 H1: P > 0.5 Standard error of proportion 0.035
Z-test 3.536
 Alpha = 0.05
 Hypothesized proportion = 0.5
Before further computation, in order to make sure that the distribution is normal, two conditions
have to passed,

 Condition 1: Count of Yes, X> 5.


 Condition 2: X-n> 5 Comparison of Z-test = 3.536 with the z-
critical values of left tail, right tail and
Only if both conditions are satisfied, it is a normal two-tailed values to test on H0 and H1
sampling distribution. Below is the observation for stated earlier.
our sample, 1-true, 0-false
p-value Z-critical

Condition1 1 Left tail 0.99979688265325 -1.645

Condition 2 1 Right Tail 0.00020311734675 1.645


Two tail 0.00040623469351 -1.960 1.960
Normal sampling
Distribution? Yes
INTERPRETATIONS

 For right tail test, it is observed that Z-test > Z-critical (for alpha = 0.05) and the p-value = 0.033 is considerably lower than
alpha = 0.05.
 For left tail, Z-test (negative extreme) = -3.536 < Z-Critical = -1.645 and also p-value = 0.967 which is greater than (1-
alpha) = 0.95
 For two tailed test,

 Z-test (negative extreme) < Z-Critical = -1.96

 Z-test (positive extreme) > Z-Critical = 1.96

 P-value = 0.0004 is very much less than alpha=0.05


Thus, in all the cases, the observation support to reject H0, i.e., P <= 0.5. Hence, our claim that, for more than 50% of the
population, their decision to buy laptop is affected by COVID-19 situation, is proved to be true, H1: P > 0.5 is proved to be
true.
No Yes

Mean 15.50 Mean 34.50


Standard Error 1.85 Standard Error 14.82
Median 15.50 Median 28.00

Two Sample T Test: Mode

Standard Deviation
#N/A

3.70
Mode
Standard
Deviation
#N/A

29.64
Sample Variance 13.67 Sample Variance 878.33
Our claim for the test is that, the average Kurtosis 1.14 Kurtosis -0.34
Skewness 0.00 Skewness 0.94
number of people who cross check for Range 9.00 Range 66.00
Minimum 11.00 Minimum 8.00
specific price range is greater than the Maximum 20.00 Maximum 74.00
Sum 62.00 Sum 138.00
average number of people who don’t Count 4.00 Count 4.00

cross check specification. That is,

 H0: Mean of people responding No t-Test: Two-Sample    


  No Yes
= Mean of people responding Yes, Mean 15.50 34.50
Variance 13.67 878.33
µ(Yes) = µ(No) Observations 4.00 4.00
Pooled Variance 446.00  
 H1: Mean of people responding Yes Hypothesized Mean Difference 0.00  
df 6.00  
> Mean of people responding No, t Stat -1.27  
P(T<=t) one-tail 0.13  
µ(Yes) > µ(No) t Critical one-tail 1.94  
P(T<=t) two-tail 0.25  
 Alpha = 0.05
t Critical two-tail 2.45  
INTERPRETATIONS

 The number of people who cross check for specific price range (M =34.50, SD = 29.64, n
= 4) was hypothesized to be greater than the number of people who don’t cross check
specification (M = 15.50, SD = 3.70, n = 4)

 This difference was significant, t (6) = 2.45, p = 0.13 (1 tail). Hence, the null hypothesis
H0, µ (Yes) = µ (No) is rejected, and our claim H1, µ (Yes) > µ (No) is proved to be true.
ANOVA

In our survey, we used one way ANOVA test and Groups Count Sum Average Variance
following are the details: Apple 40 2418545 60463.63 1014996900

 The independent variable is brand of Laptops and Dell 46 1533545 33337.93 393594085.1

collected data on Apple, Dell, HP and Lenovo HP 67 2439500 36410.45 568787313.4

and find the out the difference in the income of Lenovo 30 1115000 37166.67 560747126.4

people while buying these laptops


 For the input we took the income of people and
ANOVA
grouped them in columns having the different Source of
Variation SS df MS F P-value F crit
brand name as Apple, Dell, HP and Lenovo. Between
 Alpha= 0.05 Groups 1.97E+10 3 6.57E+09 10.5832894 0.00000193 2.655074
Within
 We claimed means of income of people buying Groups 1.11E+11 179 6.21E+08

four different laptop brands are not all equal


Total 1.31E+11 182        
 H0= Equal mean(( μ1 = μ2 = μ3 ...)

 H1-Means are not all equal


INTERPRETATIONS

Since p-value of data is 0.00000193 which is very small meaning there is a significant
variation between the different laptop brands. Here critical value of f is 2.655074 and since F=
10.5832894 is greater than the critical value, therefore H 0 is rejected.

F > 2.655074, H0 rejected

We reject H0 because 10.5832894 > 2.655074. We have statistically significant evidence at
α=0.05 to show that there is a difference in mean income difference of people buying the
different laptop brands.
 
Chi-Square Test

In our data we conducted Chi square test to test the association between gender and Laptop preferred.

Laptops Grand
Brands Female Male Total Percentage
Expected Range Female Male
Apple 19 21 40 0.2 Apple 20 20
Dell 20 26 46 0.23 Dell 23 23
HP 32 35 67 0.335 HP 33.5 33.5
Lenovo 18 12 30 0.15
Lenovo 15 15
Others 11 6 17 0.085
Others 8.5 8.5
Grand Total 100 100 200

Chi Test -0.449943


INTERPRETATIONS

 If p is less than 0.05 then we have enough evidence to show that the males and females in
the target population are in fact different (independent)

 If p is greater than 0.05 then we do not have enough evidence to show that the males and
females in the target population are different (associated)

 Here, the chi value is 0.44994291 which is greater than 0.05(0.44994291>0.05) .so we do
not have enough evidence to show that male and female are different .therefore they are
associated.

You might also like