0% found this document useful (0 votes)
32 views8 pages

Notes

The document provides examples of statistical analyses that can be performed on data including: 1) Calculating numerical summaries such as averages, standard deviations, and quartiles for variables. 2) Performing contingency summaries like two-way frequency tables and chi-squared tests. 3) Conducting hypothesis tests like one-sample and paired t-tests to compare means. Procedures and R code are demonstrated for conducting these analyses on various datasets involving sales data, prices, and performance.

Uploaded by

Narender Reddy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views8 pages

Notes

The document provides examples of statistical analyses that can be performed on data including: 1) Calculating numerical summaries such as averages, standard deviations, and quartiles for variables. 2) Performing contingency summaries like two-way frequency tables and chi-squared tests. 3) Conducting hypothesis tests like one-sample and paired t-tests to compare means. Procedures and R code are demonstrated for conducting these analyses on various datasets involving sales data, prices, and performance.

Uploaded by

Narender Reddy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 8

15 November 2016

CSV, XLS, SAV are different formats used.


Convert the data into the required format and
operations can be done.
Most widely used format is XL Format.
Compute the following for the data given in the
file.
1.Average and standard deviation of sales for
all companies
2.Sample standard deviation for population
industry-wise.
3.Form a frequency distribution of companies
4.Bivariate /cross-classification table for
company and industry.

Solution procedure:
1.Problem
2.Data
3.program

23 November 2016
Statistical operations
Calculate numerical summaries such as
1.1.1. average
1.1.2. standard deviation and
1.1.3. third quartile for all numerical variables

> numSummary(Dataset[,c("Total", "Unit.Cost",


"Units")], statistics=c("mean", "sd", "cv"),
quantiles=c(0,.25,.5,.75,1))
mean
Total

sd

cv n

456.46233 447.02210 0.9793187 43

Unit.Cost 20.30860 47.34512 2.3312836 43


Units

49.32558 30.07825 0.6097900 43

> numSummary(Dataset[,c("Total", "Unit.Cost",


"Units")], statistics=c("quantiles"),
quantiles=c(.25,.5,.75,1.0))
25%

50%

75%

100% n

Total

144.59 299.40 600.18 1879.06 43

Unit.Cost 3.99 4.99 17.99 275.00 43


Units

27.50 53.00 74.50 96.00 43

Contingency summaries
Two-way table
Frequency table:
Region
Item

Central East West

Binder

Desk

Pen

Pen Set
Pencil

3
4

1
3

0
2

Pearson's Chi-squared test


data: .Table
X-squared = 7.326, df = 8, p-value = 0.5019

The prices of shares of a company on different


days in a month were found to be 66, 65, 69,
70, 69, 71,63,70,64 and 68. Test at 5% level
average share price is Rs.65.

with(company, (t.test(price,
alternative='two.sided', mu=65,
conf.level=.95)))
One Sample t-test
data: price
t = 2.8247, df = 9, p-value = 0.01989
alternative hypothesis: true mean is not equal
to 65
95 percent confidence interval:
65.49785 69.50215
sample estimates:
mean of x
67.5
p<alphareject
p>alphaaccept
therefore hypothesis is rejected.

Performance of Three sales men A, B, C over a


period of time is shown in the following data.

salesm
en
A

quanti
ty
300
400

300

500

600

300

300

400

700

300

400

600

100

50

Test whether average performance of these three


salesmen differ significantly
> AnovaModel.2 <- aov(quantity ~ salesmen,
data=salesmen)
> summary(AnovaModel.2)
Df Sum Sq Mean Sq F value Pr(>F)
salesmen

2 23750 11875 0.319 0.734

Residuals 11 410000 37273


> with(salesmen, numSummary(quantity,
groups=salesmen, statistics=c("mean", "sd")))
mean

sd data:n

A 320 148.3240

B 400 141.4214

C 410 255.9297

30/11
(One-sample) T-test:

From the following 10 observations .test


whether the population mean is 45.
Sample: 16 46 55 41 49 51 50 44 47 42
Sol:

H0: u=45
> with(ttest, (t.test(X,
alternative='two.sided', mu=45,
conf.level=.95)))
One Sample t-test
data: X
t = -0.26464, df = 9, p-value = 0.7972
alternative hypothesis: true mean is not
equal to 45
95 percent confidence interval:
36.40682 51.79318
sample estimates:
mean of x
44.1
Therefore Hypothesis is accepted
i.e..,u=45

Ttest for two independent variables

The sale performance of 8 salesmen are


recorded before and after training.
befor afte
no e
r
1
75
77
2
90 101
3
94
93
4
95
92
5
100 105
6
90
88
7
70
76
8
64
68

with(sales,
(t.test(before,
alternative='two.sided', conf.level=.95,

after,

+ paired=TRUE)))
Paired t-test
data: before and after
t = -1.6503, df = 7, p-value = 0.1429
alternative hypothesis: true difference in means is
not equal to 0
95 percent confidence interval:
-6.690337 1.190337
sample estimates:

mean of the differences


-2.75
p-value>0.05
hypothesis accepted.

You might also like