0% found this document useful (0 votes)

27 views25 pages

Cb161 Lab Manual

1. The document provides details about the Statistical Methods Lab course for second semester students, including course objectives, outcomes, content, textbooks, prerequisites and evaluation methods. 2. The lab course content is divided into three cycles covering topics such as measures of central tendency and dispersion, correlation and regression, curve fitting, ANOVA, time series analysis, goodness of fit, and parametric tests. 3. Solutions to sample problems are provided for each lab experiment covering the various statistical and data analysis techniques in R.

Uploaded by

k.nagaraju.

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

27 views25 pages

Cb161 Lab Manual

Uploaded by

k.nagaraju.

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 25

DEPARTMENT

COMPUTER SCIENCE & BUSINESS SYSTEMS

CB – 161(R-18)
STATISTICALMETHODS LAB with R MANUAL
I/IV B.Tech - CSBS (2nd – Semester)

RVR&JC COLLEGE OF ENGINEERING(AUTONOMOUS)

(Sponsored by Nagarjuna Education Society)
(Affiliated to Acharya Nagarjuna University :: Approved by AICTE)
CHANDRAMOULIPURAM:: CHOWDAVARAM
GUNTUR – 522 019 :: ANDHRA PRADESH

0
LABORATORY MANUAL

PROFORMA FOR LABORATORY – BASED COURSE DESCRIPTION

Course Code : CB – 161

Course Title : STATISTICALMETHODS LAB with R LAB
Year & Semester : I/IV B – Tech (2nd– Semester)
Periods/Week : 03
Nature of the Course : Required /Elective (Engineering Core)
Name of the Instructor : Dr.B.Srinivasa Rao
Designation : Associate Professor
E – Mail : [email protected]

1
1. Course Objectives:

The student who successfully completes this course will have:

1. The knowledge to use R for statistical programming, computation, modelling and graphics.
2. The skill to write functions and use R in an efficient way.
3. The ability to fit some basic types of statistical models using R.
4. The idea to expand the knowledge of R on their own.

2.Course Outcomes
On completion of this course, students will be able to:

1. Write the programs in R to solve the statistical problems.

2. Apply various built in functions in R to solve the computational and modelling problems.
3. Interpret the statistical data by various functions of graphical representation.
4. Understand- reading, writing, working and manipulating the data in various data frames.

3. LAB – COURSE CONTENT

Introduction to R
Functions
Control flow and Loops
Working with Vectors and Matrices
Reading in Data
Writing Data
Working with Data
Manipulating Data
Simulation
Linear model
Data Frame
Graphics in R

4. TEXT BOOKS :

1.Hands-on Programming with R, Garrett Grolemund, O′Reilly.

2.R for Everyone: Advanced Analytics and Graphics, Jared P. Lander, Addison-Wesley

2
5. PRE – REQUISITES

CB151– C Programming.

6. LAB – COURSE PLAN&DELIVERY :

LAB CYCLE-I NO. OF PERIODS
1.Measures of central tendency
a) Mean b)Median c)Mode d)Geometric Mean e)Harmonic Mean 3

2.Measures of dispersion
a)Range b)Quartile deviation c)Mean deviation d)Standard deviation 3
e) Coefficient of Variation

3.Correlation & Regression

a) CorrelationCoefficient b)Regression lines c)Rank Correlation 6
d) Multiple correlation coefficient e)Multiple linear regression

LAB CYCLE-II
4.Curve fitting
a)Straight line b)Parabola c) Y=aXb d) Y=abX e)Y=aebX 6

5.ANOVA
a)one-way classification b)two-way classification 3

6.Time series
a)Moving averages b)ARIMA 3

LAB CYCLE-III
7.Goodness of fit
a)Binomial b)Poisson c)Normal 6

8.Parametric tests
a) t-test for one-mean b) t-test for two means c) paired t-test d) F-test 6

9.Graphical representation of data

a) Bar plot b)Frequency polygon c)Histogram d)Pie chart e) Scatter plot 6

3
7. EVALUATION METHODS :

Internal Lab Exam : 40 Marks

Final Lab Exam : 60 Marks

8. TOPICS COVERED BEYOND THE CURRICULUM:

Statistical concepts regarding testing of hypothesis

Differences between C and R Programming

9. SEMESTER END OBSERVATIONS FOR FUTURE GUIDANCE:

Case studies to be explained are revised.

Identified new problems to be assigned for the next academic year students.

10. CO-PO Mapping

PO1 PO2 PO3 PO4 PO5 PO6 PO7 PO8 PO9 PO10 PO11 PO12
CO1 3 3 3
CO2 2 2 2
CO3 3 3 2
CO4 3 2 3

11. CO-PSO Mapping

PSO1 POS2 PSO3

CO1 3 3 1
CO2 2 2 3
CO3 3 3 2
CO4 3 2 2

4
SOLUTIONS

1. Experiment No. 1: Measures of central tendency

a) Mean b)Median c)Mode d)Geometric Mean e)Harmonic Mean
SOLUTION:
# Mean
print("Enter values in an array")
x<- scan()
sum<-0
for(i in 1:length(vals))
{
sum=sum+x[i]
}
meanvalue<-sum/length(x)
print("average value of the given values")
print(meanvalue)
print("mean value using built in function")
print(mean(x))

#median
print("Enter values in a vector")
x<-scan()
n <- length(x)
for(i in 1:(n-1))
for(j in (i+1):n)
if (x[i] > x[j])
{
temp <- x[i]
x[i] <- x[j]
x[j] <- temp
}
print(x)
if(n%%2==0){
med =(x[n%/%2]+x[(n%/%2)+1])/2.0
} else
med =x[(n+1)%/%2]
print(med)

#mode
print("ENTER THE NUMBER OF ELEMENTS")
x<-scan()
n<-length(x)
c<-vector()
for(i in 1:(n-1))
{
c[i]<-1
c[n]<-1
for(j in (i+1):n)
if(x[i]==x[j]) c[i]<-c[i]+1
}
print(c)

5
big=c[1];
pos=1;
for(i in 2:n)
if(big<c[i])
{
big=c[i]
pos=i
}
for(i in 1:n)
if(big==c[i])
{
if(x[pos]<x[i])
pos=i
}
print("The MODE using program is")
print(x[pos])
print("The MODE using built in function is")
print(max(x[duplicated(x)]))

# Geometric Mean
print("Enter values in an array")
x<- scan()
n<-length(x)
prod<-1
for(i in 1:n)
prod<-prod*x[i]
gm<-prod^(1/n)
print("Geometric Mean")
print(gm)
print("GM using built in function")
print(exp(mean(log(x))))

# Harmonic Mean
print("Enter values in an array")
x<- scan()
n<-length(x)
sum<-0
for(i in 1:n)
{
sum=sum+(1/x[i])
}
hm<-n/sum
print("Harmonic mean of the given values")
print(hm)
print("Harmonic mean value using built in function")
print(n/sum((1/x)))

6
Experiment No. 2: Measures of dispersion
a)Range b)Quartile deviation c)Mean deviation d)Standard deviation e)Coeff. of Variation

SOLUTION :

#Range
print("Enter values in a vector")
x<-scan()
n <- length(x)
for(i in 1:(n-1))
for(j in (i+1):n)
if (x[i] > x[j])
{
temp <- x[i]
x[i] <- x[j]
x[j] <- temp
}
print(x)
range<-x[n]-x[1]
print(" range of the given data")
print(range)
print(" range of the given data using built in ")
print(range(x))

# another way
big<-x[1]
small<-x[1]
for(i in 2:n)
{
if (big<x[i]) big<-x[i]
if (small>x[i]) small<-x[i]
}
print(big)
print(small)
rang<-big-small
print(rang)

#Quartile deviation
print("Enter values in a vector")
x<-scan()
n <- length(x)
for(i in 1:(n-1))
for(j in (i+1):n)
if (x[i] > x[j])
{
temp <- x[i]
x[i] <- x[j]
x[j] <- temp
}
print(x)
if(n%%2==0){
q3<-x[(3*n)%/%4]
print(q3)

7
q1<-x[n%/%4]
print(q1)
qd<-(q3-q1)/2
} else if((n+1)%%4==0) {
q3<-x[(3*(n+1))%/%4]
print(q3)
q1<-x[(n+1)%/%4]
print(q1)
qd<-(q3-q1)/2
} else {
q3<-(x[((3*(n+1))%/%4)]+x[((3*(n+1))%/%4)+1])/2
print(q3)
q1<-(x[((n+1)%/%4)]+x[((n+1)%/%4)+1])/2
print(q1)
qd<-(q3-q1)/2
}
print(" Quartile deviation using program:")
print(qd)

quart<-function(x)
{
x <- sort(x)
n <- length(x)
m <- (n+1)/2
if (floor(m) != m) {
l <- m-1/2; u <- m+1/2
} else {
l <- m-1; u <- m+1
}
qrt3<-median(x[u:n])
print(qrt3)
qrt1<-median(x[1:l])
print(qrt1)
quartdev<-(qrt3-qrt1)/2
}
print(" Quartile deviation using built ins:")
print(quart(x))

# Mean deviation
print("Enter values in an array")
x<- scan()
n<-length(x)
sumx<-0
sumdev<-0
for(i in 1:n)
sumx=sumx+x[i]
mean<-sumx/n
print(mean)
for(i in 1:n)
sumdev=sumdev+abs(x[i]-mean)
print(sumdev)
md=sumdev/n
print("Mean deviation")

8
print(md)
print("md value using built in function")
result<-mad(x)
print(result)

# Standard deviation
print("Enter values in an array")
x<- scan()
n<-length(x)
sumx<-0
sumxx<-0
for(i in 1:n)
{
sumx=sumx+x[i]
sumxx=sumxx+(x[i]*x[i])
}
mean<-sumx/n
standev=sqrt((sumxx/n)-(mean*mean))
print("Standard deviation")
print(standev)
print("SD value using built in function")
print(sd(x))

# Coefficient of Variation
print("Enter values in an array")
x<- scan()
n<-length(x)
sumx<-0
sumsqdev<-0
for(i in 1:n)
sumx=sumx+x[i]
mean<-sumx/n
for(i in 1:n)
sumsqdev<-sumsqdev+((x[i]-mean)^2)
sd<-sqrt(sumsqdev/n)
print("Standard deviation is")
print(sd)
print("mean value is ")
print(mean)
cv=(sd/mean)*100
print("Coefficient of variation")
print(cv)
print("Coefficient of variation using built-ins")
print((sd(x)/mean(x))*100)

9
Experiment No. 3: Correlation & Regression
a)Correlation coefficient b)Regression lines c)Rank Correlation
d)Multiple correlation coefficient e)Multiple linear regression
SOLUTION:

#corcof
x<-c(7,9,4,10,6,7,8,8,5,6)
y<-c(6,8,6,10,8,5,10,7,7,8)
n<-length(x)
xy<-x*y
xx<-x*x
yy<-y*y
mydata<- data.frame(x,y,xy,xx,yy)
#print(mydata)
sums<-list(sum(x),sum(y),sum(x*y),sum(x*x),sum(y*y))
mydata<-rbind(mydata,sums)
print(mydata,row.names=FALSE)
meanx<-sum(x)/n
print(meanx)
meany<-sum(y)/n
print(meany)
cov<-(sum(x*y)/n)-(meanx*meany)
print(cov)
sdx<-sqrt((sum(x*x)/n)-(meanx^2))
print(sdx)
sdy<-sqrt((sum(y*y)/n)-(meany^2))
print(sdx)
corcof<-cov/sdx/sdy
print(" Correlation coefficient using program")
print(round(corcof,digits=4))
print(" Correlation coefficient using built in")
print(cor(x,y))
plot(x,y)

#Regression Lines
x<-c(7,9,4,10,6,7,8,8,5,6)
y<-c(6,8,6,10,8,5,10,7,7,8)
n<-length(x)
xy<-x*y
xx<-x*x
yy<-y*y
mydata<- data.frame(x,y,xy,xx,yy)
#print(mydata)
sums<-list(sum(x),sum(y),sum(x*y),sum(x*x),sum(y*y))
mydata<-rbind(mydata,sums)
print(mydata,row.names=FALSE)
print("regression line x on y")
result1<-lm(x~y)
print(result1)
print("regression line y on x")
result2<-lm(y~x)
print(result2)

10
# to find the value of x when y=23
x<-coef(result1)[1] + coef(result1)[2]*23
print(x)
# to find the value of y when x=45
y<-coef(result2)[1] + coef(result2)[2]*45
print(y)

#Rank correlation for Unrepeated observations

print("Enter the values of X")
x<-scan()
print("Enter the values of Y")
y<-scan()
a<-rank(-x)
print(a)
b<-rank(-y)
print(b)
d<-a-b
print(d)
ssqd=sum(d*d)
print(ssqd)
n<-length(x)
rankcor<-1-((6*ssqd)/(n*(n*n)-1))
print("rank correlation coefficient is:")
print(rankcor)
corr<-cor.test(x,y,method="spearman")
print("rank correlation coefficient using built in function is :")
print(corr)

#multicorcof
x<-c(7,9,4,10,6,7,8,8,5,6)
y<-c(6,8,6,10,8,5,10,7,7,8)
z<-c(1,2,3,4,5,6,9,7,8,9)
n<-length(x)
corcof<- function(x,y)
{
xy<-x*y
xx<-x*x
yy<-y*y
mydata<- data.frame(x,y,xy,xx,yy)
sums<-list(sum(x),sum(y),sum(x*y),sum(x*x),sum(y*y))
mydata<-rbind(mydata,sums)
cat("\n")
print(mydata,row.names=FALSE)
meanx<-sum(x)/n
meany<-sum(y)/n
cov<-(sum(x*y)/n)-(meanx*meany)
sdx<-sqrt((sum(x*x)/n)-(meanx^2))
sdy<-sqrt((sum(y*y)/n)-(meany^2))
corcof<-cov/sdx/sdy
print(" Correlation coefficient using program")
print(round(corcof,digits=4))
}
r12<-corcof(x,y)

11
r23<-corcof(y,z)
r13<-corcof(x,z)
cat("\n\n\n")
print("partial correlation coefficient")
pcof<-(r12-(r13*r23))/(sqrt(1-(r13^2))*sqrt(1-(r23^2)))
print(pcof)
print("multiple correlation coefficient")
mcof<-sqrt((r12^2+r13^2-2*r12*r13*r23)/(1-r23^2))
print(mcof)

#multipleregline
x1<-c(3,5,6,8,12,14)
x2<-c(16,10,7,4,3,2)
x3<-c(90,72,54,42,30,12)
x1<-c(37,45,38,42,31)
x2<-c(4,0,5,2,4)
x3<-c(71200,66800,75000,70300,65400)
n<-length(x)
x1x2<-x1*x2
x1x3<-x1*x3
x2x3<-x2*x3
x1x1<-x1*x1
x2x2<-x2*x2
mydata<- data.frame(x1,x2,x3,x1x2,x1x3,x2x3,x1x1,x2x2)
print(mydata)
sums<-
list(sum(x1),sum(x2),sum(x3),sum(x1*x2),sum(x1*x3),sum(x2*x3),sum(x1*x1),s
um(x2*x2))
mydata<-rbind(mydata,sums)
print(mydata,row.names=FALSE)
result1<-lm(x3~x1+x2)
print(result1)

12
Experiment No. 4: Curve fitting
a)Straight line b)Parabola c) Y=aXb d) Y=abX e)Y=aebX

SOLUTION:

#Straight line fit

x<-c(1,2,3,4,6,8)
y<-c(2.4,3,3.6,4,5,6)
n<-length(x)
xy<-x*y
xx<-x*x
mydata<- data.frame(x,y,xy,xx)
print(mydata)
sums<-list(sum(x),sum(y),sum(x*y),sum(x*x))
mydata<-rbind(mydata,sums)
print(mydata,row.names=FALSE)
stline<-lm(y~x)
print(stline)
summary(stline)
plot(x,y)
abline(stline,col="BLUE")

#Parabola fit
x<-c(0,1,2,3,4)
y<-c(1,1.8,1.3,2.5,6.3)
n<-length(x)
xy<-x*y
xx<-x*x
xxx<-x^3
xxxx<-x^4
xxy<-x^2*y
mydata<- data.frame(x,y,xy,xx,xxx,xxxx,xxy)
print(mydata)
sums<-list(sum(x),sum(y),sum(x*y),sum(x*x),sum(x^3),sum(x^4),sum(x^2*y))
mydata<-rbind(mydata,sums)
print(mydata,row.names=FALSE)
parabola <- lm(y ~ x+I(x^2))
print(parabola)
f<-coef(parabola)[1]+((coef(parabola)[2])*x)+((coef(parabola)[3])*x*x)
print(f)
plot(x,y)
curve((coef(parabola)[1]+(coef(parabola)[2]*x)+(coef(parabola)
[3]*x*x)),from=x[1],n=x[n],add=T)
curve(predict(parabola,newdata=data.frame(x)),add=T)

13
#a x power b
x<-c(1,2,3,4,6,8)
y<-c(2.4,3,3.6,4,5,6)
n<-length(x)
logx<-round(log10(x),digits=4)
logy<-round(log10(y),digits=4)
logxlogy<-round(logx*logy,digits=4)
logxlogx<-round(logx*logx,digits=4)
mydata<-data.frame(logx,logy,logxlogy,logxlogx)
colnames(mydata)=c("X=logx","Y=logy","XY","XX")
print(mydata)
sums<-
list(sum(logx),sum(logy),round(sum(logx*logy),digits=4),round(sum(logx*log
x),digits=4))
mydata<-rbind(mydata,sums)
print(mydata,row.names=FALSE)
power<-lm(log10(y)~log10(x))
print(power)
alpha<-10^(coef(power)[1])
beta<-coef(power)[2]
print(round(alpha,digits=4))
print(round(beta,digits=4))
f<-alpha*(x^beta)
print(f)
plot(x,y)
curve(alpha*(x^beta),from=x[1],n=x[n],add=T)

#Power fit(a*b^x)
x<-c(1,1.5,2,2.5,3,3.5,4)
y<-c(1,1.3,1.6,2,2.7,3.4,4.1)
n<-length(x)
logy<-round(log10(y),digits=4)
xlogy<-round(x*logy,digits=4)
xx<-x*x
mydata<-data.frame(x,logy,xlogy,xx)
colnames(mydata)=c("X=x","Y=logy","XY","XX")
#print(mydata)
sums<-list(sum(x),sum(logy),sum(x*logy),sum(x*x))
mydata<-rbind(mydata,sums)
print(mydata,row.names=FALSE)
power<-lm(log10(y)~x)
print(power)
alpha<-10^(coef(power)[1])
beta<-10^(coef(power)[2])
print(alpha)
print(beta)
f<-alpha*(beta^x)
print(f)
plot(x,y)
curve(alpha*(beta^x),from=x[1],n=x[n],add=T)

14
#Power fit(a*e^(b*x))
x<-c(1,2,3,4,5)
y<-c(1.8,5.1,8.9,14.1,19.8)
n<-length(x)
logy<-round(log10(y),digits=4)
xlogy<-round(x*logy,digits=4)
xx<-x*x
mydata<-data.frame(x,logy,xlogy,xx)
colnames(mydata)=c("X=x","Y=logy","XY","XX")
print(mydata)
sums<-list(sum(x),sum(logy),sum(x*logy),sum(x*x))
mydata<-rbind(mydata,sums)
print(mydata,row.names=FALSE)
power<-lm(log10(y)~x)
print(power)
alpha<-10^(coef(power)[1])
beta<-coef(power)[2]/0.4343
print(alpha)
print(beta)
f<-alpha*exp((beta^x))
print(f)
plot(x,y)
curve((alpha*exp((beta^x))),from=x[1],n=x[n],add=T)

15
Experiment No. 5: ANOVA
a)one-way classification b)two-way classification

SOLUTION:

#ANOVA one-way (Formula)

data<-c(13,10,8,11,8,13,11,14,14,4,1,3,4,2,4)
formula<-c("A","A","A","A","A","B","B","B","B","C","C","C","C","C","C")
anova1way<-aov(data~formula)
print(summary(anova1way))

#ANOVA two-way (Treatments-blocks)

data<-c(13,7,9,3,6,6,3,1,11,5,15,5)
treatments<-c("t1","t1","t1","t1","t2","t2","t2","t2","t3","t3","t3","t3")
block<-c("b1","b2","b3","b4","b1","b2","b3","b4","b1","b2","b3","b4")
anova2way<-aov(data~treatments+block)
print(summary(anova2way))

16
Experiment No. 6: Time series
a)Moving averages b)ARIMA

SOLUTION:

#Five yearly moving Average

year<-
c(1966,1967,1968,1969,1970,1971,1972,1973,1974,1975,1976,1977,1978,1979,19
80,1981,1982)
timeseries<-
c(19.3,20.9,17.8,16.1,17.6,17.8,18.3,17.3,21.4,19.3,18.1,19.5,19.2,22.2,20
.9,21.5,21.9)
Four yearly moving Average
year<-c(1971,1972,1973,1974,1975,1976,1977,1978,1979,1980,1981,1982)
timeseries<-c(2204,2500,2360,2680,2424,2634,2904,3098,3172,2952,3248,3172)
print("enter the length of moving average")
n<-scan()
if(n%%2==0){
ma<-filter(timeseries,rep(1/n,n), sides=2)
adjma<-filter(y,rep(1/2,2))
mydata<-data.frame(year,timeseries,ma,adjma)
print(mydata)
} else{
ma<-filter(timeseries,rep(1/n,n), sides=2)
mydata<-data.frame(year,timeseries,ma)
print(mydata)
}
plot(year,timeseries, type = "l", col = 1, xlim=c(1965,1983), ylim =
c(16,23),
main = "Moving averages", xlab = "year", ylab = "values")
lines(year,y1, col="blue")
plot(year,timeseries, type = "l", col = 1, xlim=c(1970,1983), ylim =
c(2000,3500),
main = "Moving averages", xlab = "year", ylab = "values")
lines(year,y1, col="blue")

17
#Five yearly moving Average(ARIMA)
year<-
c(1966,1967,1968,1969,1970,1971,1972,1973,1974,1975,1976,1977,1978,1979,19
80,1981,1982)
timeseries<-
c(19.3,20.9,17.8,16.1,17.6,17.8,18.3,17.3,21.4,19.3,18.1,19.5,19.2,22.2,20
.9,21.5,21.9)
Four yearly moving Average
year<-c(1971,1972,1973,1974,1975,1976,1977,1978,1979,1980,1981,1982)
timeseries<-c(2204,2500,2360,2680,2424,2634,2904,3098,3172,2952,3248,3172)
print("enter the length of moving average")
n<-scan()
if(n%%2==0){
ma<-filter(timeseries,rep(1/n,n), sides=2)
adjma<-filter(ma,rep(1/2,2))
mydata<-data.frame(year,timeseries,ma,adjma)
print(mydata)
u= adjma[!is.na(adjma)]
l<-length(u)
utminus1<-u[1:l-1]
print(utminus1)
ut<-u[2:l]
print(ut)
} else{
ma<-filter(timeseries,rep(1/n,n), sides=2)
mydata<-data.frame(year,timeseries,ma)
print(mydata)
u= ma[!is.na(ma)]
l<-length(u)
utminus1<-u[1:l-1]
print(utminus1)
ut<-u[2:l]
print(ut)
}
plot(year,timeseries, type = "l", col = 1, xlim=c(1965,1983), ylim =
c(16,23),
main = "Moving averages", xlab = "year", ylab = "values")
lines(year,y1, col="blue")
plot(year,timeseries, type = "l", col = 1, xlim=c(1970,1983), ylim =
c(2000,3500),
main = "Moving averages", xlab = "year", ylab = "values")
lines(year,y1, col="blue")

18
Experiment No. 7: Goodness of fit
a)Binomial b)Poisson c)Normal d)Contingency table
SOLUTION:

#Binomial fit and goodness of fit

x<-c(0,1,2,3,4,5,6,7)
f<-c(7,6,19,35,30,23,7,1)
fx<-f*x
sumfx<-sum(f*x)
print(sumfx)
sumf<-sum(f)
print(sumf)
s<-length(x)-1
p=sumfx/sumf/s
print(p)
pr<-dbinom(0:7,size=s,prob=p)
pr<-round(pr,digits=5)
fee<-(pr*sumf)
fe<-round(fee,digits=0)
mydata<- data.frame(x,f,fx,pr,fee,fe)
print(mydata)
sums<-list(NA,sum(f),sum(f*x),NA,NA,sum(fe))
mydata<-rbind(mydata,sums)
print(mydata,row.names=FALSE)
result<-chisq.test(f,p=pr,rescale.p=TRUE)
print(result)

#Poisson fit and goodness of fit

x<-c(0,1,2,3,4,5,6,7,8)
f<-c(103,143,98,42,8,4,2,0,0)
fx<-f*x
sumfx<-sum(f*x)
print(sumfx)
sumf<-sum(f)
print(sumf)
s<-length(x)-1
p=sumfx/sumf
print(p)
pr<-dpois(0:8,lambda=p)
pr<-round(pr,digits=5)
fee<-(pr*sumf)
fe<-round(fee,digits=0)
mydata<- data.frame(x,f,fx,pr,fee,fe)
print(mydata)
sums<-list(NA,sum(f),sum(f*x),NA,NA,sum(fe))
mydata<-rbind(mydata,sums)
print(mydata,row.names=FALSE)
result<-chisq.test(f,p=pr,rescale.p=TRUE)
print(result)

19
# Goodness of fit for ND
x<-c(60,65,70,75,80,85,90,95,100)
y<-c(65,70,75,80,85,90,95,100,105)
f<-c(0,3,21,150,335,326,135,26,4)
xi<-(x+y)/2
fx<-f*xi
fxx<-f*xi*xi
sumfx<-sum(f*xi)
print(sumfx)
sumfxx<-sum(f*xi*xi)
sumf<-sum(f)
print(sumf)
m<-sumfx/sumf
print(m)
sd<-(sumfxx/sumf)-(m*m)
sd<-sqrt(sd)
print(sd)
u<-pnorm(y,m,sd)
l<-pnorm(x,m,sd)
pr<-u-l
u<-round(u,digits=5)
l<-round(l,digits=5)
pr<-round(pr,digits=5)
fee<-(pr*sumf)
fe<-round(fee,digits=0)
mydata<- data.frame(x,y,xi,f,fx,fxx,u,l,pr,fee,fe)
print(mydata)
sums<-list(NA,NA,NA,sum(f),sum(f*xi),sum(f*xi*xi),NA,NA,NA,NA,sum(fe))
mydata<-rbind(mydata,sums)
print(mydata,row.names=FALSE)
result<-chisq.test(f,p=pr,rescale.p=TRUE)
print(result)

20
# Goodness of fit contingency table
m<-as.table(rbind(c(190,243,197),c(82,44,44),c(23,78,34),c(5,12,8)))
dimnames(m)=list(Empcategory=c("Labour","Clerks","Technicians","Executives
"), BonusSchemes=c("Type1","Type2","Type3"))
print(m)
csum<-colSums (m)
rsum<-rowSums (m)
mytable<-(rbind(m,csum))
print(mytable)
mytable<-(cbind(m,rsum))
print(mytable)
test<-chisq.test(m)
print(test)
print(test$expected,3)

21
Experiment No. 8: Parametric tests
a) t-test for one-mean b) t-test for two means c) paired t-test d) F-test

SOLUTION:

#one mean- LST

print("Ënter the number of observations given")
n<-scan()
print("Ënter the mean value given")
mean<-scan()
print("Ënter the standard deviation value given")
std<-scan()
print("Ënter the mu value to be tested:")
mu<-scan()
print("enter the level of significance")
alpha<-scan()
zvalue<-(mean-mu)/(std/sqrt(n))
print(zvalue)
tab<-qnorm(alpha,lower.tail=TRUE)
print(tab)
tab<-qnorm(alpha,lower.tail=FALSE)
print(tab)

#two-means- SST
#x<-c(8260, 8130, 8350, 8070, 8340)
#y<-c(7950, 7890, 7900, 8140, 7920, 7840)
x<-c(59, 68, 44, 71, 63, 46, 69, 54, 48)
y<-c(50, 36, 62, 52, 70, 41)
print("enter the level of significance")
alpha<-scan()
n1<-length(x)
n2<-length(y)
sd=sqrt((((n1-1)*sd(x)^2)+(n2-1)*sd(y)^2)/(n1+n2-2))
tvalue=(mean(x)-mean(y))/(sd*sqrt((1/n1)+(1/n2)))
print("mean of x:")
print(mean(x))
print("mean of y:")
print(mean(y))
print("Combined SD")
print(sd)
print("Calculated value of t:")
print(round(tvalue,digits=4))
#t<-t.test(x,y)
#print(t)
print("Table value for two-tailed test:")
tablevalue<-qt(1-alpha/2, df=n1+n2-2)
print(round(tablevalue,digits=3))
print("Table value for one-tailed test:")
tablevalue<-qt(1-alpha, df=n1+n2-2)
print(round(tablevalue,digits=3))

22
#paired t-test for two-means- SST
x<-c(45, 73, 46, 124, 33, 57, 83, 34, 26, 17)
y<-c(36, 60, 44, 119, 35, 51, 77, 29, 24, 11)
print("enter the level of significance")
alpha<-scan()
d<-x-y
n<-length(d)
dbar<-mean(d)
std<-sd(d)
values<-c(dbar,std)
print("values of dbar and standard deviation of differences")
print(round(values,digits=4))
t<-t.test(x,y,paired=TRUE)
print(t)
print("Table value for two-tailed test:")
tablevalue<-qt(1-alpha/2, df=n-1)
print(round(tablevalue,digits=4))
print("Table value for one-tailed test:")
tablevalue<-qt(1-alpha, df=n-1)
print(round(tablevalue,digits=3))

#paired t-test for two-means- SST

#x<-c(16, 26, 27, 23, 24, 22)
#y<-c(33, 42, 35, 32, 28, 31)
#x<-c(9, 11, 13, 11, 15, 9 ,12, 14)
#y<-c(10, 12, 10, 14, 9, 8, 10)
x<-c(20,16,26,27,23,22,18,24,25,19)
y<-c(17,23,32,25,22,24,28,16,31,33,20,27)
print("enter the level of significance")
alpha<-scan()
n1<-length(x)
n2<-length(y)
sd1<-sd(x)
sd2<-sd(y)
if (sd1>sd2) { fvalue<-sd1^2/sd2^2
} else fvalue<-sd2^2/sd1^2
xbar<-mean(x)
ybar<-mean(y)
print("mean of x, mean of y, SD of x , SD of y")
values<-c(xbar,ybar,sd1,sd2)
print(round(values,digits=4))
print("F value using program")
print(round(fvalue,digits=4))
F<-var.test(x,y)
print("F value using built in ")
print(F)
print("Table value for two-tailed test:")
if (sd1>sd2) { ftab<-qf(1-alpha,n1-1,n2-1)
} else ftab<-qf(1-alpha,n2-1,n1-1)
print(round(ftab,digits=3))

23
Experiment No. 9: Graphical representation of data
a) Bar plot b)Frequency polygon c)Histogram d)Pie chart e) scatter plot

SOLUTION

#Bar plot
H <- c(5,15,17,18,16,15)
M <- c(1980,1981,1982,1983,1984,1985)
barplot(H,xlab="Year",ylab="Profit",ylim=c(0,20),
col=rainbow(6),names.arg=M, main="RVRJC
PHARMACEUTICALFIRM",border="red")

#Frequency polygon
v <- c(15, 35, 20, 10, 5 ,15, 20, 15, 12, 13)
plot(v,type="o",xlab="Year",ylab="Profit",xlim=c(1,10),ylim=c(0,40),
col="green", main="RVRJC PHARMACEUTICAL FIRM")

#Histogram
v<-c(3,5,6,19,9,18,23,67,11,10,44,45,54,37,26,8,5,1)
hist(v,main="STUDENTS MARKS", xlab ="Weight",xlim=c(0,70), ylab="no.of
branches",ylim=c(0,5),col=rainbow(10))

#Pie chart
x<-c(5086,3179,1429,152,257,69)
lbls<-c("Total Income", "Interest Paid", "Salaries", "Rent",
"Others","Profit")
pie(x, labels = lbls,col=rainbow(6), main="Income and Expenditure of a
Bank")

#Scatter plot
x<-c(65,68,69,70,75,65,87,56,54)
y<-c(75,76,68,79,77,69,80,60,65)
plot(x,y,xlab="Weight",ylab="height",main="Weight vs Heigts")

Financial Econometrics Mathematics and Statistics Theory Method and Application Hardcovernbsped 1493994271 9781493994274 - Compress
100% (2)
Financial Econometrics Mathematics and Statistics Theory Method and Application Hardcovernbsped 1493994271 9781493994274 - Compress
657 pages
Introduction To Econometrics PDF
No ratings yet
Introduction To Econometrics PDF
13 pages
AD3411 DATA SCIENCE AND ANALYTICS LAB (2) - Removed
No ratings yet
AD3411 DATA SCIENCE AND ANALYTICS LAB (2) - Removed
24 pages
Shaurya Sharma PS LAB WORK
No ratings yet
Shaurya Sharma PS LAB WORK
70 pages
Computer Interactive Statistics
No ratings yet
Computer Interactive Statistics
102 pages
Lab Manual R - STD
No ratings yet
Lab Manual R - STD
17 pages
Ilovepdfrahul Merged
No ratings yet
Ilovepdfrahul Merged
18 pages
Measures of Central Tendency and Variability
No ratings yet
Measures of Central Tendency and Variability
20 pages
'Enter First Value: ': 1 % Initialize Sums. % Read in First Value % While Loop To Read Input Values. % Accumulate Sums
No ratings yet
'Enter First Value: ': 1 % Initialize Sums. % Read in First Value % While Loop To Read Input Values. % Accumulate Sums
16 pages
Datascience Lab
No ratings yet
Datascience Lab
24 pages
An Introduction To R: Biostatistics 615/815
No ratings yet
An Introduction To R: Biostatistics 615/815
59 pages
Galgotias College of Engineering & Technology: Inroduction To Data Analytics and Visualization Lab File (KDS-551)
No ratings yet
Galgotias College of Engineering & Technology: Inroduction To Data Analytics and Visualization Lab File (KDS-551)
47 pages
Fdsa Record Ai&Ds
No ratings yet
Fdsa Record Ai&Ds
26 pages
DA All
No ratings yet
DA All
15 pages
Lab File AD PDF
No ratings yet
Lab File AD PDF
25 pages
Assignment 1
No ratings yet
Assignment 1
16 pages
Ad3411-Data Science and Analytics Laboratory
No ratings yet
Ad3411-Data Science and Analytics Laboratory
27 pages
R Lab Manual
No ratings yet
R Lab Manual
19 pages
ComputerLabNotes 2024
No ratings yet
ComputerLabNotes 2024
109 pages
Statistics With R
No ratings yet
Statistics With R
5 pages
DEV Lab Manual
No ratings yet
DEV Lab Manual
27 pages
V Nishal 24MID0281
No ratings yet
V Nishal 24MID0281
17 pages
NAST Practical File
No ratings yet
NAST Practical File
18 pages
R Program 2025,-1
No ratings yet
R Program 2025,-1
11 pages
Nishant R File
No ratings yet
Nishant R File
49 pages
Ad3411 - Data Science and Analytics Laboratory
No ratings yet
Ad3411 - Data Science and Analytics Laboratory
26 pages
Multiple Regression
100% (1)
Multiple Regression
58 pages
Machine Learning
No ratings yet
Machine Learning
3 pages
Second Laboratory Report. Probability and Random Variables.
No ratings yet
Second Laboratory Report. Probability and Random Variables.
23 pages
Dat QB
No ratings yet
Dat QB
8 pages
Practical 10
No ratings yet
Practical 10
22 pages
R Intro 2011
No ratings yet
R Intro 2011
115 pages
UNIT 3 JACARANDA Practice Exam Solutions
No ratings yet
UNIT 3 JACARANDA Practice Exam Solutions
15 pages
Arunav Da Prac
No ratings yet
Arunav Da Prac
55 pages
Fyybsc - CS Sem 1 FMS Journal
No ratings yet
Fyybsc - CS Sem 1 FMS Journal
43 pages
Rintro
No ratings yet
Rintro
14 pages
Final Cost Practical
No ratings yet
Final Cost Practical
29 pages
Maths Lab
No ratings yet
Maths Lab
17 pages
Maths Record Output .
No ratings yet
Maths Record Output .
24 pages
ML Lab Manual
No ratings yet
ML Lab Manual
28 pages
Ad3411 Data Science and Analytics Laboratory
100% (7)
Ad3411 Data Science and Analytics Laboratory
24 pages
CH 4 - Problems
No ratings yet
CH 4 - Problems
72 pages
Python Code - Summary Statistics
No ratings yet
Python Code - Summary Statistics
6 pages
4 12
No ratings yet
4 12
17 pages
EMERGENCY
No ratings yet
EMERGENCY
19 pages
Utp Kirtan ST 5401
No ratings yet
Utp Kirtan ST 5401
2 pages
Cost Practical
No ratings yet
Cost Practical
13 pages
Big Data File in R
No ratings yet
Big Data File in R
23 pages
BCA V SEM Advanced R Programming Lab Manual Final-1
No ratings yet
BCA V SEM Advanced R Programming Lab Manual Final-1
5 pages
R Programs
No ratings yet
R Programs
12 pages
R Console
No ratings yet
R Console
6 pages
DA Lab ANSWERS
No ratings yet
DA Lab ANSWERS
10 pages
DAUR Lab Manual
No ratings yet
DAUR Lab Manual
14 pages
R Lab File Deepak
No ratings yet
R Lab File Deepak
27 pages
CSC 406 Uniabuja 1
No ratings yet
CSC 406 Uniabuja 1
267 pages
Introduction To R: 1 Getting Started
No ratings yet
Introduction To R: 1 Getting Started
14 pages
Statiscal Method Using R Lab, Syllabus
No ratings yet
Statiscal Method Using R Lab, Syllabus
3 pages
Basics of Statistics and Probability - FP: Statistical Measures
No ratings yet
Basics of Statistics and Probability - FP: Statistical Measures
12 pages
Stats GE Sem-4
No ratings yet
Stats GE Sem-4
5 pages
4 III BTech Minor DS Courses Syllabus
No ratings yet
4 III BTech Minor DS Courses Syllabus
5 pages
R Commands
No ratings yet
R Commands
5 pages
CS01207
No ratings yet
CS01207
3 pages
Regression Analysis
No ratings yet
Regression Analysis
7 pages
BAN5
No ratings yet
BAN5
2 pages
Litrecher Rivew1
No ratings yet
Litrecher Rivew1
18 pages
Effect of Internal Control On Fraud Prevention in Commercial-1-30
No ratings yet
Effect of Internal Control On Fraud Prevention in Commercial-1-30
30 pages
Chapter 4 Demand Estimation
No ratings yet
Chapter 4 Demand Estimation
9 pages
SRT 605 - Topic (10) SLR
No ratings yet
SRT 605 - Topic (10) SLR
39 pages
Statistics For Business and Economics: Bab 14
No ratings yet
Statistics For Business and Economics: Bab 14
31 pages
Measuring Social Media Influencer Index
No ratings yet
Measuring Social Media Influencer Index
16 pages
Unit 2 Supervised Learning Regression
No ratings yet
Unit 2 Supervised Learning Regression
111 pages
13 Multiple Regression Part3
No ratings yet
13 Multiple Regression Part3
20 pages
Topic 4 - Bivariate Analysis
No ratings yet
Topic 4 - Bivariate Analysis
29 pages
Unit 3 Partial and Multiple Correlations: Structure
No ratings yet
Unit 3 Partial and Multiple Correlations: Structure
17 pages
Faries Hizrian Effendy 2019
No ratings yet
Faries Hizrian Effendy 2019
26 pages
Identifying The Sources of Gains From Takeovers: Feature Article
No ratings yet
Identifying The Sources of Gains From Takeovers: Feature Article
25 pages
Boundaryless Careers and Occupational Wellbeing Michela Cortini Instant Download
No ratings yet
Boundaryless Careers and Occupational Wellbeing Michela Cortini Instant Download
80 pages
Biostatics Imp Questions
No ratings yet
Biostatics Imp Questions
15 pages
Applied Medical Statistics 1st Edition Jingmei Jianginstant Download
100% (2)
Applied Medical Statistics 1st Edition Jingmei Jianginstant Download
61 pages
Cherie 2020 J. Phys. Conf. Ser. 1469 012105
No ratings yet
Cherie 2020 J. Phys. Conf. Ser. 1469 012105
9 pages
Introductory Econometrics A Modern Approach 6th Edition Wooldridge Test Bankpdf Download
100% (5)
Introductory Econometrics A Modern Approach 6th Edition Wooldridge Test Bankpdf Download
44 pages
Scientist's Guide To Developing Explanatory Statistical Models Using Causal Principles
No ratings yet
Scientist's Guide To Developing Explanatory Statistical Models Using Causal Principles
14 pages
Regression and Correlation
No ratings yet
Regression and Correlation
2 pages
Predictive Modelling Using Linear Regression
No ratings yet
Predictive Modelling Using Linear Regression
12 pages
(English (Auto-Generated) ) All Machine Learning Algorithms Explained in 17 Min (DownSub - Com)
No ratings yet
(English (Auto-Generated) ) All Machine Learning Algorithms Explained in 17 Min (DownSub - Com)
19 pages
Step 1: Business and Data Understanding: Project 1: Predicting Catalog Demand
No ratings yet
Step 1: Business and Data Understanding: Project 1: Predicting Catalog Demand
2 pages
Nonlinearity Test Summary - Bima
No ratings yet
Nonlinearity Test Summary - Bima
4 pages

Cb161 Lab Manual

Uploaded by

Cb161 Lab Manual

Uploaded by

DEPARTMENT

COMPUTER SCIENCE & BUSINESS SYSTEMS

RVR&JC COLLEGE OF ENGINEERING(AUTONOMOUS)

PROFORMA FOR LABORATORY – BASED COURSE DESCRIPTION

Course Code : CB – 161

The student who successfully completes this course will have:

1. Write the programs in R to solve the statistical problems.

3. LAB – COURSE CONTENT

1.Hands-on Programming with R, Garrett Grolemund, O′Reilly.

6. LAB – COURSE PLAN&DELIVERY :

3.Correlation & Regression

9.Graphical representation of data

Internal Lab Exam : 40 Marks

8. TOPICS COVERED BEYOND THE CURRICULUM:

Statistical concepts regarding testing of hypothesis

9. SEMESTER END OBSERVATIONS FOR FUTURE GUIDANCE:

Case studies to be explained are revised.

10. CO-PO Mapping

11. CO-PSO Mapping

PSO1 POS2 PSO3

1. Experiment No. 1: Measures of central tendency

#Rank correlation for Unrepeated observations

#Straight line fit

#ANOVA one-way (Formula)

#ANOVA two-way (Treatments-blocks)

#Five yearly moving Average

#Binomial fit and goodness of fit

#Poisson fit and goodness of fit

#one mean- LST

#paired t-test for two-means- SST

You might also like