0% found this document useful (0 votes)
14 views68 pages

Rstudio Divya

The document is a research methodology guide for a BCOM course, focusing on R programming and RStudio. It includes instructions on installation, layout explanation, and various functions and packages in R, along with examples and commands for data manipulation and visualization. Additionally, it covers hypothesis testing using t-tests and provides a framework for analyzing data sets.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as ODG, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views68 pages

Rstudio Divya

The document is a research methodology guide for a BCOM course, focusing on R programming and RStudio. It includes instructions on installation, layout explanation, and various functions and packages in R, along with examples and commands for data manipulation and visualization. Additionally, it covers hypothesis testing using t-tests and provides a framework for analyzing data sets.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as ODG, PDF, TXT or read online on Scribd
You are on page 1/ 68

VIVEKANDA INSTITUTE OF

PROFESSIONAL STUDIES-
TECHNICAL CAMPUS

Vivekananda School of Business Studies

Research Methodology Of Commerce

BCOM 2023-2026
RStudio

Student Name: DIVYA SINGH


Enrolment Number: 03329888823
Section- B.COM(Hons.)-3D

Submitted to: Deepika Chhikara


Assistant Professor, VIPS

INTRODUCTION TO R

 How to Install R Studio? What is the latest version of


R. Give details?
We must install R before we can install R Studio. Here are the instructions for installing R:
1. Visit CRAN, choose get R for Windows, then click base to get the most recent R installatio
2. From the pop-up menu that appears, right-click the installer file and choose Run
Administrator.

3. Decide which language will be utilised for the installation.

4. R continues to use English for all messages and help files; this doesn't change that.

Observe the installer's instructions.

You may click Next again until R installs, and you can use the default settings without risk.
We may install the R Studio setup once we have installed the R setup. The steps to install R
studio are as follows: .
1. Set up R. Keep the installation parameters set to their default values.
2. Let R Studio open.
3. Select "Install Packages" under the "Packages" tab.
4. Type "Rcmdr" until a list of results appears. ..
5. Hold off until the R Commander package has been installed in its entirety.
Launched on October 17, 2023, the most recent version of R Studio is "2023.09.01,"
known as "Deserted Sunflower."
 RStudio Layout with snapshot. Explain the purpose of
all panes.
There are four panes in R Studio also known as windows. These panes are : -

1. Source-
This is that part of the window where we write our code. Our code will not
be evaluated until we run this code in the console.

2. Console-
This is that part of the window where our code from the source is evaluated
by R. we can also use the console to perform quick calculations that we don’t need to
save.

3. Environment/History- This is that part of the window where we can see that
. space
what objects are in our working

4. Files/Post/Packages/Help-
This is that part of the window where we can see
file directories, view, plots see our packages and access R help.

 Who designed and developed the R language?


For statistical analysis, Ross Ihaka and Robert Gentleman created the R language. T
language was created by the R Development Core Team. They offered many formats, includ
precompiled binary files for Windows, Macintosh, Linux, and UNIX, as well as sources writte
in C.

 Are variables ‘H’ and ‘h’ same in R language?


R is a case- sensitive language, hence ‘H’ and ‘h’ both are different.

 What is a package? What are two major parts of R


language?
A package is an assemblage of documentation, examples, and functions. The R Lan
offers several packages that are beneficial for a wide range of uses. Additional packages lik
utils, stats, and graphics are included with the main package.
The two main components of the R Language software are the base system and ad
packages. Along with other essential features, the "base" package is the most signifi
component of the basic system.

 How is a package installed and accessed?


To install a package, simply pass the package to be installed using “LIBRARY()” command.

 What is CRAN?
The official repository is the Comprehensive R Archive Network (CRAN), a global network
of web and file servers run by the R community. It is coordinated by the R community, and
order for a package to be listed in CRAN, it must pass a number of tests to make sure that i
complies with CRAN guidelines..

 What do you mean by Object Assignment? Elucidate


difference between Left-side and Right-side
Assignment with output. What is Assignment operator
in Rstudio?
When values are assigned to a name using the assignment operator, a new object with a na
is created. This process is known as object assignment. Once the new named object is form
it may be used again without duplication in subsequent calculations.
There are three parts to the assignments operation. Starting from the left: 1.The object nam
a new object is represented by the first x numeric component.

2. The assignment operator \-, which consists of the less than sign < right after the minus s
-, is the second element.
3. The value or values that should be applied to the name make up the last part.

A variable's value is assigned from right to left via the symbol <-.

-> designates a variable's value in a left-to-right direction.


To assign a value, utilise the assignment operator. For example, we may use the <- assignm
operator to set the variable x's value to 3. By simply entering x at the command line, which
return the value of x, we may evaluate the variable.

 Explain The c( ) function. Write a sample command


for the same.
The c() function in R is used to perform three common tasks:
1. Create a Vector.
2. Concatenate multiple vectors.
3. Create columns in a data frame.

 What is paste( ) function used for? Write a sample


command for the same.
Paste() Function takes multiple elements from the multiple vectors and concatenates them
a single element.
 What is %>% operator used for? Write a sample
command for the same.
%>% is called the forward pipe operator in R. It provides a mechanism for chaining
commands with a few forward- pipe operator, %>%. This operator will forward a value, or
the result of an expression, into the next function call/ expression.

 What is meant by “>”, “+” and [1]in R console?


In R Console, every output is displayed with [1], which indicates that every program code
calls every single number as a vector or a matrix. Whereas, “>” “+” works as a secondary
prompt that occurs when any instruction or command is not finished. In this case, either the
codes need to be completed or press “Esc” to get back on the primary prompt, i.e., “>”.

 Write the code to identify odd or even numbers using


IF statement.
 Write the code to identify minimum number among
three numbers using Nested IF statement.

 Display grade of student using nested if command for


following criterion (customize student
Output example: Divya has scored “A” Grade
IF THEN
PERCENTAGE
>=90 A+
>=75<90 A
>=50<75 B
<50 F

 How to Import of Data Sheet in Excel?


Step 1:- Select the Import Dataset option in the environment window. Here the user needs
select the option to import the dataset from the environment window in RStudio.
Step 2:- Select the option of “From Excel” under the import Dataset option.in this step, th
user needs to select the option to “From Excel” as the file is in the form of excel under the
import dataset option to import excel file.
Step 3:- Select the browse option and select the excel file to be imported. Now, under this
with the click to browse option user will be given the choice to select the needed excelto be
imported in R. And then the user need to select the needed excel file to be imported in R.
 What are Data Frames, Matrices, Vectors?
Data presented in a table is called a data frame. Several kinds of data can be contained in d
frames. The second and third columns can be numerical or logical, while the first column ca
be character-based. But the type of data in each column should be the same.
A two-dimensional data collection having rows and columns is called a matrix. A row is a
horizontal representation of data, whereas a column is a vertical representation. The matrix
method can be used to construct a matrix.
A collection of objects that share the same type is all that constitutes a vector. Use the c(
function and comma to separate each item in the list to create a vector.

 Write a command to create Data Frames, Matrices,


Vectors.
Command to Create A Data Frame: -
Command to Create A Matrix: -

Command to Create A Vector: -


 Name some built-in functions with their description.

OPERATOR DESCRIPTION
mean(x) Mean of x
median(x) Median of x
var(x) Variance of x
sd(x) Standard Deviation of x
scale(x) Standard scores(z-scores) of
x
quantile(x) The quartiles of x

summary(x) Summary of x: mean, min,


max, etc.

 Create a function for multiplication but no return


value
 Write a command for Accessing Rows and Columns.

 Create a data frame by your surname of 12 rows and


8 columns.
 Write a command to access non-consecutive rows or
columns, use ‘c()’. For example, to obtain rows 1 to 5,
7, 11 and columns 3 to 4 and 7.

 Add one new column and drop two existing columns 4


and 5
 Drop rows 1, 3 and 4

 Write a command to calculate the number of columns


and number of rows.
 What is the command to access built in datasets? What
is the commandto get descriptionof a built-in
datasets?

 AccessTitanic datasetand Executecommandsto


evaluate whether the evacuation strategy was fair or
not. If biased, state which gender, age group and class
was most favored. Analyze using cross tabulations.

 Calculate correlation by importing data from excel.


Determine whether there is a positive or a negative
correlation in Advertisement in month and Sales in
crores.
Advertisement in month Sales in crores
112
32 5
54 10
67 15
65 20
98 24
34
62 24
34 34

Packages in R Programming
The “tidyr” Package
 Apply Important Functions (gather,separate,
unite, spread, fill, full_seq, drop_na, and
replace_na)in “tidyr Package”for following
dataset

S.No. Group 1 Group 2 Group 3


1 23 117 29
2 345 89 101
3 76 66 239
4 212 334 289
5 88 90 176
6 199 101 320
7 72 178 89
8 35 233 109
9 90 45 199
10 265 200 56

Gather ()
Separate ()

Unite ()
Spread ()

Fill in Missing Values ()


Full Sequence ()

Drop na ()
Replace na ()

The “dplyr” Package


 Apply Important Functions (filter, arrange,
select, rename, mutate and transmute, sample_n
and sample_frac) for following column heads with
5 data rows:
NAME| SEMESTER 1 MARKS| SEMESTER 2
MARKS|SUBJECT MATHS IN CLASS 12TH (YES/NO)
Filter ()

Arrange ()
Select ()

Rename ()
Mutate ()

Transmute ()
Sample_n ()

Sample_frac ()
Data Visualization in R Studio
Quick plot with ggplot2
 Generate BCOM marks data containingthe
sections and overall percentage(5 sections
ranging from A to E ), with 60 students in each
section

Plot
 Create following Quick plots withcustomized
labels (with your name and DOB) for both the axis
and Main title of the chart
Histogram plot
• Histogram fill color by group (Section)
• Basic density plot
• Density plot line color by group (Section) and change line
type
• Draw a plot using data from numeric vectors where X conta
values ranging from 10 to 20 and
Y is square of X
• Add to the dot plot for X & Y
HISTOGRAM PLOT:
DENSITY PLOT:

SCATTER DOT PLOT:


 Activate Motor Trend Car Road Tests dataset.
Using the given data set prepare following quick
plots:
•Scatter plots with smoothed line for Miles/(US) gallon on y
axis and Weight (lb/1000) on x axis
• Scatter plots (for Miles/(US) gallon on y axis and Weig
(lb/1000) on x axis) with Smoothed line
by groups (Number of cylinders)
• Scatter plots with colors for Miles/(US) gallon on y axis and
Weight (lb/1000) on x axis
• Scatter plots (for Miles/(US) gallon on y axis and Weig
(lb/1000) on x axis) with colors by groups (Number of gears)
• Scatter plots (for Miles/(US) gallon on y axis and Weig
(lb/1000) on x axis) with Smoothed line and colors by groups
(Number of gears)
• Scatter plots (for Miles/(US) gallon on y axis and Weig
(lb/1000) on x axis) with Smoothed line. and the point shape
groups (Number of gears)
PLOT1:Scatter plots with smoothed line for Miles/(US) gallon on y axis
and Weight (lb/1000) on x axis

PLOT2: Scatter plots (for Miles/(US) gallon on y axis and Weight


(lb/1000) on x axis) with Smoothed line
PLOT3: Scatter plots with colors for Miles/(US) gallon on y axis and
Weight (lb/1000) on x axis

PLOT4: Scatter plots (for Miles/(US) gallon on y axis and Weight


(lb/1000) on x axis) with colors by groups (Number of gears)
PLOT5: Scatter plots (for Miles/(US) gallon on y axis and Weight
(lb/1000) on x axis) with Smoothed line and colors by groups (Number of
gears)

PLOT6: Scatter plots (for Miles/(US) gallon on y axis and Weight (lb/1000)
on x axis) with Smoothed line. and the point shape by groups (Number of
gears)
 Provide 5 commands for Descriptive statistics
DATA:

STATISTICAL COMMANDS:
 Q36.Provide summary statistics for the MTCARS
datasetwhile displayinga count summaryof
categorical variables.
Summary ()

Count Summary
HYPOTHESIS TESTING using R studio
For all test import excel with the given data saved by your name_test

T-TEST
 One Sample t- Test using dummy (One- Tailed)
File name example: divya_ttest

Problem 1:
To determine that the population mean of age is equal to 40
at α=0.05.

Age
18
24
56
78
67
24
65
89
25
23
45
65
78
55
32
33
44
26
56
89
44
34
3
4
56
56
76

SOLUTION: -
H0 : Population mean of age is not equal to 30
0H
:≠ 30
H1 : Population mean of age is equal to 30
1H: =30

STEPS: -
1. In the File Tab, Click on the Import Dataset then click From Excel-
2. Click on Browse, select file, select sheet and import-

3. Click on R script for new file-


4. Write the coding in source, click on run and analysis the output from
console-

CODING:
library(readxl)
Divya_ttest <- read_excel("C:/Users/mrvai/Desktop/divya singh/college/Divya_ttest.xlsx")
View(Divya_ttest)
data<-Divya_ttest[,c("Age")]
null_mean<-30
t_test_result<-t.test(data$Age,mu=null_mean,alternative = "less")
print(t_test_result)

RESULT-
data: data$Age
t = 3.5836, df = 26, p-value = 0.9993
alternative hypothesis: true mean is less than 30
95 percent confidence interval:
-Inf 54.87245
sample estimates:
mean of x
46.85185

DECISION RULE:-
If P(T) is less than a, reject Null Hypothesis.
Inference:
Since P(0.9993) is greater than a(0.05), accept Null Hypothesis.

Conclusion:-
The population mean of age is not equal to 30 at a = 0.05.

 B. Two Sample t- Test


Problem 1:
To analyzethat the time spent by full time
students in studying statistics is different as time
spent by part time students.
Full Time Part Time
3.2 3.1
1.5 3.4
6.5 4.6
0.2 2.8
3.7 2.3
3.3 1.5
1.7 3.8
3.6 9.5
3.8 4.3
5.3 2.7
6.9 3.4
3.6 1.6
1.7 3.2
2.2 4.2
7.2 3.9
3.9 1.2
1.9
5.3

SOLUTION:
Hypothesis Testing
H0 : The time spent by full time students in studying statistics is not different from time spe
by part time students.
0H
: µF= µP
H0: µF - µP = 0
H1: The time spent by full time students in studying statistics is different from time spent by
part time students.

H
1: µF≠ µ
P

H1: µF - µP ≠ 0
Steps-
1. In the File Tab, click on Import Dataset then click From Excel-

2. Click on Browse, select file, select sheet and import-


3. Click on R script for new file-

4. Write the coding in source, click on run and analyse the output from
console-
CODING:
library(readxl)
Divya_ttest2 <- read_excel("C:/Users/mrvai/Desktop/divya
singh/college/Divya_ttest2.xlsx")
View(Divya_ttest2)
t_test_result<-t.test(Divya_ttest2$'Full Time',Divya_ttest2$`Part Time`)
print(t_test_result)

RESULT:
Welch Two Sample t-test

data: Divya_ttest2$"Full Time" and Divya_ttest2$`Part Time`


t = 0.2555, df = 31.779, p-value = 0.8
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-1.186656 1.526934
sample estimates:
mean of x mean of y
3.638889 3.468750

DECISION RULE:
If P(T) is less than a, reject Null Hypothesis
Inference:
Since P(0.8) is greater than a(0.005), accept Null Hypothesis.

Conclusion:
The time spent by full time students in studying statistics is not different as time spent by p
time students at a=0.005
 C. Two Sample t- Test
Problem 1:
Is there sufficient evidence to suggest that the
mean time to exhaustion is greater after chocolate
milk than after carbohydrate replacement drink?
Use a significance level of 0.05. (Use µCM-µCD in
hypothesis statements)
CYCLIST CHOCOLATE MILK CARBOHYDRATED
REPLACEMENT
MILK
1 50.46 32.9
2 47.08 20.1
3 57.51 41.67
4 46.6 32.69
5 49.1 46.33
6 27.5 31.63
7 23.87 50.61
8 28.65 14.99
9 35.37 20.11

SOLUTION:
HYPOTHESIS TESTING-
H0 : The mean time to exhaustion is not greater after chocolatemilk than after carbohydrate
replacement drink.
: 𝜇CM
0H ≤𝜇
CD

: 𝜇CM
0H - 𝜇CD ≤ 0

H1: The mean time to exhaustion is greater after chocolatemilk than carbohydrate replacem
drink.
: 𝜇CM
1H >𝜇
CD

: 𝜇CM
1H - 𝜇CD >0
Steps:
1. In the file tab, Click on Import Dataset then click From Excel:

2. Click on browse, select file, select sheet and import


3. Click on R script for new file:

4. Write the coding in source, click on run and analyse the output from
console:
CODING:
library(readxl)
Divya_ttest3 <- read_excel("Divya_ttest3.xlsx")
View(Divya_ttest3)
t_test_result<-t.test(Divya_ttest3$`CHOCOLATE
MILK`,Divya_ttest3$`CARBOHYDRATED REPLACEMENT MILK`,paired =
T,alternative = "greater")
print(t_test_result)

RESULT:
Paired t-test

data: Divya_ttest3$`CHOCOLATE MILK` and Divya_ttest3$`CARBOHYDRATED


REPLACEMENT MILK`
t = 1.5783, df = 8, p-value = 0.07657
alternative hypothesis: true mean difference is greater than 0
95 percent confidence interval:
-1.487055 Inf
sample estimates:
mean difference
8.345556

Decision Rule:
If P(T) is less than a, reject Null Hypothesis.
Inference:
Since p(0.007) is greater than a(0.005), reject Null Hypothesis.

Conclusion:
The mean time to exhaustion is greater after chocolate milk than after carbohyd
replacement drink at a=0.005.
 D. Paired t- Test
Problem 1:
Coaching was given to students for Statistical
software after their result was evaluatedin
January in order to improve their performance in
April exams.Determineif the coachingwas
successful. (α = 0.05%)
JAN MAY
45 56
54 57
44 32
56 67
34 44
45 34
34 34
56 76
45 56
54 45
67 55
56 87
56 66
56 65
76 45
45 76

SOLUTION:
HYPOTHESIS TESTING
H0: The coaching was not successful
𝜇MAY
0:H ≤𝜇
JAN

0:H𝜇MAY - 𝜇JAN ≤ 0

H1: The coaching was successful

1:H𝜇MAY > 𝜇
JAN
1:H𝜇MAY - 𝜇JAN > 0

Steps:-
1. In the File Tab, click Import Dataset, then click sheet and import-

2. Click on Browse, select file, select sheet and import-


3. Click R script for new file-

4. Write the coding in source, click on run and analyse the output from
console-
CODING:-
> library(readxl)
> Divya_ttest4 <- read_excel("C:/Users/mrvai/Desktop/divya
singh/college/Divya_ttest4.xlsx")
> View(Divya_ttest4)
> t_test_result<-t.test(Divya_ttest4$JAN,Divya_ttest4$MAY,paired = =
T,alternative
"greater")
> print(t_test_result)

Result:-
Paired t-test

data: Divya_ttest4$JAN and Divya_ttest4$MAY


t = -1.0885, df = 15, p-value = 0.8532
alternative hypothesis: true mean difference is greater than 0
95 percent confidence interval:
-11.74747 Inf
sample estimates:
mean difference
-4.5

Decision Rule:
If P(T) is less than a, reject Null Hypothesis.
Inference:
Since P(0.85) is greater than a(0.005), accept Null Hypothesis

Conclusion:
The coaching was not successful.
 E. Two Sample t Test
Problem 1:
To analyse that there is a significant difference
between the marks scored by class groups A & B
in mathematics at α=10%.
GROUP A GROUP B
76 95
55 97
76 87
76 89
89 56
65 98
76 76
88 56
78 76
87 56
87 87
65 76
76 87
89 88
65 76
78 66
69 45
65 76
89 77

SOLUTION:
HYPOTHESIS TESTING
H0: There is no significant difference between the marks scored by class group A and B in
mathematics.
0H
: µA =µB
0H
: µA -µB =0
H1: There is a significant difference between the marks scored by class group A and B in
mathematics.
1H
: µA ≠µB
1H
: µA -µB ≠ 0

Steps:-
1. In the File Tab, click Import Dataset, then click sheet and import-

2. Click on Browse, select file, select sheet and import-


3. Click R script for new file-

4. Write the coding in source, click on run and analyse the output from
console-
CODING:-
> library(readxl)
> Divya_ttest <- read_excel("C:/Users/mrvai/Desktop/divya
singh/college/Divya_ttest.xlsx",
+ sheet = "Sheet2")
> View(Divya_ttest)
> t_test_result<-t.test(Divya_ttest$`GROUP A`,Divya_ttest$`GROUP B`,conf.level =
0.95)
> print(t_test_result)

Result:-
Welch Two Sample t-test

data: Divya_ttest$`GROUP A` and Divya_ttest$`GROUP B`


t = -0.18729, df = 31.388, p-value = 0.8526
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-9.382128 7.803181
sample estimates:
mean of x mean of y
76.26316 77.05263

Decision Rule:
If P(T) is less than a, reject Null hypothesis
Inference:

Since P(0.85) is greater than a, accept Null .Hypothesis

Conclusion:
There is no significant difference between the marks scored by class group A and B in
mathematics
 F. F Test
Problem 1:
Determine whether or not there is a significant
difference between variances of two data sets
GROUP1 GROUP2
150 170
125 165
160 130
130 155
160 125
125 150

SOLUTION:
HYPOTHESIS TESTING
H0: There is no significant difference between variances of two data sets.
0:Hµ1 = µ2
0:Hµ1 - µ2 = 0

H1: There is a significant difference between variances of two data sets.


0:Hµ1 ≠ µ2
0:Hµ1 - µ2 ≠ 0
STEPS:
1. In the File Tab, click Import Dataset, then click sheet and import-

2. Click on Browse, select file, select sheet and import-


3. Click R script for new file-

4. Write the coding in source, click on run and analyse the output from
console-
CODING:-
> library(readxl)
> Divya_ttest <- read_excel("C:/Users/mrvai/Desktop/Divya
singh/college/Divya_ttest.xlsx",
+ sheet = "Sheet3")
> View(Divya_ttest)
> var.test(Divya_ttest$GROUP1,Divya_ttest$GROUP2)

RESULT:-
F test to compare two variances

data: Divya_ttest$GROUP1 and Divya_ttest$GROUP2


F = 0.85786, num df = 5, denom df = 5, p-value = 0.8705
alternative hypothesis: true ratio of variances is not equal to 1
95 percent confidence interval:
0.1200405 6.1305620
sample estimates:
ratio of variances
0.8578554

DECISION RULE:-
If P(T) is less than a, reject Null Hypothesis.
Inference:-
Since P(0.87) is greater than a(0.005), accept Null Hypothesis.

Conclusion:
There is no significant difference between variances of two data sets.
 G. One Way Anova
Problem 1:
The marks for 3 different groups in Economics,
Science, History are given. Determine whether
there is a significant difference between the means
of population.

ECONOMICS SCIENCE HISTORY


45 69 75
53 54 20
54 58 45
53 64 42
43 64 50
44 55 39
56 45 55
52 30
20 20

SOLUTION:
HYPOTHESIS TESTING:
H0: There is no significant difference between the mean of population.
0H
: µ1 = µ2= µ3
H1: There is significant difference between the mean of population.
1H
: Atleast one of the means is different.
STEPS:
1. In the File Tab, click Import Dataset, then click sheet and import-

2. Click on Browse, select file, select sheet and import-


3. Click R script for new file-

4. Write the coding in source, click on run and analyse the output from
console-
CODING:
library(readxl)
singh_ttest <- read_excel("C:/Users/mrvai/Desktop/singh
/college/singh_ttest.xlsx",
sheet = "Sheet4")
View(singh_ttest)
group1=c(singh_ttest$ECONOMICS)
group2=c(singh_ttest$SCIENCE)
group3=c(singh_ttest$HISTORY)
combined_group=data.frame(cbind(group1,group2,group3))
summary(combined_group)
stack(combined_group)
stacked_group=stack(combined_group)
annova_result=aov(values~ind,data=stacked_group)
print(summary(annova_result))

RESULT:
Df Sum Sq Mean Sq F value Pr(>F)
ind 2 1125 562.4 3.238 0.0585 .
Residuals 22 3821 173.7

Decision Rule:
If P(T) is less than a, reject Null Hypothesis.
Inference:
Since P(0.859) is greater than a(0.005), accept Null Hypothesis

Conclusion:
There is no significant difference the mean of population.
 H. Chi Square Test
Problem 1:
Determine whether brand preference is
independent of age group.
AGE/BRAND BRAND 1 BRAND 2 BRAND 3
15-25 75 56 72
26-35 60 40 64
36-45 45 52 50
45-55 55 35 45

SOLUTION:
HYPOTHESIS TESTING:
H0: There is no association between brand preference and age group.
H1: There is an association between brand preference and age group.

STEPS:-
1. In the File Tab, click Import Dataset, then click sheet and import-
2. Click on Browse, select file, select sheet and import-

3. Click R script for new file-


4. Write the coding in source, click on run and analyse the output from
console-

CODING:
library(readxl)
Divya_ttest <- read_excel("C:/Users/mrvai/Desktop/divya
singh/college/Divya_ttest.xlsx",
sheet = "Sheet5")
View(Divya_ttest)
chi_square_result<-chisq.test(Divya_ttest[,c("BRAND 1","BRAND 2","BRAND 3")])
print(chi_square_result)

RESULT:
Pearson's Chi-squared test

data: Divya_ttest[, c("BRAND 1", "BRAND 2", "BRAND 3")]


X-squared = 6.7163, df = 6, p-value = 0.3479
Decision Rule:
If P(T) is less than a, reject Null Hypothesis.
Inference:
Since P(0.34) is greater than a(0.005), accept Null Hypothesis

Conclusion:
There is no association between brand preference and age group.

You might also like