0% found this document useful (0 votes)

24 views22 pages

Example Report

This document describes using descriptive statistics and analysis of variance to analyze a dataset. Various descriptive statistics are calculated for variables by group including mean, median, standard deviation, and through boxplots. Normality and homogeneity of variance are also checked before conducting one-way ANOVA tests to determine significance of differences between groups.

Uploaded by

Trần Thảo

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

24 views22 pages

Example Report

Uploaded by

Trần Thảo

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 22

Question 1: Produce descriptive statistics to summarize the data.

You are expected to generate as

many relevant descriptive statistics as possible using ALL the relevant tools introduced in the labs
of this course. Remember to provide appropriate interpretations for the descriptive statistics. Try
not to include unnecessary or irrelevant descriptive statistics.

We describe statistics using RStudio. First, we import the Excel file "Datasets.xlsx" into R for
further analysis:

➢ Dataset4 <-read.table("Dataset4.csv", header=TRUE, sep = ",",stringsAsFactors = FALSE)

There are 180 observations in this case study, therefore, we should see some first observations to
have better knowledge related to this data using head () function in R :

Figure 1: Some first observations of the data set

From the previous output, we can conclude that there are 180 observations with 3 variables: roa,
own, and province. Because own and province are characters, we will convert them into factors
by using the following R codes:

➢ Datasets$own <- factor(Datasets$own, levels = c("one-owned","multi-owned"))

➢ Datasets$province <-factor(Datasets$province, levels=c("Hanoi","Haiphong","TP HCM"))

After that, we use the str() function again to obtain the new structure of the data with “own” and
“province” being converted into factors:
➢ str(Datasets)

Figure 2: Structure of the data when factors have been converted

The following table() function: tableName <- table(row variable, column variable) can be used
to generate a frequency table to determine the sample size of each treatment group:

➢ table(Dataset4$own,Dataset4$province)

Figure 3: Frequency table of sample size

It can be seen all six treatment groups have the same sample size of 30. This is our best option
for a two-way ANOVA test. Following that, we use by () function to obtain numerous descriptive
statistics such as mean, median, standard deviation, and summary,.. for each treatment group
mentioned by the factors and their output respectively:

➢ by(Dataset4$roa,list(Dataset4$own,Dataset4$province),mean)

Figure 5: Mean of the data set

➢ by(Dataset4$roa,list(Dataset4$own,Dataset4$province),sd)

Figure 6: Standard deviation of the data set

➢ by(Dataset4$roa,list(Dataset4$own,Dataset4$province),summary)
Figure 7: Summary of the data set

Every code has its own distinct function that provides specific descriptive statistics data of
the outcome variable for the treatment groups with listed Own and Province. Then, after
summarizing the total figures, it can easily categorize the 5 basic statistics: Min value, 1st
quartile, Mean, Median, 3rd quartile and Max value.

Nextly, we use code to do the boxplot and mean plot to examine the findings more closely:

➢ boxplot(roa~interaction(own,province),data = Dataset4,xlab = "Ownership

and Province", ylab ="ROA", col=c("red","blue","yellow","pink","gray","purple"))

Figure 8 : Boxplot
According to the diagram, there are a variety of different box plots shapes and positions. The
dataset's minimum and maximum values, medians, quartiles, and outliers are displayed in the
box plot above. A variety of potential box plot locations and forms are shown in the diagram.
The distribution for those groups as well as the range of ROA in the six categories mentioned
above are shown using box plots. The state-owned portion of Ho Chi Minh City has the highest
middle ROA, whereas the privately-owned portion of Ho Chi Minh City obtains the lowest
middle ROA, as shown by the black horizontal line in each box in this output section.

The mean plot can be used to determine the mean value as well as the mean comparison between
treatment groups:

➢ install.packages("gplots")

➢ library(gplots)

➢ plotmeans(roa~interaction(own,province),data = Dataset4,xlab = "Ownership and

Province",ylab = "ROA",main="Mean Plot with 95% CI")

Figure 9 : Mean plot with 95% CI

Six groups in the mean plot each have a 95% confidence interval. The by() function for means
provides the foundation for the outcomes of this mean plot. The figure indicates a significant
disparity between the mean value of firms in Ho Chi Minh City and those in the other two cities.
Although the values of all categories range from 0.01 to 0.05, Ho Chi Minh's private-owned
enterprises had the lowest mean value in the sample. Furthermore, it is noticeable that six groups
have different mean values, demonstrating that they satisfy the assumptions of the two-way
ANOVA.

Question 2: Use analysis of variance to test for any significant differences due to province. Use a
.05 level of significance, and for now, ignore the effect of types of ownership. Check all the
assumptions of the inference technique you use. Are the assumptions satisfied? Explain.

Step 1: Hypothesis

- Ho: All population means are equal (μ1 = μ1 = ... = μk)

- Ha: At least two population means are different

Step 2: Test statistics

Before conducting a one-way ANOVA, we must check to ensure that three assumptions are met.

Assumption 1: Samples are independent, simple random samples

Firstly, we divide the total sample size into three group factors that are not influenced by the
other. Furthermore, the responders were also recorded separately. As a result, we can conclude
that samples are independent and randomly selected.

Assumption 2: All population in question are normally distributed

We use the Q-Q plot with R command to check the second assumption: All populations in
question are normally distributed.

➢ question2<-read.table("Datasets.csv",header=TRUE,sep = ",", stringsAsFactors = F)

➢ str(question2)
➢ question2$province<-factor(question2$province, levels=c("1","2","3"), labels =
c("HaNoi", "DaNang", "HoChiMinh"))

➢ question2$province

➢ qqPlot(lm(roa ~ province, data=question2), simulate=T, main="Q-QPlot", labels=F)

With a sample size of 180, the normality of residuals can be seen using a normal Q-Q plot. The
scatter compares the data to a perfectly normal distribution. We can see from the plot that almost
all points lie approximately near the straight line. Therefore, we conclude that the populations are
normally distributed.

Assumption 3: All populations have the same standard deviation

⮚by(dataset4$roa, dataset4$province, sd)

dataset4$province: Hanoi

[1] 0.06334152
dataset4$province: Danang

[1] 0.08371056

dataset4$province: Hochiminh

[1] 0.2478101

We have Largest SD/ Smallest SD = 0.2478101/0.06334152 = 3.912285 > 2 so we can check the
third assumption using Levene’s test. This test has the following hypotheses:

- Ho: The variance among the groups is equal.

- Ha: The variance among the groups is not equal.

If the p-value from the test is less than our chosen significance level 𝛼= .05, we can reject the
null hypothesis and conclude that we have enough evidence to state that the variance among the
groups is not equal. In R, this test can be performed thanks to the leveneTest() function from the
{car} package we have installed earlier.

⮚ LeveneTest(question2$roa, question2$province, center =

median) Levene's Test for Homogeneity of Variance (center = median)

Df F value Pr(>F)

group 2 5.8767 0.00338 **

177

---

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
The p-value of the test is 0.00388, which is smaller than our significance level of 0.05. So the
assuption is not satisfied. Assume that the variance among the groups is equal and assume that
the third assumption is reasonable.

The ANOVA test

After all three assumptions are satisfied, we now run the ANOVA test. The ANOVA command is
as follows:

⮚ aov1 <-aov(roa~province, data=question2)

⮚ summary(question2)

Df Sum Sq Mean Sq F value Pr(>F)

province 2 0.166 0.08303 3.439 0.0343 *

Residuals 177 4.273 0.02414

---

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Step 3: Level of significance

The level of significance: α=0.05

Step 4: Decision rule

We will reject Ho if p-value ≤ α.

Step 5: Value of test statistic

For one-way ANOVA, the decision rule is that if p-value is smaller than alpha, the null
hypothesis will be rejected. By conducting summary(aovl), we have p-value is 0.0343 which is
smaller than alpha= 0.05. Therefore, we decide to reject the null hypothesis.

Step 6: Conclusion
The One-way Analysis of Variance was performed to compare the effect of the type of province
on the profitability of a business. With the result in step 5, we have enough evidence to conclude
that there are significant differences in the profitability of businesses due to types of province.

Question 3: Use analysis of variance to test for any significant differences due to types of
ownership. Use a .05 level of significance, and for now, ignore the effect of province. Check all
the assumptions of the inference technique you use. Are the assumptions satisfied? Explain.

Step 1: Hypothesis
- Ho: All population means are equal (μ1 = μ1 = ... = μk)
- Ha: At least two population means are different
Step 2: Assumptions
Based on the insights and knowledge we learned in the BES course, the One-Way Analysis of
Variance is the best inference approach to resolving this question in our case study. Before we
can conduct a one-way ANOVA, we must first check to make sure that three assumptions are
met.
1. Samples are independent, simple random samples of data from the population.
2. The dependent variables for each group are normally distributed.
3. The variances of the populations that the samples come from are equal.
The first assumption can only be satisfied if a random design is carried out. As we know, this
survey was conducted on more than 2 million enterprises in all regions of the country in 2004.
The questionnaire contains many parts, in which each part is related to a different aspect in
business fields. From that, we conclude that the first hypothesis is valid.
To check the second assumption, we use Q-Q plots as an approach. Firstly, we import the
dataset4.cvs data frame into R Studio and accredit it to question3
⮚ question3 <-read.table("datasets.csv", header=TRUE, sep = ",",stringsAsFactors =
FALSE)
⮚ str(questi
on3) 'data.frame':
180 obs. of 3 variables:
$ roa : num 9.75e-03 4.36e-05 1.81e-03 9.01e-04 7.36e-03 ...
$ own : chr "state-owned" "state-owned" "state-owned" "state-owned" ...
$ province: int 1 1 1 1 1 1 1 1 1 1 ...
With the use of the Q-Q plot, we can graphically confirm the normality of the data. For each
sample, a unique Q-Q plot can be generated, allowing us to evaluate whether or not they are all
normally distributed. As an alternative, we can develop the following plot to analyze the
residuals' normality. To access the Q-Q plot function, install the {car} package.
⮚ install.packages(“car”)
⮚ library(car)
⮚ qqPlot(lm(roa ~ own, data = question3), simulate = T, labels=F)
Below is the graphical output after we run the code in R:

In a Q-Q plot, if the data points align along a straight diagonal line, the dataset appears to follow
a normal distribution. We can observe that, with only a few minor deviations along each of the
tails, the points are mainly located along the straight diagonal line. We can confidently assume
that this data set is normally distributed based on the plot.
We can check the third assumption using the Levene’s test because Largest SD/ Smallest SD =
0.2049527/0.08038714 = 2.549571 greater than 2.
⮚ by(Dataset4$ï..roa, Dataset4$own, sd)
Dataset4$own: state-owned
[1] 0.08038714

Dataset4$own: private-owned
[1] 0.2049527
This test has the following hypotheses:
- Ho: The variance among the groups is equal.
- Ha: The variance among the groups is not equal.
If the p-value from the test is less than our chosen significance level 𝛼= .05, we can reject the
null hypothesis and conclude that we have enough evidence to state that the variance among the
groups is not equal. In R, this test can be performed thanks to the leveneTest() function from the
{car} package we have installed earlier.
⮚ leveneTest(roa ~ own, question3)
The output is above:
Levene's Test for Homogeneity of Variance (center = median)
Df F value Pr(>F)
group 1 1.2044 0.2739
178
The p-value of the test is 0.2739, which is higher than our significance level of 0.05. So we do
not have enough evidence to reject the null hypothesis and conclude that the variance among the
groups is equal and state that the third assumption is reasonable.
The ANOVA test
After all three assumptions are satisfied, we now run the ANOVA test. The ANOVA command is
as follows:
⮚ aov(roa~own, data = question3)
where ‘roa’ and ‘own’ are the dependent and independent variables. The final argument is the
name of the data structure being analyzed.
Call
aov(formula = roa ~ own, data = question3)
Terms:
own Residuals
Sum of Squares 0.125775 4.313627
Deg. of Freedom 1 178
Residual standard error: 0.1556723
Estimated effects may be unbalanced
Give the hypotheses and run the one-way ANOVA with expected question3 as the outcome
variable and own as the factor. The results of the ANOVA can be seen with the summary()
command.
⮚ aov1 <- aov(roa~own, data=question3)
⮚ summary(aov1)
Df Sum Sq Mean Sq F value Pr(>F)
own 1 0.126 0.12577 5.19 0.0239 *
Residuals 178 4.314 0.02423
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Step 3: Level of significance

The level of significance: α=0.05

Step 4: Decision rule

We will reject Ho if p-value ≤ α.

Step 5: Value of test statistic

The p-value is 0.023898. We can confirm the p-value = 0.023898 < 𝛼 =0.05 and state that the
null hypothesis is rejected.

Step 6: Conclusion
The One-Way Analysis of Variance was performed to compare the effect of type of ownership on
profitability of businesses. This method revealed that there was a statistically significant
difference in mean roa between two groups {F(1, 178) = [5.1906], p = [0.023898]}. There is
enough evidence to conclude that there are any significant differences in profitability of
businesses due to types of ownership.
Question 4: At the .05 level of significance test for any significant differences due to province,
types of ownership, and interaction. Check all the assumptions of the inference technique you
use. Are the assumptions satisfied? Explain.

I-Assumptions:
(1) Sample are independent, simple random sample of size n
(2) All populations have the same standard deviation
(3) All populations are normally distributed
II- Checking process
(1) Sample are independent, simple random sample of size n

(2) All population have the same standard deviation

⮚ by(Dataset$roa,list(Dataset$own,Dataset$province), sd)

: private-owned

: province 1

[1] 0.01813046

: state-owned

: province 1

[1] 0.08496648
: private-owned

: province 2

[1] 0.1124882

: state-owned

: province 2

[1] 0.03961233

: private-owned

: province 3

[1] 0.323886

: state-owned

: province 3

[1] 0.1048471

Largest SD/ Smallest SD = 0.323886/0.01813046= 17.86419098 (Greater than 2, not clear to pool
variances). Then using Levene’s test:

⮚ leveneTest(Dataset$roa,interaction(Dataset$own,Dataset$province),center=median)

Levene's Test for Homogeneity of Variance (center = median)

Df F value Pr(>F)
group 5 3.2052 0.008563 **
174
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
P-value =0.008563< 0.05 => Reject Ho ( SD is not equal)=> not satisfied. Assume that the
variance among the groups is equal and assume that the third assumption is reasonable.
(3) All populations are normally distributed

-The points lie mostly along the straight diagonal line with some minor deviations along each of
the tails. We could safely assume that this set of data is normally distributed
-Normal residual is normally distributed but there is just one outlier (point 165). Could assume
that the residual is normally distributed upon removing the outlier

III- Perform the inference technique

Firstly, the test for any significant interaction (because if the interaction effect is significant->
ignore the main effect).We choose to use the two-way ANOVA test as mentioned in question 1
with a significance level of 0.05.
Step 1: Identify Hypothesis Test
Hypothesis testing for interaction factor:
- Ho: There is not a significant interaction between the types of ownership and the Province.
- Ha: There is a significant interaction between the types of ownership and the Province.
Step 2: Test statistic
Check assumptions:
(1) Samples are independent, simple random samples of size n
(2) All populations have the same standard deviation
(3) All populations are normally distributed
Test statistic and p-value:
We used R studio to calculate and had the output as following:

⮚ Dataset.result<-aov(roa ~ own*province, data = Dataset)

⮚ summary(Dataset.result)

Df Sum Sq Mean Sq F value Pr(>F)

own 1 0.126 0.12577 5.482 0.0203 *
province 2 0.166 0.08303 3.619 0.0289 *
own:province 2 0.155 0.07763 3.383 0.0362 *
Residuals 174 3.992 0.02294
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Step 3: Level of significance

The level of significance: α=0.05
Step 4: Decision rule
We will reject Ho if p-value ≤ α.
Step 5: Value of test statistic
To test the interaction between the types of ownership and the Province, we got: p-value=0.0362
< α=0.05 => Reject Ho
Step 6: Conclusion
There is enough evidence to conclude that the interaction between the types of ownership and the
Province is significant.

Question 5: Draw an interaction plot and interpret the plot. Is the plot consistent with the
conclusions made in Question 4?

An interaction plot is a way for determining whether or not a two factor interaction appears
graphically. The interaction plot function is used to create it as follow:

⮚ interaction.plot(Dataset$province, Dataset$own, Dataset$roa, type="b", col=c("red",

"blue"), pch=c(16, 18), main = "Interaction between own and province")
As can be observed from the plot above, it is clear that the two lines are not parallel indicating
that there is a significant interaction between “province” and “own”. In other words, there is a
difference in ROA between the two types of ownership depending on their provinces. It is
important to choose the right type of ownership for the location. The government enterprise
always has stable profitability for almost all locations due to the government or local support. In
contrast, the ROA of the private business type has great variation because it depends on whether
the location includes favorable conditions such as: tax policy, government incentives, culture, etc
or not.

In general, we can see that ROA of state-owned enterprises tends to be stable regardless of
province, in contrast to this figure for private ownership with a huge fluctuation. Firstly, province
1 shows that there is a difference in ROA between two types of ownership. In, ROA's
state-owned is highest among 3 provinces and this figure for the remaining ownership is not too
low. Province 2, which has the return on asset ratio of private ownership is highest in the 3 places
showing that private businesses besides having efficient profit management on their own, they
also get a lot of favorable conditions in this location.This province also has the smallest disparity
of ROA between two types of ownership indicating that the most economically developed areas
comparing to others. In province 3, the trend of ROA’s private ownership is strong downward at
the lowest levels which signal the poor profit management ability and very few conditions to
support the development of private enterprises.
Through analyzing the graph, we can sê that the plot is consistent with the conclusions made in
question 4. The conclusion said that there is the interaction between “ own” and “province” is
significant

Internal Resistance Project Class 12
100% (6)
Internal Resistance Project Class 12
16 pages
Anova
No ratings yet
Anova
58 pages
BES Case Study Presentation Tut 5 Group 2 Ms Hien 1 1
No ratings yet
BES Case Study Presentation Tut 5 Group 2 Ms Hien 1 1
36 pages
Analysis of Variance
No ratings yet
Analysis of Variance
57 pages
Anova
No ratings yet
Anova
59 pages
Chapter12 OneWayANOVA
No ratings yet
Chapter12 OneWayANOVA
68 pages
Analysis of Variance Regression and Correlation
No ratings yet
Analysis of Variance Regression and Correlation
23 pages
I MBA - SFM - Module - 5 - Testing of Hypothesis-ANOVA
No ratings yet
I MBA - SFM - Module - 5 - Testing of Hypothesis-ANOVA
31 pages
L6. Notes and Examples
No ratings yet
L6. Notes and Examples
15 pages
Case Study 6 Tut 01 Group 04
No ratings yet
Case Study 6 Tut 01 Group 04
23 pages
Cardiovascular Pharmacotherapeutics PDF
No ratings yet
Cardiovascular Pharmacotherapeutics PDF
798 pages
Doe
No ratings yet
Doe
143 pages
Case 4 - Tutorial 2
No ratings yet
Case 4 - Tutorial 2
20 pages
Case Study Report Tut 1 Group 4 Mrs. Hoai Phuong
No ratings yet
Case Study Report Tut 1 Group 4 Mrs. Hoai Phuong
22 pages
BES Case Study Presentation-Tut-2-Group 3-Ms Phuong
No ratings yet
BES Case Study Presentation-Tut-2-Group 3-Ms Phuong
38 pages
Anova R
No ratings yet
Anova R
17 pages
Chapter 12 - One-Way Analysis of Variance - One-Way ANOVA
No ratings yet
Chapter 12 - One-Way Analysis of Variance - One-Way ANOVA
34 pages
Unit 3 Pea306
No ratings yet
Unit 3 Pea306
103 pages
Lecture 10 ANOVA
No ratings yet
Lecture 10 ANOVA
41 pages
OneFactorANOVA Introduction
No ratings yet
OneFactorANOVA Introduction
11 pages
REPORT
No ratings yet
REPORT
19 pages
BES Test 2
No ratings yet
BES Test 2
5 pages
Balanced ANOVA
No ratings yet
Balanced ANOVA
40 pages
BES - R Lab 5
No ratings yet
BES - R Lab 5
7 pages
Data Comes in Different Formats Time Histograms Lists But . Can Contain The Same Information About Quality
No ratings yet
Data Comes in Different Formats Time Histograms Lists But . Can Contain The Same Information About Quality
64 pages
ANOVA in R
No ratings yet
ANOVA in R
11 pages
Day School 2 ANOVA
No ratings yet
Day School 2 ANOVA
11 pages
Chapter 12
No ratings yet
Chapter 12
12 pages
04 BasicAnalyses
No ratings yet
04 BasicAnalyses
44 pages
Anova
No ratings yet
Anova
34 pages
Assignment - Exercise 6.1 .Anova
No ratings yet
Assignment - Exercise 6.1 .Anova
13 pages
Module 1 - Introduction To Animal Science
No ratings yet
Module 1 - Introduction To Animal Science
13 pages
Minh Hoa KTHK1 Anh 11 - Linh
No ratings yet
Minh Hoa KTHK1 Anh 11 - Linh
2 pages
Measurement of Irrigation Water
No ratings yet
Measurement of Irrigation Water
83 pages
AGA Report 7-Measurement of Natural Gas by Turbine Meters
No ratings yet
AGA Report 7-Measurement of Natural Gas by Turbine Meters
77 pages
LabVIEW Signal Processing Course Manual
No ratings yet
LabVIEW Signal Processing Course Manual
432 pages
ANOVA Test in Python1
No ratings yet
ANOVA Test in Python1
12 pages
ANOVA
No ratings yet
ANOVA
39 pages
Anova and Manova
No ratings yet
Anova and Manova
30 pages
Belk - Possessions and The Extended Self
No ratings yet
Belk - Possessions and The Extended Self
31 pages
Research Statistics Lesson 5
No ratings yet
Research Statistics Lesson 5
11 pages
R Code For Linear Regression Analysis 1 Way ANOVA
No ratings yet
R Code For Linear Regression Analysis 1 Way ANOVA
8 pages
3.ANOVA IIb-laboratory - Solution
No ratings yet
3.ANOVA IIb-laboratory - Solution
13 pages
Digital Signal Processing (DSP) with Python Programming
From Everand
Digital Signal Processing (DSP) with Python Programming
Maurice Charbit
No ratings yet
Aditya Garg DMDW
No ratings yet
Aditya Garg DMDW
40 pages
ANOVA
No ratings yet
ANOVA
23 pages
Emerging Technologies Questions Ch1-7
No ratings yet
Emerging Technologies Questions Ch1-7
37 pages
BES - R Lab 4
No ratings yet
BES - R Lab 4
6 pages
ANOVA Example
No ratings yet
ANOVA Example
17 pages
Yadunandan Sharma 500826933 MTH480 Due Date: April 15, 2021
No ratings yet
Yadunandan Sharma 500826933 MTH480 Due Date: April 15, 2021
16 pages
Statistics For Decision Making: ANOVA: Analysis of Variance
No ratings yet
Statistics For Decision Making: ANOVA: Analysis of Variance
32 pages
Attitudes and Perception
No ratings yet
Attitudes and Perception
38 pages
8 Biostat
No ratings yet
8 Biostat
22 pages
Report Stats PDF
No ratings yet
Report Stats PDF
23 pages
LESSON NOTES FOR Year 5 SCIENCE
No ratings yet
LESSON NOTES FOR Year 5 SCIENCE
11 pages
Statistics 303: Analysis of Variance
No ratings yet
Statistics 303: Analysis of Variance
19 pages
Analysis of Variance
No ratings yet
Analysis of Variance
11 pages
An Ova
No ratings yet
An Ova
17 pages
Anova
No ratings yet
Anova
7 pages
BES - R Lab 6
No ratings yet
BES - R Lab 6
7 pages
Summarise The Nature and Effects of Perceived Fairness in Groups C2
No ratings yet
Summarise The Nature and Effects of Perceived Fairness in Groups C2
1 page
ANOVA Matlab Instructions PDF
No ratings yet
ANOVA Matlab Instructions PDF
6 pages
Analysis of Variance
No ratings yet
Analysis of Variance
6 pages
Introduction To Business Statistics Through R Software: Software
From Everand
Introduction To Business Statistics Through R Software: Software
Editor IJSMI
No ratings yet
ANOVA Executive Summary
No ratings yet
ANOVA Executive Summary
6 pages
ANOVA in R
No ratings yet
ANOVA in R
7 pages
Assignment-Practical Exercise in One-Way Anova
No ratings yet
Assignment-Practical Exercise in One-Way Anova
11 pages
AI Tech Agency - by Slidesgo
No ratings yet
AI Tech Agency - by Slidesgo
41 pages
Use of F Distribution (Analysis of Variance (ANOVA) )
No ratings yet
Use of F Distribution (Analysis of Variance (ANOVA) )
10 pages
ANOVA Matlab Instructions
No ratings yet
ANOVA Matlab Instructions
6 pages
Methodology and Application of One-Way ANOVA: Keywords
No ratings yet
Methodology and Application of One-Way ANOVA: Keywords
6 pages
Analysis of Variance
No ratings yet
Analysis of Variance
8 pages
BS 2C 4-1973 (2012)
No ratings yet
BS 2C 4-1973 (2012)
10 pages
Assignment #7 - Dr. Totanes
No ratings yet
Assignment #7 - Dr. Totanes
3 pages
Business Communication (4th Semester)
No ratings yet
Business Communication (4th Semester)
3 pages
WTC Foundation Beam MKD 03
No ratings yet
WTC Foundation Beam MKD 03
8 pages
Chi Squared for Beginners
From Everand
Chi Squared for Beginners
Stephanie Glen
No ratings yet
The Role of Ritucharya in Human Body According To Different Ritu'S
No ratings yet
The Role of Ritucharya in Human Body According To Different Ritu'S
4 pages
Econometrics: A Simple Introduction
From Everand
Econometrics: A Simple Introduction
K.H. Erickson
3.5/5 (5)
Form 204 Asignación de Tareas
No ratings yet
Form 204 Asignación de Tareas
4 pages
ProMax LB02A Multifuntion Process Calibrator Datasheet
No ratings yet
ProMax LB02A Multifuntion Process Calibrator Datasheet
5 pages
Types of Load Pavement Failures in Kenya
No ratings yet
Types of Load Pavement Failures in Kenya
4 pages
ME451: Control Systems Course Roadmap
No ratings yet
ME451: Control Systems Course Roadmap
5 pages
Evidence
No ratings yet
Evidence
4 pages
De Thi Thu Tot Nghiep THPT 2023 de Chuan Cau Truc So 03 (Thang 11.2022)
No ratings yet
De Thi Thu Tot Nghiep THPT 2023 de Chuan Cau Truc So 03 (Thang 11.2022)
6 pages
İdi̇l Ören CV
No ratings yet
İdi̇l Ören CV
3 pages
Van Der Pauw Method For Determining Resistivity
No ratings yet
Van Der Pauw Method For Determining Resistivity
9 pages
Conductivity-Depth Imaging of Helicopter-Borne TEM Data Based On Pseudo-Layer Half Space Model
No ratings yet
Conductivity-Depth Imaging of Helicopter-Borne TEM Data Based On Pseudo-Layer Half Space Model
7 pages
Abrar's Lesson Plan
No ratings yet
Abrar's Lesson Plan
4 pages
Rust Veto 4240 Pds 3
No ratings yet
Rust Veto 4240 Pds 3
1 page

Example Report

Uploaded by

Example Report

Uploaded by

Question 1: Produce descriptive statistics to summarize the data.

You are expected to generate as

➢ Dataset4 <-read.table("Dataset4.csv", header=TRUE, sep = ",",stringsAsFactors = FALSE)

Figure 1: Some first observations of the data set

➢ Datasets$own <- factor(Datasets$own, levels = c("one-owned","multi-owned"))

➢ Datasets$province <-factor(Datasets$province, levels=c("Hanoi","Haiphong","TP HCM"))

Figure 2: Structure of the data when factors have been converted

Figure 3: Frequency table of sample size

Figure 5: Mean of the data set

Figure 6: Standard deviation of the data set

➢ boxplot(roa~interaction(own,province),data = Dataset4,xlab = "Ownership

➢ plotmeans(roa~interaction(own,province),data = Dataset4,xlab = "Ownership and

Figure 9 : Mean plot with 95% CI

- Ho: All population means are equal (μ1 = μ1 = ... = μk)

- Ha: At least two population means are different

Step 2: Test statistics

Assumption 1: Samples are independent, simple random samples

Assumption 2: All population in question are normally distributed

➢ question2<-read.table("Datasets.csv",header=TRUE,sep = ",", stringsAsFactors = F)

➢ qqPlot(lm(roa ~ province, data=question2), simulate=T, main="Q-QPlot", labels=F)

Assumption 3: All populations have the same standard deviation

⮚by(dataset4$roa, dataset4$province, sd)

- Ho: The variance among the groups is equal.

⮚ LeveneTest(question2$roa, question2$province, center =

median) Levene's Test for Homogeneity of Variance (center = median)

group 2 5.8767 0.00338 **

The ANOVA test

⮚ aov1 <-aov(roa~province, data=question2)

Df Sum Sq Mean Sq F value Pr(>F)

province 2 0.166 0.08303 3.439 0.0343 *

Residuals 177 4.273 0.02414

Step 3: Level of significance

The level of significance: α=0.05

Step 4: Decision rule

Step 5: Value of test statistic

Step 3: Level of significance

The level of significance: α=0.05

Step 4: Decision rule

Step 5: Value of test statistic

(2) All population have the same standard deviation

Levene's Test for Homogeneity of Variance (center = median)

III- Perform the inference technique

⮚ Dataset.result<-aov(roa ~ own*province, data = Dataset)

Df Sum Sq Mean Sq F value Pr(>F)

Step 3: Level of significance

⮚ interaction.plot(Dataset$province, Dataset$own, Dataset$roa, type="b", col=c("red",

You might also like