0% found this document useful (0 votes)
23 views

Computer LAB Week#3 - Questions

The document describes analyzing descriptive statistics and correlations between variables such as CEO salary, return on equity, return on sales, and sales for a sample of firms. Matrix operations and statistical functions are used to calculate means, variances, correlations and more between the variables.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views

Computer LAB Week#3 - Questions

The document describes analyzing descriptive statistics and correlations between variables such as CEO salary, return on equity, return on sales, and sales for a sample of firms. Matrix operations and statistical functions are used to calculate means, variances, correlations and more between the variables.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 11

1

matrix(7,4) stats

stats(1,1)=@mean(salary)
stats(1,2)=@mean(roe)
stats(1,3)=@mean(ros)
stats(1,4)=@mean(sales)

stats(2,1)=@var(salary)
stats(2,2)=@var(roe)
stats(2,3)=@var(ros)
stats(2,4)=@var(sales)
2

stats(3,1)=@max(salary)
stats(3,2)=@max(roe)
stats(3,3)=@max(ros)
stats(3,4)=@max(sales)

stats(4,1)=@min(salary)
stats(4,2)=@min(roe)
stats(4,3)=@min(ros)
stats(4,4)=@min(sales)

stats(5,1)=@skew(salary)
stats(5,2)=@skew(roe)
stats(5,3)=@skew(ros)
stats(5,4)=@skew(sales)

stats(6,1)=@kurt(salary)
stats(6,2)=@kurt(roe)
stats(6,3)=@kurt(ros)
stats(6,4)=@kurt(sales)

For all four random variables, the skewness coefficient is larger than 0 (positive) which suggests a
long right tail. For instance, for the variable salary the positive skewness arises from a few firms
that are paying very large CEO salaries.
For all four random variables, the tails are thicker than those of the Normal distribution,
particularly for salary.

stats(7,1)=@obs(salary)
stats(7,2)=@obs(roe)
stats(7,3)=@obs(ros)
stats(7,4)=@obs(sales)

The number of observations for salary and roe is 209 (one observation per firm), but for roe
it is 207 (two ros observations missing(, and for sales it is 208 (one observation missing).

The EViews code for missing data is NA.


3

The entire matrix(7,4) of descriptive statistics:

salary roe ros sales


Mean 1281.120 17.18421 61.44928 6951.252
Variance 1874320. 72.21779 4657.136 1.13E+08
Maximum 14822.00 56.30000 418.0000 97649.90
Minimum 223.0000 0.500000 -58.00000 175.2000
Skewness 6.854923 1.560820 2.092535 4.990589
Kurtosis 60.54128 6.678555 9.406267 35.17689
Obs 209.0000 209.0000 207.0000 208.0000

We create four scalar objects (cov1, cov2, corr1, corr2) to store the results:

scalar cov1=@cov(salary, roe) ‘the result is 1336.11


scalar cov2=@cov(salary, roed) ‘the result is 13.36
scalar corr1=@cor(salary, roe) ‘the result is 0.1148
scalar corr2=@cor(salary, roed) ‘the result is 0.1148

The covariance cannot be interpreted in isolation as it is not bounded and is not unit free;
we can see that for observations for same variables expressed in different units give
different results. We cannot conclude that the dependence between salary and roe is larger
than the dependence between salary and roed since the random variable return-of-equity is
only one but the measures roe and roed have been stated in percentages and decimal
format, respectively. The correlation is a rescaled version of the covariance that is now
bounded between -1 and 1, and is unit free. The degree of dependence between CEO salary
and the roe (return of investment of the firm) is low at 11.48%.

group variables salary roe ros sales ‘creates a group object named variables
variables.cor corr stat prob ‘the group command cor generates the statistics
specified in the subsequent keywords (corr = correlations
stat = significance t-statistics; prob= p-values) the
keyword cov (instead of corr) would generate
4

covariances instead

Correlation
t-Statistic
Probability SALARY ROE ROS SALES
SALARY 1.000000
-----
-----

ROE 0.118609 1.000000


1.706114 -----
0.0895 -----

ROS -0.034413 0.274472 1.000000


-0.491812 4.076813 -----
0.6234 0.0001 -----

SALES 0.119209 -0.050573 -0.137164 1.000000


1.714867 -0.723257 -1.977785 -----
0.0879 0.4704 0.0493 -----

Alternatively the following code declares a 4x4 matrix object named corrmat, then stores
the correlation matrix in this object
matrix(4,4) corrmat ‘constructs a matrix object named corrmat
corrmat=@cor(@convert(variables)) ‘calculates the correlations using the @cor function; since the
input of this function are column vectors we turn the variables in
the group variables into vectors using @convert

Let us use the symbol ρ to denote the correlation ( ρ is a parameter).

The null and alternative hypotheses are:

H 0 : ρsalary ,roe =0 versus H A : ρ salary , roe ≠ 0w

The p-value 0.0895 (8.95%) is smaller than the significance level 10% so Ho is rejected, and we
conclude that salary and roe are significantly positively correlated.

By contrast the p-value 0.6234 (62.34%) of salary and ros is larger than the significance level 10%, so
we cannot reject Ho and conclude that salary and roe are no significantly related.

variables.distplot hist
About 40 of the 209 firms have achieved roe larger than 15% and smaller than 17% (that is,
40/209=19% of the firms).
5

The histogram of roe shows that the distribution is right-skewed, that is, there are various firms with
very large roe.

variables.distplot hist kernel theory(dist=norm)


Drawing a line around the histogram, the graphs clearly show that the empirical distribution is like
a “smoothed” version of this line around the histogram. In the limit, the histogram becomes the
empirical distribution function. In other words, as N gets larger and larger, say, 1000 and 2000
firms, the line around the histogram gets closer and closer to the empirical distribution.

Now we plot for the variable salary the empirical distribution alongside the Normal and
Student’s t theoretical distribution:
salary.distplot kernel theory(dist=norm)
salary.distplot kernel theory(dist=tdist)
6

Unsurprisingly, the Student’s t distribution provides a better approximation to the true probability
distribution of the random variable salary (the right graph shows a theoretical distribution that lies
closer to the empirical distribution) because, as shown earlier, the kurtosis of salary at K=60.54 is
well in excess of 3, which implies much thicker tails than those of the Normal distribution.

A scatterplot graph plots the observations of one variable against another. In the case of salary and
roe, the first point in the scatterplot is the salary of the CEO in firm number one in the sample against
the roe for firm number, the second point pertains for firm number two, and so on.

Using the group object variables that contains salary roe ros sales, we can use this code:

variables.scat(m) ‘displays the same scatterplots in multiple frames (roe versus salary,
ros versus salary, sales versus salary)

variables.scat(m) linefit ‘displays the same scatterplots in multiple frames alongside the
OLS regression line

Alternatively, we can use the following command:

scat(m) salary roe ros sales ‘displays the same scatterplots as the first command above but this
code does not allow adding the OLS regression line
7

The scatterplot and regression line for salary vs roe is positively sloped which suggests a positive
relation between salary and roe; that is, firms with higher roe pay higher salaries to their CEO.
However, the suggested relationship might not be statistically significant. The graph for salary versus
ros suggests instead a negative relationship. The scatterplot does not tell us whether the relationship is
statistically significant or not, we need to conduct a formal test for that.

equation model1.ls salary c roe ‘estimates a linear regression of salary on roe with
an intercept, and creates an equation object called
model1 where all the estimation output is stored; the
intercept is saved in C(1) and the slope in C(2)

OR

coef(2) B ‘constructs a 2x1 coefficient vector called B


equation model1.ls salary=B(1)+B(2)*roe ‘this is an alternative way to estimate the
regression by specifying the equation directly; the
coefficients are stored in B
(NOTE: This approach of specifying explicitly the model equation directly is useful when one wants to
β
specify a nonlinear-in-parameters model e.g. salary=β 1+ β 2 roe + ε specified as
3

salary=B(1)+B(2)*roe^B(3) with the coefficient vector declared as coef(3) B or a restricted


model, that is, a model where a specific parameter is restricted to attain a specific value).

model1.results ‘shows the estimation results on the screen

equation model2.ls salary roe ‘estimates another regression without intercept


8

Hint: create two scalar objects to store each of the two predictions.
9

equation model1.ls salary c roe


model1.results

equation model2.ls salary c roe ros


model2.results

equation model3.ls salary c roe ros sales


model3.results

model3.wald c(2)=0, c(3)=0, c(4)=0

model3.cinterval 0.95 ‘obtain 95% confidence intervals for the model parameters

model3.wald c(4)=0 ‘conducts a test for the null hypothesis C(4)=0


10

equation model4.ls salary c roe sales


model4.results

equation model5.ls salary c log(roe) log(sales)


model5.results

equation model6.ls salary c roe^2 sales^2


model6.results
11

model5.wald c(2)=c(3) ‘conducts a test for the null hypothesis specified using exact tests (t-
test/F-test) and asymptotic tests (Wald test)

model4.wald c(2)=20, c(3)=0.5 ‘test several restrictions (joint/multiple hypotheses test)

model4.resids(g) ‘graphs the actual Y, predicted Y^ , and residuals ε^ =Y −Y^


model4.resids(t) ‘tabulates actual, predicted and residuals

You might also like