Ecotrix
Ecotrix
Average: can’t be used for nominal and ordinal data- used for ratio scale data and
interval scale data (has to be continuous)
Population: unobservable, unknown
Sample: observable, known
Look up: free ration scheme by govt, india’s budget size
Mode: qualitative and discrete
Median:
BLUE: among mode, median and mean: mean is blue if unbiased
Expected value: mean of sampling distribution
Type 1 error: reject null hypothesis when it’s true for population mean
Type 2 error: accept null hypothesis when it's not true for the population mean
Covariance and correlation: explain linear relationship
Degrees of freedom meaning***
Sample mean is different than population mean ⇒ may indicate sampling bias
Why GLS?
Independent/ explanatory variables, dependent variables, predictor, exogenous
variables, endogenous variables
Regression analysis: estimation using estimators, elasticity, hypothesis testing,
forecasting/ prediction/ simulation (taking the trend ahead according to the model)
Econometric models:
1. Deterministic part of model:
a. linear or other relationship among variables
b. variables to be included: dependent on theory being studied, control
for dependent variables influencing explanatory variable, expected
relationship, taking paradoxes into account
c. Causal relationship between variables: does independent variable
cause the dependent variable.
d. How is the error term to be included?: additive, multiplicative,
probability distribution followed by error term. Why should the error
terms be normally distributed?
2. Data:
a. Experimental vs observational data (sample surveys- NSSO, NFHS)
b. Cross sectional data (census, surveys by government)
c. Time series data
d. Panel (cross sectional data being repeated for same ID or household
at different point of times)
e. RCTs: to examine CAUSATION, treatment and control group (blind and
double-blind experiments)
f. Pooled cross sectional data: different IDs, same parameters, different
points of time (e.g., NSSO / NFHS over the years)
5. Assessment of Validity:
a. Internal
b. External
R by Neeraj Hatekar
Matrix:
Array: used for multiple explanatory variables e,g. education wrt age, and gender wrt
age
e.g. emp_id=c(100:104)
emp_name=c("john","henry","adam","ron","gary")
dept=c("sales","finance","marketing","HR","R&d")
Data Operators:
1. Addition: a+b
2. Subtraction: a-b
3. Multiplication: a*b
4. Division: a/b
5. Modulus: a%%b (remainder)
6. Exponent: a^b
7. Floor Division: a%/%b (quotient)
Relational Operators:
1. Equal to: == (not an assignment operator)
2. Not equal to: a!=b
3. Greater than: a>b
4. Less than: a<b
5. Greater than equal to: a>=b
6. Less than equal to: a<=b
Logical Operators:
1. A&b true if both elements are true
2. A|b true if one of the elements are true or both are true
3. !a gives opposite logical value
4. &&, || : compares only the first elements in the datasets/ vectors
Conditional Statements:
1. If
2. Else if
3. Else
e.g. a=7
b=7
if (a>b){
} else if (a<b){
} else {
print (" both numbers are equal")}
1. Repeat loop: Repeats given statement or group of statements where the given
condition is true. It is an exit-controlled loop where the code I first executed and
then it is checked to determine whether the control should be inside the loop or
exit from it.
2. While loop: helps to repeat a statement or group while a given condition is true.
It is an entry-controlled loop where condition is first checked and only if the
condition is satisfied the control is delivered inside the loop to execute the code.
3. For loop: it is used to repeat a statement or group for a fixed number of times
except here, we need to initialise something. Here we are aware of the number
of times the code needs to be executed beforehand. The execution is similar to
the while loop.
e.g.
1) FOR LOOP
for (x in 1:10){
print(x)
}
2) FOR LOOP
data<-c(1,2,3,4,5)
for(x in data){
print(x)
}
3) REPEAT LOOP
x=2
repeat {
x=x^2
print(x)
if (x>100){
break
}
}
4) REPEAT LOOP
x=2
repeat {
x=x^2
if (x>100){
print(x)
break
}
}
num=1
sumn=0
n=1
print(sumn)
print(num)
while(n<11){
c=sumn+num
print(c)
sumn=num
num=c
n=n+1
}
1. Mean:
v<-c(1,2,3,4,5,6,7,8,9,10)
add=0
for(x in v){
add= add+x
mean= add/length(v)
print(mean)
v<-c(1,2,3,4,5,6,7,8,9,10)
add=0
n=0
for(x in v){
add= add+x
n=n+1
mean= add/n
print(mean)
2. MEDIAN
Sort data in ascending order, if condition even median, else condition
odd median.
data<-c(1,2,3,5,4,6,7,8,9,10)
data=sort(data)
if(length(data)%%2==0){
median=(data[length(data)/2]+data[(length(data)/2)+1])/2
}else{
median= data[(length(data)+1)/2]
}
print(median)
3. MODE
Sort data, calculate frequency, max frequency
data<-c(5,10,15,5,7,10)
y= table(data)
y;
names(y)[which(y==max(y))];
e.g.
res<-1
for (e in a){
res<-res*e;
prodcutVect=res;
A<-c(1:5);
print(productVect(A));
B<-c(1:10);
print(productVect(B));