Bioestadistica: Clara Carner 2023-05-29
Bioestadistica: Clara Carner 2023-05-29
Bioestadistica: Clara Carner 2023-05-29
Clara Carner
2023-05-29
set.seed(2)
X1<-c(rep(1,500),rep(0,500))
X2<-c(rep(0,250),rep(1,500),rep(0,250)) #all the combinations of 1 and 0
z<-0.1+0.5*X1+0.7*X2
p<-exp(z)/(1+exp(z)) #logistic function, prob to have the desease given x1 etc
Y<-rbinom(1000,1,p)
output<-glm(Y~X1+X2, family=binomial) #glm is used to fit generalized linear
summary(output)
##
## Call:
## glm(formula = Y ~ X1 + X2, family = binomial)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -1.7555 -1.2216 0.6943 0.9345 1.1338
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 0.1034 0.1125 0.919 0.358210
## X1 0.4989 0.1369 3.645 0.000268 ***
## X2 0.6976 0.1373 5.079 3.79e-07 ***
## ---
## Signif. codes: 0 ’***’ 0.001 ’**’ 0.01 ’*’ 0.05 ’.’ 0.1 ’ ’ 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
1
## Null deviance: 1279.4 on 999 degrees of freedom
## Residual deviance: 1240.0 on 997 degrees of freedom
## AIC: 1246
##
## Number of Fisher Scoring iterations: 4
glm: generalized linear models, specified by giving a symbolic description of the linear predictor and a
description of the error distribution. donades les y, i les x, busquem les bettes
- on surt estimate, es el valor de les bettes
• z value -> wald test: ex: B2=0 -> estiamate b2/st.error b2= z value
#EXERCICIS
##EXERCICI 5## #(d) Test the null hypothesis of HWE using R (see lecture) #150 GG, 40 G1, 1O AA
#install the library if(require(HardyWeinberg)){ install.packages(“HardyWeinberg”);require(HardyWeinberg)}
#vector of genotype frequencies x<-c(GG=150, GA=40, AA=10) #Perform the test HW.test<-
HWChisq(x,cc=0, verbose=TRUE) # no em funciona #HO is rejected at 5% level #The HW equilibrium
does not hold
##Exercise 4## #clean the R environment rm(list=ls()) #Exposure probability p_exp<-0.25 #Disease
probability given the exposure p_d_exp<-0.25 ##Disease probability given the subject is not exposed
p_d_notexp<-0.5
(a) Give the odds ratio that D will occur for E versus non E in this
population.
odds_ratio<-(p_d_exp/(1-p_d_exp))/(p_d_notexp/(1-p_d_notexp))
2
(b) Compute the probability of the disease in this population.
#Use the law of Total probability #P(D)=P(D|E)P(E)+P(D|notE)(not E) p_d<-p_d_expp_exp+p_d_notexp(1-
p_exp)
Try out the functions rbionom() and rnorm for the binomial and
the
ones and zeros of size 1000 with a probabilty of a one of 0.3 and
3
#estimated prob of exposure among the cases prob_cases<-mean(data$exposure[1:100]) prob_cases
print(prob_cases-p1)
#estimated prob of exposure among the controls prob_controls<-mean(data$exposure[101:200])
prob_controls print(prob_controls-p2)