Cheat Sheet: Optimal Stratification
Cheat Sheet: Optimal Stratification
Cheat Sheet: Optimal Stratification
CC BY SA Giulio Barcaroli • [email protected] Learn more at https://barcaroli.github.io/SamplingStrata/• package version 1.5 • Updated: 2020-01
C. Method "spatial"
lead.kr <- krige(lead~dist+soil,
prediction meuse, meuse.grid,
Use of models Sampling frame
model=fit.vgm.lead$var_model) Usually, values of target variables are not available frame <- buildFrameDF(
lead.pred <- ifelse(lead.kr[1]$var1.pred<0, in sampling frames, but only of co-variates. In order df=swissmunicipalities,
In cases where units in the sampling frame are 0,lead.kr[1]$var1.pred) to calculate correctly the variance of target id="id",
geo-referenced and there is spatial correlation lead.var <- ifelse(lead.kr[2]$var1.var < 0, variables in strata, we can make use of models. Co-variates X=c("POPTOT","HApoly"),
among them, it is possible to apply the 0,lead.kr[2]$var1.var) When applying methods ‘atomic’ and as both X’s Y=c("POPTOT","HApoly"),
"spatial" method in the optimization of the ‘continuous’, it possible to declare linear or log- and Y’s domainvalue = "REG")
frame stratification. Sampling frame linear models linking each target variable to one
co-variate available in the sampling frame. frame$airind <-
df <- as.data.frame(list(
Different steps: swissmunicipalities$Airind
dom=rep(1,nrow(meuse.grid)),
1. perform a preliminary spatial analysis and fit lead.pred=lead.pred,
frame$surfacesbois <-
spatial models on target variables lead.var=lead.var, Consider the case with ‘swissmunicipalities’ swissmunicipalities$Surfacesbois
2. define the sampling frame and add lon=meuse.grid$x, dataset. Suppose that for all units we only have
predicted values, prediction errors and lat=meuse.grid$y, values for POPTOT and HApoly, while only on a
coordinates; id=c(1:nrow(meuse.grid)))) subset (500) of it the values for Surfacesbois Optimization
3. set precision constraints; and Airbat are also available.
frame <- buildFrameSpatial(df=df,
We fit the following models: With the same precision constraints of 10% for
4. run optimization; id="id", both target variables we run the optimization step:
5. select the sample. X=c("lead.pred"),
k <- sample(c(1:2896),500)
Y=c("lead.pred"), solution <-
s <- swissmunicipalities[k,]
Spatial analysis variance=c ("lead.var"),
Airind_POPTOT <- optimStrata(
lon="lon", method = "continuous",
We make use of the «Meuse river»datasets, lm(Airind~POPTOT, data=s)
lat="lat", errors = cv,
reporting measures of 4 metals concentration. Bois_HApoly <-
domainvalue = "dom") framesamp = frame,
lm(Surfacesbois~HApoly,data=s)
model = model,
‘model’
dataframe
library(sp) Precision constraints nStrata = rep(5,7),
previously
iter = 50, defined
# locations (155 observed points) cv2 <- as.data.frame(list( For both models we calculate pops = 10)
data("meuse") DOM=rep("DOM1",1), heteroscedasticity indexes and variance:
# grid of points (3103) CV1=rep(0.05,1),
data("meuse.grid") domainvalue=c(1:1) ))
meuse.grid$id <- c(1:nrow(meuse.grid)) airind <-
coordinates(meuse)<-c('x','y') computeGamma(Airind_POPTOT$residuals,
coordinates(meuse.grid)<-c('x','y') Optimization s$POPTOT,nbins = 14)
airind
solution <- optimStrata(method="spatial", # gamma sigma r.square
errors=cv2, framesamp=frame, iter=25, # 0.59235109 0.06794055 0.87070106
nStrata=5, fitting=1, kappa=1, bois <-
Grid of range=fit.vgm.lead$var_model$range[2]) computeGamma(Bois_HApoly$residuals,
Meuse
river s$HApoly,nbins = 14)
framenew <- solution$framenew bois
outstrata <- solution$aggr_strata # gamma sigma r.square
Sample frameres <- SpatialPixelsDataFrame( # 0.8547931 0.4483606 0.9732122 )
of points=framenew[c("LON","LAT")],
observed data=framenew) Evaluation
values frameres$LABEL <- We can now instantiate the values in the
as.factor(frameres$LABEL) ‘model’ dataframe: framenew <- solution$framenew
spplot(frameres,c("LABEL"), outstrata <- solution$aggr_strata
col.regions=bpy.colors(5)) framenew$Y3 <- framenew$AIRIND
library(gstat) model <- NULL
framenew$Y4 <- framenew$SURFACESBOIS
library(automap) model$beta[1] <-
val <- evalSolution(framenew,outstrata)
v <- variogram(lead~dist+soil,data=meuse) Airind_POPTOT$coefficients[2]
val$coeff_var
fit.vgm.lead <- autofitVariogram( model$sig2[1] <- airind[2]^2
# CV1 CV2 CV3 CV4 dom
lead ~dist+soil,meuse,model="Exp") model$type[1] <- "linear"
# 0.0107 0.0706 0.0316 0.0603 DOM1
plot(v, fit.vgm.lead$var_model) model$gamma[1] <- airind[1]
# 0.0073 0.0364 0.0220 0.0426 DOM2
model$beta[2] <-
# 0.0062 0.0252 0.0253 0.0332 DOM3
Bois_HApoly$coefficients[2]
# 0.0071 0.0328 0.0303 0.0572 DOM4
model$sig2[2] <- bois[2]^2
Analysis model$type[2] <- "linear"
# 0.0055 0.0646 0.0171 0.0541 DOM5
and fitting model$gamma[2] <- bois[1]
# 0.0037 0.0745 0.0173 0.0606 DOM6
# 0.0036 0.0753 0.0145 0.0541 DOM7
model <- as.data.frame(model)
model Notice that both the CV’s of the co-variates
Optimal # beta sig2 type gamma (CV1 and CV2) andthe CV’s of the real target
Stratification # 0.01109583 0.1708807 linear 0.4703953
variables (CV3 and CV4) are compliant to the
of meuse.grid # 0.26068155 0.2010272 linear 0.8547931
10% precision constraints.
CC BY SA Giulio Barcaroli • [email protected] Learn more at https://barcaroli.github.io/SamplingStrata/• package version 1.5 • Updated: 2020-01