Ratio Regression R
Ratio Regression R
B =
Pn
b = Pni=1 yi = N −n s2e
B Vb (B)
b = (47)
i=1 xi N x2 n
where
n n n n
!
1 X b i )2 = 1 X
b2
X
b
X
s2e = (yi − Bx yi2 + B x2i − 2B xi y i
n − 1 i=1 n−1 i=1 i=1 i=1
99
• If Y ≈ BX, then yi ≈ Bx b i . Thus, Bx
b i can be considered the predicted value of yi from a
b
line through the origin (with intercept=0 and the slope = B).
b is very complicated. For small samples, B
• The distribution of B b is likely to be skewed and
is biased for B. For large samples, the bias is negligible (very small) and the distribution
of Bb tends to be approximately normal.
ybr = (48)
2
N −n xU s2e
Vb (ybr ) = Vb (Bx
b U ) = x2 Vb (B)
U
b =
N x n
tc
yr = (49)
2 2 2
xU s2e N −n tx se
Vb (tc
yr ) = N (N − n) =
x n N x n
• tc
yr is called the ratio estimator of the population total.
• In such cases, it is common to replace tx with N x or replace xU with x. This will yield:
2
b N − n se s2e
V (ybr ) ≈ Vb (tc
yr ) = N (N − n)
N n n
100
Example: Demonstration of Bias in Ratio Estimation
xi value 67 63 66 69
yi value 68 62 64 70
• Let n = 2. The total abundances are tx = 265 for X and ty = 264 for Y . Therefore,
B = 264/265 = 0.9962. Also, Sx2 = 6.250 and Sy2 ≈ 13.3.
Sample Units tc
yr Vb (tc
yr ) b
tSRS Vb (b
tSRS )
1 1,2 265 8 260 72
2 1,3 263.0075 18.0903 264 32
3 1,4 268.8971 0.0017 276 8
4 2,3 258.8372 1.7307 252 8
5 2,4 265 8 264 128
6 3,4 263.0370 18.2677 268 72
E(tcyr ) = 263.963 E(Vb (tc
yr )) = 9.0151
V (tcyr ) = 10.981
• When appropriately used, the reduction in variance from using the ratio estimator will
offset the presence of bias. Also, for large samples, the estimators tyr and y r will be
approximately normally distributed.
101
• Assuming a small bias, both the variance and MSE can be approximated by
b will be small if
• Thus, the MSE and variance of B
Example of Ratio Estimation: A manager at a mill wants to estimate the total weight of
dry wood (ty ) for a certain number of truckloads of 5-foot bundles of pulpwood (wood from
recently cut trees). The process begins by
3. Recording the weight of the pulpwood (xi ) for each of the n bundles.
4. Removing the bark and drying the wood from the n bundles.
5. Recording the weight of the dry wood (yi ) for each of the n bundles.
X
30 X
30 X
30
xi = 3316 yi = 1802 xi yi = 214, 738
i=1 i=1 i=1
X
30 X
30 X
800
x2i = 392, 440 yi2 = 118, 360 tx = xi = 89420
i=1 i=1 i=1
Use ratio estimation to estimate the total amount of dry wood (ty ), the mean amount of drywood
per bundle (y U ). Also calculate the standard errors of these estimates.
102
5.1.3 Confidence Intervals for B, ty , and ty
• For large samples, approximate 100(1 − α)% confidence intervals for B, y U , and ty are:
q q q
b ∗ b
B ± z V (B) b b ∗ b b
y r ± z V (y r ) tyr ± z Vb (tc
c ∗
yr ) (50)
where z ∗ is the the upper α/2 critical value from the standard normal distribution.
• For smaller samples, approximate 100(1 − α)% confidence intervals for B and ty are:
q q q
b ∗ b
B ± t V (B) b b ∗ b b
y r ± t V (y r ) tyr ± t Vb (tc
c ∗
yr ) (51)
where t∗ is the the upper α/2 critical value from the t(n − 1) distribution.
• General rule: a normal approximation can be used if (i) n ≥ 30, (ii) the sampling fraction
Sx Sy √
n/N ≤ .25, and (iii) the coefficients of variation CX = and CY = are < .10/ n.
xU yU
• Example: Find 95% confidence intervals for ty , y U , and B for the pulpwood and drywood
example.
ACRES92, ACRES87, and ACRES82 are the numbers of acres devoted to farms in
1992, 1987, and 1982 for that county. ( 1 acre ≈ 4040m2 )
F92, F87, and F82 are the numbers of farms in 1992, 1987, and 1982 for that county.
LF92, LF87, and LF82 are the numbers of large farms (≥ 1000 acres) in 1992, 1987, and
1982, respectively, for that county.
SF92, SF87, and SF82 are the numbers of small farms (≤ 9 acres) in 1992, 1987, and
1982, respectively, for that county.
Region represents one of four an assigned geographical regions of the United States
(W=West, S=South, NE=Northeast NC=North central).
• The following table is a summary of the number of counties (Ni ) in each state (i).
103
State
State NNi Region
Region State
State NNi Region
Region State
State Ni Region
N Region
i i i
Alaska
Alaska AK
AK 55 WW Louisiana
Louisiana LA
LA 64
64 SS Ohio
Ohio OH
OH 88
88 NC
NC
Alabama
Alabama AL
AL 67
67 SS Massachusetts
Massachusetts MA
MA 14
14 NE
NE Oklahoma
Oklahoma OK
OK 77
77 SS
Arkansas
Arkansas AR
AR 75
75 SS Maryland
Maryland MD
MD 23
23 SS Oregon
Oregon OR
OR 36
36 WW
Arizona
Arizona AZ
AZ 15
15 WW Maine
Maine ME
ME 16
16 NE
NE Pennsylvania
Pennsylvania PA
PA 67
67 NE
NE
California
California CA
CA 58
58 WW Michigan
Michigan MI
MI 83
83 NC
NC Rhode Island
Rhode Island RI
RI 55 NE
NE
Colorado
Colorado CO
CO 63
63 WW Minnisota
Minnisota MN
MN 87
87 NC
NC South Carolina
South Carolina SC
SC 46
46 SS
Connecticut
Connecticut CT
CT 88 NE
NE Missouri
Missouri MO
MO 114 NC
114 NC South Dakota
South Dakota SD
SD 66
66 NC
NC
Delaware
Delaware DE
DE 33 NE
NE Mississippi
Mississippi MS
MS 82
82 SS Tennessee
Tennessee TN
TN 95
95 SS
Florida
Florida FL
FL 67
67 SS Montana
Montana MT
MT 56
56 W
W Texas
Texas TX 254
TX 254 SS
Georgia
Georgia GA
GA 159
159 SS North Carolina
North Carolina NC
NC 100
100 SS Utah
Utah UT
UT 29
29 WW
Hawaii
Hawaii HI
HI 44 WW North Dakota
North Dakota ND
ND 53
53 NC
NC Virginia
Virginia VA
VA 98
98 SS
Iowa
Iowa IA
IA 99
99 NC
NC Nebraska
Nebraska NE
NE 93
93 NC
NC Vermont
Vermont VT
VT 14
14 NE
NE
Idaho
Idaho ID
ID 44
44 WW New Hampshire
New Hampshire NH
NH 10
10 NE
NE Washington
Washington WA 39
WA 39 WW
Illinois
Illinois IL
IL 102
102 NC NC New Jersey
New Jersey NJ
NJ 21
21 NE
NE Wisconsin
Wisconsin WI
WI 72
72 NC
NC
Indiana
Indiana IN
IN 92
92 NC
NC New Mexico
New Mexico NM
NM 33
33 W
W West Virginia
West Virginia WV 55
WV 55 SS
Kansas
Kansas KS
KS 105
105 NC NC Nevada
Nevada NV
NV 17
17 W
W Wyoming
Wyoming WY
WY 2323 WW
Kentucky
Kentucky KY
KY 120
120 SS New York
New York NY
NY 62
62 NE
NE Total
Total 50 3078
50 3078
•• The
The graph
graph is
is aa scatterplot
scatterplot of
of Acres92
Acres92 (y)
(y) vs
vs Acres87
Acres87 (x).
(x). The
The plot
plot suggests
suggests aa proportional
proportional
relationship between y and x (a positive linear relationship with 0 intercept). Therefore,
relationship between y and x (a positive linear relationship with 0 intercept). Therefore,
ratio estimation
ratio estimation should
should be
be aa useful
useful procedure
procedure for
for estimating
estimating yy U or
or ttyy..
U
•• It
It is
is known
known that
that ttxx =
= 964,
964, 470,
470, 625
625 total
total farm
farm acres
acres in
in the
the United
United States
States in
in the
the year
year 1987.
1987.
Therefore, x = t /3078 ≈ 313343.283 farm acres per county.
Therefore, xUU = txx/3078 ≈ 313343.283 farm acres per county.
5.2.1
5.2.1 Ratio Estimation
Ratio Estimation Using
Using R
R
b
•• CASE
CASE 1:
1: Estimating
Estimating B.
B.
b
•• CASE
CASE 2:
2: Estimating
Estimating yy U when
U
when ttxx and
and xxUU are
are known.
known.
•• CASE
CASE 3:
3: Estimating
Estimating ttyy when
when ttxx and
and xxUU are
are known.
known.
•• CASE
CASE 4:
4: Estimating
Estimating yy U when
U
when ttxx and
and xxUU are
are unknown.
unknown.
•• CASE
CASE 5:
5: Estimating
Estimating ttyy when
when ttxx and
and xxUU are
are unknown.
unknown.
104
100
R code for ratio estimation
library(survey)
source("c:/courses/st446/rcode/confintt.r")
# In Excel, save your spreadsheet as a text tab-delimited file
# If variable names are in row 1, then use header=T)
105
R output for ratio estimation
> # Estimation of the ratio CASE 1
-------------------------------------------------------------------
mean( ACRES92/ACRES87 ) = 0.98657
SE( ACRES92/ACRES87 ) = 0.00575
Two-Tailed CI for ACRES92/ACRES87 where alpha = 0.05 with 299 df
2.5 % 97.5 %
0.97525 0.99788
-------------------------------------------------------------------
106
5.2.2 Bootstrapping Ratio Estimates Using R
• Bootstrapping can only be used to estimate ty and y U when tx and xU are known.
• Why? Replacing tx and xU with N x and x in each bootstrap sample is equivalent to the
SRS bootstrap of y and N y. Thus, bootstrap estimates of ty and y U ignore all information
about x.
• The follow R code is for estimation of B and for estimation of ty and y U when tx and xU
are known.
R code for bootstrapping ratio estimates
library(boot)
source("c:/courses/st446/rcode/confintt.r")
indata <- read.table("c://courses/st446/Rcode/agsrs.txt",header=T)
y <- indata$ACRES92
x <- indata$ACRES87
ratio <- cbind(x,y)
ratio <- data.frame(ratio)
N=3068 # population size
Brep = 20000
tx=964470625 # X population total if known
mux = tx/N # X population mean if known
# Bootstrap the sample ratio
sampratio <- function(ratio,i) mean(y[i]/mean(x[i]))
bootratio <- boot(data=ratio,statistic=sampratio,R=Brep)
bootratio
boot.ci(bootratio,conf=.95,type=c("norm","perc"))
par(mfrow=c(2,1))
hist(bootratio$t,main="Bootstrap Sample Ratios")
plot(ecdf(bootratio$t),main="Empirical CDF of Bootstrap Ratios")
107
RR output
output forfor bootstrapping
bootstrapping ratio
ratio estimates
estimates
Bootstrap Statistics :
Bootstrap Statistics : bias
original std. error
t1*original bias
0.9865652 -2.797441e-06 std. error
0.005980109 <--- for ratio B
t1* 0.9865652 -2.797441e-06 0.005980109 <--- for ratio B
BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS
BOOTSTRAP
Based onCONFIDENCE INTERVAL
20000 bootstrap CALCULATIONS
replicates
Based on 20000 bootstrap replicates
Intervals :
Intervals
Level : Normal Percentile
Level
95% Normal 0.9983 ) Percentile
( 0.9748, ( 0.9746, 0.9981 )
95% ( 0.9748, 0.9983 ) ( 0.9746, 0.9981 )
Bootstrap Statistics :
Bootstrap Statistics
original :
bias std. error
t1* 310141.2 -4.233312std. 1908.739
original bias error <-- for y mean
t1* 310141.2 -4.233312 1908.739 <-- for y mean
BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS
BOOTSTRAP
Based onCONFIDENCE INTERVAL
20000 bootstrap CALCULATIONS
replicates
Based on 20000 bootstrap replicates
Intervals :
Intervals
Level : Normal Percentile
Level
95% Normal 313886 )
(306404, Percentile
(306354, 313862 )
95% (306404, 313886 ) (306354, 313862 )
Bootstrap Statistics :
Bootstrap Statistics
original bias: std. error
t1*original
951513191 bias
54457.72std. error
5772867 <-- for y total
t1* 951513191 54457.72 5772867 <-- for y total
BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS
BOOTSTRAP
Based onCONFIDENCE INTERVAL
20000 bootstrap CALCULATIONS
replicates
Based on 20000 bootstrap replicates
Intervals :
Intervals
Level : Normal Percentile
Level
95% Normal
(940144121, 962773345 Percentile
) (940039313, 962538574 )
95% (940144121, 962773345 ) (940039313, 962538574 )
3000
0
bootratio$t
108104
5.2.3 Ratio Estimation Using SAS Proc Surveymeans (Supplemental)
• CASE 1: Because B b and the s.e.(B)
b do not depend on knowing any population values, the
default output for estimating the population ratio B is correct in the output. The analysis is
produced by the first block of code.
DATA ratioest;
INFILE ’C:\COURSES\st446\SASsurv\agsrs.dat’;
FORMAT county $char14.;
INPUT i county $ st $ acres92 acres87 acres82 F92 F87 F82
LF92 LF87 LF82 SF92 SF87 SF82 region $ @@;
KEEP acres92 acres87 acres82;
109
*** calculate sum of sample x values (if taux and mux are unknown) ***;
PROC MEANS DATA = ratioest MEAN NOPRINT;
VAR x; OUTPUT OUT = sset MEAN = xbar;
DATA sset; SET sset; flag=1; KEEP flag xbar;
DATA ratioest; MERGE ratioest sset; BY flag;
*** scale the y values for estimation of a total or mean of y ***;
y_total = acres92*taux; *** known case ***;
y_mean = acres92*mux;
y_utotal = acres92*N*xbar; *** unknown case ***;
y_umean = acres92*xbar;
PROC SURVEYMEANS data=ratioest total=3078 ratio clm alpha=.05; +-------
var y x; | Case 1
ratio y / x; +-------
title ’Ratio Estimation of the Ratio B’;
PROC SURVEYMEANS data=ratioest total=3078 ratio clm alpha=.05; +-------
var x y; | Case 2
ratio y_mean / x; +-------
title ’Ratio Estimation of the y Mean --- known x mean and total’;
PROC SURVEYMEANS data=ratioest total=3078 ratio clm alpha=.05; +-------
var x y ; | Case 3
ratio y_total / x; +-------
title ’Ratio Estimation of the y Total --- known x mean and total’;
PROC SURVEYMEANS data=ratioest total=3078 ratio clm alpha=.05; +-------
var x y; | Case 4
ratio y_umean / x; +-------
title ’Ratio Estimation of the y Mean --- unknown x mean and total’;
PROC SURVEYMEANS data=ratioest total=3078 ratio clm alpha=.05; +-------
var x y; | Case 5
ratio y_utotal / x; +-------
title ’Ratio Estimation of the y Total --- unknown x mean and total’;
RUN;
110
Ratio Analysis
Numerator Denominator Ratio Std Err 95% CL for Ratio
-----------------------------------------------------------------------
y x 0.986565 0.005750 0.97524871 0.99788176 <--
-----------------------------------------------------------------------
Ratio Analysis
Numerator Denominator Ratio Std Err 95% CL for Ratio
-----------------------------------------------------------------------
y_total x 951513191 5546162 940598734 962427648 <--
-----------------------------------------------------------------------
111
(OUTPUT FOR CASE 4)
Ratio Estimation of the y Mean --- unknown x mean and total
The SURVEYMEANS Procedure
Data Summary
Number of Observations 300
Statistics
Std Error
Variable Mean of Mean 95% CL for Mean
-----------------------------------------------------------------
x 301954 18914 264733 339174
y 297897 18898 260706 335088
y_umean 89951122411 5706452641 7.87212E10 1.01181E11
-----------------------------------------------------------------
Ratio Analysis
Numerator Denominator Ratio Std Err 95% CL for Ratio
------------------------------------------------------------------------
y_umean x 297897 1736.376641 294479.980 301314.114 <--
------------------------------------------------------------------------
5.3 yc b
U vs y r or
b
t vs tc
yr Which is better? SRS or Ratio Estimation?
• Let Sx and Sy be the population standard deviations of X and Y . Let Sxy be the population
covariance between X and Y . The population correlation coefficient
PN
Sxy (xi − xU )(yi − y U )
R = where Sxy = i=1 .
Sy Sx N −1
112
• It can be shown that approximations for the true population variances and MSEs of tc
yr
and ybr are
N (N − n) 2
M SE(tc c
yr ) ≈ V (tyr ) ≈ (Sy − 2BRSy Sx + B 2 Sx2 )
n
N −n 2
M SE(ybr ) ≈ V (ybr ) ≈ (Sy − 2BRSy Sx + B 2 Sx2 )
Nn
• Thus, these variances will be smaller as R approaches 1. Or, equivalently, the stronger the
positive correlation, the smaller the variance.
• If the researcher wants to estimate ty or y U , the main sampling question is ‘When is worth
the additional effort and expense to collect information about X instead of just using a
SRS estimator yc b
U or t which does not require knowledge about X?’
• The answer requires looking at the coefficient of variation for both X and Y .
CX = CY =
1 CX
• It can be shown that if R > , then the variance of the ratio estimator is smaller
2 CY
than the variance of the SRS estimator.
• Because the maximum value of R is 1, if we have CX > 2CY , then the variance of the ratio
estimator must be larger than the variance of the SRS estimator. Thus, when CX > 2CY ,
the SRS estimator is better (more efficient) than the ratio estimator.
• Because CX and CY are unknown, we would calculate the sample (Pearson) correlation
bx and C
coefficient r and the sample coefficients of variation (C by ) to check if these conditions
are met. The formulas are:
Cbx = Cby =
n Pn P P
1 X xi − x yi − y i=1 xi yi − n1 ( ni=1 xi )( ni=1 yi )
r = =
n − 1 i=1 sx sy (n − 1)sx sy
where sx and sy are the sample standard deviations of the x and y observations.
• In summary, if the following conditions hold, using ratio estimators can provide a substan-
tial improvement over the SRS estimators:
1. You must be able to simultaneously observe X and Y values that are ‘roughly propor-
tional’ to each other. That is, there is a strong positive linear relationship between
Y and X that passes through the origin (zero intercept).
2. The coefficient of variation for X should not be substantially larger than the coefficient
of variation for Y .
3. The population total tx or population mean xU should be known.
• If there is a linear relationship between Y and X and the intercept is not zero or the
correlation between X and Y is negative, then a regression estimator should be considered.
113
5.4 Estimation in Domains (or Subpopulations)
• It is common to want estimates of a mean or total for subpopulations. The subpopulations
are called domains.
• For example, in the previous example, we may want estimates for each of the four regions
(W, S, NE, and NC). Each region is an example of a domain (or subpopulation).
• Let Ud be the set of population units in domain d and let Nd be the number of population
units in domain d. The domain total and domain mean for domain d are
!
X X
tyd = yi y U d = tyd /Nd = yi /Nd
i∈Ud i∈Ud
• Let Sd be the set of sample units in domain d and let nd be the number of sample units
in domain d. Natural estimators for the domain mean y U d and the domain total tyd are
! !
X N d
X
yd
Ud = yi /nd = y d tc
yd = yi = Nd y d
i∈S
nd i∈S
d d
P PN
uU d ( Ni=1 ui )/N ui
Let Bd = = PN = Pi=1
N
xU d ( i=1 xi )/N i=1 xi
P P
i∈U ui + ui
= P d Pi∈| Ud
i∈U xi + i∈| U xi
P d P d
i∈U ui + 0
= P d P i∈| Ud
i∈U 1 + i∈| U 0
P d Pd
i∈U ui i∈Ud yi
= P d = = yU d
i∈Ud 1 Nd
114
• Let S be the set of n SRS units, and let Sd ⊂ S be the set of nd SRS units in domain d.
We can estimate domain ratio Bd = y U d with the ratio of domain sample means:
P P
bd = u d ( u i )/n ui
B = Pi∈S = Pi∈S
xd ( i∈S xi )/n i∈S xi
P P
i∈S ui + ui
= P d Pi∈| Sd
i∈S xi + i∈| S xi
P d P d
i∈S yi + 0
= P d P i∈| Sd
i∈S 1 + i∈| S 0
P d Pd
i∈S yi i∈Sd yi
= P d = = yd
i∈Sd 1 nd
EXAMPLE of Domain Estimation: Suppose we are interested in estimating the mean acres
per farm for the states in each region. The regions are the domains (or subpopulations). The
table contains summary values for the proportion of the sample (xd = nd /300) from domain d
and the proportion of population units (xU d = Nd /3078) in domain d:
P
d nd i∈Sd yi yd xd Nd xU d
NC 107 37,481,245 350292 .356 1054 .3424
NE 24 1,727,300 71971 .08 220 .0715
S 130 26,812,026 206246 .43 1382 .4490
W 39 23,348,543 598681 .13 422 .1371
Total n = 300 N = 3078
Note that the proportion of the sample from domain d is close to the actual proportion of
population units in domain d (xd ≈ xU d ).
115
5.4.1 Using R to Perform a Domain Analysis
R code for Domain Analysis
library(survey)
source("c:/courses/st446/rcode/confintt.r")
domain <- read.table("c://courses/st446/Rcode/agsrs.txt",header=T)
N=3078 # population size
n=300 # sample size
fpc <- c(rep(N,n))
domaindt <- cbind(domain,fpc)
domaindat <- data.frame(domaindt)
#domaindat
# Create the sampling design
domain_dsgn <- svydesign(data=domaindat, id=~1, fpc=~fpc )
domain_dsgn
# Estimation of domain totals
# Domain = NC
esttotal <- svytotal(~ACRES92,subset(domain_dsgn,REGION=="NC"))
esttotal
confint(esttotal,df=n-1)
# Domain = NE
esttotal <- svytotal(~ACRES92,subset(domain_dsgn,REGION=="NE"))
esttotal
confint(esttotal,df=n-1)
# Domain = S
esttotal <- svytotal(~ACRES92,subset(domain_dsgn,REGION=="S"))
esttotal
confint(esttotal,df=n-1)
# Domain = W
esttotal <- svytotal(~ACRES92,subset(domain_dsgn,REGION=="W"))
esttotal
confint(esttotal,df=n-1)
# Estimation of domain means
# Domain = NC
estmean <- svymean(~ACRES92,subset(domain_dsgn,REGION=="NC"))
estmean
confint(estmean,df=n-1)
# Domain = NE
estmean <- svymean(~ACRES92,subset(domain_dsgn,REGION=="NE"))
estmean
confint(estmean,df=n-1)
# Domain = S
estmean <- svymean(~ACRES92,subset(domain_dsgn,REGION=="S"))
estmean
confint(estmean,df=n-1)
# Domain = W
estmean <- svymean(~ACRES92,subset(domain_dsgn,REGION=="W"))
estmean
confint(estmean,df=n-1)
116
R output for Domain Analysis
> # Estimation of domain totals
> # Domain = NC
total SE 2.5 % 97.5 %
ACRES92 384557574 41022160 ACRES92 303828848 465286299
> # Domain = NE
total SE 2.5 % 97.5 %
ACRES92 17722098 4490614 ACRES92 8884885 26559311
> # Domain = S
total SE 2.5 % 97.5 %
ACRES92 275091387 35287421 ACRES92 205648224 344534549
> # Domain = W
total SE 2.5 % 97.5 %
ACRES92 239556051 46090457 ACRES92 148853274 330258829
• The top section of the output ‘Statistics’ contains the SRS analysis of variable y =acres92
for estimating y U and ty .
DATA agsrs;
INFILE ’C:\COURSES\THAI\SASPSM\agsrs.dat’;
FORMAT county $char14.;
INPUT i county $ st $ acres92 acres87 acres82 F92 F87 F82
LF92 LF87 LF82 SF92 SF87 SF82 region $ @@;
117
DATA agsrs; SET agsrs;
PROC SURVEYMEANS data=agsrs total=3078 nobs mean clm sum clsum df;
var acres92;
weight utwgt;
domain region;
title1 ’Domain Estimation of ybar_Ud and t_d -- Acreage 1992 --- Nd unknown’;
RUN;
Data Summary
Statistics
Std Error
Variable N DF Mean of Mean 95% CL for Mean
--------------------------------------------------------------------
acres92 300 299 297897 18898 260706.257 335087.836
--------------------------------------------------------------------
Std Error
region Variable N DF Mean of Mean 95% CL for Mean
------------------------------------------------------------------------
NC acres92 107 299 350292 26985 297186.692 403397.326
NE acres92 24 299 71971 12360 47646.954 96294.713
S acres92 130 299 206246 23066 160854.596 251638.111
W acres92 39 299 598681 77637 445897.252 751463.927
------------------------------------------------------------------------
118