Uncertainty Estimation
Uncertainty in the areal estimate, Atotal, is due to contributions from four sources: (1) uncertainty in the logistic regression model parameter estimates, (2) uncertainty in the pixel-level forest/nonforest classifications, given a set of parameter estimates, (3) uncertainty in the interpolated pixel-level ash tree count-per-hectare estimates, , and (4) spatial correlation in forest/nonforest and ash tree observations not accommodated in the logistic regression model predictions and the ash tree count-per-hectare interpolated estimates.
The spatial correlation contribution to uncertainty in Atotal results from two phenomena. First, forest areas tend to be clustered rather than independently randomly distributed throughout the landscape. Thus, to mimic natural conditions, forest/nonforest realizations generated from the logistic regression model predictions of forest probability should exhibit clustering comparable to that observed among the FIA subplot observations of forest and nonforest. This feature requires that the random numbers used to generate the forest/nonforest realization in step A3 be drawn from a correlated uniform [0,1] distribution. Second, the errors obtained as the differences between and
were expected to be spatially correlated; i.e., if an interpolation,
, overestimates its true value, other interpolations in close spatial proximity would be expected to overestimate their true values also. However, for this investigation, the range of spatial correlation for the interpolation errors was only slightly more than the 30-m pixel width, was regarded as negligible, and was ignored.
To generate random numbers from an appropriately correlated uniform [0,1] distribution as required to accommodate spatial correlation, an 8-step procedure designated Procedure B was used:
B1. construct an empirical variogram,
where F is the numerical designation for forest or nonforest subplot observations, and n(d) denotes a collection of pairs, (Fi,Fj), whose Euclidean distances in geographic space are within a given neighborhood of d;
B2. fit an exponential variogram,
to the empirical variogram from step B1, where the estimate of the range of spatial correlation is
B3. construct a spatial correlation matrix by assigning to each pixel pair, (i, j), a correlation, ij, calculated as,
,
where dij is the distance between the ith and jth pixel centers and, initially, from step B2;
B4. generate a vector of random numbers, one for each pixel, from a multivariate Gaussian distribution with the correlation structure constructed in step B3 using the technique described by Kennedy and Gentle (1980, pp. 228-231);
B5. convert the Gaussian random numbers from step B4 to Gaussian cumulative frequencies, resulting in a correlated uniform [0, 1] distribution;
B6. generate a forest/nonforest realization using Procedure A with the correlated uniform [0, 1] distribution from step B5;
B7. construct an empirical variogram of the forest/nonforest realization from step B5; fit an exponential variogram model; and estimate the range of spatial correlation as in step B2;
B8. repeat steps B3-B7, adjusting the parameter in step B3 each iteration until the range of spatial correlation from step B7 is close to that obtained in step B2.
The exponential variogram model was used in step B2 because of its simplicity and the adequacy of the fit to the data. Construction of the multivariate Gaussian distribution in step B4 requires the Cholesky decomposition of a covariance matrix corresponding to the correlation matrix constructed in step B3. For a square region, n pixels on a side, the correlation matrix will be n2 by n2. Thus, the 30-km by 30-km study area, which has 1,000 TM pixels on a side, would require decomposition of a 106 by 106 matrix. To accommodate personal computer space and processing limitations, analyses involving spatial correlations were constrained to a 2-km by 2-km region, which is approximately 67 pixels on a side, requires decomposition of a 4,489 by 4,489 matrix, and comprises approximately 400 ha (Figures 2 and 3).
Uncertainty in Atotal for the 2-km by 2-km region was estimated using a 6-step Monte Carlo simulation procedure designated Procedure C:
C1. generate random numbers from a multivariate Gaussian distribution with mean 0 and variance matrix, , from equation 2 (described at Ash Tree Distribution Layer); add these random numbers to the logistic regression model parameters estimates to obtain simulated parameter estimates; calculate the probability,
, of forest for each pixel using the simulated parameter estimates with equation 1 (described at Ash Tree Distribution Layer);
C2. for each pixel, generate a random number, r, from a correlated uniform [0,1] distribution using Procedure B; if , designate the pixel as forest; if
, designate the pixel as nonforest;
C3. calculate the total forest area, Ftotal, as the product of the number of forest pixels from step C2 and the 0.09-ha pixel area;
C4. for each pixel, generate a random number from a normal distribution with mean 0 and variance, from equation 4 (described at Ash Tree Distribution Layer); add the random number to the interpolated estimate of ash tree count per hectare,
, to obtain a simulated ash tree count per hectare; multiply the simulated ash tree count per hectare and the 0.09-ha pixel area to obtain a simulated ash tree count for the pixel;
C5. estimate Atotal as the sum of the simulated ash tree counts from step C4 for forest pixels from step C2;
C6. repeat steps C1-C5 a large number of times; calculate the mean and variance of Ftotal and Atotal over all repetitions; estimate the uncertainties in Ftotal and Atotal as and
, respectively.
In Procedure C, the contribution of uncertainty due to the logistic regression model parameter estimates may be excluded by not adding uncertainty in step C1; the contribution of uncertainty in the classification, given the parameter estimates, may be excluded by skipping step C2 and comparing the probabilities generated in step C1 to 0.5 using Procedure A; the contribution of uncertainty due to spatial correlation may be excluded by generating an uncorrelated uniform [0,1] distribution in step C2; and the contribution of uncertainty due to the interpolated ash tree counts per ha may be excluded by not adding uncertainty in step C4. The magnitude of the contributions of individual sources of uncertainty may be estimated by considering and
obtained by including contributions from all sources individually and in combinations.
Encyclopedia ID: p3430



