Building the database

Determining the amount of upstream surface mining ultimately requires only two spatial data sets: one delimiting the spatial extent of surface mining and another of the elevation terrain.

The extent of upstream surface mining is provided by SkyTruth and includes areas of surface mining as seen in remotely sensed images taken in 1976, 1985, 1995, and 2005. The area included in this analysis covers 59 counties in West Virgina, Virginia, Kentucky and Tennessee.

The terrain elevation dataset was obtained from the National Hydrologic Database (NHD+). Elevation is used to derive a flow direction data set, one in which each cell’s value represents the direction in which its contents will flow. (Actually, flow direction is also supplied by as a component of the NDH+ data set and is what I used.) From this, flow accumulation can be calculated. Flow accumulation, in its simplest form, is a tally of all the cells upstream of a given location. Alternatively, flow accumulation can be weighted, in which case the result is the sum of all cell values in a supplied weight raster that are upstream of a given cell (rather than a tally of upstream cells). Thus, if we set the weight raster to be a data set where surface mining cells have a value of 1, and all other cells have value of 0, the flow accumulation result will be a tally of mining cells upstream. Multiply this tally by the area of a cell (30 x 30m or 0.9 hectares), and you get the amount of surface mining found within each cell!

Of course, it’s not quite that easy. One major catch is that we don’t know where mining might occur outside the 59 county SkyTruth study area. There may be a large mine found just outside this area. If this mine flows into one of these 59 counties, it wouldn’t appear in our flow accumulation result. Consequently, we need to exclude any areas within the 59 counties that accept flow from outside this area. This figure <> shows these excluded areas.

How was this determined?

To identify all areas that accept flow from outside the 59 county area, I created a raster data set extending just beyond the extent of the 59 counties, setting the values of cells within the 59 county area to ‘0’ and those outside to ‘1’. I then computed flow accumulation using this as a weight raster, and voila! any cell with a value > 0 accumulated flow from outside the 59 county study area and needed to be excluded from further analysis.

Next up: A technical description of what I did to get the database all set up.