Creating the database – technical details

Data Extent

The 59 counties for which SkyTruth data are available were subset from ESRI’s county dataset file, projected to the Albers 1983 coordinate system (to match the elevation dataset), and then converted into a raster dataset.

SkyTruth Mining Data

Data obtained from SkyTruth as a single shapefile in UTM Zone 17. Features are divided into areas of mountain top removal (MTR) and other areas of surface mining (SM) as well as the temporal extent of the observed MTR/SM activity:

Label 1976 1985 1995 2005
1976 X
1976-1985 X X
1976-1995 X X X
1976-2005 X X X X
1985 X
1985-1995 X X
1985-2005 X X X
1995 X
1995-2005 X X
2005 X
  • The shapefile was imported into the MTR SDE database in its native UTM projection and reprojected into the Albers projection system used by the NHD+ and MLRC data.
  • The reprojected Albers features were converted into a raster dataset using the 30 m cell size of the other datasets
  • A python script was used to combine the MTR and SM feature categories and produce a new raster dataset for each date range


The elevation (terrain) and flow direction data were amassed from the NHD+ collection. The 59 county SkyTruth area overlapped 4 NHD+ data sections: 5a, 5b, 5c, and 6a. The Elevation and flow direction section data were downloaded and mosaicked into a single raster dataset, respectively.

Excluding areas with upstream extent outside the study area

I converted the SkyTruth county raster dataset (from above) into a binary raster, with values of 1 assigned to areas outside the 59 county region and zeros assigned to areas within. I then calculated flow accumulation, using this binary raster as a weight raster (and the NHD+ data for the flow direction source). Any cell in the flow accumulation result with a value > 0 has part of its drainage area outside the study extent and was reclassified as NoData. However, as some cells outside the study area would also have a flow accumulation value = 0, I used the county raster as a mask when reclassifying. The resulting raster was used as a mask to extract the following data sets:

  • NHD+ DEM
  • NHD+ FlowDirection
  • NHD+ FlowAccumulation
  • NLCD 2001
  • SkyTruth Mining Extent (for 1976, 1985, 1995, and 2005)

Calculating upstream mining (and land cover) maps

I constructed a tool that accepts an input raster (of any extent as long as it covers the entire SkyTruth area of analysis), reclassifies it into a binary raster where 1 includes the values for which you want to tabulate upstream areas, and tabulates flow accumulation.

I then used this tool to create accumulated upstream areas of NLCD development (classes 21, 22, 23, and 24) and NCLD forested areas (values 41, 42, 42, and 60). Upstream mining rasters were created simply by adding them as weights in a flow accumulation calculation, as they were in binary format already.

Creating the flow line path

User points that don’t originally fall directly on a major flow path (i.e. stream) need to be snapped to one for upstream calculations to be accurate. As using the “snap pour point” tool does not guarantee that the point will be snapped to a stream cell (only to the cell with the largest accumulated flow value), I devised another method of adjusting the points location to be on a stream.

This method involved creating a flow path vector layer from the flow accumulation raster and then using the snapping tool to snap user points to this feature class. The process used to create this flow path vector layer involved (1) thresholding the flow accumulation raster so that all cells with accumulating 1000 or more cells is identified as a stream and then converting this stream raster to a stream feature class using the “stream to feature” tool.

The result is similar, but not identical to the NHD+ flowline dataset.