Nolan, B.T.Hitt, K.J.Ruddy, B.C.2002Probability of nitrate contamination of recently recharged ground waters in the conterminous United States1Raster digital dataEnvironmental Science and TechnologyVolume 36, Number 10, Pages 2138-2145Reston, Virginia, USAU.S. Geological Surveyhttps://water.usgs.gov/lookup/getspatial?gwriskThis data set is a national map of predicted probability of
nitrate contamination of shallow ground waters based on a
logistic regression (LR) model. The LR model was used to
predict the probability of nitrate contamination exceeding 4
mg/L in predominantly shallow, recently recharged ground waters
of the United States.
The model contains variables representing (1) nitrogen (N)
fertilizer loading, (2) percent cropland-pasture, (3) natural
log of human population density, (4) percent well-drained soils,
(5) depth to the seasonally high water table, and (6) presence
or absence of unconsolidated sand and gravel aquifers. Observed
and average predicted probabilities associated with deciles of
risk are well correlated (r2 = 0.875), indicating that the LR
model fits the data well. The likelihood of nitrate
contamination is greater in areas with high N loading and
well-drained surficial soils over unconsolidated sand and
gravels. The LR model correctly predicted the status of nitrate
contamination in 75 percent of wells in a validation data set
from the National Water-Quality Assessment (NAWQA) program.
Considering all wells used in both calibration and validation,
observed median nitrate concentration increased from 0.24 to
8.30 mg/L as the mapped probability of nitrate exceeding 4 mg/L
increased from less than or equal to 0.17 to greater than 0.83.
Determining where shallow ground water is at risk of nitrate
contamination can help managers allocate scarce resources for
prevention, monitoring, and clean-up measures.
Related_Spatial_and_Tabular_Data_Sets:
The data set is provided as a GeoTIFF image (gwrisk.tif) as well
as native ArcInfo Workstation GRID format (gwrisk.tgz).
The following table describes the display of mapped probabilities in the
GeoTIFF image:
>Probability that nitrate Index RGB Values
>concentration exceeds 4 mg/L Color Value R G B
>-----------------------------------------------------------------
> 0 to .17 Light yellow 1 255 255 235
> Greater than .17 to .33 Orange 2 255 214 153
> Greater than .33 to .50 Green 3 209 255 51
> Greater than .50 to .67 Blue 4 102 214 255
> Greater than .67 to .83 Purple 5 204 102 255
> Greater than .83 to 1 Red 6 255 0 0
> No data Gray 99 230 230 230
The gwrisk.clr (colormap) file specifies the colors to be used for
displaying GRID cell values. This is an ascii file containing
the index numbers followed by RGB values:
>Index red green blue
>----------------------------
> 0 0 0 0
> 1 255 255 235
> 2 255 214 153
> 3 209 255 51
> 4 102 214 255
> 5 204 102 255
> 6 255 0 0
> ...
> 99 230 230 230
> ...
> 255 0 0 0
The .aux (auxiliary) file stores coordinate system information
for the .tif file and is recognized by ArcGIS.
To extract the ArcInfo Workstation GRID from the .tgz archive
file use TARARC or the following commands:
>gunzip gwrisk.tgz
>tar xvof gwrisk.tar
The GRID "gwrisk" will be in a subdirectory called arctar00000.
>arctar00000/
>arctar00000/gwrisk/
>arctar00000/gwrisk/dblbnd.adf
>arctar00000/gwrisk/hdr.adf
>arctar00000/gwrisk/log
>arctar00000/gwrisk/metadata.xml
>arctar00000/gwrisk/prj.adf
>arctar00000/gwrisk/sta.adf
>arctar00000/gwrisk/vat.adf
>arctar00000/gwrisk/w001001.adf
>arctar00000/gwrisk/w001001x.adf
>arctar00000/info/
>arctar00000/info/arc.dir
>arctar00000/info/arc0000.dat
>arctar00000/info/arc0000.nit
>arctar00000/info/arc0001.dat
>arctar00000/info/arc0001.nit
>arctar00000/info/arc0002.dat
>arctar00000/info/arc0002.nit
>arctar00000/info/arc0002r.001
>arctar00000/log
2002The calibration data set was from 1230 wells sampled as part of
the National Water-Quality Assessment (NAWQA) program from
1992-1995 and the validation data set was from 736 NAWQA wells
sampled from 1996-1999. The N fertilizer input data represents
1992 and the population density is from 1990.
CompleteThe map could be updated as necessary as better data sets become
available and to reflect any improvements in the procedures to
define the logistic regression model.
-129.839567-64.11309751.78429521.217696NoneGround waterGround water contaminationGround water pollutionGround water susceptibilityNutrientsNitrateNational Water-Quality Assessment ProgramNAWQAinlandWatersNoneConterminous United StatesNoneNoneNone2002None.
The national probability map is intended for regional use only. Areas
of high probability on the map have high potential for nitrate
contamination but are not necssarily contaminated. The map should
not be used for local applications.
Please acknowledge the U.S. Geological Survey in products derived
from these data.
Kerie J. HittU.S. Geological Surveymailing address12201 Sunrise Valley Drive, MS-413RestonVA20192USA1-888-275-8747703-648-6693kjhitt@usgs.gov9:00am-5:30pm ESTPlease contact through emailhttp://water.usgs.gov/nawqa/nutrients/pubs/est_v36_no10/See figure 3 of the published report
(PDF format) for a map of the probability that nitrate exceeds 4 mg/L
in shallow ground waters of the United States, based on the LR model.
PDFhttps://water.usgs.gov/GIS/browse/gwrisk.gifMap of the probability that nitrate exceeds 4 mg/L
in shallow ground waters of the United States, based on the LR model.
GIFMicrosoft Windows 2000 Version 5.0
(Build 2195 Service Pack 3); ArcInfo Workstation version 8.3
Nolan, B.T., and Ruddy, B.C.1996Nitrate in ground waters of the United States--Assessing
the risk
1USGS Fact Sheet092-96http://water.usgs.gov/nawqa/FS-092-96.htmlNolan, B.T., Ruddy, B.C., Hitt, K.J., and Helsel, D.R.1997Risk of nitrate contamination in groundwaters of the United States--
A national perspective
1Environmental Science and TechnologyVolume 31, Number 8, Pages 2229-2236This data set is the one that previously was distributed as "GWRISK" and
now is being replaced.
Nolan, B.T., Ruddy, B.C., Hitt, K.J., and Helsel, D.R.1998A national look at nitrate contamination of groundwater1Water conditioning and purificationVolume 39, Number 12, Pages 76-79http://water.usgs.gov/nawqa/wcp/Nolan, B.T.2001Relating nitrogen sources and aquifer susceptibility to
nitrate in shallow ground waters of the United States
1Ground WaterVolume 39, Number 2, Pages 290-299http://water.usgs.gov/nawqa/nutrients/pubs/gw_v39_no2/Nolan, B.T., Hitt, K.J., and Ruddy, B.C.2002Probability of nitrate contamination of recently recharged groundwaters
in the conterminous United States
1Raster digital dataEnvironmental Science & TechnologyVolume 36, Number 10, Pages 2138-2145http://water.usgs.gov/nawqa/nutrients/pubs/est_v36_no10/The LR model results were checked using standard USGS review
procedures.
Not applicable for raster data.The data spans the conterminous United States.
An output cell is assigned a value of "no data" if any of the
corresponding input data cells at that location is "no data."
U.S. Geological Survey19990601National land cover data (NLCD)Version 1.0raster digital dataSioux Falls, SDU.S. Geological Surveyhttp://edc.usgs.gov/products/landcover/nlcd.htmlOn-line19861993early to mid 1990sNLCDThe source data consists of land cover data at 30-m resolution.
The 30-m data was aggregated at 1-km resolution to indicate the percent of each
land cover type in the cell.
Curtis PriceRick Clawges1999Population density of the conterminous United States1.0digital raster dataOpen-File Report99-78Rapid City, SDU.S. Geological Surveyhttps://water.usgs.gov/lookup/getspatial?ofr99-78_popdengOn-line19901990POPDENG1-km grid of 1990 population of the conterminous
United States, converted from an ASCII file obtained from the
Consortium for International Earth Science Information Network (CIESIN).
U.S. Department of Agriculture,
Natural Resources Conservation Service
1994State Soil Geographic (STATSGO) Data Base
for the United States and Puerto Rico
mapMiscellaneous Pub.1492Fort Worth, TXU.S. Department of Agriculturehttp://soils.usda.gov/survey/geography/statsgo/CDROM1994Publication date of STATSGO dataSTATSGOWeighted averages for percent well-drained
soils and depth to seasonally high water table.
James A. Miller (compiler)19980701The Principal Aquifers of the 48 Conterminous United States,
in National Atlas of the United States
Version 1.0mapMadison, Wisconsin, USAU.S. Geological Surveyhttp://www.nationalatlas.gov/mld/aquifrp.htmlOn-line1998Publication date of principal aquifers dataUSAQV1Surface outcrop or near-surface locations
of unconsolidated sand and gravel aquifers (excluding
glaciated sediments and alluvial aquifers along major rivers).
David R. Soller and P.H. Packard1998Digital representation of a map showing the thickness and
character of Quaternary sediments in the glaciated United
States east of the Rocky Mountains: surficial Quaternary
sediments
1mapUSGS Digital Data Series38Reston, VirginiaU.S. Geological Surveyhttps://pubs.usgs.gov/dds/dds38/On-line1998Data compiled prior to 1987QSURFLocation of coarse-grained surficial Quarternary sediments.Preliminary version of the ground water risk map published in
USGS fact sheet FS-092-96.
1996Revised version of the ground water risk map published in
Environmental Science and Technology, volume 31, number 8,
pages 2229-2236.
1997Characteristics of nitrogen loading and aquifer susceptibility
to contamination were evaluated to determine their influence
on contamination of shallow ground water by nitrate. A set of
13 explanatory variables was derived from these
characteristics, and variables that have a significant
influence were identified using logistic regression (LR)
(Nolan, 2001).
The model was developed using data collected from more than
1200 wells by the first 20 NAWQA study units that began in
1991 and contained these variables: 1) N fertilizer loading to
the land surface; 2) percent cropland-pasture; 3) natural log
of human population density; 4) percent well-drained soils; 5)
depth to seasonally high water table; and 6) presence or
absence of a fracture zone within a surficial aquifer.
The model was used to identify variables that significantly
influence nitrate contamination of shallow ground water, but
was not validated with an independent data set and was not
used in prediction.
200103The previous LR model was refined (Nolan and others, 2002).
The new model was recalibrated with three updated variables
that represent improved sources of data and was validated
using data collected from over 700 wells by a different set of
NAWQA study units that began in 1994.
Three variables were updated: N loading, percent
cropland-pasture, and presence or absence of rock fractures.
The current model uses separate estimates of farm and nonfarm
fertilizer loading (USGS, unpublished data, 2001). The
county-level fertilizer estimates were allocated to NLCD land
cover classes instead of being allocated equally to farm and
nonfarm uses. Farm fertilizer N was allocated to NLCD
classes orchards/vineyards, row crops and small grains.
Nonfarm fertilizer N was allocated to NLCD classes
low-intensity residential and urban/recreational grasses.
Cropland-pasture is represented by the aggregation of NLCD
classes pasture/hay, row crops, small grains, and fallow land.
In the previous model, cropland-pasture was derived from
Anderson land use/land cover data updated with 1990 population
data.
A binary indicator variable representing the presence or
absence of unconsolidated sand and gravel aquifers was
substituted for the fracture zone variable.
The other variables remain the same: natural log of human
population density, percent well-drained soils, and depth to
seasonally high water table.
Each of the variables in the LR model (farm-nonfarm N
fertilizer loading, percent cropland-pasture, natural log of
population density, percent well-drained soils, depth to the
seasonally high water table, and presence or absence of
unconsolidated sand and gravel aquifers) was compiled within
1-km grid cells for prediction of nitrate contamination at the
national scale.
All of the grid layers line up with a template representing
the conterminous U.S. consisting of 3070 rows x 4855 columns,
about 14.9 million total cells. About 7.6 million cells
contain data; the rest are "no data." Each cell measures 1-km
x 1-km.
The original 30-m NLCD were aggregated to give the percent of
each category of land use within 1-km cells. The area (square
km) of each type of land use in each cell is the percent
divided by 100. The aggregated 1-km NLCD data were used to
compute farm-nonfarm N fertilizer loading and percent
cropland-pasture.
Farm-nonfarm N was determined by combining NLCD areas of farm
and nonfarm land use with fertilizer application data by
county. The farm N application rate was calculated by
dividing the total farm N applied (kg) in each county by the
area of farm land within the county. Farm N was applied to
appropriate farm land classification areas within each 1-km
cell by multiplying the application rate by the farm land area
within the cell. The procedure was duplicated for nonfarm
land use and nonfarm fertilizer data by county. Total
farm-nonfarm N is the sum of farm N plus nonfarm N. [Grid
fertn92g.]
The NLCD percent cropland-pasture is the sum of the percent of
all categories of cropland-pasture land use in each 1-km
cell. [Grid tcroppas.]
The population data was derived from 1990 Census block group
data representing people per square km. The log is calculated
as ln(pop90g + .00001) for each 1-km cell. [Grid logpop90.]
The STATSGO soil mapping units were gridded at 1-km
resolution. Each mapping unit has multiple components. For
well-drained soils, each cell contains the total percent of
hydrologic groups A and B, computed for all soil components
within the mapping unit. [Grid welldrg.] For depth to
seasonally high water table, each 1-km cell contains the
weighted average value of the depth to water table, based on
the relative percents of all soil components within the
mapping unit. [Grid wtdepavg.]
The binary unconsolidated sand and gravel aquifer indicator
variable is a combination of unconsolidated sand and gravel
aquifers (Miller, 1998) and coarse-grained stratified
sediments (Soller and Packard, 1998). The principal aquifer
boundaries were converted to 1-km grid format with all
aquifers having a rock name of "Unconsolidated sand and gravel
aquifers" coded 1. All other aquifers were coded 0. [Grid
bsgaqg2.] Soller's ALLQSURF map was converted to a 1-km grid
having "coarse-grained sediments" (code 101) coded as 1 and
everything else coded as 0. [Grid qcgsed2.] The two input
grids are combined so that any cell having either sand and
gravel = 1 or coarse-grained = 1 is coded as 1. [Grid
newbaqg.]
Explanatory variables in logistic the regression model are:
> Estimated Wald
>Variable coefficient p value
>----------------------------------------- ----------- -------
>constant -5.541 <0.001
>1992 fertilizer N (kg/ha) 0.004 <0.001
>NLCD cropland-pasture (%) 0.016 <0.001
>ln(1990 population density) 0.229 <0.001
> (ln(people/km2)
>well-drained soils (%) 0.025 <0.001
>depth to seasonally high water table (m) 1.088 <0.001
>presence or absence of unconsolidated 0.424 0.002
> sand and gravel aquifers
To make the national probability map, the values from the 1-km
grid cells are put in to the LR equation to calculate for each
cell the probability that nitrate in shallow ground water
exceeds 4 mg/L. An output cell is assigned a value of "no
data" if any of the corresponding input data cells at that
location is "no data."
The following ArcInfo Workstation GRID commands produced the
probability grid:
>/*Probability grid with revised binary aquifer variable (using Soller's
>/* coarse grained stratified sediments)
>/*
>/*snap to window specified by xmin,ymin, xmax,ymax (geolg grid)
>setwindow -2505700.250 122020.555 2349299.750 3192020.555 d:/ancill/wolock/geol/geolg
>setcell 1000
>&if [exists pgridrev -grid] &then kill pgridrev
>/*This plugs the values for the input variables into the LR equation:
>pgridrev = (exp (-5.5408 + (.00373 * fertn92g) + (.0162 * tcroppas) ~
> + (.2286 * logpop90) + ~
>(.0249 * welldrg) + (1.0879 * wtdepavg) + (.4245 * newbaqg) )) / ~
>(1 + (exp (-5.5408 + (.00373 * fertn92g) + (.0162 * tcroppas) + (.2286 * logpop90) + ~
>(.0249 * welldrg) + (1.0879 * wtdepavg) + (.4245 * newbaqg) )))
GRID cells were randomly selected from this dataset and
checked by hand to ensure the correct probability values were
calculated using the logistic regression equation and the
input data values.
In most NAWQA study units, the predicted exceedence
probability reasonably approximates the observed proportion of
wells with nitrate exceeding 4 mg/L. This indicates that the
model adequately simulates regional N loading and aquifer
susceptibility in these areas. The model inaccurately
predicts the probability in some areas, such as the Rio Grande
Valley.
200111The LR values were grouped by category. The resulting grid has 6 categorical
values corresponding to the range of probablities (1-6) and 1 value for
"no data" (99).
200111The GRID was converted to a GeoTIFF image for distribution.
>gridimage gwrisk gwrisk.clr gwrisk tiff
20030904Received the following comments from Curtis Price and
responded to them.
1. The TIFF actually has your index values stored with an
embedded colormap. So I'd add the index values to your table
of concentrations in the supp info section, as you did in
the attribute overview. [done]
2. I think you can and should skip the world file, as the
georeferencing should be stored in the TIFF header, accessible
to AV3, ArcGIS and any other software that recognizes
GeoTIFF. But you may want to *add* the AUX file generated by
ArcCatalog for the TIFF as that would make the coordinate
system ("projection" in workstation parlance) of the TIFF
available to ArcGIS. [done]
3. Suggest using > instead of ! to format tables [either would work]
4. ARC/INFO = ArcInfo or ArcInfo Workstation.
I use the terminology [ArcInfo Workstation] GRID as the software;
"grid" is a data set. You seem to be using this convention some of the time
in your narratives, but not all the time. The downside of this is that reviewers
may want you to define GRID because it looks like an acronym. It isn't.
[done]
5. Suggest specifying processing window in your AML as x1 y1 x2 y2
rather than referring to the grid "geolg" and its snapping
which was projected from Soller's map?? [done]
6. Suggest instead of "using this aml" just say "using the following ArcInfo
Workstation GRID commands" [done]
20030908RasterGrid Cell307048551Albers Conical Equal Area29.545.5-96230.000000.00000coordinate pair1000.01000.0METERSNorth American Datum of 1983GRS806378137.000000294.257222GWRISK.VATValue attribute table for the gridArcInfo Workstation GRID softwareVALUECategorical value of probability that nitrate
nitrate exceeds 4 mg/L or no data
USGS10 to .17 probability that nitrate
exceeds 4 mg/L based on LR model
USGS2more than .17 to .33 probability that nitrate
exceeds 4 mg/L based on LR model
USGS3more than .33 to .50 probability that nitrate
exceeds 4 mg/L based on LR model
USGS4more than .50 to .67 probability that nitrate
exceeds 4 mg/L based on LR model
USGS5more than .67 to .83 probability that nitrate
exceeds 4 mg/L based on LR model
USGS6more than .83 to 1 probability that nitrate
exceeds 4 mg/L based on LR model
USGS99no dataUSGSCOUNTCount of cells in each VALUE categoryCalculated by ArcInfo Workstation GRID software188207276548GWRISK.STAStatistics table that accompanies the gridArcInfo Workstation GRID softwareMINMinimum GRID cell VALUEComputed1ComputedThis is the minimum value
of the categorical values (1,2,3,4,5,6,99) in the grid.
MAXMaximum GRID cell VALUEComputed99ComputedThis is the maximum value
of the categorical values (1,2,3,4,5,6,99) in the grid.
MEANMean GRID cell VALUEComputed49.152Computed automatically by GRID softwareThe MEAN is meaningless for the
categorical values.
STDVStandard Deviation of GRID cell VALUEsComputed48.691Computed automatically by ArcInfo Workstation GRID softwareThe STDV is meaningless for the
categorical values.
Each 1-km grid cell stores the value of the category of
predicted probability that nitrate concentration in shallow
ground water exceeds 4 mg/L, as calculated by the multi-variate
LR model. No data is indicated by a value of 99. The values of
the categories are:
> Grid Probability that nitrate exceeds 4 mg/L
> cell in shallow ground waters of the
> value United States, based on the LR model
>--------------------------------------------------
> 1 0 to .17
> 2 More than .17 to .33
> 3 More than .33 to .50
> 4 More than .50 to .67
> 5 More than .67 to .83
> 6 More than .83 to 1
> 99 No data
To obtain an ArcInfo grid containing the actual probabilities calculated
in the LR model, contact the NAWQA Nutrients Synthesis team by
emailing kjhitt@usgs.gov.
None.U.S. Geological SurveyAsk USGS - Water Webserver Teammailing445 National CenterRestonVA201921-888-275-8747 (1-888-ASK-USGS)https://water.usgs.gov/user_feedback_form.htmlAlthough this data set has been used by the U.S. Geological
Survey, U.S. Department of the Interior, no warranty expressed or
implied is made by the U.S. Geological Survey as to the accuracy
of the data and related materials. The act of distribution shall not
constitute any such warranty, and no responsibility is assumed by
the U.S. Geological Survey in the use of this data, software, or
related materials.
Any use of trade, product, or firm names is for descriptive
purposes only and does not imply endorsement by the U.S.
Government.OtherArc Info gridzipped2.709https://water.usgs.gov/GIS/dsdl/gwrisk.zipOtherArc Info gridzipped1https://water.usgs.gov/GIS/dsdl/gwrisk.tgzOtherColor map file for grid1https://water.usgs.gov/GIS/dsdl/gwrisk.clrOtherGeo TIFF imagezipped1https://water.usgs.gov/GIS/dsdl/gwrisk.tar.gzNone. This dataset is provided by USGS as a public service.20041108U.S. Geological SurveyAsk USGS -- Water Webserver Teammailing445 National CenterRestonVA201921-888-275-8747 (1-888-ASK-USGS)https://answers.usgs.gov/cgi-bin/gsanswers?pemail=h2oteam&subject=GIS+Dataset+gwriskFGDC Content Standards for Digital Geospatial MetadataFGDC-STD-001-1998