National Geochemical Atlas: The geochemical landscape of the conterminous United States derived from stream sediment and other solid sample media analyzed by the National Uranium Resource Evaluation (NURE) Program

National Geochemical Atlas: The geochemical landscape of the
conterminous United States derived from stream sediment and
other solid sample media analyzed by the National Uranium
Resource Evaluation (NURE) Program

A subset of the National Uranium Resource Evaluation (NURE)
Hydrogeochemical and Stream Sediment Reconnaissance (HSSR) data
were used to produce a set of map images depicting the general
geochemistry of the conterminous US.

Approxiamately 260,000 samples from the continental US were
analyzed. These consisted of solid samples, including stream,
lake, pond, spring, and playa sediments, and soils. Data for
eleven elements were analyzed and included on this release of
the National Geochemical Atlas: Na, Ti, Fe, Cu, Zn, As, Ce, Hf,
Pb, Th, and U.

Purpose:

The National Uranium Resource Evaluation (NURE) program of the
Department of Energy (DOE) collected a vast amount of chemical
data on sediment, soil, and water from the United States in the
late 1970's and early 1980's. This element of the NURE program
was known as the Hydrogeochemical and Stream Sediment
Reconnaissance (HSSR). The NURE HSSR data have long been
available to the public in a variety of formats, ranging from the
original paper reports produced by the DOE (Averett, 1984),
to comprehensive digital releases on CD-ROM by the U.S.
Geological Survey in the last few years (Hoffman and Buttleman,
1994; 1996), to digital releases on the Internet of reformatted
and cleaned data (Smith, 1998). While these publications remain
the best sources of the complete, primary data, and are
accompanied by documentation of the sampling protocols, sample
characteristics, and analytical methods, they are difficult to
use for geochemical research, especially when the study area
covers a large part of the United States. This publication is
intended to allow the rapid visualization of the geochemical
landscape of the United States using the NURE HSSR data. Here,
the user is relieved of the responsibility of selecting and
processing the raw data.

Supplemental_Information:

Because the NURE HSSR data have been processed by the author for
the production of these images, the user must use a degree of
caution in interpreting the maps produced here. One must
understand the methods used in deriving the data in order to
judge the significance of any particular map feature. For
reference, the raw data used to produce these images are
available in digital form (Hoffman and Buttleman, 1996), for
examination by sophisticated users.

Issues relating to the analytical methods are best discussed
in some of the source materials generated by the NURE program.
In addition, a number of problems relating to the variability
of the lower limit of detection were encountered during the
processing of the data for these images. To some extent the
categorization of element concentration into classes covered
up some of these discrepancies.

Logical_Consistency_Report:

A variety of data problems were identified during the
processing that led to the production of these map images.
Most of these have satisfactory solutions, but the user
must review the processing steps to judge whether they are
appropriate for new uses.

Completeness_Report:

Since the images are based on data from surveyed quadrangles,
holes both large and small occur in the images. Large holes
are typically the size and shape of groups of quadrangles, and
the data were either unavailable in the source information or
were discarded for reasons having to do with errors or
inconsistencies found during processing. None of the maps
shows continuous data for the entire area.

National Geochemical Data Base:
1. National Uranium Resource Evaluation (NURE) Hydrogeochemical and Stream Sediment Reconnaissance (HSSR) data for Alaska, formatted for GSSEARCH data base software;
2. NURE HSSR Data formatted as dBASE files for Alaska and the conterminous United States;
3. NURE HSSR Data for Alaska and the conterminous United States as originally compiled by the Department of Energy

308 quadrangle files covering the continental US from Hoffman
and Buttleman (1996) contained data for stream, lake, or spring
sediments, and a subset of 43 of these files also contained
data for soils. Records covering these sample media were used
to produce these images.

Source_Citation_Abbreviation:DDS-47Source_Contribution:Information from the PLUTO database

Process_Step:

Process_Description:

These images were made by examining a series of dBase (DBF)
files, each containing the point data for a single element in a
set of solid (sediment) samples from the NURE HSSR program.

The starting point for the data processing that yielded these
images is the set of quadrangle-by-quadrangle DBF files of NURE
HSSR data found in Hoffman and Buttleman (1996). Note that
these files are not the raw NURE data, but are themselves
processed from the original digital files (on tape) produced by
DOE. Indeed, the DOE tapes are also not the true raw data from
the program, as there was a manual data-processing step to
transfer data from paper reports. 308 quadrangle files
(covering the continental U.S.) from Hoffman and Buttleman
(1996) contained data for stream, lake, or spring sediments, and
a subset of 43 of these files also contained data for soils.
Records covering these sample media were used to produce these
images.

Initial data processing and clean-up

Most of the selection of records from the original DBF files
and other primary data extraction tasks were done with the
Paradox database program. The steps in this procedure were as
follows:

1.1 Record selection

Records were extracted from the quadrangle DBF files for the
appropriate sample media using one or more of the following
field codes. (See Hoffman and Buttleman, 1994, for explanation
of codes.)

After surveying each file (through a series of Paradox
queries), a new query was constructed that extracted all
records for stream sediments (wet and dry), lake and pond
sediments (including dry lakes), spring sediments, and soils.

Most chemical data in the quadrangle DBF files are stored in
parts-per-billion (ppb). Paradox was used to convert each field
into a more appropriate unit: parts-per-million (ppm) for trace
elements, and weight percent for major elements (Al, Ca, Fe, K,
Mg, and Na).

1.4 Record consolidation

Many samples were analyzed by more than one laboratory, or by
more than one method. In these cases, there are multiple
records in the quadrangle DBF files for an individual sample
location, each with analyses for different elements. These
records were found and combined into a single record.

Paradox was used to sort the records by latitude and longitude.
A temporary DBF file was generated, and read by a DOS FORTRAN
program, ECLEAN, written by the author (unpublished). This
program searched for consecutive records that had identical or
nearly identical geographic coordinates (within 0.0005 degrees,
or ~50 m, of each other). These were assumed to be the same
sample, as round-off errors sometimes affected the 4th decimal
place. ECLEAN then combined these records, element by element,
into a single new record. In the few cases where data for the
same element was present in two or more records, the highest
value was arbitrarily chosen. This process also had the effect
of consolidating samples actually collected as duplicates at a
single location into single records. ECLEAN also eliminated
records with no chemical data (and there were many of these).
The program then created a new DBF file with the consolidated
data.

Secondary data processing

At the beginning of this processing stage, the 308 original
quadrangle DBF files have been reduced to 308 new DBF files
containing only the geographic and chemical-element fields of
the sediment and soil data, without any duplicate or blank
records. Major systematic problems, as discussed above, have
been corrected. The following processing steps were used to
find and correct additional problems in the datasets, to search
for regional inconsistencies in the data, and to establish the
usefulness of data reported as upper limits ( for example <10
ppm).

2.1 Data surveying

The reduced DBF files were surveyed with a DOS FORTRAN program,
also written by the author, called GRIDPLOT. This program
reads in multiple DBF files, and produces a simple, color,
gridded map of the data for one element on the computer screen.
Systematic errors that were not found during primary data
processing could be seen visually, as discontinuities in the
colored map. In some cases, these could be traced to
systematic errors in the quadrangle DBF files, especially
errors in the position of decimal points. These were corrected
by repeating the primary processing for the affected
quadrangle. Other discontinuities are caused by analytical
errors, and were handled through the data leveling procedure
described next.

2.2 Data leveling

In some areas, generally in the western U.S., one or more
quadrangles, or parts of quadrangles, would appear to be
discontinuous with adjacent quadrangles for a given element,
when viewed with GRIDPLOT. In many such instances, a good case
can be made that there is a systematic analytical error (that is,
an accuracy problem, probably due to different analytical
methods or interlaboratory calibration problems) across the
discontinuity. The best argument for the occurrence of this
type of error is that regional chemical trends are seen on both
sides of the discontinuity, and the application of a simple
correction factor can make the data appear continuous. In
these cases, a correction factor is supplied to GRIDPLOT for
the affected areas, and the factor is adjusted until the
gridded map appears smooth and continuous. In other cases,
either no correction factor can correct the discontinuity, or
regional trends are absent in certain quadrangles and the data
appear to be random. Such data were discarded and not used to
produce these images.

2.3 Data below detection limits

A negative concentration of an element in the quadrangle DBF
files indicates that the value is an upper limit (for example
-10 means <10). These values present a special problem in
creating map coverages of geochemical data. The philosophy
adopted here is simple: steps were taken to ensure that all
such upper limits fall within the lowest interval in the final
map legend, and thus are known to be correctly categorized.

First, two histograms were prepared for each element, one
showing the concentration range of unqualified data, the other
showing only upper limits. For most elements, the vast majority
of the data fell in the first histogram, and markers were
inserted into this plot showing the values of every 5th
percentile (for reference). The second histogram was displayed
below the first and compared visually. The strategy was to
select a cutoff value below which upper limits are to be
retained, such that they do not affect the accuracy of the map.
Above this cutoff value, upper limits are deleted from the final
dataset. The graphical result of deletions of this type are
small holes in the map where grid cells could not be assigned
real values.

2.4 Data extraction

Once the data were leveled, upper limit cutoffs were
established, and areas of bad data were identified, the
GRIDPLOT program was run again to extract values for a single
element from all 308 processed quadrangle DBF files. For the
special case of uranium, GRIDPLOT was programmed to make
choices about which data field to use for the final value.
Uranium is typically stored in one of five fields in the
original quadrangle DBF files: one labeled as CONU, the others
as CONCN01, CONCN02, CONCN05, and CONUDN. The CONC05 field was
given priority over the CONU field if both were filled, and
data in the CONCN01 and CONCN02 fields were used in the absence
of data in the first two fields. The CONUDN field (U by
delayed neutron) was only coded in few percent of the samples (
in only 9 quadrangles), but these data were not used here. The
output from this data processing step is a series of elemental
DBF files of useable NURE data.

Major errors corrected

Several major errors in the NURE HSSR data were identified and
corrected during the above data-processing steps. These errors
are present in the original DBF files and composite database of
Hoffman and Buttleman (1994; 1996). The errors will be
corrected in the a new database (Smith, 1998), but as of this
time only a small part of the United States is covered by this.

3.1 Miscoded samples

The data survey conducted for each quadrangle DBF file in step
1.1 uncovered a block of stream-sediment samples miscoded as
stream water in seven quadrangles in the northeastern U.S.
(Boston, Glen Falls, Lake Champlain, Lewiston, Newark, Scranton,
and Williamsport). These records were altered to give them the
correct coding prior to any data processing.

3.2 Data in incorrect units

In about 30,000 samples collected and analyzed by Oak Ridge
Gaseous Diffusion Plant (ORGDP) and tabulated in the quadrangle
DBF files, major elements (Al, Ca, Fe, K, Mg, and Na) plus As
and Se were all tabulated incorrectly, in units other than ppb.
Over 70 quadrangles contain data affected by this problem.
These records can be identified from the lack of coding in the
SAMPTYP field, and a value of 4 coded in the SAMPMDC field.
These problems were corrected as a group.

About 15,000 records found in several dozen quadrangles in the
western U.S. (samples analyzed but not collected by ORGDP) also
contain major element data in ppm instead of ppb, although
trace elements are all coded correctly. Most of these are
coded as soils (SAMPTYP=59), talus (SAMPTYP=62), or uncoded
in this field (SAMPTYP=blank), and all have a value of M coded
in the LTYPC field, which stands for sediment. These were
also corrected by special handling.

Data products

Themes with names of the form Grid: Cu are elemental
concentration maps, produced from a gridded version of the
point data. These bitmap files (TIFF) are based on grids made
with the MINC program of Webring (1981), which employs a
minimum curvature interpolation of the point data to create a
smooth surface. The grid-cells used were 2 km on each side.

Following the gridding operation, the program GCLR written by
R.W. Simpson of USGS in Menlo Park, California was used to
produce a color shaded relief map. The color scheme of these
maps is similar to that used in the point data themes, as it is
based upon the distribution of the underlying point data. Here,
seven intervals were used, corresponding to the lowest 40th, the
40th-80th, the 80th-90th, the 90th-95th, the 95th-98th, the
98th-99th, and the 99th-100th percentiles. The legends for all
these maps, showing the actual concentration values
corresponding to each color interval, are shown in a special
image called Scalebar.tif. It shows the concentrations of each
element corresponding to each color interval in the gridded
elemental maps.

The U.S. Geological Survey (USGS) provides this vector data as is. The USGS makes no
guarantee or warranty concerning the accuracy of information contained in the raster data.
The USGS further makes no warranties, either expressed or implied as to any other matter
whatsoever, including, without limitation, the condition of the product, or its fitness
for any particular purpose. The burden for determining fitness for use lies entirely
with the user. Although this data has been processed successfully on computers at the
USGS, no warranty, expressed or implied, is made by the USGS regarding the use of this
data on any other system, nor does the fact of distribution constitute or imply any such
warranty.