US Environmental Protection Agency20090213EPA statistical predictions of Ozone for 2002 at 36km gridResearch Triangle Park, NCU.S. EPA Office of Research & Development (ORD) - National Exposure Research Laboratory (NERL)The O3Surface_36km_2002 file is the output data file from a hierarchical Bayesian model (HBM) that combines ozone monitoring data from National Air Monitoring Stations/State and Local Air Monitoring Stations (NAMS/SLAMS) and simulated ozone data from the deterministic prediction model, Models-3/Community Multiscale Air Quality (CMAQ). The file contains estimates of the mean and standard error for 36 km x 36 km grids across the contiguous US for each day of the modeling year.The data are intended for use by professionals comparing air quality and health outcomes, through techniques such as case crossover analysis. Other uses may be developed at a later time. The standard errors of the estimates should be taken in to account when using the results.January 2, 2002December 30, 2002ground conditionCompleteNone planned
-128.09
-65.47
51.46
23.10ISO 19115 Topic CategoryenvironmentEPA GIS Keyword ThesaurusAirNoneUnited StatesData Are Restricted to Internal EPA and CDC Personnel, State Partners and Academic Partners.The data are intended for use by professionals comparing air quality and health outcomes, through techniques such as case crossover analysis. Other uses may be developed in collaboration with EPA. The standard errors of the estimates should be considered when using the predicted means. Please confirm that you are using the most recent copy of both data and metadata. Acknowledgement of the EPA would be appreciated.Branch Chiefmailing and physical addressUSEPA/NERL/HEASD D201-03Research Triangle ParkNC27711(919) 541-5537Dimmick.Fred@epa.govhttp://www.epa.gov/U.S. Environmental Protection Agency, NERL, CDC IAG TeamFred DimmickFIPS Pub 199No ConfidentialityStandard Technical ControlsAcknowledgement of EPA as the source of these estimates would be appreciated.The HB data are created ina window XP professional environment using a compiled c-code implementation of the HB Model. The model produces comma separated values (CSV). For CDC, the CSV files are transformed to xml files using SAS and then the data xml file and this metadata file are compressed using WinZip into a file to allow efficient transfer to CDC via EPA's CDX service.NoneFeatures represented have not been tested for completenessThe grid structure is defined a priori. The alignment of air quality measurements to the grid structure is done by using the horizontal position associated with the measurement monitors.12728The grid is 36km by 36km by design. Monitors within a grid are assigned to the centroid of the grid and may be as far away as 12.728 km away from the centroid.Essentially all monitors are placed within 3 meters of the surface of the earth.3Qualitative estimateCalculate the predictions20090121Metadata produced.20090211Metadata importedUnknown0.0000010.000001Decimal degreesLambert conformal with spherical earthSee AttributesSee AttributesSee AttributesmetersThe O3Surface_36km_2002 file is the output data file from a hierarchical Bayesian model (HBM) that combines ozone monitoring data from National Air Monitoring Stations/State and Local Air Monitoring Stations (NAMS/SLAMS) and simulated ozone data from the deterministic prediction model, Models-3/Community Multiscale Air Quality (CMAQ). The file contains the posterior means and standard errors of the estimated space-time surface for a surface made of 36 km x 36 km contiguous grids. The contiguous 36 km x 36 km grids cover the contiguous United States. Finally, the time frame of interest is January 2, 2002 through December 30, 2002. The file includes the following variables: Date, Latitude, Longitude, Grid Row (row), Grid Column (col), Posterior mean estimated ozone concentration in ppm (Ozone_pred), and Standard error of the estimated ozone concentration (Ozone_stdd).
Input data
The air quality monitoring data from the NAMS/SLAMS network were downloaded from the Air Quality System (AQS) database. Only Federal Reference Method (FRM) samplers were included in the dataset. Data from first Pollutant Occurrence Codes (POC) were used. The data was downloaded covering January 1, 2002 through December 31, 2002.
The CMAQ data was created from version 4.6 of the model using CBIV mechanism, and covered January 1, 2002 through December 31, 2002. These model results were developed in July 2007 and delivered to the HBM team for inclusion in this part of the process in October 2007. The data are daily maximum 8-hour ozone concentration calculated on a 36 km x 36 km grid across the contiguous 48 States. These CMAQ results are based on (1) the emissions data from the EPA's National Emissions Inventory (NEI) 2002 (developed using mobile emissions model Mobile 6 and (2) daily continuous emissions monitoring (CEM) data for the major NOx point sources). In addition, the meteorological data used for these model results is from Mesoscale Model 5 (MM5) version 3.6.3 simulations (FDDA, Pleim-Xu lsm).
The Hierarchial Bayesian Model (HBM)
The HBM combines the actual monitoring data (NAMS/SLAMS) and the estimated ozone concentration surface (CMAQ) to predict ozone through space and time. The model assumes that both the actual monitoring data and the CMAQ data provide good information about the same underlying pollutant surface, but with different measurement error structures. It gives more weight to the accurate monitoring data in areas where monitoring data exists and relies on bias adjusted CMAQ data in areas where no monitoring data is available. The modeling is divided into hierarchical components where each level of the hierarchy is modeled conditional on the preceding levels. To fit the model, a custom-designed Monte Carlo Markov Chain (MCMC) software was used. Priors for the model and simulation parameters are specified for each run of the model.
The projections for the grid structure are as follows:
Projection information for lat/lon contained in the raw HB datasets
Horizontal coordinate system
Geographic coordinate system name: GCS_WGS_1984
Details
Geographic Coordinate System
Latitude Resolution: 0.000000
Longitude Resolution: 0.000000
Geographic Coordinate Units: Decimal degrees
Prime Meridian: Greewich
Angular unit: Degree
Radians/unit: 0.0174532925199433
Geodetic Model
Horizontal Datum Name: D_WGS_1984
Ellipsoid Name: WGS_1984
Semi-major Axis: 6378137.000000
Denominator of Flattening Ratio: 298.257224
McMillan, N., Holland, D. M., Morara, M., and Feng, J. (2010). Environmetrics 21, 48-65 and DIMMICK, F., E. HALL, and J. Tikvart. Hierarchical Bayesian Model (HBM)-Derived Estimates of Air Quality for 2002 - Annual Report. U.S. Environmental Protection Agency, Washington, DC, EPA/600/R-10/017, 2010.Branch Chiefmailing and physical addressUSEPA/NERL/HEASD D201-03Research Triangle ParkNC27711(919) 541-5537Dimmick.Fred@epa.govhttp://www.epa.gov/U.S. Environmental Protection Agency, NERL, CDC IAG TeamFred DimmickOffline DataAlthough these data have been processed successfully on a computer system at the Environmental Protection Agency, no warranty expressed or implied is made regarding the accuracy or utility of the data on any other system or for general or scientific purposes, nor shall the act of distribution constitute any such warranty. It is also strongly recommended that careful attention be paid to the contents of the metadata file associated with these data to evaluate data set limitations, restrictions or intended use. The U.S. Environmental Protection Agency shall not be held liable for improper or incorrect use of the data described and/or contained herein.2009021320130213Branch Chiefmailing and physical addressUSEPA/NERL/HEASD D201-03Research Triangle ParkNC27711(919) 541-5537Dimmick.Fred@epa.govhttp://www.epa.gov/U.S. Environmental Protection Agency, NERL, CDC IAG TeamFred DimmickFGDC Content Standard for Digital Geospatial MetadataFGDC-STD-001-1998201003060918320020080925Point