Project ID: 2008OH64BTitle: Competitive Learning to Develop a Biomarker Forecasting Tool for Classifying Recreational Water QualityProject Type: ResearchStart Date: 7/01/2008End Date: 6/30/2009Congressional District: 1Focus Categories: Water Quality, Surface Water, Non Point PollutionKeywords: water quality, recreational management, non point pollution, neural networks, microbialPrincipal Investigator: Boccelli, Dominic L Federal Funds: $ 28,248Non-Federal Matching Funds: $ 56,496Abstract: Recreational users of urban-influenced surface waters, such as in the Cincinnati region, can be exposed to unhealthy levels of microbial contamination due to increased runoff resulting from greater impervious surface area and direct discharge of combined sewer and/or storm water into surface waters. Unfortunately, laboratory testing requires, at a minimum, 24 hours for analysis and reporting thereby eliminating the possibility of alerting the public to potentially unsafe conditions in a timely fashion. The objective of this study is to develop a classification tool that accurately identifies microbial outbreaks using readily available data to provide engineers, managers, regulators, and public health officials an opportunity to inform the populace regarding the public health status of recreational waters in an almost real-time environment.

The use of Linear Vector Quantization (LVQ) is proposed to develop a water quality classification tool for use in a regional Recreation Management Program. Unlike most data driven tools that predict concentration first, which is then used to classify the water quality, the LVQ approach is a statistical classification approach intended to classify the water quality directly from readily available hydrologic and meteorologic data. By eliminating the need to predict microbial concentrations, the uncertainties associated with the predictions have also been eliminated.

The LVQ algorithm will be developed using water quality, meteorologic, and hydrologic data associated with the Ohio River and three of its tributaries located in the Cincinnati region. Water quality measurements were performed at eighteen different locations in the system over 1-1/2 recreational seasons (May - Oct) and will be paired with available hydrologic and meteorologic data. The explanatory variables to be considered include precipitation event characteristics (duration, intensity, and total volume), number of preceding dry weather days, and total previous rainfall. Initial data exploration will be performed to evaluate pertinent explanatory variables.

The resulting classification performance of the LVQ algorithm will be compared to more typical approaches (multivariate linear regression and neural networks) based on the same data set. Since water quality classification is the most important aspect of these algorithms, the evaluation will be focused on comparing correct classifications, with particular emphasis on the true-positive and false-negative rates. The best performing algorithm will be developed into a tool capable for near-time use in predicting water quality associated with the local riverine system.