** MODELS PREDICT ENSEMBLE AVERAGES**

Venkatram, A. (1979): The expected deviation of observed concentrations from predicted ensemble means. Atmos. Environ. (11):1547-1549.

“…we expect the 1-h averaged concentration to deviate from the ensemble mean by more than 100%. …This analysis shows that under unstable conditions, poor comparison of observations with predictions should be expected. …Our discussion brings up the question of model validation. It is clear that the expected deviation can be reduced by averaging several observations under similar conditions. Then, for adequate validation, the predicted concentration should be compared against an average derived from an ensemble of measured concentrations….”

Fox, D.G. (1984): Uncertainty in air quality modeling. Bull. Amer. Meteoro. Soc. (65):27-36.

“…There is agreement in the meteorological community that air quality modeling results contain various types of uncertainty (although not state explicitly) such that they represent no more than an estimate within the distribution of possible values. Generally, turbulence must be averaged in time, space, or over a number of realizations of the flow pattern, in order to elicit meaningful information. In doing this, parameters such as the dispersion coefficients are defined by their mean or average values without consideration of the variation around that mean. … The details of atmospheric motion fields are not predictable without uncertainty, nor is the concentration of a pollutant released into any turbulent fluid predictable without uncertainty. In studies of turbulence, it is convenient to introduce the notion of an ensemble, namely a number of repeats of the same ‘experiment,’ holding external conditions (boundary and initial conditions) fixed….”

Weil, J.C., R.I. Sykes, and A. Venkatram (1992): Evaluating air-quality models: review and outlook. Journal of Applied Meteoro. ((31):1121-1145.

“…Air-quality models predict the mean concentration for a given set of conditions (i.e., an ensemble), whereas observations are individual realizations drawn from the ensemble. The natural variability: is the random concentration fluctuation about the mean and is large (of the order of the mean; section 2). The steering committee considered the natural variability to be very significant in hampering the performance evaluation. …The natural variability, also called the inherent uncertainty (Fox 1984; Venkatram 1982), is caused by PBL turbulence. It arises because the details of the velocity field are not the same in each realization of a turbulent flow….”

**MODEL EVALUATION IDEAS**

Presentation at the Guideline on Air Quality Models March 19-21, 2013 at the Sheraton Raleigh Hotel, Raleigh, N.C. Provided is the power-point presentation, extended notes, and handout notes.

ASTM D6589 Standard Guide for Statistical Evaluation of Atmospheric Dispersion Model Performance. This Standard Guide provides concepts that are useful for the comparison of modeled air concentrations with observed field data. In the Annex of this Standard Guide a procedure is outlined that compares observed average centerline concentration values with average modeled centerline concentration values. Provided are links to 1) the Standard Guide, 2) FORTRAN code that implements the procedure, and 3) data sets used to demonstrate the procedure.

**DISCUSSION**

It is not a given that the concepts espoused by my March 2013 presentation or in D6589 are universally accepted. There are some who yet insist that it is sufficient to compare the arc-maxima concentration values seen at each downwind distance with that simulated by the model. I am hoping that my March 2013 presentation will discourage such analyses.

The Harmonization Initiative (which was begun in 1991) is continuing effort to reach consensus on best practices in air quality characterization. As part of the Harmonization Initiative, conferences are held every 18 months to stimulate development of ideas and ultimately consensus on best practices. As part of this effort, a model validation kit was created to stimulate discussion on how best to evaluate model performance.