Ensemble averaging based assessment of spatiotemporal variations in ambient PM2.5 concentrations over Delhi, India, during 2010–2016

GeoHealth Hub India Team

Research output: Contribution to journalArticlepeer-review

32 Scopus citations


Elevated levels of ambient air pollution has been implicated as a major risk factor for morbidities and premature mortality in India, with particularly high concentrations of particulate matter in the Indo-Gangetic plain. High resolution spatiotemporal estimates of such exposures are critical to assess health effects at an individual level. This article retrospectively assesses daily average PM2.5 exposure at 1 km × 1 km grids in Delhi, India from 2010 to 2016, using multiple data sources and ensemble averaging approaches. We used a multi-stage modeling exercise involving satellite data, land use variables, reanalysis based meteorological variables and population density. A calibration regression was used to model PM2.5: PM10 to counter the sparsity of ground monitoring data. The relationship between PM2.5 and its spatiotemporal predictors was modeled using six learners; generalized additive models, elastic net, support vector regressions, random forests, neural networks and extreme gradient boosting. Subsequently, these predictions were combined under a generalized additive model framework using a tensor product based spatial smoothing. Overall cross-validated prediction accuracy of the model was 80% over the study period with high spatial model accuracy and predicted annual average concentrations ranging from 87 to 138 μg/m3. Annual average root mean squared errors for the ensemble averaged predictions were in the range 39.7–62.7 μg/m3 with prediction bias ranging between 4.6 and 11.2 μg/m3. In addition, tree based learners such as random forests and extreme gradient boosting outperformed other algorithms. Our findings indicate important seasonal and geographical differences in particulate matter concentrations within Delhi over a significant period of time, with meteorological and land use features that discriminate most and least polluted regions. This exposure assessment can be used to estimate dose response relationships more accurately over a wide range of particulate matter concentrations.

Original languageEnglish
Article number117309
JournalAtmospheric Environment
StatePublished - 1 Mar 2020


  • Hybrid models
  • Machine learning
  • Particulate matter
  • Pollution exposure
  • Satellite observations

ASJC Scopus subject areas

  • General Environmental Science
  • Atmospheric Science


Dive into the research topics of 'Ensemble averaging based assessment of spatiotemporal variations in ambient PM2.5 concentrations over Delhi, India, during 2010–2016'. Together they form a unique fingerprint.

Cite this