Authors:Shariq Nawaj Khan and Upasana Yadav
Abstract: One of the most urgent environmental and public health issues of the twenty-first century is urban air pollution, especially in rapidly urbanizing cities in developing nations like India. The complicated and nonlinear behavior of air contaminants typically limits the efficiency of conventional statistical modeling methodologies. To address these issues, this work uses ground-based monitoring data from Lucknow, India, to examine the use of machine learning (ML) approaches for modeling and forecasting urban air quality. From June to September 2025, several air quality monitoring sites provided the city-level aggregated concentrations of six major pollutants: PM₂¹, NO2, O3, NO, SO3, and CO. Two frequently used ML models, Extreme Gradient Boosting (XGBoost) and Artificial Neural Networks (ANNs), were developed and tested using an 80:20 train–test split combined with 10-fold cross-validation. The coefficient of determination (R2) and mean squared error (MSE) were used to evaluate the model’s performance. The findings suggest that pollutants with relatively stable emission patterns, like PM₂¹ and NO₂, have significant predictive capability, but pollutants with highly variable and localized emission features, like CO and NO, have weaker prediction. The results highlight the fact that no single model is always the best and that machine learning effectiveness varies depending on the pollutant. All things considered, this study shows how ML-based methods can enhance urban air quality assessment and assist evidence-based environmental planning in mid-sized Indian cities.
Keywords: Air Pollution, Machine Learning, XGBoost, Artificial Neural Networks, Urban Air Quality
DOI:https://doi.org/10.66095/ijair.2026.v2.S1.20
Pages: 197-215
Download Full Article: Click Here