Improving Prediction of Air Quality Using Machine Learning Techniques
Abstract
Particulate Matter (PM2.5) significantly contributes to air pollution in India, posing serious health risks due to short- and long-term exposure. Accurate prediction of PM2.5 levels is essential for devising effective strategies to reduce emissions and manage air quality. This study explores the use of machine learning models to predict daily PM2.5 concentrations. Five models Linear Regression, Gradient Boosting Regression, K-Nearest Neighbors Regression, Decision Tree Regression, and Random Forest Regression were developed using features such as season, date, year, weekday, hour, and other relevant factors. A comprehensive dataset spanning six years (2017–2022) was used to train and test these models. The Decision Tree Regressor demonstrated the highest accuracy, achieving an R-squared value of 0.95 and a Mean Absolute Error (MAE) of 2.88. These findings suggest that the Decision Tree Regressor is the most suitable model for forecasting PM2.5 concentrations, aiding air quality improvement initiatives.