Prediction and Forecasting of Air Quality Index in Chennai using Regression and ARIMA time series models
Abstract
Air is one of the most fundamental constituents for the sustenance of life on Earth. The meteorological and traffic factors, consumption of non-renewable energy sources, and industrial parameters are steadily increasing Air pollution. These factors affect the welfare and prosperity of life on Earth; therefore, the nature of air quality in our environment needs to be monitored continuously. The Air Quality Index (AQI) which indicates the quality of air is influenced by several individual factors such as accumulation of NO2, CO, O3, PM2.5, SO2, and PM10. This research paper aims to predict and forecast the AQI, with the help of Machine Learning (ML) techniques namely linear regression and time series analysis. Primarily, Multi Linear Regression (MLR) model, supervised machine learning is developed to predict AQI. NO2, Ozone (O3), PM 2.5 and SO2 sensor output collected from Central Pollution Control Board (CPCB) – Chennai region, India feed as input features and optimized AQI calculated from sensor’s output set as a target to train the regression model. The obtained model parameters are validated with new and unseen sensor’s output. The performance is analyzed using different quantitative indices. Secondly, the Auto Regressive Integrated Moving Average (ARIMA) time series model is applied to forecast the AQI in future time. The obtained model parameters are again validated with new unobserved data for time. The result shows that both models are highly efficient and accurate in predicting and forecasting AQI.