Enhanced Sales Forecasting Using XGBoost and Feature Engineering: A Comparative Evaluation
Abstract
Sales forecasting is essential for planning supply chains, setting prices, and making business decisions. However, traditional forecasting methods often miss nonlinear trends and complex influences in changing markets. This study examines a more effective sales forecasting method that uses extreme gradient boosting (XGBoost) and detailed feature engineering. We added features such as day, week, month, seasonal trends, lag variables, rolling-window features, promotional indicators, and holiday effects to improve the prediction model. We tested the proposed approach against several standard models ARIMA, linear regression, and random forests using performance measures such as RMSE, MAE, and MAPE. The results show that the XGBoost model outperforms all baseline models, achieving average error reductions of 15–22% and greater stability across different product categories. The evaluation shows that using XGBoost with organized feature engineering greatly improves forecasting accuracy, providing a flexible analytics solution for retail and business planning. The findings encourage the use of forecasting systems driven by machine learning to increase efficiency and help with strategic decisions. Sales forecasting is vital for managing inventories, increasing customer satisfaction, and guiding important business decisions. This study examines how to use structured retail data to predict sales with machine learning. We compare the effectiveness of a basic random regressor with XGBoost, a specialized feature-engineered gradient boosting method. The experimental results indicate that XGBoost performs better than the basic model when supplemented with customized time-related and categorical data. The proposed method effectively addresses real-world sales prediction challenges, as shown by its reduced root mean squared error (RMSE) of 945.18 and an R² score of 0.81.