The prediction of product return rates with ensemble machine learning algorithms

  • Ergun Eroglu


There may not always be up-to-date data available for planning. Predicted data are used especially for future planning. Due to errors in such planning based on prediction, many products enter the reverse logistics network without completing the shelf life. Especially in textile sector, because it is fashion-dependent sector, it is the most important point of planning to be able to make accurate estimates in order to avoid unnecessary resource utilization and to provide minimum cost. It is difficult to establish a mathematical model because the prediction problems in real life have multivariate structure and unknownparameters. Generally, most of the studies in literature have been based on time series prediction. But due to the changing fashion and demands of consumers, there are significant differences between demand forecasts and real data. So, in the problems with unknown parameters and mutlivariate structure Ensemble Machine Learning (EML) methods are preferred in recent years because they give more accurate results than other prediction methods.

Unlike other studies, this paper is the first study that the product return rate in textile sector has been predicted with the Bagging, Random Subspace (RSS), Stacking and Vote algorithms from Ensemble Machine Learning methods for the first time. In this direction, it is aimed to concentrate on the returns of the products sold with the preferences of the customers and to predict the returns more accurately. In this way, the consumer information obtained as a result of the analyzes can provide more accurate planning in avoiding unnecessary production, transportation and storage activities, reducing costs, resource utilization and environmental pollution. In addition, it is one of the main aims of the study to contribute to the literature by determining the parameters that can be used in predicting the return rates.

High performance results were obtained with Stacking and Vote algorithms from Ensemble Machine Learning methods. The obtained results were given comparatively and the correlation coefficient of 83.89% was reached. Even based on product prices at the lowest $6.6 level, the 0.138 increase in prediction performance means avoiding an average cost loss of $2,865,124.8 in per year.

Industrial Engineering