A movie box office revenues prediction algorithm based on human-machine collaboration feature processing

  • Dongqi Wang Department of Data Science and Management Engineering, School of Management, Zhejiang University, Hangzhou, China.
  • Yanqing Wu Meta-intelligence and Decision Lab., Hangzhou OR Cloud Co.ltd, Hangzhou, China.
  • Chenmin Gu Ningbo High School, Ningbo, China.
  • Yiqin Wang Ningbo High School, Ningbo, China.
  • Xingyu Zhu Ningbo High School, Ningbo, China.
  • Weihua Zhou Department of Data Science and Management Engineering, School of Management, Zhejiang University, Hangzhou, China.
  • Xin(Maxwell) Lin Department of Data Science and Management Engineering, School of Management, Zhejiang University, Hangzhou, China.

Abstract

Improving the accuracy of box office revenue forecasts is conducive to stimulating the creation, market investment, infrastructure construction, and rational allocation of public resources in the film market, as well as promoting social welfare and cultural prosperity. Since the existing box office revenue prediction algorithm does not consider film industry structure, the prediction accuracy is not satisfying. This paper firstly builds a two-stage human-machine collaborative feature processing framework. In the first stage, based on the box office data, the regression decision tree algorithm is used to process all the box office features preliminarily and delete the unimportant features automatically. In the second stage, feature processing is coupled with the built Artificial Neural Network (ANN). In this stage, the features processed by the machine are manually classified, and multiple, incompatible feature sets are divided. After designing the incompatible set network pruning algorithm, the neural network is pruned. We construct the data set with a total of 7098 movies crawled online on four platforms. Numerical experimental results show that the Mean Absolute Error (MAE) of the two-stage algorithm is significantly better than the baseline model, which can effectively reduce the noise caused by encoding between incompatible features directly, improve the prediction accuracy of ANN, accelerate the forward inference speed of ANN and reduce the consumption of computing resources.

Published
2022-11-23