Data Size Requirement for Forecasting Daily Crude Oil Price with Neural Networks


  • Serkan Aras
  • Manel Hamdi



neural networks, forecasting, data size, structural break, crude oil price


When the literature regarding applications of neural networks is investigated, it appears that a substantial issue is what size the training data should be when modelling a time series through neural networks. The aim of this paper is to determine the size of training data to be used to construct a forecasting model via a multiple-breakpoint test and compare its performance with two general methods, namely, using all available data and using just two years of data. Furthermore, the importance of the selection of the final neural network model is investigated in detail. The results obtained from daily crude oil prices indicate that the data from the last structural change lead to simpler architectures of neural networks and have an advantage in reaching more accurate forecasts in terms of MAE value. In addition, the statistical tests show that there is a statistically significant interaction between data size and stopping rule.

JEL Codes - Q47; C45; C53


Alexandridis, A., and Livanis, E., 2008. Forecasting Crude Oil Prices Using Wavelet Neural Networks. Paper presented at the 5th FSDET, Athens, Greece.

Alvarez-Ramirez, J., Alvarez, J., and Rodriguez, E., 2008. Short-term predictability of crude oil markets: A detrended fluctuation analysis approach. Energy Economics, 30(5), 2645-2656.

Amin-Naseri, M. R., and Gharacheh, E. A., 2007. A hybrid artificial intelligence approach to monthly forecasting of crude oil price time series. Paper presented at the 10th International Conference on Engineering Applications of Neural Networks.

Andrews, D. W., 1993. Tests for parameter instability and structural change with unknown change point. Econometrica, 61(4), 821-856.

Andrews, D. W., and Ploberger, W., 1994. Optimal tests when a nuisance parameter is present only under the alternative. Econometrica, 62(6), 1383-1414.

Aras, S., and Kocakoc, I. D., 2016. A new model selection strategy in time series forecasting with artificial neural networks: IHTS. Neurocomputing, 174, 974-987.

Azadeh, A., Moghaddam, M., Khakzad, M., and Ebrahimipour, V., 2012. A flexible neural network-fuzzy mathematical programming algorithm for improvement of oil price estimation and forecasting. Computers & Industrial Engineering, 62(2), 421-430.

Azoff, E. M., 1994. Neural network time series forecasting of financial markets: John Wiley & Sons, Inc.

Bai, J., 1997. Estimating multiple breaks one at a time. Econometric Theory, 13(03), 315-352.

Bai, J., and Perron, P., 1998. Estimating and testing linear models with multiple structural changes. Econometrica, 66(1), 47-78.

Bai, J., and Perron, P., 2003. Computation and analysis of multiple structural change models. Journal of Applied Econometrics, 18(1), 1-22.

Bildirici, M., and Ersin, O. O., 2013. Forecasting oil prices: Smooth transition and neural network augmented GARCH family models. Journal of Petroleum Science Engineering, 109, 230-240.

Box, G. E. P., and Jenkins, G. M., 1994. Time Series Analysis: Forecasting and Control. Upper Saddle River:: Prentice Hall PTR.

Chen, P. F., Lee, C. C., and Zeng, J. H., 2014. The relationship between spot and futures oil prices: Do structural breaks matter? Energy Economics, 43, 206-217.

Chiroma, H., Abdulkareem, S., Abubakar, A., and Usman, M. J., 2013. Computational intelligence techniques with application to crude oil price projection: A literature survey from 2001-2012. Neural Network World, 23(6), 523-551.

Chiroma, H., Abdulkareem, S., and Herawan, T., 2015. Evolutionary Neural Network model for West Texas Intermediate crude oil price prediction. Applied Energy, 142, 266-273.

Chow, G. C., 1960. Tests of equality between sets of coefficients in two linear regressions. Econometrica, 28(3), 591-605.

Clements, M. P., and Hendry, D. F., 2001. Forecasting non-stationary economic time series: MIT press.

Cybenko, G., 1989. Approximation by superpositions of a sigmoidal function. Mathematics of Control, Signals, and Systems, 2(4), 303-314.

de Souza e Silva, E. G., Legey, L. F., and de Souza e Silva, E. A., 2010. Forecasting oil price trends using wavelets and hidden Markov models. Energy Economics, 32(6), 1507-1519.

Dufays, A., 2016. Infinite-state Markov-switching for dynamic volatility. Journal of Financial Econometrics, 14(2), 418-460.

Erceg-Hurn, D. M., and Mirosevich, V. M., 2008. Modern robust statistical methods: An easy way to maximize the accuracy and power of your research. The American Psychologist, 63(7), 591-601.

Foster, W. R., Collopy, F., and Ungar, L. H., 1992. Neural network forecasting of short, noisy time series. Computers & Chemical Engineering, 16(4), 293-297.

Gabralla, L. A., and Abraham, A., 2013. Computational modeling of crude oil price forecasting: A review of two decades of research. International Journal of Computer Information Systems and Industrial Management Applications, 5, 729-740.

Ghaffari, A., and Zare, S., 2009. A novel algorithm for prediction of crude oil price variation based on soft computing. Energy Economics, 31(4), 531-536.

Godarzi, A. A., Amiri, R. M., Talaei, A., and Jamasb, T., 2014. Predicting oil price movements: A dynamic Artificial Neural Network approach. Energy Policy, 68, 371-382.

Hagan, M. T., Demuth, H. B., Beale, M. H., and De Jesús, O., 1996. Neural network design (Vol. 20). Boston: PWS publishing company.

Hamdi, M., and Aloui, C., 2015. Forecasting Crude Oil Price Using Artificial Neural Networks: A Literature Survey. Economic Bulletin, 35(2), 1339-1359.

Harvey, A., 1997. Trends, cycles and autoregressions. Economic Journal (London), 107(440), 192-201.

Hornik, K., Stinchcombe, M., and White, H., 1989. Multilayer feedforward networks are universal approximators. Neural Networks, 2(5), 359-366.

Iman, R., and Conover, W. J., 1983. Modern Business Statistics. New York: Wiley.

Jain, A., and Kumar, A. M., 2007. Hybrid neural network models for hydrologic time series forecasting. Applied Soft Computing, 7(2), 585-592.

Jasic, T., and Wood, D., 2003. Neural network protocols and model performance. Neurocomputing, 55(3), 747-753.

Keselman, H. J., Algina, J., Lix, L. M., Wilcox, R. R., and Deering, K. N., 2008. A generally robust approach for testing hypotheses and setting confidence intervals for effect sizes. Psychological Methods, 13(2), 110-129.

Keselman, H. J., Huberty, C. J., Lix, L. M., Olejnik, S., Cribbie, R. A., Donahue, B., and Kowalchuk, R. K., 1998. Statistical practices of educational researchers: An analysis of their ANOVA, MANOVA, and ANCOVA analyses. Review of Educational Research, 68(3), 350-386.

King, M. R., and Mody, N. A., 2010. Numerical and statistical methods for bioengineering: applications in MATLAB: Cambridge University Press.

Kolasa, M., Jóźwicki, W., Wojtyna, R., and Jarzemski, P., 2007. Optimization of hidden layer in a neural network used to predict bladder-cancer patient-survival. Signal Processing Algorithms, Architectures, Arrangements, and Applications SPA 2007, Poznan.

Lachtermacher, G., and Fuller, J. D., 1995. Back propagation in time-series forecasting. Journal of Forecasting, 14(4), 381-393.

Lackes, R., Borgermann, C., and Dirkmorfeld, M., 2009. Forecasting the price development of crude oil with artificial neural networks. In S. Omatu and et al. (Eds.), International Work-Conference on Artificial Neural Networks (pp. 248-255). Berlin: Springer Berlin Heidelberg.

Laerd Statistics, 2015. Statistical tutorials and software guides. from

Liu, J., Wu, S., and Zidek, J. V., 1997. On segmented multivariate regression. Statistica Sinica, 7(2), 497-525.

Lolli, F., Gamberini, R., Regattieri, A., Balugani, E., Gatos, T., and Gucci, S., 2017. Single-hidden layer neural networks for forecasting intermittent demand. International Journal of Production Economics, 183(PA), 116-128.

Maxwell, S. E., and Delaney, H. D., 2004. Designing experiments and analyzing data: A model comparison perspective (2nd ed. ed.). New York, NY: Psychology Press.

Mirmirani, S., and Li, H. C., 2004. A comparison of VAR and neural networks with genetic algorithm in forecasting price of oil. Advances in Econometrics, 19, 203-223.

Moshiri, S., and Foroutan, F., 2006. Forecasting nonlinear crude oil futures prices. Energy Journal, 27(4), 81-95.

Nelson, M., Hill, T., Remus, W., and O'Connor, M., 1999. Time series forecasting using neural networks: Should the data be deseasonalized first? Journal of Forecasting, 18(5), 359-367.<359::AID-FOR746>3.0.CO;2-P

Newbold, P., Carlson, W. L., and Thorne, B. M., 2009. Statistics for Business and Economics: Prentice Hall.

Osborne, J. W., and Overbay, A., 2004. The power of outliers (and why researchers should always check for them). Practical Assessment, Research & Evaluation, 9(6), 1-12.

Quandt, R. E., 1960. Tests of the hypothesis that a linear regression system obeys two separate regimes. Journal of the American Statistical Association, 55(290), 324-330.

Rehkugler, H., and Poddig, T., 1994. Finanzmarktanwendungen neuronaler Netze und okonometrischer Verfahren: Physica-Verlag HD.

Shambora, W. E., and Rossiter, R., 2007. Are there exploitable inefficiencies in the futures market for oil? Energy Economics, 29(1), 18-27.

Song, Y., 2014. Modelling regime switching and structural breaks with an infinite hidden Markov model. Journal of Applied Econometrics, 29(5), 825-842.

Tang, Z., and Fishwick, P. A., 1993. Feedforward neural nets as models for time series forecasting. ORSA Journal on Computing, 5(4), 374-385.

Walczak, S., 2001. An empirical analysis of data requirements for financial forecasting with neural networks. Journal of Management Information Systems, 17(4), 203-222.

Walczak, S., and Cerpa, N., 1999. Heuristic principles for the design of artificial neural networks. Information and Software Technology, 41(2), 107-117.

Wang, J., and Wang, J., 2016. Forecasting energy market indices with recurrent neural networks: Case study of crude oil price fluctuations. Energy, 102, 365-374.

Wang, S., Yu, L., and Lai, K. K., 2005. Crude oil price forecasting with TEI@I methodology. Journal of Systems Science and Complexity, 18, 145-166.

Wilcox, R., 1994. A one-way random effects model for trimmed means. Psychometrika, 59(3), 289-306.

Wilcox, R., 2011. Modern statistics for the social and behavioral sciences: A practical introduction: CRC press.

Wilcox, R., 2012. Introduction to robust estimation and hypothesis testing (2nd ed.): Academic Press.

Xie, W., Yu, L., Xu, S., and Wang, S., 2006. A New Method for Crude Oil Price Forecasting Based on Support Vector Machines, Berlin, Heidelberg.

Xiong, T., Bao, Y., and Hu, Z., 2013. Beyond one-step-ahead forecasting: Evaluation of alternative multi-step-ahead forecasting models for crude oil prices. Energy Economics, 40, 405-415.

Xu, S., and Chen, L., 2008. A novel approach for determining the optimal number of hidden layer neurons for FNN’s and its application in data mining. Proceedings of the 5th International Conference on Information Technology and Applications (ICITA ’08).

Yan, L., 2012. Analysis of the international oil price fluctuations and its influencing factors. American Journal of Industrial and Business Management, 2(2), 39-46.

Yao, Y. C., 1988. Estimating the number of change-points via Schwarz' criterion. Statistics & Probability Letters, 6(3), 181-189.

Yu, L., Wang, S., and Lai, K. K., 2008. Forecasting crude oil price with an EMD-based neural network ensemble learning paradigm. Energy Economics, 30(5), 2623-2635.

Zhang, G., and Hu, M. Y., 1998. Neural network forecasting of the British pound/US dollar exchange rate. Omega, 26(4), 495-506.

Zhang, G., Patuwo, B. E., and Hu, M. Y., 1998. Forecasting with artificial neural networks: The state of the art. International Journal of Forecasting, 14(1), 35-62.

Zhang, G. P., 2001. An investigation of neural networks for linear time-series forecasting. Computers & Operations Research, 28(12), 1183-1202.

Zhang, G. P., Patuwo, B. E., and Hu, M. Y., 2001. A simulation study of artificial neural networks for nonlinear time-series forecasting. Computers & Operations Research, 28(4), 381-396.

Zhang, J. L., Zhang, Y. J., and Zhang, L., 2015. A novel hybrid method for crude oil price forecasting. Energy Economics, 49, 649-659.

Zhu, D. M., Ching, W. K., Elliott, R. J., Siu, T. K., and Zhang, L., 2017. Hidden Markov models with threshold effects and their applications to oil price forecasting. Journal of Industrial and Management Optimization, 13(2), 757-773.

Zimmerman, D. W., 1994. A note on the influence of outliers on parametric and nonparametric tests. The Journal of General Psychology, 121(4), 391-401.




How to Cite

Aras, S., & Hamdi, M. (2019). Data Size Requirement for Forecasting Daily Crude Oil Price with Neural Networks. Scientific Annals of Economics and Business, 66(3), 363 –.