Premium Market Analysis, Trader Education, Software, and Trading Strategies. Thirty Years Of Skin In The Game

Quantitative trading, Trader education, Trading Strategies

Are Historical Data Prior to 2000 Obsolete for Developing Trading Systems?

Some quants have claimed that data prior to 2000 is obsolete for the purpose of trading system development.

Is historical data prior to 2000 obsolete? As posed, this question is too broad. After all, data is data, and the more data that is available, the better it should be in principle when developing trading systems because they are exposed to a wider spectrum of market conditions. However, not all agree with this idea. Different traders have different objectives. It could be the case that for some systems, price series dynamics have changed to the extent that using old data makes proper development impossible. Typically, those who argue that information from before 2000 is no longer relevant list the following reasons:

  • Decimalization of stocks in April of 2001
  • Regulation NMS in 2007
  • Changes in uptick rules
  • Emergence of HFT
  • Quantitative easing after GFC.

The above may be valid reasons for not using data before 2000 or 2009 for certain classes of trading strategies. It is quite possible that many intraday strategies, including scalping, pairs trading, and  short-term momentum, as well as some other arbitrage strategies, on both intraday and daily data may have been affected by the changes listed above to the extent that data prior to 2009 or 2000 are obsolete. However, the spectrum of trading strategies is quite broad. Those who argue that changes in price data after 2009 have rendered older data obsolete in their domain of application. This adds another important dimension to the already difficult task of trading system development: developers must analyze their systems for susceptibility to changing market conditions and make sure the proper price series are used in backtests when developing them. This problem also makes results from machine learning programs more dubious as all systems are tested in the same in-sample and out-of-sample regardless of whether or not that is appropriate. Thus, in addition to a large data-mining bias there may be bias due to price series dynamics changes and regime changes.

© 2015 Michael Harris. All Rights Reserved. We grant a revocable permission to create a hyperlink to this blog subject to certain terms and conditions. Any unauthorized copy, reproduction, distribution, publication, display, modification, or transmission of any part of this blog is strictly prohibited without prior written permission.