Inferences From Backtest Results Are False Until Proven True

The validity of any backtest results should not be assumed no matter how credible their source appears to be. There are many factors that can contribute to incorrect backtest results, especially in the case of more complex strategies, such as for example, relative strength ETF rotation and asset allocation.

Impressive backtest results are frequently reported in academic papers, trading magazines and financial blogs. However, there is this maxim that everyone should always respect:

Any inferences made from backtest results are false until proven true

In my new book I go one step further to argue that the primary task of quantitative traders should be to prove specific backtest results worthless, rather than proving them useful. Thus, I have introduced the following principle:

There are no edges in backtest results, except in a very small set of them

I have discussed in articles several times in the past about the main factors that impact the validity of backtest results and include:

  1. Data-mining and data snooping bias
  2. Use of non tradable instruments
  3. Unrealistic accounting of frictional effects
  4. Use of the market close to enter positions instead of the more realistic open
  5. Use of dubious risk and money management methods
  6. Lack of effect on actual prices

I will not elaborate on the above in this article but some details can be found here, here, here and here, for example.

I would only like to remind about (2) and (6), two factors that affect the reliability of long backtests based on non tradable indexes that are usually used in an effort to convince the audience about the potential of momentum strategies based on relative or absolute strength. Note that such strategies became popular only in the last 20 years and very popular in the last 10 years. Therefore, only recently they have had an impact on price action so that any backtests can be reliable. This is something that the promoters of those strategies seem to either do not understand or even try to obscure: when the trade gets crowded, any underline edge is arbitraged out. Therefore, this is another empirical principle that is important:

Widely used strategies lose any edge they might have had in the past

As a result of this empirical principle, long-term backtests of popular strategies based on non tradable indexes may not reflect true potential. They can possibly impress someone not familiar with the markets and quantitative trading but cause amusement to those that understand them.

Next I offer an example about the impact of (4), i.e., the use of market close to enter positions instead of the open. Although for some simple strategies it is possible to use projections of indicator/score values to establish positions at the close in advance of the signal, in general this is not possible or practical. See this article about moving average projections for more details.

The example is about an ETF rotation strategy with the following details

Monthly Data: 01/2007 – 10/31/2015
ETFs: SPY, EEM, TLT, GLD, DBC
Score: 3-month ROC
Two open positions only
50% allocation
Long-only
No rebalancing
No commissions
$100K starting capital
Adjusted data from Yahoo Finance

(Edit: results for open updated on 11/23/2015 to correct for coding error) Below are Amibroker backtest results for the open and close for position entry. The table also includes results from a widely used online backtester, Portfolio Visualizer:

Parameter Trade on Close Trade on Open Portfolio Visualizer
CAR 11.82% 9.65% 14.52%
Max. DD -19.08% -19.70% -20.45%
Sharpe 0.60 0.51 0.91

The wide variations, especially in annualized return (CAR) are evident. I have no idea about the trade entry point used in Portfolio Visualizer but I suspect it is the close.

The challenge

This is a little challenge for quantitative traders that would like to participate:

Backtest the above ETF rotation system and report your results in Twitter. You may add if you wish @mikeharrisNY in your tweet so I that I will get a notification of your post. Note that one of the reasons that I do not provide code to backtest the strategy is because I prefer that others do their own homework free of any bias from someone else’s code.

Regardless of the results of this challenge, the point is that backtests results can involve wide variations based on even practical assumptions, such as the entry point.

I summarize below the three principles mentioned above:

Any inferences made from backtest results are false until proven true

There are no edges in backtest results, except in a very small set of them

Widely used strategies lose any edge they might have had in the past

You can subscribe here to notifications of new posts by email.

Charting program: Amibroker
Disclaimer

Detailed technical and quantitative analysis of Dow-30 stocks and popular ETFs can be found in our Weekly Premium Report.

New book release

boocimagebig2


Publisher: Michael Harris
Date: September 1, 2015
Language: English
270+ pages (6″ x 9″ trim)
74 high quality charts
Available online only
Table of Contents

 


© 2015 Michael Harris. All Rights Reserved. We grant a revocable permission to create a hyperlink to this blog subject to certain terms and conditions. Any unauthorized copy, reproduction, distribution, publication, display, modification, or transmission of any part of this blog is strictly prohibited without prior written permission. 

This entry was posted in Quantitative trading, Trading Strategies and tagged , , , . Bookmark the permalink.

Leave a Reply