Can curve-fitting be avoided and how? Here are some thoughts because this is a very important issue in trading algo development.
If a strategy results from a process of testing several independent hypotheses, then curve-fitting specifically is no longer the main issue but data snooping bias is. Obviously, if one tests enough independent hypotheses, the probability of one of them hacking a significant p-value is high, if not close to 1.
Therefore, we cannot really say whether a trading algo with one parameter tested on 5 major assets is significant unless we know much more than that. We cannot also judge the validity of results of trading algos, momentum portfolios and asset allocations strategies published in books or papers because it is hard to know how they were developed, i.e., whether there was any relentless data-mining.
Curve-fitting is unavoidable in trading algo development. It can be shown that all trading algos are curve-fitted in the sense of some arbitrary objective function. Only random systems are not curve-fitted. However, as the number of parameters in the algo increases, the higher the probability becomes that it will fail due to changing market conditions because performance is usually fitted on past data. In this particular sense, curve-fitting is bad but that relates more to the number of free parameters than to a deliberate effort to tune their values so that performance is maximized. Therefore, claims of no optimization are basically irrelevant in the case of trading algos with several parameters. The real issues are:
- How the algo was discovered: analysis (unique) vs. synthesis (data-mining)
- How changing market conditions will affect its performance
Note that when there are several parameters in a trading algo, or when even a few parameters are optimized, performance may be robust to small variations of their values. Therefore, robustness tests based on small variations of parameters cannot prove significance.
Finally, note that high significance is not equivalent to future profitability because of Type I errors. Minimization of Type I error requires significance levels that are only obtained via optimization, so you see the vicious circle here. And this is also a reason that trading algo development is both a hard and an interesting endeavor. Both the analysis and synthesis of trading algos are presented in more detail in my book, Fooled By Technical Analysis.
Enjoy the holiday!
New books by Michael Harris:
Charting and backtesting program: Amibroker
Detailed technical and quantitative analysis of Dow-30 stocks and popular ETFs can be found in our Weekly Premium Report.