Can curve-fitting be avoided and how? Here are some thoughts because this is a very important issue in trading algo development.
Alex Bernal (@InterestRateArb) suggested the following in an email earlier today:
By only using 1 variable, and testing on 5 major asset classes seems to me
to prevent the possibility of curve fitting.
The short answer is… Maybe. If this involves a test a single unique hypothesis on 5 major assets, then the significance level is not affected and we can reject the null when the p-value is less than 0.05. This may also hold if a number of dependent hypotheses are tested on these 5 major assets. We assume that the assets were selected in advance and not post hoc but I have seen that practice too, unfortunately.
However, if the algo resulted from a process of testing several independent hypotheses, then curve-fitting specifically is no longer the main issue but data snooping bias is. Obviously, if one tests enough independent hypotheses, the probability of one of them hacking a significant p-value is high, if not close to 1.
Therefore, we cannot really say whether a trading algo with one parameter tested on 5 major assets is significant unless we know much more than that. We cannot also judge the validity of results of trading algos, momentum portfolios and asset allocations strategies published in books or papers because it is hard to know how they were developed, i.e., whether there was any relentless data-mining.
Curve-fitting is unavoidable in trading algo development. It can be shown that all trading algos are curve-fitted in the sense of some arbitrary objective function. Only random systems are not curve-fitted. However, as the number of parameters in the algo increases, the higher the probability becomes that it will fail due to changing market conditions because performance is usually fitted on past data. In this particular sense, curve-fitting is bad but that relates more to the number of free parameters than to a deliberate effort to tune their values so that performance is maximized. Therefore, claims of no optimization are basically irrelevant in the case of trading algos with several parameters. The real issues are:
- How the algo was discovered: analysis (unique) vs. synthesis (data-mining)
- How changing market conditions will affect its performance
Note that when there are several parameters in a trading algo, or when even a few parameters are optimized, performance may be robust to small variations of their values. Therefore, robustness tests based on small variations of parameters cannot prove significance.
Finally, note that high significance is not equivalent to future profitability because of Type I errors. Minimization of Type I error requires significance levels that are only obtained via optimization, so you see the vicious circle here. And this is also a reason that trading algo development is both a hard and an interesting endeavor. Both the analysis and synthesis of trading algos are presented in more detail in my book, Fooled By Technical Analysis.
Enjoy the holiday!
New books by Michael Harris:
Charting and backtesting program: Amibroker
Detailed technical and quantitative analysis of Dow-30 stocks and popular ETFs can be found in our Weekly Premium Report.