Trading strategy development usually involves optimization and curve-fitting of some sort. There are different and even conflicting views about the meaning of optimization and curve-fitting and their impact on strategy performance. Some claim that optimization and curve-fitting are unavoidable and even necessary but others insist that any trading strategy that is optimized or curve-fitted will eventually fail.
What is curve-fitting?
In mathematics, curve fitting is the process of finding a curve that fits best a collection of data points in the sense that some objective function subject to constraints is maximized (or minimized). For example, least squares is a curve-fitting method that minimizes the sum of squared residuals. A residual is the difference between a fitted and an actual value. The objective function to minimize when using this method for achieving the best fit is the sum of the squared residuals. Note that a “best fit” is defined only relative to the chosen objective and that curve fitting is essentially the result of optimization.
How does the notion of curve fitting apply to trading strategy development?
A trading strategy is a process that generates a collection of entry and exit signals. Usually, the algorithm or model of a trading strategy involves some parameters. The values of the parameters must be selected so that the strategy performs best during actual trading. It is common practice to set these parameters by back-testing the system on historical data so that some objective function is maximized (or minimized). For example, one can set the parameters so that the net profit is maximized or the maximum drawdown is minimized, just to mention two possibilities. There are more complex objective faction, for example the Sharpe, Sortino or MAR ratios. In additions, strategy developers can define complex objective functions that include several metrics of performance usually in the form of linear weighted functions. In machine learning these objectives are called “fitness functions”.
Curve-fitting and optimization
When one adopts the definition that trading strategies are processes that generate collections of entry and exit signals, then one realizes that what it is done essentially when any parameters are adjusted via back-testing is that the timing of the signals is varied so that they are fitted on historical data in such a way so that some objective function is optimized. This is not curve-fitting in the usual sense because one is not merely trying to find a curve that best fits the historical data but instead find the best collection of entry signals that in conjunction with the exit signals maximize some objective. This process is much more involved and complicated than simple curve-fitting. It involves selection, or timing of entry and exit signals, so that an objective function that is related to performance is optimized. This is an optimization problem rather than just a curve-fitting problem. As already mentioned, curve-fitting may involve optimization but the latter is a process with a much broader scope and includes many more possibilities than the former. Therefore, it is better to refer to optimized systems than to curve-fitted systems although this turns out to be more of a semantics issue for those that understand the process in depth.
For example, let us consider a simple moving average crossover system that generates long entry signals when SMA(t1) > SMA (t2), where t1 and t2 are the periods with t2 > t1, and short entry signals when SMA(t1) < SMA(t2). In its simplest form, this is a stop and reverse system, i.e., when an opposite signal is generated the previous position is closed and reversed. This system cannot be used in practice unless the values of t1 and t2 are selected. One can select those values via optimization of performance using back-testing on historical data. It is a widespread belief that this process results in systems that fail in actual trading because they are “curve fitted”. Is this belief true?
Actually, no one has ever proven mathematically that the failures of optimized strategies, which are well documented, are primarily due to the optimization, or what is commonly referred to as curve-fitting. It may be the case that the failures are merely due to the the nature of the strategies and their inability to adapt to changing market conditions. As a matter of fact this is more probable, since most indicators lag price. Thus, it is more probable that optimized trading strategies will fail for any values of their parameters at some point. It is the nature of the strategy and not the optimization that causes the failure. The large class of trading strategies based on technical analysis indicators has high probability of failure but that has been wrongly attributed based on my experience to the optimization process for setting parameters. It does not even matter whether the parameters are set so that small changes in their values result in stable performance. This is not an issue of the integrity of the optimization method used but of the nature of these trading strategies.
In my paper “Limitations of Quantitative Claims About Trading Strategy Evaluation” I have an example that shows how changing market conditions affect strategy performance and that selection of parameters is irrelevant.
However, any optimization that causes selection of entry and exit collections is in general a problematic process because it may introduce survivorship bias. Selecting collections that performed best in the past overlooks the fact that many other similar collections failed.
Going back to the simple moving average crossover strategy, it is easy to understand that given a specific historical data series, changing the values of t1 and t2 will cause a change in the timing of the entry and exit signals. In this case, selecting any collection of entry and exit signals that results from specific values of the parameters such that some objective function is maximized introduces bias. This is because it may be due to chance that the specific collection survived in specific market conditions. In the simple example, each collection is completely different from the others in the sense that both the entry and the exit points are different. What can we do to minimize the bias so that the integrity of the optimization process is not compromised? This question can be answered if we first understand how different types of strategies are affected by optimization of their parameters.
A three-level classification of optimized trading strategies
We can distinguish three types of strategies related to how optimization effects their collection of entry and exit points:
Type-I curve-fit: When the parameters of Type-I strategies are adjusted both the entry and the exit signals are affected, as for example in the simple moving average crossover strategy considered before. In this case, optimization and curve fitting result in collections of entry and exit signals that differ and selecting one that performs best introduces selection bias. These systems have the highest probability of failure.
Type-II curve-fit: When the parameters of Type-II strategies are adjusted, only the entry signals are affected. In this case, optimization and curve fitting result in collections of entry and exit signals that differ only in their entry part. Selection introduces less bias than with Type-I systems. These strategies have lower probability of failure that Type-I strategies. Example: Enter long if SMA(t1) > SMA(t2) and Price < P and Exit long at P1 or P2 where P1 and P2 are fixed prices (profit price and stop price).
Type-III curve-fit: When the parameters of Type-III strategies are adjusted, only the exit signals are affected. In this case, optimization and curve fitting result in collections of entry and exit signals that differ only in their exit part. Selection introduces less bias than in the case of Type-I or Type-II. These strategies have the lowest probability of failure because the timing of entry signals is not affected by optimization. Example: Enter long if Close of today > Close of 2 days ago and Exit long at entry price + x points or at entry price – y points, where x and y are the parameter to optimize (profit target and stop-loss).
In general, strategies that include indicators involve Type-I curve-fit. Type-II curve fit is rarely present in practice. Type-III curve-fit includes the broad class of strategies based on parameter-less price patterns.
Most software programs that discover trading strategies automatically generate Type-I strategies. It is irrelevant how many statistical tests they perform to measure the significance of the results as these strategies have high probability of failure during actual trading because of their nature and changing market conditions. Note that not all Type-II strategies make sense. For example, trying to discover such strategies without a guiding model is an exercise in futility since there are billions of combinations of price action features that can result in this type of strategies and selection bias is extremely high. However, it appears that in short-time frames these strategies can be effective if designed properly.
The important issue is not whether a strategy is optimized because all strategies are in one way or another, but to what degree optimization impacts the probability of failure due to its nature and changing market conditions. Strategies can fail for all sorts of other reasons but in this article we dealt with optimization and curve-fitting. Type-III curve-fit systems, as defined above, appear to have the lowest probability of failure if properly designed.