High accuracy predictions result in an acceptable profit factor over a trade sample if, and only if, the payoff ratio is sufficiently high. Machine learning applications that focus on high prediction accuracy without constraining the payoff ratio are in many cases unprofitable due to a low profit factor.
I first presented the Profitability Rule in my book Stock Trading Techniques Based on Price Patterns (p. 52, Traders Press, 2000) and in articles published in trading magazines. A detailed derivation of the rule can be found in my book Profitability and Systematic Trading (p. 46, Wiley, 2008), which can be downloaded for free from the Traders Education section of this blog. The rule is as follows:
The above expression is exact, meaning that if P is known, RWL is also known, and vice versa, as follows:
The above equation also implies that if RWL is not constrained or specified, then the win fraction may be high but the profit factor may be less than 1. For example, if RWL is 0.2 and Pf is 0.9, then the win fraction P is 0.81, or 81% but the strategy is unprofitable. In other words, high win rate without a sufficiently large payoff ratio is not sufficient for profitability.
Focusing on high accuracy without support from high profit ratio is a typical mistake of novice developers applying machine learning libraries in Python or R to trading strategy development. A typical example can be found in this article.
Developing trading strategies is a process much more involved than just applying machine learning, especially when the features are extracted from the data.
We use DLPAL S software in this example to develop a strategy for trading SPY ETF. We use in-sample daily data from 01/03/2000 to 12/31/2010. For a low payoff ratio we use a profit target of 1% and a stop-loss of 4%. These choices will result in high win fraction strategies with win rate greater than 90% since the approximate payoff ratio is 0.25%. All other settings are shown on the workspace below.
The results are shown below.
Each line in the results is a strategy that satisfies the performance parameters specified on the search workspace. Index and Index Date are used internally. Trade on is the entry point, in this case the Open of next bar. P is the success rate of the strategy, PF is the profit factor, Trades is the number of historical trades, CL is the maximum number of consecutive losers, Type is LONG for long strategies and SHORT for short strategies, Target is the profit target, Stop is the stop-loss and C indicates whether % or points for the exits, in this case it is %. Last Date and First Date are the last and first date in the data file.
DLPAL S identified 82 distinct strategies with profit factor greater than 1, specifically 49 long and 33 short, which satisfied the performance parameters defined on the workspace. Note that the minimum win rate is 90% for both long and short strategies.
DLPAL S generated AFL system code for all strategies in the results (combined with OR Boolean operator) and the results for the out-of-sample from 01/03/2011 to 03/22/2019 are shown below.
The result is positive but not acceptable because absolute and risk-adjusted performance are lower than the buy and hold performance in the out-of-sample period. MAR (CAGR/max. DD) for the system is 0.32 versus 0.64 for buy and hold. High accuracy does not translate directly into high performance.
Next, we lower the accuracy requirement to 67% for long and 80% for short (we often require short strategies to have higher accuracy in the in-sample since they are more fragile.) The profit target and stop-loss are both set to 3% for an approximate payoff ratio of 1. The parameters are set as shown below.
The results are shown next.
DLPAL S identified 7 distinct strategies with profit factor greater than 1.5, all long, which satisfied all performance parameters defined on the workspace. Note that the maximum win rate in this case is 71.06%.
The out-of-sample backtest is shown below.
In this case risk-adjusted performance is much higher; MAR for the system is 0.69 versus 0.64 for buy and hold.
Prediction accuracy must be constrained by a sufficiently high payoff ratio and also profit factor. This is almost never done with naive machine learning application because they apply classification based on a set of features with unknown impact on payoff ratio and profit factor. The main problem is that most novice developers use available machine learning libraries where adding additional constraints is not a straightforward task. DLPAL S (and also DLPAL DQ and DLPAL LS) have algorithms that implement the constraints on key parameters and identify predictors that satisfy them.
It is highly unlikely that any “simple” applications of machine learning will ever be profitable for the reasons stated above among other things. Furthermore, retraining repeatedly until a desired equity curve is obtained results in substantial increase in data-mining bias due to data-snooping and the results are in most cases random and over-fitted to the price series. This is one reason that most naive applications of machine learning reproduce the results of simple lagging indicators such as moving averages that are unprofitable although the accuracy appears to be very high. Trading strategy development is much more involved than machine leaning application and many hedge funds will discover this too late in the game.
If you have any questions or comments, happy to connect on Twitter: @priceactionlab
Disclaimer: No part of the analysis in this blog constitutes a trade recommendation. The past performance of any trading system or methodology is not necessarily indicative of future results. Read the full disclaimer here.