The BlueTrend fund returned 45.78% in 2008, in a year that the S&P 500 index fell 37% on a total return basis. However, in recent years the performance of the fund has deteriorated. I first present an analysis of the fund performance and then discuss comments made by Leda Braga about trend-following and data-mining.
The BlueTrend fund, which invests based on systematic trend-following, justified its role as an alternative investment vehicle in 2008 when it delivered returns that more than hedged losses in the equity markers. However, in 2011 and 2012 performance was nearly flat and in 2013 it was negative at -11.4%. The reason for that can be attributed to degradation in the performance of trend-following algorithms due to choppy markets and lack of trends. As shown on the performance graph below, the fund experienced a drawdown in the order of 20% during 2013 and performance has been volatile after 2010. The fund net performance for 2014 was 16.44% and for January of 2015 it was at +8.52%:
Bootstrap of monthly returns
A bootstrap of monthly returns generated a p-value of 0.003866 (500,000 samples) for a mean return of 1.0247% in the period Nov. 2005 to Jan. 2015. This low p-value means that if the null hypothesis is true and the performance is random, the chances of obtaining a mean return of 1.0247% are about 1 in 2,500. Although the bootstrap test has certain shortcoming, the results indicate that there is underline order in the generation of the returns that can be attributed to an intelligent system.
However, if the same bootstrap test is performed with monthly data from 01/2010 to 01/2015, the result is a p-value of 0.158 with a mean return of 0.49%. This result means that there is no evidence against the null hypothesis that the returns were random in that period (or that the returns were sampled from a distribution with zero mean). What do these results mean in the context of systematic trend-following algorithms? Has the system stopped working or the performance deterioration was temporary due to adverse market conditions?
I have written several blogs about the risks involved in trend-following. The fundamental problem of most trend-following systems is that they try to deliver a high payoff ratio but at a low win rate. The formula that describes this trade-off was discussed in another blog. A specific example of what I call “the trend-follower’s nightmare” was offered in a recent blog. In a nutshell, it is hard to design trend-following systems with high win rate because if one tries to increase the fraction of winning trades, the payoff ratio decreases. As a result, most trend-following systems have win rate less than 50% and that causes high drawdown during whipsaw periods. An approach that may mitigate the negative effects of these problems involves combining a trend-following system with short-term trading system. I suspect Systematica Investments’ new BlueMatrix fund that trades equities may aim at accomplishing exactly that, i.e., compensating for negative trend-following performance when markets do not offer trends. However, note that depending on the trading algos or discretionary styles employed, the performance of short-term trading could deteriorate during sharp uptrends at low volatility. I think this is what happened during 2014 to equity traders that used mean-reversion algorithms. The only method that worked was buying the dips along the trend. Therefore, I think that the probability is high that trend-followers will face in the future similar problems as those faced in the period 2010-2012 due to choppy markets.
Leda Braga has also said that “Systematic trading takes the emotion out of trading” but that does not apply in case a trader decides to stop a system because of fear of losses. If a system is proven it is important to have a procedure in place so that it is only stopped when there is a maximum cumulative loss.
Leda Braga said in an interview the following:
“There’s a creative moment when you think of a hypothesis, maybe it’s that interest rate data drives currency rates. So we think about that first before mining the data. We don’t mine the data to come up with ideas.”
This statement aims at the heart of the problem: data-mining bias. However, I am still not sure that it resolves it. One may think that coming up with a hypothesis to test may allow running classical statistical significance tests because there is no issue of multiple comparisons that arises when one mines the data to come up with hypotheses. I agree but this is true only if the same data are not used again and the same analyst does not use the knowledge obtained from a failed test to come up with another idea. Otherwise, there is no difference between an algo that mines data and a human mind that mines the same data other than the speed at which this is achieved.
For example, let us assume that your team thought of an idea today that when X and Y and Z happen, stocks make a top, giving an entry for a short trend. Then, the team tests this but finds that the results are not significant. Next day, the same team comes up with another idea that if X and Z and V happen, then stocks start another uptrend. If the same data are used to evaluate the new hypothesis, data-mining bias increases and must be compensated for in statistical significance tests by accounting for all results that were not accepted. But there is another more serious problem here with the idea of single hypothesis testing: How can we know whether a certain hypothesis was not generated by a data-mining process in the past and we are just part of a universal data-mining process?
One answer to the above question has to do with “uniqueness”. One may avoid getting fooled by randomness by making sure that the hypothesis to test is unique and could not have been obtained by a computerized data-mining process. This is the key here and proving that is half the battle and also half of any potential trading edge.
You can subscribe here to notifications of new posts by email.
Disclosure: no relevant positions.
Charting program: Amibroker