Price Action Lab Blog

Premium Market Analysis

Trading Strategies

Why You Should Avoid Purchasing a Trading System

You should avoid purchasing a trading system because there is usually no way of knowing how it was developed. There are many “trading system outfits” that use data-mining to generate systems they claim are robust. However, depending on how this is done, robustness tests can be part of a data-mining process with high bias.

There are many websites that sell trading systems or their signals. More are appearing every month, disguised as quant outfits. Do you know why there is such a proliferation? You can also buy software that tests or even generates trading systems. Do these systems worth anything? In 99.99% of the cases (add as many 9s as you want) these systems worth nothing. Examples:

  • Day and swing trading systems based on indicators
  • Trend-following systems based on indicators and/or fundamentals
  • ETF and equity rotation schemes based on any method

If you do not know how the systems were generated exactly, then you may want to avoid considering them because they can be the result of p-hacking. This is what I mean by that:

Even if the developers claim that the systems were developed on in-sample data, then tested on out-of-sample data and even tested on forward data, the truth of the matter is that anyone can find a system that satisfies these requirements by letting an algo data mine historical data until the desired result is generated. I have talked about this fact several times in this blog. The system that performed well is selected from a large, often vast, number of candidates and although its performance appears significant on out-of-sample data, the low p-value has been essentially hacked. This is what is meant by p-hacking. The system may have a large enough Sharpe ratio, so that when multiplied by the square root of the number of years in the out-of-sample, it then produces a high t-statistic, as in:

t-statistic = Sharpe ratio × √number of years

However, this statistical test only applies when a single, and often unique, hypothesis was involved. If multiple hypotheses were involved, then its value must be adjusted by a number that turns out to be large. As a result, significance is no longer present.

Even if the developers did not use one of those genetic programming applications that generate trading systems by the tons, you can never know how many parameters they optimized, directly and indirectly. Curve-fitting has many ugly aspects to it. There are different types of curve-fitted systems and different ways of arriving at them, as follows:

  • Direct parameter optimization
  • Indirect parameter optimization

In direct parameter optimization (DPO), the developer, or genetic algo, directly adjusts the parameters of indicators so that some performance metric is optimized. This is the most dangerous mode of developing trading systems because it involves relentless optimization and the results are fitted on historical data by design. Out-of-sample performance may be due to chance after doing indirect parameter optimization (IPO). In this mode, the markets, specific securities, in-sample, out-of-sample and the forward-sample can be selected until the desired result is obtained. In most cases this is not done intentionally to defraud anyone but it is the outcome of ignorance of the impact of data-snooping and multiple selections on the significance of the final result. Although the perils of direct optimizations are now understood by most quants, IPO still remains a dark area.

Some tests you could do

Let us assume that you are offered a system to trade SPY and you are told that it was profitable in the out-of-sample. Unless you can do the following, maybe it is not a good idea to consider the system due to the issues discussed above:

(1) Test the system in the whole data history available. Do you like the performance of the system? Is the Sharpe ratio high enough? Do you see a higher maximum drawdown than that reported?

(2) Check the number of parameters in the system. Are there more than two parameters? It is easy to fit any system that has many parameters so that it finally performs well in all samples, in and out.

(3) Consider one highly correlated security to SPY and test the system in that market. Does the performance deteriorate significantly or even becomes negative? If it does, this could mean that the system exploits specific dynamics of this market and it is not based on a genuine hypothesis about trading the markets. Although there is nothing wrong in principle with exploiting market-specific dynamics, when those change the systems implode. I have an example below.

If you cannot do any tests

In this case, maybe it is not a good idea to buy a system and you must do your own analysis and homework. This includes cases when the code of the system is not available or there is no capability of doing the tests. Note that in some websites that sell trading signals there are some systems that show unique performance in real-time. That was also the case with a forex system before the SNB removed the peg to the euro. The system showed exceptional performance in real trading but it exploited a market specific property, i.e., the peg to the euro. Profits of several months or even years vanished in one hour, turning the most profitable system into the most unprofitable. One could have detected the dependency on the specific market property with a test on a similar forex pair.

Finally, note that the message of this post is not that all trading systems for sale are worthless but that you have limited ways of knowing how they were developed. For example, they may be artifacts of relentless data mining. Ignore any claims that the data-mining bias was measured and corrected for. These usually are the result of a gross misconception of what data-mining bias is. Data-mining bias cannot be measured because it refers to the quality of a very complex process rather than to a specific quantity.

You can subscribe here to notifications of new posts by email.


Detailed technical and quantitative analysis of Dow-30 stocks and popular ETFs can be found in our Weekly Premium Report.

© 2015 Michael Harris. All Rights Reserved. We grant a revocable permission to create a hyperlink to this blog subject to certain terms and conditions. Any unauthorized copy, reproduction, distribution, publication, display, modification, or transmission of any part of this blog is strictly prohibited without prior written permission. 


  1. ninjatrader

    Yeah!!! very very nice post.I am very glad to see is a big description why i should avid Purchasing a Trading System.So i think it is helpful post.Thanks a lot for this.

  2. ExpertAdvisor

    I totally agree. Like you I posted an article on my blog to tell people to avoid buying a trading system/EA

    I prefer to say them to work hard to find their own trading system instead.

  3. steve


    I pay a monthly subscription to ETFReplay which allows for backtesting of ETF portfolios using several parameters. Are these all optimized and bogus? Am I wasting my money?


    • Hi Steve,

      ETFreplay is a great website. I don't think they sell trading systems although I haven't visited the website recently. I think they offer tools for traders to do their own homework and in fact they seemed to me very generous about that in terms of free information they provide. This is good as long as the user of those tools knows their limitations and, in general, the limitation of backtesting. Backtesting can become a dangerous practice due to data-mining bias.

    • dan

      If you a test an idea out and it seems robust and works across different markets you may have something, assuming there is good rationale for the relationships. But if you run ETF replay many times with many tweaks the odds that you have something real vs datamined changes dramatically. I prefer to follow systems that have live date forward performance and those that aren't constantly tweaked. Backtesting doesn't tell us as much about the future as we'd hope. Depending on the system, near term results often is more applicable than how it did years/decades back.

      • Dan, you made some excellent points. About the last one I think distant results are often applicable to daily timeframes when market conditions change.

Comments are Closed

Theme by Anders Norén