The Fundamental Problem of Backtesting and Why It Has Not Helped Much Technical Traders

Trying to discover an edge by randomly backtesting ideas is equivalent to looking for a needle in a haystack. More than 25 years have passed since backtesting software became available to retail traders and the difficulty in finding an edge persists because of various issues that involve, amongst other things, data-mining and process pitfalls.

The fundamental problem of backtesting for the purpose of finding an edge in the markets is that it it introduces a dangerous form of data-mining bias caused by the reuse of data to test many different hypotheses. This is because when one finally discovers what appears to be an edge that even validates on out-of-sample data, this could be the result of curve-fitting in the in-sample and accidental good performance in the out-of-sample.  Things become a lot worse when many combinations of indicators, exit strategies and performance metrics are used with the data-mining bias increasing rapidly as a function of their number. Even with a few combinations of indicators and exit strategies, the probability of finding an algo that passes all validation tests in the in-sample and the out-of-sample after continuous use of backtesting is for all practical purposes one. Thus, system developers who use backtesting programs may think they are improving their chances of finding an edge by repeatedly trying new ideas on historical data when in fact what is happening is that in effect they are increasing the data-mining bias with each subsequent trial and their chances of true success are diminishing. After a few years of using such programs, the probability that all systems found are worthless is for all practical purposes 1 unless there is a drastic paradigm shift in the way the user tries to solve the problem founded on a solid understanding of the issues involves and how one can can deal with them effectively.

The  process described above, which basically involves a human developer and backtesting software, has been automated in recent years using genetic programming and various types of evolutionary algorithms. Essentially that amounts to software that combines indicators with exit signals according to evolutionary principles and does that repeatedly until some metrics of performances are optimized. However, these developments instead of taking years to increase the data-mining bias, they achieve that in a few minutes and the end result is the same; repeated use of historical data using many combinations of indicators, exit strategies and performance metrics guarantees that at the end some system(s) will be found that will pass all validation tests but could reflect spurious correlations that disappear when market conditions change, i.e., these systems do not possess any intelligence when generating entry and exit signals but they are just artifacts of curve-fitting in-sample data and just lucky on the out-of-sample data. Developers of such products even claim that they can test trillions of combinations of systems until they find one that meets the search objectives while underestimating or being ignorant of the impact of data-mining bias. Traders who use such products without understanding these realities of repeated backtesting with many degrees of freedom become the victims of their own ignorance and have no chances of ever finding a true edge unless they get to the bottom of this but that requires another edge in the form of an understanding of what must be done to avoid the pitfalls of such processes.

In the mid 1990s when I was investigating automated methods for searching for an edge I was aware of some of these pitfalls and also some of the problems in commercial backtesting programs of the time that probably end up costing fortunes to their users due to fallacious results, mainly due to code limitations. Also, many users of such programs confused forward looking algorithms in historical data for intelligent prediction systems. After exhaustive tests and many months of work I decided that the system I will build to search for edges would deal only with pure price action, simple exit conditions and a few fundamental performance metrics. This is how Price Action Lab™ was born after utilizing Occam’s razor as a guiding principle.


Price Action Lab™ (PAL) does not use genetic programming or neural networks like some other programs because these are mainly curve-fitting methods. PAL does not search for any predefined pattern formations and it does not perform permutations or combinations of rules and indicators. The algorithmic engine used for the pattern search, scan and p-indicator functions, as shown on the above outline of the program, was designed carefully to minimize data mining bias. More importantly, PAL produces the same output each time it encounters the same conditions and this determinism is in compliance with the standards of scientific testing and analysis.

Price Action Lab™ (PAL) discovers price patterns in any market and timeframe of choice by analyzing market price action based on user-defined performance statistics and risk/reward parameters. These are price patterns that are present in the data and not some artificial mix of indicators. Also these are not the kind of price patterns that some genetic algorithms find but a special class that minimizes curve-fitting.

PAL has three main functions: search for price patterns, scan for price patterns and P-Indicator calculations as shown on the above chart. Several validations methods are available, including out-of-sample testing, robustness tests and portfolio backtests. Programming is not required to use any of the functions of PAL.  Actually, PAL will generate code for system developers who can use the patterns it discovers as building blocks in their trading systems. PAL can be used by both systematic and discretionary quant traders.

PAL has many other functions including tools for performance randomization and price series analysis. For a list of what one can do with the software you may read this article.

Use of PAL does not guarantee profitability. See the full disclaimer here.

This entry was posted in Strategy Synthesis, Technical Analysis and tagged , . Bookmark the permalink.