Fooled by Randomness Through Selection Bias

There are programs available to traders that mix and match indicators with exit conditions for the purpose of designing systems that fullfil desired performance criteria and risk/reward objectives. Due to the inherent randomness in the algorithms of these programs their output varies each time they run. How can one distinguish between the random systems generated by such programs and some systems that may possess some intelligence in timing their entries?

Suppose that you  have such a program and you let it run to develop a system for SPY. After a number of iterations you get a relatively nice equity curve in the in-sample and in the out-of-sample with a total of about 1000 trades (horizontal axis):

Before obtaining the above equity curve, the system generated a large number of equity curves that were not acceptable. You think this is a good equity curve but you also suspect this system may be random due to selection bias. You are absolutely correct. But before going into this, let us talk about the case where a trader thinks that by increasing the number of trades, randomness due to selection bias is minimized:

In this system the number of trades was increased by two orders of magnitude to about 100,000 and both the in-sample and out-of-sample performance look acceptable. Does this mean that in this case the system has lower probability to be random?

The answer is no. Both of the above equity curves were generated by tossing a coin with a payout equal to +1 for heads and -1 for tails. Actually, the second equity curve was generated after only a few simulations. Both curves are random and they only appear acceptable. You can try the simulation yourself and see how successive random runs can at some point generate nice looking equity curves.

So the logic is that random processes can generate nice looking equity curves but how can we know if a nice looking equity curve selected from a bunch of other not so nice looking curves is just the outcome of a random process? This inverse question is much more difficult to answer. Here are some criteria to use:

(1) When selection bias is a fact, is the underline process that generates the equity curve deterministic or does it involve randomness? If randomness is involved and each time the process runs the system with the best equity curve is different structurally then the probability that it is a random system is extremely high.
(2) If you remove the exit logic of the system and you specify a small profit target and stop-loss just outside the range of ATR(14), does the system remain profitable? If not, then the probability that the system possesses no intelligence in timing entries is very high.
(3) Does the generated system involve parameters that were selected to get the final equity curve through the maximization of some objective(s)? If there are such parameters, then the selection bias dimensionality increases to the power of the number of parameters making it extremely unlikely that the system possesses intelligence and it may be just fitted to the data.
(4) This is the most important part: Do you get only one run of the system that is repeatable and deterministic or multiple runs that may or may not converge to the acceptable performance? If many runs occur, this introduces data-mining bias. In this case, data in the in-sample are used multiple times with many combinations of indicators and heuristic rules until an acceptable equity curve is obtained. Some of the systems that generate the good performance curves in the in-sample may also generate good equity performance in the out-of-sample by chance alone. Thus, data-mining bias facilitates selection bias. Cross-validation in an out-of-sample may not be enough to rule out randomness because, as already pointed out, some systems may survive in the out-of-sample just by chance.

The coin toss experiment provides an indication that when one comes across a process that generates many system alternatives with many equity curves, some acceptable and some unacceptable, one may get fooled by randomness. Minimizing data-mining and selection bias is a very involved process for the most part outside the capabilities of the average user of such processes.

This entry was posted in ETF Analysis and tagged , , , , . Bookmark the permalink.

Comments are closed.