After an amazing blog post by Howard Lindzon last night mentioning probability investing I decided to write a few things about it but the topic is vast and I will have to come back. All quant traders should be aware of these issues with probability.

Discretionary trading or investing based on probabilities is fraught with many difficulties. Most of the difficulties arise from the very nature of probability.

Let us start with the wrong claim that one often sees in financial blogs:

**Claim:** If the probability of success is high and I can make more than I can lose, then the opportunity I am considering has high expectation.

**The claim is wrong**. Some of those who make it did not pay enough attention to their probability and statistics teacher. Let us assume that the win ratio is w > 0.50. Actually, let us assume that it is 0.7, or that the trader is right 70% of the time. Also, let us assume that the profit potential is W and the loss potential is L, with W >> L. Actually, let us assume that W = 3 x L. Usually, this is how the claim is expressed mathematically:

Expectation = w x W – (1-w) x L = 0.7 x (3L) – (1-0.7)xL = 1.8L (1)

Therefore, the claim is that the expectation from the next investment or trade is 1.8 times the potential loss. This is not only wrong,** it is flat wrong**.

For those that paid attention to their professors, the expectation is also the **mean** of a probability distribution, usually denoted as E[X] or μ. In this case, it is the mean of the population of the values a random variable X takes with some probability p. Now this is the most important part:

**Averages converge to the expectation only in the presence of sufficient samples**

What does the above statement mean? Let us first look at the theoretical expectation that is given as the sum of all possible values the random variable X can take times their associated probability:

E[X] = ∑X_{i}p_{i} (2)

Do equations (1) and (2) have anything in common? No, but equation (1) will converge to equation (2) in the presence of sufficient samples. In statistics, they say that the **sample mean targets the population mean**. But this is true only for sufficient samples.

So what is a sufficient sample? In finance, nobody knows the answer but it is a large sample, possibly consisting of hundreds or even thousands of trades.

What this boils down to, is that equation (1) is not a measure of expectation of the next trade, not even close. It only says that in the longer-term, an unspecified time interval, an investor or trader that is right w% of the time and with average profit W and average loss L, where W = 3L, will realize an average trade equal to 1.8L. The total profit will be equal to the sample size times the expectation (less any friction).

Here is the danger discretionary but also systematic traders face: consecutive losers and associated clusters.

Let us go back to the original claim. The win rate of the investor is 70%. This is like a biased coin. However, the next time the coin is tossed, tails can show up instead of heads. Thus, the trade can be a loss. This can happen again for the next toss. The probability of one loss is 30%. The probability of two consecutive losers is 30% x 30% =9%. This is high probability in finance but also in statistics. According to the rare event rule, if the probability of an event is small, it will probably not occur:

**Rare event rule:** if p(event) is small, then the event will probably not occur

What is low probability in finance? Remember that the 1987 crash was a 6-sigma event, a highly unlikely event but it did happen. However, if we structure our lives around extreme events, or left fat tails, we may accomplish nothing or little. We have to set a “reasonable” probability of rare events. Some think that p = 0.01 is a reasonable value in finance. But note that in the case of the original claim, three consecutive losers have probability 0.027, or 2.7%. This means that a drawdown caused by three losers in a raw could occur and it is not a rare event.

**There are more problems with the naive expectation formula**

One may get a streak of consecutive losers, then a winner or two, and then another streak of consecutive losers. Although the method of investing or trading may be sound, the longer-term convergence to the expectation, or mean of the population, may take longer than expected and one may suffer significant losses or even face ruin. In a nutshell, trading or investing with probabilities can be tricky and even dangerous. There are ways of dealing with the uncertainties that are covered in my new book, Fooled by Technical Analysis: the perils of charting, backtesting and data-mining. But one thing is for sure as Howard wrote in his post:

Unless you are an insider, if you want to trade or invest you will have to deal with probabilities and you must understand how they work and the dangers they hide. I hope this blog post will provide an initial direction. The first step is to understand the correct notion of expectation and not get fooled by self-proclaimed experts and wannabe quants that lack understanding of it. When you average a small number of trades you are not calculating the expectation. This is only a random sample mean. You cannot make inferences about the population mean unless you have a large representative sample or you are willing to accept a large error. On top of this, non-stationarity can affect the results and render the investing method ineffective. For example, the U.S equity markets switched from momentum-driven to mean reversion-driven in the 1990s and this destroyed a large class of strategies used by traders and investors while causing massive wealth transfer. Always be prepared for a change in market conditions that will invalidate representative samples of the past. This is the name of the game and it can be played profitably but only if some misconceptions are first cleared and one stays away from hype and naive claims.

© 2015 Michael Harris. All Rights Reserved. We grant a revocable permission to create a hyperlink to this blog subject to certain terms and conditions. Any unauthorized copy, reproduction, distribution, publication, display, modification, or transmission of any part of this blog is strictly prohibited without prior written permission.

## Ryan

From Wikipedia: In probability theory, the expected value of a random variable is the long-run average value of repetitions of the experiment it represents.

I understand your point but I'm not sure your claim regarding the definition of expected value is true. After a small number of trades your sample mean might not equal the expectation, but that doesn't mean the expectation is not 1.8L.

## Ryan

A more formal conclusion might be that positive expectation bets can have negative utility to a risk-averse investor, or that profitable strategies have a chance of ruin if sufficiently unlucky.

## Michael Harris

Hello Ryan,

The claim that the expectation is positive must be accepted by rejecting the null hypothesis that the expectation is 0. This requires sufficient samples. However, I agree that

"profitable strategies have a chance of ruin if sufficiently unlucky"

I think this is a good statement but I will also add to it:

"or insufficiently capitalized." In my book I have examples of strategies with positive expectation that produce 100% ruin depending on initial capital and trading friction. An example is buying the overnight changes in SPY. Depending on starting capital and commission rate, the strategy can generate high returns or lead to ruin.

## Michael Harris

Hello Ryan,

Especially in the case of probability and statistics I never use a public source of information such as Wikipedia. The correct interpretation of the expectation can be found in academic texts, for example in my graduate studies text by Papoulis, A., Probability, Random Variables, and Stochastic Processes, McGraw-Hill, p. 138, as follows:

" If n is sufficiently large, then the average is approximately equal to the expected value".

You claim that :

"After a small number of trades your sample mean might not equal the expectation, but that doesn't mean the expectation is not 1.8L."

This is exactly the main point. Even if you know the win rate, something that is questionable, only after a sufficient sample the average will be equal to the expectation. Note that a trader is not evaluated based on theoretical expectation but based on the average of executed trades.

Now, one can even assert that the expectation is not known if the sample is not sufficient. This is because the value of the win rate represents a claim. This is a circular problem. You know the true win rate only after a sufficient sample is generated. The claim (null hypothesis) must be tested again the alternative:

Ho: win rate = w (or greater)

H1: win rate < w

This is a left-tail hypothesis test. There are complications regarding the applicability of this test in trading that are discussed in my new book. In a nutshell, all hypothesis tests in finance are conditioned on historical data.

In general, the expectation is known only if you assume that the win rate (a proportion in statistics) is known with a high confidence. In trading you need a very large sample to have confidence at the 5% or 1% significance level of the claim that the win rate is equal or greater to a certain value.

Therefore, in light of the above I argue that the expectation is more than often not know. However, for 95% of traders it can be assumed that their sample of trades comes from a distribution with E[x] = 0, or 0 expectation, before trading friction is accounted for.

## hughesfleming

Excellent article!

## sameer

Very nice explanation. I was also getting bogged by simple expectation formula and was looking for more explicit answer.

Regards

Sameer