Descriptive Statistics In The Financial Blogosphere

Anyone using excel can generate descriptive statistics based on market data, such as the mean, standard deviation, kurtosis and skewness of returns, or even the percentage of winning trading days after specific holidays. Such descriptive statistics are for the most part useless for trading the markets and this fact is based on mathematics, not only common sense. Anyone who relies on descriptive statistics for trading the markets is doomed to fail.

Descriptive statistics are a way of describing data, usually a population under study. They cannot be used in trading to reach any conclusions about the profitability of any single trade but only of a sample of trades that is representative of the population. For example, we know that in the last 52 years, the market was up 37 times the day before Labor Day. Can we use this information to make money?

The above information is not useful. Obviously, the next trade can be a loser. This is similar to tossing a biased coin towards heads. Tails may still show up in the next toss. The bias will only be present only after a sufficient number of tosses. For example, this year the market dropped on Friday before Labor Day, which is what happened by the way. As a result, the win rate dropped to 100*37/53 = 69.81%. Three winners in a row are required so that 100*40/56 = 71.42%, which is slightly exceeds the previous win rate. If the bias is a property of the population, then eventually the win rate will return to its expected level. However, this is of no cancellation to someone who relied on that statistic to bet money. Actually, a streak of losers can wipe the speculator out before the win rate returns to its expected average. These scenarios are intuitive and common sense but financial bloggers who present useless statistics overlook them.

The next important consideration relies on mathematics. Suppose that the bias in the Labor Day example is not 71.15% but it is actually a little better than 50% and the sample so far was not sufficient to reflect the correct value of the parameter. Essentially, there was a random sequence of events that generated 37 winners in 52 trials, or a success rate of 71.15%, but as the number of trials gets large the success rate will converge to just above 50%. What is the number of trials, or years in the case of the specific example, required to be sure that we choose the wrong side of the trade with a small probability, let us say 1%?

The Chernoff bound provides an estimate of the minimum number of trials of n independent events of probability p so that the probability of choosing the wrong side is ε. The formula is shown below:


When p gets close to 0.5 it is clear that the number of required trials becomes very large. For example, for p = 0.51 (51%), and ε = 0.01, the required number of trials is close to 23,000. Therefore, only after 23,000 years we will be able to choose the wrong side with 1% probability at most in the case of a small bias of 51%. Even if we can live with a 10% error, then the required sample size is about 11,500. This is still to large for financial price series.

The problem is that we do not know how much our coin is biased, or referring to the Labor Day example, if the market is actually biased to generate more gains than losses on Fridays before the holiday.  If we assume the presence of bias, then the number of trials necessary to choose the wrong side with a small probability is also small but such assumption is not justified. This is true in general when using descriptive statistics based on limited samples. If the sample is due to a random sequence and the assumed bias is a fluke, then the statistics are useless, something that is the case 99% of the time. Relying on statistical flukes and small samples leads to losses.

Subscribe via RSS or Email, or follow us on Twitter.

Charting and backtesting program: Amibroker

Technical and quantitative analysis of Dow-30 stocks and 30 popular ETFs is included in our Weekly Premium Report. Market signals for longer-term traders are offered by our premium Market Signals service.

Copyright Notice

This entry was posted in Market Statistics and tagged , . Bookmark the permalink.