How long of a track record is required for determining if a fund manager or trader is skilled? The answers may surprise you.

In a recent Meb Faber podcast with professor Kenneth French, there was an intriguing discussion about skill vs. luck. Professor Kenneth French argued that investors underestimate the number of years required to obtain a statistically significant measure of a manager’s skill. He gave the following example:

A manager has an average 5% alpha with the same volatility as the market, which is about 20%. The answer is that a track record of 64 years is required for a t-statistic of 2.

This short article will first examine the mathematics of the answer and then discuss the limitations of t-statistics to measure skill.

The mathematics

For the problem mentioned, we will use a simple measure of alpha:

Alpha = realized returns minus market returns.

We assume that the market has generated an average annualized return of 10% with 20% volatility. (The SPY ETF annualized return since inception has been 10.4%, with 18.7% volatility, so this is a good assumption for the market.)

T-statistic = (mean of sample – assumed mean)/SE

where SE = volatility/square root of number of years.

If we plug our numbers, we get the following:

T-statistic = (15% – 10%)*√n /20 = 2

We solve the above equation for n and we get:

n = (40/5)^2 = 64

Therefore, as professor Kenneth French revealed in the podcast, 64 years of a track record are required for a t-statistic equal to 2, or about three investment lifetimes.

For example, to reduce the track record’s length to 16 years, an alpha of 10% is required!

Y-axis: number of years. X-axis: alpha. Generated with WolframAlpha

Issues with the T-statistic

Assume that a fund manager generates a 40% alpha in the first year after launching a fund. Is it possible? Yes, just look at the performance of some CTAs. In that case, the t-statistic will be 2. However, due to the test’s low power, the track record is not long enough. The effect size is important. For a sample of 16 years, below is a chart of how the power of a test varies as a function of the effect size.

More importantly, the t-statistic assumes normal data. Was the fund manager’s record “normal” enough, or did the alpha come from a couple of years of unusual, probably random, returns? In that scenario, using the t-statistic is not feasible.

This is also a problem with CTAs. Due to outlier gains—since they chase outliers in the first place—no statistical method can determine if there is skill, regardless of the alpha. This is one of the reasons allocators exercise extreme caution when investing in CTAs, as they recognize the issue of false positives.

False negatives are another problem. If you wait long enough to select managers, you will miss some stellar performers.

Bob Elliot, Co-Founder, CEO, and Chief Investment Officer at Unlimited, argued on X that the process of selecting a manager is qualitative. I agree that selecting a manager should be qualitative. In my opinion, there is so much randomness that skill is essentially a result of luck. There is no robust statistical method that will reveal true skill.

Finally, and most importantly, a t-statistic might tell you if the performance was due to luck or skill. Skill can disappear suddenly, especially in the case of fund managers that exploit market anomalies that abruptly vanish due to overcrowding or change in market microstructure. For more details, see my article in SSRN, Limitations of Quantitative Claims About Trading Strategy Evaluation. I think in that article I make a compelling case about the limitations of statistical methods.

Have a Great Labor Day Weekend!