Expecting Unreasonable Confidence

By Mark Roulo

Last Updated: 25-April-2020


People often want more certainty than one can attain from statistical analysis of available data. This is an attempt to illustrate how that unreasonable certainty is attained and to illustrate how it can be illusory.


People often want more certainty than one can attain from statistical analysis of available data. A good example of this is a common belief that the "Trinity Study"1 of investment returns implies with high confidence that one can sustain 30 years of retirement spending while drawing down savings at 4% per year of the original savings amount, adjusting for inflation (up or down) each year.

The Trinity Study looked at "70 years of capital market returns generated from 1926 to 1995" and calculated "the average terminal value for all 51 20-year periods from 1926 to 1995."

It included tables that showed, as an example, that in 98% of the time periods examined a 4% inflation adjusted withdrawal rate from a 75% stock, 25% bond portfolio lasted for at least 30 years. A 100% stock portfolio had a 95% chance of lasting 30 years (though the average end value of the portfolio was higher).

Two separate quotes from an internet financial forum illustrate how these results are often interpreted:
... the Trinity study ... arrived at much the same conclusion, i.e. that 4% SWR has always worked in the past for a 30 year period as long as you held a reasonable %age of equities.
:
:
:
... the 4% is not a "rule", but simply the observation that backtesting a 60/40 portfolio, 4% real withdrawal did not deplete completely the capital over 30 years in 95% (iirc) of the cases.
(underlining added by me)

The fundamental problem with any strong conclusions drawn from the Trinity Study is that 70 years of stock market returns provide fewer than four independent 20-year periods and fewer than three independent 30-year periods.
So ... to try to illustrate (rather than prove) the problem with treating these "rolling time periods" the same as independent time-periods I will present the San Francisco Giants' results from the start of the 2010 baseball season.

A baseball game is (usually) nine innings and a "series" is a short set of games, usually three, against one opponent. Teams, fans and commentators speak of a team "winning a series" which means that they have a winning record for the games in that series. In this case, to win a series would mean winning two or three of the games against the opposing team.

So ... The results for the 2010 San Francisco Giants first three series were:

Series 1
(San Francisco vs Houston)
G#Team Inning Final
1 2 3 4 5 6 7 8 9
1 SFG 0 3 0 0 0 0 1 1 0 5 W
HOU 0 0 0 0 0 0 0 0 2 2
2 SFG 0 0 0 0 0 3 0 0 0 3 W
HOU 0 0 0 0 0 0 0 0 0 0
3 SFG 0 2 1 0 0 0 1 2 4 10 W
HOU 0 0 0 1 0 0 3 0 0 4
Series 2
(San Francisco vs Atlanta)
G#Team Inning Final
1 2 3 4 5 6 7 8 9
4 SFG 0 0 0 0 0 0 2 0 3 5 W
ATL 0 0 2 0 1 0 0 1 0 4
5 SFG 0 0 0 1 0 0 0 0 1 2 L
ATL 0 0 0 0 0 1 3 1 2 7
6 SFG 0 0 0 1 0 2 0 3 0 6 W
ATL 2 0 0 0 0 0 0 0 1 3
Series 3
(San Francisco vs Pittsburg)
G#Team Inning Final
1 2 3 4 5 6 7 8 9
7 SFG 3 0 2 1 0 1 0 2 0 9 W
PIT 0 1 0 0 1 0 1 0 0 3
8 SFG 0 0 1 0 1 1 0 0 2 5 L
PIT 1 0 2 0 0 0 0 1 2 6
9 SFG 0 4 0 0 2 0 0 0 0 6 W
PIT 0 0 0 0 0 0 0 0 0 0
  • Extra inning games (game 4) have all the scoring from the extra innings added to the 9th inning.
  • Unplayed bottom innings (e.g. 9th inning when home team is ahead) are treated as zero run innings.

The SF Giants won all three series and seven of the nine games (77% of games).
For reference, the US stock market had positive real returns in 8 of the most recent 11 decades2 (72% of decades).

Converting the 81 innings to 73 "rolling" 9-inning games, the SF Giants went 53-13 (with 7 ties) in these theoretical rolling games. Converting these theoretical games to theoretical 3-game series (27 consecutive innings per series), the SF Giants went 50-3 (94% success rate!) in series play (with two series tied).

The fundamental problem with the 50-3-2 theoretical series result is that it is constructed out of only three independent samples!

We just can't have a lot of confidence about the results of the next series because we've only seen three series so far.

Nevertheless, people WANT confidence and the rolling 30-year average provides that confidence for stock results. People think they can describe how the stock market returns over a thirty year period of time "always" or "almost always" had a given minimum return. Or that 95% of the time, some other desirable property holds. But this is done by slicing and dicing at most two or three independent 30-year periods. The two or three independent time periods in the past had portfolio returns with a given property. The future cannot be predicted with any certainty from those two or three time periods.

When the same technique is applied to baseball results it becomes obvious why one can't be terribly certain about how the SF Giants would perform in their next series. (And, in fact, the SF Giants lost their next two series, then won the series after that before eventually winning the 2010 World Series ...).
  1. "The Trinity Study" is a common shorthand for a 1995 paper by Philip L. Cooley, Carl M. Hubbard and Daniel T. Walz from Trinity University (the private liberal arts university in San Antonio, TX, not Trinity College the part of Cambridge University that housed Isaac Newton). The 1995 paper was titled:

    "Retirement Savings: Choosing a Withdrawal Rate That Is Sustainable"

    This paper used US stock market data from 1926 to 1995.

    The authors have a related paper, "Sustainable Withdrawal Rates From Your Retirement Portfolio" which includes two more years of data (1926 - 1997), and, interestingly updates the success rates of the tables in a way that is inconsistent with the first paper. A 75% stock, 25% bond allocation now succeeds in 100% of the 53 (up from 51 because of the two extra years of data) rolling 30-year time periods considered.

    The basic conclusion of both papers is the same, however: A 75% stock, 25% bond portfolio almost always lasted 30 years when 4% of the original money was withdrawn each year, after adjusting for inflation.

  2. The decades with negative real returns were: the 1910s, 1970s and the 2000s. Interestingly, the 1930s had a slight positive real return partially because the first part of the 1929 stock market crash took place in the 1920s and partially because deflation increased real returns.