Standard Deviations

By Mark Roulo

Last Updated: August-2019

Introduction

Not all populations have a standard deviation.

You can always calculate a standard deviation for a (finite) sample from a given population, but if the population itself does not have a standard deviation, then you are fooling yourself if you think the standard deviation calculated from the sample has anything to do with the population.

An Example

The simplest way to establish that not all populations have standard deviations is to provide one that doesn't. Consider the following population with specified elements and the fraction of the population for each element:

Element	Fraction
1	$\frac{1}{2}$
2	$\frac{1}{4}$
4	$\frac{1}{8}$
8	$\frac{1}{16}$
16	$\frac{1}{32}$
32	$\frac{1}{64}$
:	:

So, ½ the population has a value of 1, ¼ the population has a value of 2, ⅛ the population has a value of 4, etc.

The first thing to notice that the population has no average value!

The weighted value for each element is ½ and the average of the population is:

\frac{1}{2} \cdot 1 + \frac{1}{4} \cdot 2 + \frac{1}{8} \cdot 4 + \frac{1}{16} \cdot 8 + ...

Which is:

½ + ½ + ½ + ½ + ...

Which is either undefined or infinity, depending on which of these two makes you happier. In either case, there is no average. Which makes calculating a standard deviation troublesome.

We can change the population a bit to provide it with an average, while still arranging to have no standard deviation. Consider this population:

Element	Fraction
1	$\frac{1}{4}$
-1	$\frac{1}{4}$
2	$\frac{1}{8}$
-2	$\frac{1}{8}$
4	$\frac{1}{16}$
-4	$\frac{1}{16}$
8	$\frac{1}{32}$
-8	$\frac{1}{32}$
16	$\frac{1}{64}$
-16	$\frac{1}{64}$
32	$\frac{1}{128}$
-32	$\frac{1}{128}$
:	:

Now we have an average for the population: 0.

And we can try to calculate the standard deviation. First we'll try to calculate the variance. If we can do that, we can take the square root to get the standard deviation. The variance is:

\frac{1}{4} \cdot {(1 - 0)}^{2} + \frac{1}{4} \cdot {(-1 - 0)}^{2} + \frac{1}{8} \cdot {(2 - 0)}^{2} + \frac{1}{8} \cdot {(-2 - 0)}^{2} + \frac{1}{16} \cdot {(4 - 0)}^{2} ...

Which is:

\frac{1}{4} + \frac{1}{4} + \frac{4}{8} + \frac{4}{8} + \frac{16}{16} ...

Which gets arbitrarily large as we add more terms. The population as a whole has an undefined variance and thus an undefined standard deviation.

We should be able to keep going:

Population with mean and variance (thus standard deviation), but no defined skew.
Population with mean, variance and skew, but no defined kurtosis.
And so on ...

Does This Matter?

This is theoretically interesting (or not ...), but does any of it matter? Are there "real world" distributions that do not have standard deviations? The answer may be, "yes."

The following is part of an interview from Jack D. Schwager's "New Market Wizards." William Eckhardt is a mathematician who is (or was) also a commodities trader:

William Eckhardt:	A robust statistical estimator is one that is not perturbed much by mistaken assumptions about the nature of the distribution.
Jack D. Schwager	Why do you feel such techniques are more appropriate for trading system analysis?
William Eckhardt:	Because I believe that price distributions are pathological.
Jack D. Schwager	In what way?
William Eckhardt:	As one example, price distributions have more variance [a statistical measure of the variability in the data] than one would expect on the basis of normal distribution theory. Benoit Mandelbrot, the originator of the concept of fractional dimension, has conjectured that price change distributions actually have infinite variance. The sample variance [i.e., the implied variability in prices] just gets larger and larger as you add more data. If this is true, then most standard statistical techniques are invalid for price data applications.
Jack D. Schwager	I don't understand. How can the variance be infinite?
William Eckhardt:	A simple example can illustrate how a distribution can have an infinite mean. (By the way, a variance is a mean — it's the mean of the squares of the deviations from another mean.) Consider a simple, one-dimensional random walk generated, say, by the tosses of a fair coin. We are interested in the average waiting time between successive equalizations of heads and tails — that is, the average number of tosses between successive ties in the totals for heads and tails. Typically, if we sample this process, we find that the waiting time between ties tends to be short. This is hardly surprising. Since we always start from a tie situation in measuring the waiting time, another tie is usually not far away. However, sometimes, either heads or tails gets far ahead, albeit rarely, and then we may have to wait an enormous amount of time for another tie, especially since additional tosses are just as likely to increase this discrepancy as to lessen it. Thus, our sample will tend to consist of a lot of relatively short waiting times and a few disquietingly large outliers. What's the average? Remarkably, this distribution has no average, or you can say the average is infinite. At any given stage, your sample average will be finite, of course, but as you gather more sample data, the average will creep up inexorably. If you draw enough sample data, you can make the average in your sample as large as you want.