Imagine being tasked with examining the tattered carcases of bombers returning from missions in Germany during World War 2.  Your analysis will inform the military commanders where to best place additional armor on the planes before sending them on their next missions.  Let’s say that you found the most bullet holes around the tail gunner and in the wings.  Where would you recommend the extra armor be placed?

Around the tail gunner and on the wings, right? Because that’s obviously the most likely spot for planes to get hit.

This was my reflexive response, too.  But it turns out that’s almost completely backward.



You see, most people don’t intuitively understand the concept of conditional statistics. Rather, where people have any education in statistics at all, they are trained in frequentist methods. Frequentist methods would suggest you should examine where the most frequent damage occurs, and address the issue in those spots. But this ignores a critical piece of the puzzle.

Specifically, what we are observing is not a random sample of planes, but a sample of planes which made it back safely. A classic example of survivorship bias. This turns the problem on its head, and as it turns out, this nuanced perspective makes all the difference in the world.

Thank goodness the British Air Force had the good sense to consult a statistician. The statistician they consulted, Abraham Wald understood that many of the bombers that flew missions over Germany simply didn’t return. Which meant, of course, that he only got to examine the bombers that survived. The surviving bombers were more likely to have bullet holes in certain common spots on the plane, but no bullet holes in other common spots. Wald hypothesized that damage to areas on planes where the planes returned were trivial; after all, the planes returned despite that damage. On the other hand, areas in which surviving planes sustained little damage were probably more critical, as planes which were hit in those areas rarely returned. As a result, he recommended bulking up armour in these areas, which ended up saving countless lives over the course of the war.

Wald knew that in order to make an accurate analysis and develop a reliable conclusion, he had to include all the data, not just the survivors.

To this day, survivorship bias is something that still plagues researchers and statisticians.  Where elements of a dataset are not tracked throughout the entire measurement period, conclusions drawn form that data are likely to be skewed.  This is particularly true in financial research where, as a matter of capitalism, companies are constantly being created and destroyed.

Of course, creation and destruction occurs at many levels of the financial world.  Mutual fund companies, for example, often merge underperforming funds with better performing brethren ).  For example, a Vanguard study measuring the period from 1997 to 2011 found that only 54% of funds at the beginning of the period remained at the end without being merged or liquidated.  Of those funds that remained, only 35% outperformed their benchmarks.  That 35% number is pretty bad, but here is where survivorship bias comes into play:

If we included all the funds that underperformed, including the liquidated and merged funds, the proportion that outperformed was 0.54 * 0.35 = 0.19, or just 19%.

The point here isn’t to bash discretionary fund managers.  We. Do. That. Often. The point of this article was simply to demonstrate that when you put garbage data into an analysis – whether it’s World War 2 bombers or your investment portfolio – you’re going to get garbage conclusions.