Prospect Theory, Bias, and Chalk: Our 2017 March Madness Wrap
Congrats to the First Place Loser
Let’s start off in the obvious place: Mike Philbrick, the poor-man’s Gronkowski, went wire-to-wire in last place.
That makes us happy, and so first and foremost, we come to bury him.
It’s entirely his fault. We know because the scoring rules were such that, assuming public betting markets are reasonably good proxies for the true odds of a team winning a particular game, every bracket from the most sophisticated strategies to the purely random should have had the exact same total expected return, and hence an equal chance of victory.
The rules were set this way in order to maximize the odds that skill would emerge, and so to whatever extent we were successful in doing that, Mike has none.
Sample Size is Small No Matter What
If Mike can scramble for one dignity-sparing life preserver, it would be in this message that Adam Butler posted to our internal chat system after the second round table was posted:
LOL that Dan Adams (last year’s winner) and MP (last year’s 2nd or 3rd place right?) are in a dead heat for this year’s last place. If there was ever a signal that this whole concept is a racket, this is it… hilarious.
It’s worth mentioning that Adam didn’t even submit a bracket this year due to a taupe-colored and carboard-flavored aversion to whimsy. As he posed to me during the editing of this very article, “Why not flip a coin 63 times and have people guess the sequence?”
Ridiculous though it may seem, he does have a point: while our rules were designed to maximize sample size, the largest possible sample size is capped at 63 (the total number of games in the tournament). Unfortunately, even in our system, legacy errors still exist, and as we’ll discuss below, your entries still displayed remarkable bias, making the effective sample size considerably less than 63. Even under ideal conditions – which our bracket challenge did not achieve – the sample size for March Madness is small enough that the outcome in any single year will appear lottery-.
Investing is different because we have the opportunity to apply strategies repeatedly over a long period of time. While the outcome of any short-term investment sub-period may have characteristics of randomness, over many repetitions patterns emerge. This opportunity isn’t available in March Madness, but unlike Adam’s assertion that this makes the entire thing a “racket,” the less sports-averse among us took this as the fundamental issue necessitating the largest sample size possible. After all, the larger the sample size, the less random the outcome.
This is why we equalized the expected points per team. In our pre-tournament post, we wrote:
If we must endure legacy errors – and we really can’t see any (unoppressively demanding) way around it – then the next best option is to create parity between the total expected values of every team. And the best way we know how to do that is to award points in inverse proportion to a team’s likelihood of advancing to a certain point in the tournament.
We then asked a very simple question:
…if the expected value of every team is the same, how are you going to make your picks?
And your collective answer was to get out a big ‘ol crate of chalk.
Our March Madness Entries Were Still Irrational
When we set the total expected values equally for every team, the goal was to elicit a meaningful uptick in the number of upsets chosen. In fact, in a perfectly rational world where betting markets perfectly reflect each team’s true odds of winning, the distribution of picks in the first round would have approached 50% for every team, regardless of seed.
Of course, that didn’t happen.
Judging by the divergence of our first round picks from Yahoo’s, where they use “standard” scoring rules, ReSolve’s scoring method barely made a dent in your approach to selecting winners:
Figure 1. First Round Pick Distribution for ReSolve and Yahoo March Madness Bracket Challenges, Yahoo Concensus Favorites, 2017
|Overall Seed||Team||Yahoo Public Picks||ReSolve Picks||Difference|
|Overall Seed||Team||Yahoo Public Picks||ReSolve Picks||Difference|
In only five cases (highlighted above) did our rules sway more than a 15% change in your attitude while bringing the overall pick percentage closer to 50%.
The overall pick bias was remarkably consistent, too:
Figure 2. Frequency of ReSolve First Round Picks by Seed
Don’t be confused by the bump in the 10 seed, either. That was largely driven by Witchita State, a team that was an underdog by seed, but a 70% favorite in Vegas.
Also, don’t think that this phenomenon was isolated to the first round. Bias towards higher-seeded team flowed deeply through most brackets, as evidence by the distribution of our champion picks:
Figure 3. Frequency of ReSolve National Champion Picks
87% of our entrants chose a 3 seed or higher to win it all. And sure, the plurality pick and 1-seed North Carolina won the National Championship, but you have to go all the way down to our #7 finisher to find the top bracket to have accurately predicted that. That bracket – submitted by “Hinch” – finished 20 points off the leader, and even then, North Carolina accounted for a meager 12% of Hinch’s points.
Around here, chalk doesn’t pay.
Why Did People Still Pick Favorites?
There are four possible reasons that people were biased towards favorites:
- An incomplete understanding of our scoring rules, which caused people to adhere to standard methods.
- Assumption that the number of correct picks would ultimately correlate strongly with total points.
- Failure to fully grasp the distribution in points that could be earned by a low seed scoring a huge upset or a mid-seed making a Cinderella run.
- A genuine belief that, based on a perceived informational edge, favorites would earn outsized points this year.
Let’s dissect these one at a time.
Possibility #1: Lack of Understanding of Scoring Rules
On this point, there’s not much to say. While we did our best to clearly explain the rules, we know that at least one person slightly misunderstood the scoring method. Hopefully that was isolated, but with rules as unique as ours, it is likely that a few entries didn’t know or understand how their brackets would be scored.
Possibility #2: False Assumption that Number of Correct Picks Would Correlate with Total Points
The following chart says it all:
Figure 4. Total Score and Number of Correct March Madness Picks, 2017
60 out of our 70 entries correctly picked between 32 and 42 games. Furthermore, once a bracket hit that range, the marginal value of an additional correct pick didn’t mean much.
As one (admittedly cherry-picked) example, both our 5th place and 63rd place entries chose 32 games correctly. Zooming the lens out, for our top 30 brackets by score, the correlation between the number of correct picks and total points was indistinguishable from zero.
This is exactly what we’d hope to see: the quality of the picks proved much more important than the quantity of winners.
Possibility #3: Failure to Grasp Distribution or Points by Team
Awarding points based on the inverse cumulative odds of a team accomplishing something remarkable inevitably leads to a skewed distribution of total points. This is no different than traditional rules where getting the national champion correct is worth 32x as many points as getting a first round game right. We have a similar skew, but in our tournament, we’re rewarding teams that outperform expectations.
Figure 5. Points by Team, 2017
Because of this, our distribution of point by teams skewed heavily towards South Carolina, Xavier, and Oregon, who combined accounted for 59% of the total points available. Adding in North Carolina only bumps that to 64%, emphasizing the importance of underdogs and Cinderellas relative to the National Champion. It should come as no surprise that the top of our leaderboard had significant exposure to South Carolina and Xavier specifically.
Possibility #4: Genuine Belief that Top Seeds Would Outperform Expectations
This is the only rational justification for choosing so many favorites. However, there’s little data supporting the notion that your collective bias was information-driven.
How Prospect Theory Ruined Your Bracket
Ultimately, we believe that the biggest reason for your chalky brackets was deeply-rooted cognitive biases.
According to Daniel Kahneman and Amos Tversky, who won the Nobel Prize in Economics for describing the behavioural economics concept of Prospect Theory, fear of loss is about 2.5 times more powerful than lust for gains. That is, in the absence of a strong informational advantage, you tended to stick with the consensus picks and Vegas favorites.
In addition, we must consider the notions of absolute versus relative success and failure. Specifically, while entrants in our bracket challenge may have been anxious about their absolute performance, they would have been far more comfortable with so long as others followed similar strategies and performed equally. After all, failing together is more “comfortable” than failing alone.
But by using a consensus strategy, it also becomes very difficult to win. The contrarian behaviors that may lead to relative failure are also required to achieve relative success. And because prospect theory dictates that failing alone is the worst possible outcome one can experience, in the absence of an informational edge, most people prefer the safety of the herd.
How Prospect Theory Ruins Your Portfolio
If the connection to investing isn’t yet clear, consider the plight of thoughtful, globally-diversified investors over the past couple years. Sure, these investors have made money, but not nearly as much as their friends and neighbors with portfolios concentrated in US stocks.
Making things worse, a potent cocktail of home bias and equity concentration make US stock portfolios more the norm than the aberration. As such, despite the fact that on an absolute basis everyone is winning, on a relative basis diversification has felt like a loser.
Times like these test the perseverance of investors who zig when everyone else is zagging. But if the process is thoughtful and backed by solid data, then the short term results should be a far smaller consideration than the weight of the evidence.
This is true in your portfolio, and it’s true in March Madness.
How We’re Going to Use Prospect Theory In our 2018 Bracket Challenge
People respond to incentives. The problem this year was that our incentive to pry you away from the herd wasn’t strong enough. So that’s what we’re going to change next year. Though we reserve the right to change our rules between now and then, it makes sense that we would make the payoff for correctly choosing an underdog equivalent not for the rational strategist, but for the irrational mind.
That is, what if we, after awarding points in inverse proportion to the cumulative odds (as we did this year), we further multiply the underdog’s points by a factor of 2 or 3?
We’ll spend our off-season thinking on it, but rest assured, we will never end our search for the incentive structure that meaningfully eliminates your pick biases.
And Lastly, the Final Results for Resolve’s 2017 March Madness Bracket Challenge
Congratulations to colonel717, who scored almost exactly 50% of the total available points. Look upon this bracket, all ye losers, and imagine what could have been.