Prospect Theory, Bias, and Chalk: Our 2017 March Madness Wrap

Congrats to the First Place Loser

Let’s start off in the obvious place: Mike Philbrick, the poor-man’s Gronkowski, went wire-to-wire in last place.

That makes us happy, and so first and foremost, we come to bury him.

It’s entirely his fault.  We know because the scoring rules were such that, assuming public betting markets are reasonably good proxies for the true odds of a team winning a particular game, every bracket from the most sophisticated strategies to the purely random should have had the exact same total expected return, and hence an equal chance of victory.

The rules were set this way in order to maximize the odds that skill would emerge, and so to whatever extent we were successful in doing that, Mike has none.

Source: mysticalpha

Sample Size is Small No Matter What

If Mike can scramble for one dignity-sparing life preserver, it would be in this message that Adam Butler posted to our internal chat system after the second round table was posted:

LOL that Dan Adams (last year’s winner) and MP (last year’s 2nd or 3rd place right?) are in a dead heat for this year’s last place. If there was ever a signal that this whole concept is a racket, this is it… hilarious.
It’s worth mentioning that Adam didn’t even submit a bracket this year due to a taupe-colored and carboard-flavored aversion to whimsy.   As he posed to me during the editing of this very article, “Why not flip a coin 63 times and have people guess the sequence?”

Ridiculous though it may seem, he does have a point: while our rules were designed to maximize sample size, the largest possible sample size is capped at 63 (the total number of games in the tournament).  Unfortunately, even in our system, legacy errors still exist, and as we’ll discuss below, your entries still displayed remarkable bias, making the effective sample size considerably less than 63.  Even under ideal conditions – which our bracket challenge did not achieve – the sample size for March Madness is small enough that the outcome in any single year will appear lottery-.

Investing is different because we have the opportunity to apply strategies repeatedly over a long period of time.  While the outcome of any short-term investment sub-period may have characteristics of randomness, over many repetitions patterns emerge.  This opportunity isn’t available in March Madness, but unlike Adam’s assertion that this makes the entire thing a “racket,” the less sports-averse among us took this as the fundamental issue necessitating the largest sample size possible.  After all, the larger the sample size, the less random the outcome.

This is why we equalized the expected points per team.  In our pre-tournament post, we wrote:

If we must endure legacy errors – and we really can’t see any (unoppressively demanding) way around it – then the next best option is to create parity between the total expected values of every team.  And the best way we know how to do that is to award points in inverse proportion to a team’s likelihood of advancing to a certain point in the tournament.
We then asked a very simple question:

if the expected value of every team is the same, how are you going to make your picks?
And your collective answer was to get out a big ‘ol crate of chalk.

Our March Madness Entries Were Still Irrational

When we set the total expected values equally for every team, the goal was to elicit a meaningful uptick in the number of upsets chosen.  In fact, in a perfectly rational world where betting markets perfectly reflect each team’s true odds of winning, the distribution of picks in the first round would have approached 50% for every team, regardless of seed.

Of course, that didn’t happen.

Judging by the divergence of our first round picks from Yahoo’s, where they use “standard” scoring rules, ReSolve’s scoring method barely made a dent in your approach to selecting winners:

Figure 1. First Round Pick Distribution for ReSolve and Yahoo March Madness Bracket Challenges, Yahoo Concensus Favorites, 2017

Overall Seed Team Yahoo Public Picks ReSolve Picks Difference
1 Villanova 99% 93% 6%
3 Duke 99% 91% 7%
2 Kansas 99% 91% 7%
4 North Carolina 99% 96% 3%
5 Gonzaga 98% 93% 6%
6 Kentucky 98% 96% 3%
7 Arizona 98% 96% 2%
8 Louisville 98% 93% 5%
9 UCLA 97% 93% 4%
10 Oregon 95% 86% 10%
11 Baylor 94% 94% -1%
12 Butler 93% 70% 23%
13 West Virginia 91% 84% 7%
14 Purdue 89% 76% 13%
15 Florida State 86% 80% 5%
16 Notre Dame 85% 83% 2%
Overall Seed Team Yahoo Public Picks ReSolve Picks Difference
17 Florida 85% 69% 17%
18 SMU 79% 69% 11%
19 Iowa State 79% 79% 0%
20 Virginia 79% 39% 40%
21 Michigan 76% 67% 9%
22 Cincinnati 75% 71% 4%
23 Wisconsin 72% 39% 33%
24 Wichita State 71% 67% 4%
25 Marquette 64% 79% -15%
26 Michigan State 61% 66% -5%
27 Maryland 58% 44% 13%
28 Creighton 58% 43% 15%
29 Seton Hall 56% 30% 26%
30 Minnesota 55% 57% -2%
31 Saint Mary’s 55% 57% -2%
32 Vanderbilt 51% 53% -2%

In only five cases (highlighted above) did our rules sway more than a 15% change in your attitude while bringing the overall pick percentage closer to 50%.

The overall pick bias was remarkably consistent, too:

Figure 2. Frequency of ReSolve First Round Picks by Seed

Don’t be confused by the bump in the 10 seed, either.  That was largely driven by Witchita State, a team that was an underdog by seed, but a 70% favorite in Vegas.

Also, don’t think that this phenomenon was isolated to the first round.  Bias towards higher-seeded team flowed deeply through most brackets, as evidence by the distribution of our champion picks:

Figure 3. Frequency of ReSolve National Champion Picks

Team Seed Frequency
North Carolina 1 11
Gonzaga 1 10
Kansas 1 10
Villanova 1 8
Oregon 3 7
Kentucky 2 6
Duke 2 5
Arizona 2 4
Team Seed Frequency
Michigan 7 2
Maryland 6 1
Notre Dame 5 1
Cincinatti 6 1
Creighton 6 1
Virginia Tech 9 1
Florida 4 1
Marquette 10 1

87% of our entrants chose a 3 seed or higher to win it all.  And sure, the plurality pick and 1-seed North Carolina won the National Championship, but you have to go all the way down to our #7 finisher to find the top bracket to have accurately predicted that.  That bracket – submitted by “Hinch” – finished 20 points off the leader, and even then, North Carolina accounted for a meager 12% of Hinch’s points.

Around here, chalk doesn’t pay.

Why Did People Still Pick Favorites?

There are four possible reasons that people were biased towards favorites:

  • An incomplete understanding of our scoring rules, which caused people to adhere to standard methods.
  • Assumption that the number of correct picks would ultimately correlate strongly with total points.
  • Failure to fully grasp the distribution in points that could be earned by a low seed scoring a huge upset or a mid-seed making a Cinderella run.
  • A genuine belief that, based on a perceived informational edge, favorites would earn outsized points this year.

Let’s dissect these one at a time.

Possibility #1: Lack of Understanding of Scoring Rules

On this point, there’s not much to say.  While we did our best to clearly explain the rules, we know that at least one person slightly misunderstood the scoring method.  Hopefully that was isolated, but with rules as unique as ours, it is likely that a few entries didn’t know or understand how their brackets would be scored.

Possibility #2: False Assumption that Number of Correct Picks Would Correlate with Total Points

The following chart says it all:

Figure 4. Total  Score and Number of Correct March Madness Picks, 2017

60 out of our 70 entries correctly picked between 32 and 42 games.  Furthermore, once a bracket hit that range, the marginal value of an additional correct pick didn’t mean much.

As one (admittedly cherry-picked) example, both our 5th place and 63rd place entries chose 32 games correctly.  Zooming the lens out, for our top 30 brackets by score, the correlation between the number of correct picks and total points was indistinguishable from zero.

This is exactly what we’d hope to see: the quality of the picks proved much more important than the quantity of winners.

Possibility #3: Failure to Grasp Distribution or Points by Team

Awarding points based on the inverse cumulative odds of a team accomplishing something remarkable inevitably leads to a skewed distribution of total points.  This is no different than traditional rules where getting the national champion correct is worth 32x as many points as getting a first round game right.  We have a similar skew, but in our tournament, we’re rewarding teams that outperform expectations.

Figure 5.  Points by Team, 2017

Because of this, our distribution of point by teams skewed heavily towards South Carolina, Xavier, and Oregon, who combined accounted for 59% of the total points available.  Adding in North Carolina only bumps that to 64%, emphasizing the importance of underdogs and Cinderellas relative to the National Champion. It should come as no surprise that the top of our leaderboard had significant exposure to South Carolina and Xavier specifically.

Possibility #4: Genuine Belief that Top Seeds Would Outperform Expectations

This is the only rational justification for choosing so many favorites.  However, there’s little data supporting the notion that your collective bias was information-driven.

How Prospect Theory Ruined Your Bracket

Ultimately, we believe that the biggest reason for your chalky brackets was deeply-rooted cognitive biases.

According to Daniel Kahneman and Amos Tversky, who won the Nobel Prize in Economics for describing the behavioural economics concept of Prospect Theory, fear of loss is about 2.5 times more powerful than lust for gains.  That is, in the absence of a strong informational advantage, you tended to stick with the consensus picks and Vegas favorites.

In addition, we must consider the notions of absolute versus relative success and failure.  Specifically, while entrants in our bracket challenge may have been anxious about their absolute performance, they would have been far more comfortable with so long as others followed similar strategies and performed equally.  After all, failing together is more “comfortable” than failing alone.

But by using a consensus strategy, it also becomes very difficult to win.  The contrarian behaviors that may lead to relative failure are also required to achieve relative success.  And because prospect theory dictates that failing alone is the worst possible outcome one can experience, in the absence of an informational edge, most people prefer the safety of the herd.

How Prospect Theory Ruins Your Portfolio

If the connection to investing isn’t yet clear, consider the plight of thoughtful, globally-diversified investors over the past couple years.  Sure, these investors have made money, but not nearly as much as their friends and neighbors with portfolios concentrated in US stocks.

Making things worse, a potent cocktail of home bias and equity concentration make US stock portfolios more the norm than the aberration.  As such, despite the fact that on an absolute basis everyone is winning, on a relative basis diversification has felt like a loser.

Times like these test the perseverance of investors who zig when everyone else is zagging.  But if the process is thoughtful and backed by solid data, then the short term results should be a far smaller consideration than the weight of the evidence.

This is true in your portfolio, and it’s true in March Madness.

How We’re Going to Use Prospect Theory In our 2018 Bracket Challenge

People respond to incentives.  The problem this year was that our incentive to pry you away from the herd wasn’t strong enough.  So that’s what we’re going to change next year.  Though we reserve the right to change our rules between now and then, it makes sense that we would make the payoff for correctly choosing an underdog equivalent not for the rational strategist, but for the irrational mind.

That is, what if we, after awarding points in inverse proportion to the cumulative odds (as we did this year), we further multiply the underdog’s points by a factor of 2 or 3?

We’ll spend our off-season thinking on it, but rest assured, we will never end our search for the incentive structure that meaningfully eliminates your pick biases.

And Lastly, the Final Results for Resolve’s 2017 March Madness Bracket Challenge

Congratulations to colonel717, who scored almost exactly 50% of the total available points. Look upon this bracket, all ye losers, and imagine what could have been.

Final Rank Name Points
1 colonel717 81.52
2 0761fabera 77.75
3 bshibe 76.63
4 DavefromBarrie 69.22
5 Purvinator 65.71
6 etfmike 65.04
7 Hinch 61.78
8 MrsBrick 61.57
9 mike.king 61.30
10 PatBolland 59.81
11 Mwkohler 57.30
12 geoff313 54.44
13 jd1218 53.41
14 acmeinvest 51.25
15 mackchiu 50.53
16 Sallese 49.64
17 DCSorley 49.16
18 Dragana 48.95
20 archer 47.89
21 Oilerman44 47.67
22 bviveiros 47.42
23 trentd 47.16
Final Rank Name Points
24 shawkins 47.06
25 baylasdad 46.97
26 les sherman 46.51
27 glaschowski 46.48
28 mlederman 45.39
29 snadkarni 44.69
30 eharari 44.17
31 MrDuMass 44.10
32 Nick1 43.96
33 bigdawg24 42.92
34 PunterJP 42.78
35 Fenders10 42.64
36 jwood3010 42.63
37 HungryHungryHippos 42.43
38 CBurnell4 42.04
39 Pvolpe 41.84
40 CMRAM 41.67
41 cotts23 41.57
42 teamcatherine 41.28
43 KPeg15 41.00
44 ukbsktbll 40.82
45 csorley1 40.75
46 robbiep 40.73
47 RCMAlts 40.45
Final Rank Name Points
48 Crawling Chaos 40.23
49 pkatevatis 40.16
50 TheHotDogVendor 39.55
51 Walkush 39.19
52 Mnoack 38.60
53 mySphere 38.45
54 mattzerker 38.05
55 jcrich 37.32
56 Prewbee 37.05
57 brianm317 36.71
58 CANESPEED 36.64
59 sjsivak 36.41
60 Jgunter742 36.26
61 brennantim 35.29
62 jmkeil 34.89
63 scott.pountney 34.56
64 cosggg 28.99
65 Benc100 28.30
66 Cozzsmack 28.17
67 resolvetest 27.95
68 LDB 27.78
69 DanAdams 19.80
70 Dead-Brick 19.80