Episode Two – Decision Making
Philip Tetlock’s research on the qualities of Super-forecasters explains why forecasting is hard and how diverse models and systematic thinking can help close the gap.
- Improving predictions by regularly updating data
- The Brier Score and why it really matters
- Why larger samples permit more accurate short term predictions
- The difficulties with the “Long Now” narrative
Hosted by Adam Butler, Mike Philbrick and Rodrigo Gordillo of ReSolve Global*.
The ReSolve Master Class – Episode Two
Adam:00:00:00Okay, welcome to Episode Two. We are going to focus on decision making, and we’re going to start really broad and discuss some of the research on decision making in general, especially decision making in complex fields. Then we’re going to tie that back to some examples within financial markets just to crystallize it and make it real for those who are listening to this podcast, and then we’re going to discuss some ways that you can improve your decision making process by drawing on some of the more recent research in the field.
Mike:00:00:38Let’s get started with your favorite person in the world, your hero, your…I don’t know what we would call him. Your …
Adam:00:00:45That’s right. Referring, of course to none other than Philip Tetlock, who in 1985 ran a 20 year experiment on decision making that led to some pretty astonishing conclusions. Just to break it down, in 1985, he recruited over 200 experts average education level Master’s degree an average of over 16 years in their respective fields. He asked all of them a series of questions with a very specific structure. He asked them to make a forecast about what would happen and to assign a probability that their forecast would come to pass. And so over a 20 year horizon, he solicited over 82,000 forecasts and then he compiled the results in a book called Expert Political Judgment, which I think we would all highly recommend you go and read for yourself, but we can distill some of the conclusions. For example, we learned that individual experts are unable on average to deliver forecasts with greater than 50% calibration. In other words, you’re better off tossing a coin than listening to the forecasts of typical experts.
Some other surprising results, experts were able to deliver better calibrated forecasts outside of their field of interest, that largely had to do with the fact that within their field of interest, their confidence levels were too high. So they were overly confident in their ability to make precise forecasts within their own field of experience.Those experts who were featured in the media or cited most frequently in academia were on average less well calibrated than the experts who were more likely to toil in obscurity, and groups of experts. So, an ensemble of experts taking the average of the forecasts delivered a much higher level of calibration than trying to choose any particular individual expert on making a specific forecast.
Mike:00:02:59Those have to be independent though, right?
Adam:00:03:02Well, yeah, that’s a separate body of research. All of these forecasts were solicited independently but that’s a really good point because a lot of decision making theory is often applied to markets, but the decision making research is done in an environment that allows all of the forecasts to be made in isolation. So you avoid group think by design, whereas in markets, virtually every facet of markets drives toward group think.
Rodrigo:00:03:34The other interesting thing was the fact that experts outside of their field of expertise had a more accurate prediction than those in their field of expertise, which I also always thought was this idea of anchoring to your past predictions. When you have your ego involved it’s a problem. If you’re outside that area, you have no expectations of being right or wrong, you just have a prediction that’s more or less…
Mike:00:03:55And also less susceptible to group think. If you’re in an area of expertise, you are likely to have colleagues that are in that area of expertise who you are bouncing your ideas off of. In fact, often research and peer reviewed research and citations of research..
Adam:00:04:12… on papers. So you’re married to a certain framework, you’re part of a tribe. Every academic field has several different tribes and you belong to a certain tribe that thinks in a certain way and writes papers in a certain direction based on certain frameworks…
Rodrigo:00:04:30Painful to move outside of that and what the repercussions of being the odd man out. No, but what’s interesting about the Tetlock story is, during that period prior to reading Tetlock, you were in a completely different mindset when it came to trying to manage money. We’re talking about, do you remember Don Coxe?
Adam:00:04:47Of course, yeah. I remember back in the mid-2000s becoming extremely involved and engaged with this idea of a commodity super cycle and my primary go to guru was a guy named Don Coxe, who was an extraordinarily well educated and erudite individual. Every month he wrote a 20, 30, 40 page report that reinforced this long term commodity super cycle view. I spent three, four years digging into every nook and cranny of evidence in favor of this viewpoint. What that meant is that I became completely embedded in this point of view and married to it and I had the blinders on, I wasn’t able to see when the tectonic plates shifted in mid-2008. That they’d shifted, that this narrative wasn’t a self-fulfilling prophecy and that I needed to be open minded to different outcomes.
Mike:00:05:51Let’s crystallize that with some of the CXO stuff. Do that but then let’s come back to Tetlock’s second book and Superforecasters and some of the things that it takes, so that we can crystallize this initial point for people, and then come back to what did Tetlock find that actually does work, and how might we harness that? I’ll turn it back to you guys.
Adam:00:06:15Yeah, let’s connect the concepts from Tetlock to what we’re here to discuss, which is investing in financial markets. Let’s point to some examples. One really good example is a research website called CXO Analytics. From 1998 until the mid, I think it was maybe 2014, 2015, the tracked forecasts, guru forecasts from newsletters and major investment firms about market direction. So whenever there was a report, somebody came out and said that they were bullish, or bearish on markets. They documented that and they tracked it. Importantly, they only ever took note of the direction of the call.
So, they weren’t grading them on the magnitude if markets went up 8% or 12%, or whatever, it’s just, did they go up or did they go down and what did you expect? So over that 15, 17 year timeframe they collected about 6500 forecasts, from dozens and dozens of different gurus. And in 2014, he published a report on his observations. What he concluded is that the accuracy rate across all of these gurus was just 47%. There weren’t any gurus that demonstrated any meaningful ability to make forecasts above random guesses. The fact is as you pointed out earlier Mike, he decided to just stop tracking it at that time, because it was so demonstrable that these gurus had absolutely no ability to make this call.
Mike:00:07:47Agreed. The one that always fascinates me too is the history of forecasting and the Fed. I think it was around Greenspan’s era that that was done. Really what they were looking at is can the Fed who decides what interest rates are going to be in their notes, when they were looking six months forward, what were their predictions like? Keep in mind, they make the decision and they didn’t have any accuracy or edge either. Complex dynamic systems have feedback loops that change. Now, they did obviously update their viewpoints and made changes. They didn’t say, well, six months from now this is what it’s going to be so that’s what we’re going to do. They didn’t say we have a narrative with a certainty of outcome. What they did was, they had probabilistic thinking, they looked at the data, they updated regularly what they were looking at in the economy, and maybe less so anchored on what the prior was, and didn’t really have to explain it.
If you look at what they did, and how incongruent it was with what they prepared to do six months earlier, I think it shows a flexibility of mindset. That is actually one of the requirements when you’re thinking about any kind of narrative that you’re going to create. Then if that narrative has certainty of outcome between point A and point B in the future six months from now without any updating, that’s something that you should be on the lookout for.
Rodrigo:00:09:19Well, the conclusion of Tetlock was that how does he create something that’s passionless, or a system that he would use as a benchmark to see how that system would do against the forecasters? And what it was that simple things like the trend, if the Fed increase rates by 25 basis points, they’re likely to increase rates by 25 basis points next quarter, that type of simple trend, simple system ended up beating both the analyst and doing better than a coin toss. So that was the first thread where he’s like, huh, this passionate prediction reversion to the mean.
Adam:00:09:51Over longer time horizons. So trend in short term horizons and reversion to the mean over the long term.
Rodrigo:00:09:58That was his first book, and then what he then questioned is, are there people that use these very unique ways of thinking, simple heuristics, simple approaches to try to analyze any situation to predict things in a local short term approach? So where you’re asking the predictor to be dispassionate to give updates of their prediction with a probability to have those probabilities change over time, and assess who is good at predicting and who is not good at predicting. The second book is called Superforecasting, in the journey of finding the super forecasters. In one particular metric, which was the Brier Score. And the brightest goal was just the likelihood that the people that are predicting are going to have a high level of accuracy, it’s not huge.
Adam:00:10:48A high level of calibration. Again, you make a prediction and then you’re required to give a probability that your prediction will come to pass. So it’s easy to make a binary guess one way or another by forcing somebody to give you a probability. It forces their brain to think in probability space. What tends to happen is you tend to get more nuanced, less binary type forecasts. And that is a useful measure of whether somebody is likely to be a super forecaster or have better than average forecasting abilities.
Rodrigo:00:11:25Remember that there was a hedgehog and the fox, right? And so even within the Superforecasters, there are hedgehogs and foxes and correct me if I’m wrong, but I believe the hedgehog is the one that’s more cautious.
Adam:00:11:34No. The hedgehog knows one thing, and the fox knows many things.
Rodrigo:00:11:38So there were instances that the hedgehog, there were people that would have 90% expectation, but also the best hedgehog forecasters were willing to then go to 40%. Large swings in their predictions but we’re not anchored to the priors, which I think is a crucial thing that in our world that we see into finding good forecasting.
Mike:00:11:58Forecasting or narrative that’s probability based, that updates regularly without anchoring to the prior, and is allowed to do so. Encouraged or required to do so. And thus allowing to have that short term memory loss, right? What was your prediction, why it doesn’t really matter? Here’s what it is now, because the facts have changed. So my probability weighted outcomes change alongside of that.
Adam:00:12:27Critically, what is necessary in order to create a dispassionate view that can be regularly updated, is to avoid explanations and defenses for defending your point of view. The minute you write something down, or you say something publicly, your brain becomes accountable to that outcome, your ego gets married to that outcome. And it becomes incredibly difficult to step away, rethink it, and come back with an alternative conclusion. So just tying this back to the investment industry, what do we gravitate to as humans? Our brains are wired to gravitate to narratives. We love good stories, were taught in school how to write persuasive essays, present a thesis, write three to five paragraphs that defend that thesis, maybe one throwaway paragraph that presents alternative evidence and then dismisses it, and then a final conclusion. This is how we’re taught to think; we’re not taught to think stochastically or probabilistically. And investors pay for newsletters, investors pay for sell side researchers to write long reports with many facts that describe why a certain outcome is virtually inevitable. All of those qualities run counter to the qualities that are likely to create good forecasting skills that are able to regularly update their views and to deliver their views with strong calibration by thinking probabilistically.
Rodrigo:00:14:07So, this idea of the Brier Score is important here, because does anybody know the Brier score of Don Coxe? Did anybody measure it? Were we even able to measure his predictions that he didn’t give enough predictive narratives to warrant enough data to have an opinion whether it was effective or not? Like you said, we’re human beings, we like the narrative and we’re not saying that there aren’t individuals out there that have this ability. You just need to be as an investor, and the person that’s consuming these narratives to understand your own human biases, understand that there are tools out there for you to be able to calibrate, better understand, and what you’ll find is that ultimately there’s nothing out there on the narrative side to…
Mike:00:14:44You should be careful of narratives that are very certain. I know amongst us when we say something like, this is going to happen or that we’ll do this like even setting up the microphones and like that, thing’s not going to fall out of there.
Mike:00:15:02So we’re very sensitive to that language and I think that that is a good thing. I think that when you hear narratives that are very compelling but very certain, and lack the forecast are actually giving all the caveats, and all the other potential outcomes. Along with that, alarm bells should start to ring.
Rodrigo:00:15:24Let’s say you do identify somebody, and you do a Brier Score on them and it’s pretty high. Even the superforecaster process that Tetlock follows is an ensemble of forecasters. You do better with the ensemble than you do with any individual.
Mike:00:15:37Let’s go back to the hurricanes. In our first episode, we talked about building the house and you watch the hurricane forecast if you’re in the Caribbean, and what will you see, you’ll see the European model, you’ll see the American model, you’ll see the spaghetti strings of the different areas, the cones of future potential possibilities that could occur.
Adam:00:15:59You’ll see hundreds of lines on the map. All of them sort of vectors of the forecasted trajectory of the hurricane based on a single model. And what do you look for? You look for where a large number of models converge. And what does a large number of models allow you to do? It allows you to think probabilistically. If you’ve got 1000 models, 600 of which are suggesting that a hurricane is going to follow a certain trajectory, you can say at the moment, we believe there’s a 60% chance that the hurricane is going to go in this direction, and we’re 15% chance likely that this hurricane is going to go in this direction, etc. But it allows you to think probabilistically, rather than state something definitively.
Mike:00:16:38Now, do we want to jump into the computational irreducibility of some of these, you know the things that form storms. The temperature of the water, the direction of the wind, the storm that came from Africa and went across the Atlantic Ocean, or if it germinated in the lower Caribbean as an example. And just because we know the forces, we still can’t calculate for certain the outcome.
Rodrigo:00:17:05Especially the Euro. It’s such a complex and abstract series of interactions between those variables. And the longer out you go, the more possibilities, and it becomes computationally irreducible.
Adam:00:17:19It’s like the three body problem, you’ve got three bodies in a gravitational field, all interacting, you know their exact position and momentum at a point in time. You can compute with extreme levels of precision what their position and velocity is going to be in the very near future, in a small number of increments forward. But it’s extraordinarily difficult if not impossible to analytically solve for what their positions are going to be once you go out beyond a certain threshold, you just have to walk it forward and then observe and update your models and act accordingly.
Rodrigo:00:17:57So this is where Stephen Wolfram has come up with these ways of thinking about what we can predict and what we can’t. And there are these local pockets of reducible predictions. And these are exactly what you said, what is the next step within three body problem? But can you predict it 12 months later where they’re going to be? No, it’s impossible. It’s an irreducible problem. So when we think about it from our perspective, we used to run a piece called Bold, Confident and Wrong, and it was grabbing all the reports at 12 months out, looked at the big banks from the previous year and seeing how well they did. Is there anything more dynamic than managing money and all the variables that go into it? Did we really think we were going to be able to accurately and persistently have an edge a year out in these predictions? Maybe there was a part of my career where I did, but certainly I’ve come to be disabused of that.
Mike:00:18:46And then the group think that falls into that.
Adam:00:18:48There’s always a story that accompanies the forecast.
Mike:00:18:51And the story on story from the storytellers that on mass are reading each other’s stories.
Adam:00:18:55For sure. But on the other end of the spectrum, of course, are the high frequency players, Demonstrably, you can forecast with unbelievable accuracy especially if you take a large number of forecasts, a large number of markets, a large number of models at the microsecond, second, maybe even hourly, and maybe daily horizon in markets.
Mike:00:19:18Elaborate on end in that context?
Adam:00:19:20Well, yeah, the great thing about high frequency is that you have a very large number of observations that happen within time horizons that are meaningful for investors. A high frequency trader might trade hundreds of 1000s of times a day. Well, it doesn’t take very many days for you to generate a large enough sample to be able to draw extremely powerful statistical conclusions about the types of strategies, patterns, forecasting tools that are likely to be effective in that domain. Once you move to a daily, weekly, monthly, annual horizon, there’s not enough calendar months or certainly counter years for you to be able to generate that level of power, statistical power to determine whether the framework, model or forecasting methods that you’re using are effective.
Rodrigo:00:20:10And look, we employ people that have been very successful in that high frequency space, I think, eight, nine years, five weeks of losses. But before anybody jumps on that bandwagon, sadly, it doesn’t take a lot of money. And it’s very low capacity. We’re talking again, our audience is the steward of wealth, large pools of money and we got to think about our capability of managing hundreds of millions of dollars with high liquidity for the long term. So it becomes much more difficult to do a lot of short term predictions in that way.
Mike:00:20:46There’s a capacity.
Rodrigo:00:20:45You need to think about the problem in a different way.
Mike:00:20:50There are capacity constraints to markets, it’s not infinite liquidity before you start to impact the outcome. The study of physics doesn’t change physics, the study of chemistry doesn’t change chemistry, the study of markets and the exploitation of potential anomalies does.
Adam:00:21:09It really is about our business is dominated by these long term narratives. Everybody talks about the trend for 2021 and they make decisions on this, that capital market expectations dominate how consultants allocate for their clients. This can be a real problem; it is a real problem. So, to think about the Long Now approach, we need to recognize the pitfalls of predicting, of narrative and understand what Tetlock has demonstrated works better and find that balance, find that equilibrium.
Mike:00:21:46Agreed, and look for some blind spots too, that are a function of luck, mistaken as facts, or skill. So one of those peeves for you and I is the use of the US equity market to determine a risk premium for equities, using that pervasively across things. There was a recent paper that came out that said, wait a second, you actually have a much higher potential for a negative real rate of return in equities. If you actually consider all of the equity markets over 30 year periods of time, you have, I think it was a 12% chance for a negative real return. Over 30 years, it’s like a 0% chance in the US.
Adam: 00:22:281.2%. So you’re actually, if you broaden your dataset to include all developed markets over a longer time horizon, then your probability of an adverse outcome from equity investing is 10 times larger than what you would presume if you only looked at the US data.
Mike:00:22:46Right. And think of that in that 100 year horizon that we talked about, and the transition of power from the various dominant regimes, British to the US, etc. And you are managing wealth now on a timeframe that is 50 to 100 years, these things are going to happen.
Rodrigo:00:23:05So this sets it up nicely for the next episode which is when we bring people into our world, we start to make them think of what is the first step we need to take here. And the recognition that predicting long term which dominant energy, is very hard. And if we’re not able to predict, if our capital market expectations are wrong, what is the best way to start? What is the do no harm portfolio. There’s going to be a few building blocks here, but we’re going to talk about that in the next episode.
Rodrigo:00:23:30Diversify and all that fun stuff.
Mike:00:23:30Preparation, an ounce of prevention is worth a pound of cure. Anyway, amazing.
Rodrigo:00:23:36Cue the music.