Lets begin with a simple example. Lets say we are flipping a coin 10 times. The probability we will assume here for simplicity is 50% heads and 50% tails. (As I discussed in an earlier post, the odds are actually closer to 51% heads and 49% tails) Knowing this, the forecast could be made that, of the 10 times we tossed the coin, we will get 5 heads and 5 tails since the odds of each binary outcome is 50%. If we looked at each coin toss individually, we would say that each toss had the same odds to land on either side. So we could guess, heads or tails, based upon probability, and be equally correct. If on e of us "called" heads and the other "called" tails, one of use must be wrong. One of us would be wrong even though we both chose outcomes with equal probabilities. However, over the course of 10 trials, we would both be wrong about half the time. (though in such a limited number of trials, one of us being wrong all 10 times is roughly 0.1%)
My forecasts (as any other, but for the purpose of this post I will be specifically referring to my data.) contain win probabilities for each Senate and Gubernatorial race this year. It is tempting to look at those projections and determine (as I have tallied on the right side of this page) that the GOP will have 53 seats in the Senate and have 27 Governorships. The problem here is that this only tallies the front runner and doesn't take into account how much of a front runner these candidates are. In the tally, Sandoval (R-NV) and Purdue (R-GA) are both currently front runners to win their races. However, they do not have the same win probabilities. Sandoval is roughly a 99.98% favorite to win and Purdue is a 52.53% favorite to win. This means that if we could hold the election 10,000 times, we could expect him to win 9,998 times and lose 2 times. For Purdue, in 10,000 elections we could expect him to win 5,253 times and for him to lose 4,747 times. Therefore, by front runner I mean that if we could hold the election many times, each candidate would win their respect race a majority of the time. The probability looks at how many times each candidate will win their race if the election were held many times. I run these elections (the entire board collectively rather than these individual races) 50,000 times. This means that since my current probability of the Republicans winning the Senate are 56%: Republicans end up with a majority of seats in roughly 28,000 of my 50,000 simulations. On the flip side, Democrats end up winning the Senate roughly 22,000 times out of those same 50,000 simulations.
The problem with forecasting events that only happen once, is that the election only represents one outcome of all possible outcomes. While it only has a 0.12% chance of occurring, the GOP does end up with a 56-44 majority in my projections 12 times today. While extremely unlikely, it could happen. Another example here is the lottery. Specifically referring to the Powerball, the odds of winning are 1:175,223,510. These odds seem impossible, even more so than the .12% chance of the GOP having 56 seats. However, if 175 million people buy 1 ticket each (all with different number combinations) the odds of someone winning the Powerball is 99.87% even though each individual person still has a 1:175,223,510 chance of winning. This is referred to as the law of large numbers.
Now that of course was an extreme example. The odds of the front runner winning these races are better than 1:175,223,510. The question I am getting to is: when is a forecast considered "accurate." The answer is not if all races are projected for the correct party. In fact, if my probabilities are calibrated correctly, the frontrunners I have here should only win about 87.06% of the time. If I have 100 races where the front runner is at 75%: 25 of those 100 front runners are expected to lose. The average probability of all of my front runners at the moment is 87.06%. This means that, since I am projecting for 72 races, I am projecting that I will miss roughly 9 of these races on average. This number will fluctuate and continue to shrink as the front runners begin to pull away and as uncertainty drops in these last 3 weeks.
If the election were held today, and I correctly projected 70 of these 72 races: the probabilities for each race I have listed would not have been accurate. That being said, since the systematic error of such a model is roughly +/-15%, 70 of 72 would actually be an acceptable outcome. Particularly since this inaccuracy skews towards being more confident than less confident, the "inaccuracy" expected is more likely to be smaller than my current 12.94%. This inaccuracy is so high this year because so many races are just that close. If the margins in these races were larger, or there were fewer closer races, the average win probability would be much greater.
I don't like calling anyone in particular out, but I do need to say: This is why you should be skeptical of anyone that says their projections are right because in 2012 they correctly projected all 33 Senate races. Yes I am talking about Sam Wang over at the Princeton Election Consortium. This would be accurate if all of the projected winners of the races were at 100%. However, this is never the case. Nate Silver of FiveThirtyEight correctly makes the point when he says "Out of all FiveThirtyEight forecasts that give candidates about a 75 percent shot of winning, do the candidates in fact win about 75 percent of the time over the long run? It’s a problem if these candidates win only 55 percent of the time. But from a statistical standpoint, it’s just as much of a problem if they win 95 percent of the time."