# methodology

### Introduction

This web site presents predictions of the 2008 presidential electoral college outcome (if elections were held today) based on state level polls collected by electoral-vote.com. At least once a day, a script fetches those site's polls table, computes simulations of election outcomes (details below) based on state-level polls and posts them on the front page.

### Simulation-based predictions: winning probability and expected number of electoral votes

I use simple statistical techniques that take into account how close the candidates are in each state's poll to compute how likely one candidate is to win in November, and how many electoral votes each candidate will receive. Polls report only imprecise information about the electorate's opinion. However, it is possible to use the poll results together with the reported margin of error to compute the probability that one candidate wins the state. I compute this probability in each state, and use it to simulate 10000 electoral college outcomes in each state. I add up the electoral votes across states, in each simulated elections. Then, I compute the fraction of elections won by each candidate, the average number of electoral college votes obtained by the candidate, and the standard deviation of the electoral college votes. These are the numbers reported in the main display on the front page. For a more technical explanation, see below.

This information is also reported on the main display and is simpler to explain. It calculates electoral college votes assuming candidates win with certainty states where they are ahead in the poll. This does not take into account how close the candidates are in the poll. In states with 50-50 shares I assign each candidate 1/2 of the college's volte.

The histograms reports the percentage of simulated elections resulting with a given range of electoral votes going to Obama. It's helpful to provide an idea of the precision of the estimated expected number of electoral votes.

### Warning

Please use caution in using these results to predict who will winthe election in November (see notes 2, 3, and 5 below). My computations assume that the only problem with these polls is sampling error. That is, every voter has an equal probability of being sampled. Obviously, pollsters may also make systematic mistakes (for example, they may not sample enough young voters, or they may have a biased method to pick likely voters ...). These mistakes are not corrected by my methodology. I prefer to view these results as a tool to interpret current state polls.

### Technical notes

1. I compute Obama's vote share out of the total of Obama+McCain shares (i.e. I ignore other candidates' share, if present, and undecided voters). I assume that, in each state, the sampled Obama vote share follows a normal distribution, with mean equal to the share and standard deviation equal to the reported margin of error divided by 1.96 (see note 4). I then compute the probability that Obama's vote share in the population is greater than 50%. Finally, I use this probability to independently simulate 100000 election outcomes in each state. Using the results from the simulations, I count the electoral votes in each simulation and report summary statistics.
2. I assume that the state winner gets all state's electoral votes. However, Maine (<span class="dem">4 electoral votes</span>) and Nebraska (<span class="rep">5 votes</span>) use a winner-take-all system at the congressional district level, with two electoral votes going to the statewide winner. This introduces a possible error in the prediction of vote difference of no more than 3 votes.
3. It is not clear how to count undecided votes. In 2004, I ignored the undecided and my prediction turned out t be correct. I'm going to stick with this assumption
4. Pollsters report the margin of error to provide a 95% confidence interval around each candidate's reported vote share. There is a big variance in prediction across pollsters and time, as documented by this table from electoral-vote.com. Somebody believes that the margins of error are understated by a factor of 1.4.
5. Please note that results are accurate as long as the polls are accurate, and as long as the page I use to fetch poll results is accurate and unbiased in the choice of polls to report. While I have no reasons to believe it is not, it is also true that a particular choice of pollsters over the 50 states can skew the outcome of the electoral college one way or the other. The simulation based prediction is to some extent robust to this kind of data-coaching, but is nevertheless affected by it.
6. There may be frewuent swings in the candidates winning probability. This is because some pivotal states are also very close calls and the simulations are sensitive to even small changes in winning margins If this causes you motion sickness, I suggest you rather look at the average winning margin and its confidence interval. If a big state such as Florida goes from the one candidate to the other as a result of a change in the poll's prediction, there's a big change in the expected number of electoral votes.
7. The expected average of electoral votes gives an idea of what we should expect the winning margin to be. It is computed as the the average of the winning margins in all simulations of the electoral college. The 95% confidence interval reports the accuracy of the prediction of the winning margin.NOTE that the average winning margin may be in favor of the democratic candidate even if the probability that the democratic candidate wins is below 50%. This happens when the democratic candidate wins less than half of the simulations, but the winning margins in those cases are larger than McCain's winning margin in the rest of the simulations. A possibility when the probabilities of Obama or McCain winning are close to 50%.
8. It is possible for one candidate to be winning with probability above 50%, but to be behind in the basic predictions, computed adding up all college votes from states where the candidate has a share above 50% in the poll. This is one of the cases where my page provides the best contribution to understanding what is going on in the polls. This may happen when one candidate is winning by small margins in states that have a majority of electoral college votes. The other candidate is winning by wider margin in the other states. The basic prediction does not care about whether the lead of one candidate is 1% or 30%: it assigns all state votes to that candidate. However, intuitively, the probability that a candidate wins that state is different if the polls reports a 1% lead or a 30% lead. Therefore the probabilistic prediction may give a probability above 50% to the candidate that is actually losing a majority of electoral college votes.

Updated: 07 Jan 18 14:53

There are no comments attached to this item.