A question on my recent post about political opinion polls asks
– at what point does the trend become relevant?
– and how do you calculate the margin of error between two polls?
Those are good questions, and the reply was getting long enough that I decided to promote it to a post of its own. The issue is that proportions will fluctuate up and down slightly from poll to poll even if nothing is changing, and we want to distinguish this from real changes in voter attitudes — otherwise there will be a different finding every month and it will look as if public opinion is bouncing around all over the place. I don’t think you want to base a headline on a difference that’s much below the margin of error, though reporting the differences is fine if you don’t think people can find the press release on their own.
The (maximum) margin of error, which reputable polls usually quote, gives an estimate of uncertainty that’s designed to be fairly conservative. If the poll is well-designed and well-conducted, the difference between the poll estimate and the truth will be less than the maximum margin of error 95% of the time for true proportions near one-half, and more often than 95% for smaller proportions. The difference will be less than half the margin of error about two-thirds of the time, so being less conservative doesn’t let you shrink the margin very much. In this case the difference was well under half the margin of error. In fact, if there were no changes in public opinion you would still see month-to-month differences this big about half the time.
For trends based on just two polls, the margin of error is larger than for a single poll, because it could happen by chance that one poll was a bit too low and the other was a bit too high: the difference between the two polls can easily be larger than the difference between either poll and the truth.
The best way to overcome the random fluctuations to pick up small trends is to do some sort of averaging of polls, either over time, or over competing polling organisations. In the US, the website fivethirtyeight.com combines all the published polls to get estimates and probabilities of winning the election, and they do very well in short-term predictions. Here’s a plot for Australian (2007) elections, by Simon Jackman, of Stanford, where you can see individual poll results (with large fluctuations) around the average curve (which has much smaller uncertainties). KiwiPollGuy has apparently done something similar for NZ elections (though I’d be happier if their identity or their methodology was public).
So, how are these numbers computed? If the poll was a uniform random sample of N people, and the true proportion was P, the margin of error would be 2 * square root(P*(1-P)/N). The problem then is that we don’t know P — that’s why we’re doing the poll. The maximum margin of error takes P=0.5, which gives the largest margin of error, and one that’s pretty reasonable for a range of P from, say, 15% to 85%. The formula then simplifies to 1/square root of N. If N is 1000, that’s 3.16%, for N=948 as in the previous post, it is 3.24%.
Why is it 2 * square root(P*(1-P)/N)? Well, that takes more maths than I’m willing to type in this format so I’m just going to mutter “Bernoulli” at you and refer you to Wikipedia.
For trends based on two polls, as opposed to single polls, it turns out that the squared uncertainties add, so the square of the margin of error for the difference is twice the square of the margin of error for a single poll. Converting back to actual percentages, that means the margin of error for a difference based on two polls is 1.4 times large than for a single poll.
In reality, the margins of error computed this way are an underestimate, because of non-response and other imperfections in the sampling, but they don’t do too badly.