August 22, 2014

Margin of error for minor parties

The 3% ‘margin of error’ usually quoted for poll is actually the ‘maximum margin of error’, and is an overestimate for minor parties. On the other hand, it also assumes simple random sampling and so tends to be an underestimate for major parties.

In case anyone is interested, I have done the calculations for a range of percentages (code here), both under simple random sampling and under one assumption about real sampling.

 

Lower and upper ‘margin of error’ limits for a sample of size 1000 and the observed percentage, under the usual assumptions of independent sampling

Percentage lower upper
1 0.5 1.8
2 1.2 3.1
3 2.0 4.3
4 2.9 5.4
5 3.7 6.5
6 4.6 7.7
7 5.5 8.8
8 6.4 9.9
9 7.3 10.9
10 8.2 12.0
15 12.8 17.4
20 17.6 22.6
30 27.2 32.9
50 46.9 53.1

 

Lower and upper ‘margin of error’ limits for a sample of size 1000 and the observed percentage, assuming that complications in sampling inflate the variance by a factor of 2, which empirically is about right for National.

Percentage lower upper
1 0.3 2.3
2 1.0 3.6
3 1.7 4.9
4 2.5 6.1
5 3.3 7.3
6 4.1 8.5
7 4.9 9.6
8 5.8 10.7
9 6.6 11.9
10 7.5 13.0
15 12.0 18.4
20 16.6 23.8
30 26.0 34.2
50 45.5 54.5
avatar

Thomas Lumley (@tslumley) is Professor of Biostatistics at the University of Auckland. His research interests include semiparametric models, survey sampling, statistical computing, foundations of statistics, and whatever methodological problems his medical collaborators come up with. He also blogs at Biased and Inefficient See all posts by Thomas Lumley »

Comments

  • avatar

    Excellent public service!

    I’ve been tweeting “symmetric” “margin of error” calculations at people. Am concerned that if people see something asymmetric they will get anxious.

    Good to have somewhere to point people though.

    10 years ago

  • avatar

    ‘Maximum margin of error’ is a terrible name. The actual error can easily be greater.

    10 years ago

    • avatar
      Thomas Lumley

      It is indeed a terrible name. Sadly, it’s the name we’ve got.

      Come to think of it, “error” isn’t a ideal term, either.

      10 years ago

      • avatar

        I think I remember you complaining about ‘error’ for any difference between a data point and a regression line, in which case I totally agree. But I think it’s a fairly reasonable term for the difference between a measurement and the thing the measurement was trying to measure.

        10 years ago

        • avatar
          Thomas Lumley

          Yes, for statisticians. But for non-scientists there is apparently a strong association between ‘error’ and ‘mistake’ that gives a misleading impression when we talk about ‘error’.

          [what I was complaining about with regression was getting the role of Y and mu wrong in talking about error. Typically, Y is the truth and Y-mu is the error in approximating the truth by the model, not the reverse]

          10 years ago