October 13, 2016

Weighting surveys

From the New York Times: “How One 19-Year-Old Illinois Man Is Distorting National Polling Averages”

There is a 19-year-old black man in Illinois who has no idea of the role he is playing in this election.

He is sure he is going to vote for Donald J. Trump.

I think the story exaggerates the impact of this guy’s opinions on polling averages, but it’s a great illustration of one of the subtleties of polling.

Even in New Zealand, you often see people claiming, for example, that opinion polls will underestimate the Green Party vote because Green voters are younger and more urban, and so are less likely to have landline phones. As we see from the actual elections, that isn’t true. Pollers know about these simple forms of bias, and use weighting to fix them — if they poll half as many young voters as they should, each of their votes counts twice. Weighting isn’t as good as actually having a representative sample, but it’s ok — and unlike actually having a representative sample, it’s achievable.

One of the tricky parts of weighting is which groups to weight. If you make the groups too broadly-defined, you don’t remove enough bias; if you make them too narrowly-defined, you end up with a few people getting really extreme weights, making the sampling error much larger than it should be. That’s what happened here: the survey had one person in one of its groups, and that person turned out to be unusual. But it gets worse.

The impact of the weighting was amplified because this is a panel survey, polling the same people repeatedly. Panel surveys are useful because they allow much more accurate estimation of changes in opinions, but an unlucky sample will persist over many surveys.

Worse still, one of the weighting factors used was how people say they voted in 2012. That sounds sensible, but it breaks one of the key assumptions about weighting variables: you need to know the population totals.  We know the totals for how the population really voted in 2012, but reported vote isn’t the same thing at all — people are surprisingly unreliable at reporting how they voted in the past.

The actual impact on polling aggregators such as 538 is probably pretty small, since they model and try to remove ‘house effects’ (differences between surveys). However, the poll does give aid and comfort to people who don’t want to believe the consensus results, and that is not helpful.

avatar

Thomas Lumley (@tslumley) is Professor of Biostatistics at the University of Auckland. His research interests include semiparametric models, survey sampling, statistical computing, foundations of statistics, and whatever methodological problems his medical collaborators come up with. He also blogs at Biased and Inefficient See all posts by Thomas Lumley »

Comments

  • avatar

    “people are surprisingly unreliable at reporting how they voted in the past”

    Just to put some numbers on that- for the New Zealand Election Survey (so a multiparty system where people might consider some parties very similar) somewhere between 17% and 20% of people in the 2014 panel recalled voting in 2011 for a party different to their response collected in 2011.

    8 years ago

  • avatar

    I’ve often thought it would be more cost effective to increase the sample size rather than stuff around with sampling. Especially in New Zealand where the population size is small and some of the samples are inordinately hard to find.

    On a sperate note, one time when I was contacted by a phone pollster they told me “oh you’re close enough” after answering the questions to determine if I was the right demographic. This gave me the impression the pollster was not overly careful about getting that precisely right.

    8 years ago

  • avatar
    steve curtis

    Ive never worried about so called low turnouts for elections. The millions who do vote are the best ‘sample’ you will ever get.

    8 years ago

    • avatar
      Megan Pledger

      But the “sample” is biased towards the Baby Boomers so they are coddled while the young and poor keep getting hammered by new govt policies. The latest one being the government trying to favour landlords (heavily weighted towards baby boomers) over renters (heavily weighted towards the young and poor) because the government don’t like the outcomes of the tenancy tribunal.

      8 years ago