The polling spectrum
I’ve had two people already complain to me on Twitter about the Stickybeak polling at The Spinoff. I’m a lot less negative than they are.
To start with, I think the really important distinction in surveys is between those that are actually trying to get the right answer, and those that aren’t. Stickybeak are on the “are trying” side.
There’s also a distinction between studies that are really only making an effort to get internally-valid comparisons and those that are trying to match the population. Internally-valid comparisons can still be useful: if you have a big self-selected internet sample you won’t learn much about what proportion of people take drugs, but you might be able learn how the proportion of cannabis users trying to cut down compares with the proportion of nicotine users trying to cut down, or whether people who smoke weed and drink beer do both at once or on separate days, or other useful things.
Stickybeak are clearly trying to get nationally representative estimates (at least for their overall political polling): they talk about reweighting to match census data by gender, age, and region, and their claimed secret sauce is chatbots to raise response rates for online surveys.
Now, just because you’re trying to get the right answer doesn’t mean you will. There are plenty of people who try to predict Lotto results or earthquakes, too. And there, it’s too soon to say. We know that online panels can give good answers: YouGov has done well with this technique, where their respondents are not necessarily representative, but they have a lot of information about them. We’re also pretty sure that pure random sampling for political opinion doesn’t work any more; response rates are so low that either quota sampling or weighting is needed to make the sample look at all like the population.
So what do I think? I would have hoped to see more variables used to reweight (ethnicity, and finer-scale geography), with total sample size larger, not smaller, than the traditional polls. I’d also like to see a better uncertainty description. The Spinoff is quoting
For a random sample of this size and after accounting for weighting the maximum sampling error (using 95% confidence) is approximately ±4%.
The accounting for weighting is not always done by NZ pollsters, so that’s good to see, but ‘For a random sample of this size’ seems a bit evasive. Either they’re claiming 4% is a good summary of the (maximum) sampling error for their results, in which case they should say so, or they aren’t, in which case they should stop hinting that it is. Still, we know that the 3.1% error claimed by traditional pollsters is an underestimate, and they largely get a pass on it.
If you want to know whether to trust their results, I can’t tell you. Stickybeak are new enough that we don’t really know how accurate they are.
Thomas Lumley (@tslumley) is Professor of Biostatistics at the University of Auckland. His research interests include semiparametric models, survey sampling, statistical computing, foundations of statistics, and whatever methodological problems his medical collaborators come up with. He also blogs at Biased and Inefficient See all posts by Thomas Lumley »
There seems to be more of these ‘rapid reaction surveys’ like Stickybeak who normally work on commercial and private clients getting into the political prediction game.
A recent NZ Herald story had this headline
https://www.nzherald.co.nz/nz/news/article.cfm?c_id=1&objectid=12356984
“Do voters want the election delayed? New NZ Herald-Kantar poll reveals the answer”
Kantar who is the owner of Colmar Brunton seems to be using this brand for this non random polling. Or as they put it …”an automated market research platform designed for insights professionals, marketeers..blah blah..”
https://www.colmarbrunton.co.nz/launch-of-kantar-marketplace/
4 years ago
It makes sense if you’ve got an existing reputation for traditional polling but you’re worried about its sustainability that you’d try alternative approaches under a new label. But I’m not sure the ‘random’ vs ‘non-random’ distinction really holds up for most opinion research, given how low the response rates to phone calls are getting. Which is a pity, because it’s a lot harder to demonstrate to outsiders that you are serious and competent in non-response adjustment than that you are serious and competent in random-digit dialling.
4 years ago
The problem I have so-called “re-weighting” with these online panels is that there is a well-established “digital divide” by generation (older people less tech savvy) and by socio-economic and ethnic status (disadvantaged people have less online access, as demonstrated in the need to make special arrangements for educational outreach under COVID lockdown). I would be more impressed if polling companies had to show how close their re-weighted samples were to publicly-available information on key indicators like voting pattern in the last election (at least for these political polls).
4 years ago
Vote at the last election isn’t available from the people sampled, typically — only self-reported vote at the last election is — so it’s hard to calibrate that. I mentioned ethnicity as a gap, but they do, of course, use age.
Also, if reweighting fundamentally didn’t work, traditional pollsters would be getting completely wrong answers, too. They also recruit their sample at least partly online, and they reweight in broadly similar ways.
4 years ago