Posts from July 2020 (21)

July 31, 2020

Bogus polls

The recent trends in opinion-poll support for the National Party got a lot of attention. That’s because real opinion polls, like those done by Colmar Brunton and Reid Research (and the internal party polling that they tell us about when they think it will help them) are genuine attempts to estimate popular opinion.  You can argue about how good they are — but you can argue about how good they are, there are factual grounds for discussion.

NewsHub ran a bogus online clicky poll with the question Who would you prefer as Prime Minister – Judith Collins or Jacinda Ardern?  Of the people who clicked on the poll, 53% preferred Ms Collins, and 47% preferred Ms Ardern.  Let’s compare that to the two real polls. The 1News/Colmar Brunton poll had

  • Jacinda Ardern: 54% 
  • Judith Collins: 20%

The 3/Reid poll had

  • Jacinda Ardern: 62% 
  • Judith Collins: 15% 

Why are these so different from the Newshub clicky poll? The first point is that there’s no reason for them to be similar. Two of them are estimates of popular opinion; the other one is a video game.

On top of that, the question is different.  The real polls are asking who (out of basically anyone) is your preferred PM.  The bogus poll forces the choice down to Ardern vs Collins.  If you supported Simon Bridges or Todd Muller — or Metiria Turei  or Winston Peters — the real polls let you say so, and the bogus poll doesn’t.

Research Association NZ, who are the professional association for opinion researchers in NZ, have a code of practice for political polling (PDF). It’s only binding on their members, but it does have best practice advice for the media, such as using the term “poll” only for serious attempts to estimate public opinion, not for bogus clicky website things.

July 30, 2020

Briefly

  • The Algorithm Charter has been released. Stories from NewsHub, newsroom, The Guardian, The Register, ZDnet
  • Covid-19 in Victoria: bad, and according to modelling by Peter Ellis, still getting worse.  If you’re in NZ, make sure you have a mask and hand sanitiser available and at least have the contact apps on your phone, in case we get another outbreak.
  • RadioNZ’s podcast “The Detail” has an episode on polling.  I’m reliably informed that I’m on it.
  • In 2020, we’re amid that critical juncture for ’90s music—we can finally start asking today’s teens, “What music do you recognize from the ’90s?”. From pudding.cool
July 29, 2020

Gender guessing software

A company called Genderify has what they say is “an AI-powered tool for identifying the gender of your customers”. This is an example of something that is not worth doing (asking is easy and reliable; people will be upset when you get it wrong), but also very difficult.

After seeing some examples on Twitter, I decided to try it on some senior members of the Stats department (whose gender identity I’m reasonably confident of)

“Thomas Lumley” is 63.90% likely to be male and 36.10% likely to be female, and you have to like the four digit precision. But “Dr Thomas Lumley” is 89.40% likely to be male, and “prof thomas lumley” gets up to 94.60%!

“Ilze Ziedins” is 85.20% likely to be male, which will surprise her. “Dr Ilze Ziedins” gets to 96.00%

“James Curran” is 99.60% likely to be male, adding “Dr” or “Prof” gets him up to 99.90%

“Rachel Fewster” is at 72.00% likely to be male, adding her professorial title puts that up to 95.40%

“Renate Meyer” is at 62.30%, her doctorate moves that up to 88.20%, and her promotion to professor makes it 94.00%

Note that none of these are classically gender-neutral or gender-ambiguous names: no Hadley or Hilary or Cameron.  The overall level of accuracy is pretty terrible to start with — but the response to adding qualifications is bizarre.  If that wasn’t in the basic pre-release testing, then what was?

Even better (worse):  it’s not just that adding “Dr” or “Prof” make it think you’re more likely mean a man, adding “Dame” also does.

 

Update: on Twitter, (Dr, Prof) Casey Fiesler raised the possibility that Genderify are just trolling, which I must say is looking quite plausible.

The StatChat guide to polls

It’s getting to be that time of the triennium again, so some highlights from past StatsChat posts on electoral polling

July 28, 2020

Super Rugby Australia Predictions for Round 5

Team Ratings for Round 5

The basic method is described on my Department home page.
Here are the team ratings prior to this week’s games, along with the ratings at the start of the season.

Current Rating Rating at Season Start Difference
Brumbies 4.66 4.67 -0.00
Reds -1.37 -0.31 -1.10
Rebels -4.02 -5.52 1.50
Waratahs -7.01 -7.12 0.10
Force -10.54 -10.00 -0.50

 

Performance So Far

So far there have been 8 matches played, 7 of which were correctly predicted, a success rate of 87.5%.
Here are the predictions for last week’s games.

Game Date Score Prediction Correct
1 Rebels vs. Waratahs Jul 24 29 – 10 5.60 TRUE
2 Force vs. Brumbies Jul 25 0 – 24 -8.60 TRUE

 

Predictions for Round 5

Here are the predictions for Round 5. The prediction is my estimated expected points difference with a positive margin being a win to the home team, and a negative margin a win to the away team.

Game Date Winner Prediction
1 Force vs. Rebels Jul 31 Rebels -2.00
2 Brumbies vs. Reds Aug 01 Brumbies 10.50

 

Super Rugby Aotearoa Predictions for Round 8

Team Ratings for Round 8

The basic method is described on my Department home page.
Here are the team ratings prior to this week’s games, along with the ratings at the start of the season.

Current Rating Rating at Season Start Difference
Crusaders 14.68 15.15 -0.50
Hurricanes 7.98 8.31 -0.30
Blues 6.88 5.39 1.50
Chiefs 5.41 7.94 -2.50
Highlanders 1.61 -0.22 1.80

 

Performance So Far

So far there have been 14 matches played, 9 of which were correctly predicted, a success rate of 64.3%.
Here are the predictions for last week’s games.

Game Date Score Prediction Correct
1 Crusaders vs. Hurricanes Jul 25 32 – 34 13.30 FALSE
2 Blues vs. Chiefs Jul 26 21 – 17 6.40 TRUE

 

Predictions for Round 8

Here are the predictions for Round 8. The prediction is my estimated expected points difference with a positive margin being a win to the home team, and a negative margin a win to the away team.

Game Date Winner Prediction
1 Chiefs vs. Crusaders Aug 01 Crusaders -4.80
2 Highlanders vs. Blues Aug 02 Blues -0.80

 

NRL Predictions for Round 12

Team Ratings for Round 12

The basic method is described on my Department home page.
Here are the team ratings prior to this week’s games, along with the ratings at the start of the season.

Current Rating Rating at Season Start Difference
Storm 14.03 12.73 1.30
Roosters 12.40 12.25 0.20
Raiders 6.55 7.06 -0.50
Eels 5.47 2.80 2.70
Panthers 3.61 -0.13 3.70
Rabbitohs 2.55 2.85 -0.30
Sharks 1.94 1.81 0.10
Wests Tigers 0.96 -0.18 1.10
Sea Eagles 0.62 1.05 -0.40
Knights -2.26 -5.92 3.70
Dragons -4.00 -6.14 2.10
Bulldogs -5.93 -2.52 -3.40
Cowboys -6.32 -3.95 -2.40
Warriors -7.34 -5.17 -2.20
Broncos -10.48 -5.53 -4.90
Titans -13.79 -12.99 -0.80

 

Performance So Far

So far there have been 88 matches played, 57 of which were correctly predicted, a success rate of 64.8%.
Here are the predictions for last week’s games.

Game Date Score Prediction Correct
1 Eels vs. Wests Tigers Jul 23 26 – 16 5.90 TRUE
2 Cowboys vs. Sea Eagles Jul 24 12 – 24 -4.00 TRUE
3 Broncos vs. Storm Jul 24 8 – 46 -20.80 TRUE
4 Roosters vs. Warriors Jul 25 18 – 10 26.00 TRUE
5 Sharks vs. Dragons Jul 25 28 – 24 6.30 TRUE
6 Raiders vs. Rabbitohs Jul 25 18 – 12 6.00 TRUE
7 Knights vs. Bulldogs Jul 26 12 – 18 7.00 FALSE
8 Titans vs. Panthers Jul 26 14 – 22 -16.40 TRUE

 

Predictions for Round 12

Here are the predictions for Round 12. The prediction is my estimated expected points difference with a positive margin being a win to the home team, and a negative margin a win to the away team.

Game Date Winner Prediction
1 Dragons vs. Rabbitohs Jul 30 Rabbitohs -4.50
2 Wests Tigers vs. Warriors Jul 31 Wests Tigers 12.80
3 Broncos vs. Sharks Jul 31 Sharks -10.40
4 Roosters vs. Titans Aug 01 Roosters 28.20
5 Cowboys vs. Raiders Aug 01 Raiders -10.90
6 Sea Eagles vs. Panthers Aug 01 Panthers -1.00
7 Bulldogs vs. Eels Aug 02 Eels -9.40
8 Storm vs. Knights Aug 02 Storm 16.30

 

July 27, 2020

Rogue polls

I wrote about ordinary sampling variation and ‘rogue polls’ for The Spinoff

 

July 24, 2020

Hard, soft, and real

In clinical trials we make two important distinctions between measurements. There are ‘hard’ and ‘soft’ outcomes — ‘hard’ ones are objectively and reproducibly measurable, ‘soft’ ones have some subjectivity and observer bias.  There are also ‘surrogate’ and ‘real’ or ‘patient-centered’ outcomes. ‘Real’ outcomes are what we care about; ‘surrogate’ outcomes are things we measure because we can measure them well and we expect them to correlate with real outcomes. Hard and soft outcomes are valuable; real and surrogate outcomes are valuable; you don’t want to confuse them.

There’s a story on NewsHub headlined Bisexual men are real, study finds. One of the researchers had previously doubted this, but has now been convinced, and has paper in PNAS about an analysis combining data from many previous studies.  These studies involve wiring someone’s penis up to detect arousal and then showing him erotic images.  The claim is that these measurements are objective

If men who self-report Kinsey scores in the bisexual range indeed have relatively bisexual arousal patterns, then both Minimum Arousal and the Bisexual Arousal Composite should show an inverted U-shaped distribution across the Kinsey range (i.e., men who self-identify as 0 [exclusively heterosexual] and 6 [exclusively homosexual] should have the lowest scores for these variables; men in intermediate groups should have greater values, with the peak resting at a Kinsey score of 3); the Absolute Arousal Difference should show a U-shaped distribution (i.e., exclusively heterosexual and exclusively homosexual men should have lower values than bisexual-identified men).

The reason for emphasising these measurements is that they doesn’t completely trust self-report (while agreeing it is valuable)

However, because the scale relied on self-reports, results could not provide definitive evidence for bisexual orientation. For example, surveys have shown that a large proportion of men who identify as gay or homosexual had gone through a previous and transient phase of bisexual identification 

I don’t think anyone (whatever their opinion on bisexuality) would deny that men who lie about sex are real. The problem is treating the physical arousal measurements as basically definitive of bisexuality.  In the clinical trials terminology, the arousal measurement is a relatively hard outcome, but it is a surrogate outcome.

With modern data science (and sufficiently dodgy ethics) there would be other surrogate outcomes that someone has probably explored.  Are there a significant number of men on Tinder who swipe right for both male and female profiles?  Are there many PornHub accounts of men who watch both straight and gay porn? Are there men who have shared a one-bedroom home with both men and women over time?  All of these are clearly reductive: they would give you one-dimensional information about bisexuality, but they are measuring different things and there’s no reason to expect they would agree on how common it is.  The same is true for physiological arousal.  Measuring it can be valuable; the demographics of physiological arousal can be a valid area of study; but it can’t answer the yes/no question.

Some men claim to be attracted to both men and women, and behave as if their claims are true. It turns out, according to this paper, that for some of these men the physiological measurements of arousal show the relationships that you’d expect.  If there weren’t any men whose physiological measurements of arousal show those relationships, that would be an interesting fact, but the real question would be why the measurements don’t fit with the phenomenon of bisexuality.  If you think of this paper as just trying to answer a question about physiological arousal then, ok, that’s the question it tries to answer. And in fact one of the researchers is quoted further down in the NewsHub story saying

“It has always been clear that bisexual men exist in terms of self-identity and behaviour, but many, including myself, were sceptical about their ability to be sexually aroused to both men and women.” 

Contrast that, though, with the paper’s “Significance” section, which starts out

There has long been skepticism among both scientists and laypersons that male bisexual orientation exists.”

Or with the second sentence of the press release:

“The existence of male bisexuality is contested, with skeptics claiming that men who self-identify as bisexual are actually either homosexual or heterosexual.”.

Or with the title of the research paper itself

Robust evidence for bisexual orientation among men

The stretching of the study findings to the headline “Bisexual men are real, study finds” can’t just be blamed on the media.

When we talk about whether Alexander the Great or Shakespeare was bisexual, there are difficulties in even agreeing on the concept over centuries or millennia of social distance.  But I think most people would agree there’s more to the question than what would have happened if you wired them up to a machine and showed them porn.

Briefly

Non-representative sampling!