Posts from September 2017 (36)

September 17, 2017

Polls that aren’t any use

bogus-pie

From last week in the Herald: 73.6 per cent of landlords plan rent rises if Labour wins. It’s been a while since I noticed a bogus-poll headline, but they keep coming back.

This time there are two independent reasons this number is meaningless.  First, it’s a self-selected survey — a bogus poll.  You can think of self-selected surveys as a type of petition: they don’t tell you anything useful about the people who didn’t respond, so the results are only interesting if the absolute number responding in a particular category is surprisingly high.  In this case, it’s 73.6% of 816 landlords. According to an OIA request in 2015,  there are more than 120,000 landlords in NZ, so we’re looking at a ‘yes’ response from less than half a percent of them.

Second, there’s an important distinction in polling questions to worry about.  If a nice pollster calls you up one evening and asks who you’re voting for, there’s no particular reason to say anything other than the truth.  The truth is the strongest possible signal of your political affiliation.  If a survey asks “will you raise rents if Labour gets in and raises costs?”,  it’s fairly natural to say “yes” as a sign that you don’t support Labour, whether it’s true or not. There’s no cost to saying “yes”, but if you’re currently setting rents at what you think is the right level, there is a cost to raising them.

Those of you who do arithmetic compulsively will have noticed another, more minor, problem with the headline.  There is no number of votes out of 816 that rounds correctly to 73.6%:  600/816 is 73.52941%, ie, 73.5% and 601/816 is 73.65196, ie, 73.7%.  And, of course, headlining the results of any poll, even a good one, to the nearest tenth of a percentage point is silly.

September 13, 2017

Thresholds and discards, again

There are competing explanations out there about what happens to votes for a party that doesn’t reach the 5%/1 electorate threshold.  This post is about why I don’t like one of them.

People will say (such as on NZ Morning Report this morning) that your votes are reallocated to other parties.  In some voting systems, such as the STV we use for local government elections, reallocating votes is a thing. Your voting paper literally (or virtually) starts off in one party’s pile and is moved to a different party’s pile.

That’s not what happens with the party votes for Parliament.  If the Greens don’t make 5%, party votes for the Greens are not used in allocating List seats.  It’s exactly as if those voters hadn’t cast a party vote, which I think is a simple enough explanation to use.

Now, in the vast majority of cases the result will be the same as if the votes had been reallocated in proportion — unless something weird like a tie happens at some stage in the counting — but one of the explanations is what happens and the other one isn’t.

If you think the two explanations convey the same meaning, you shouldn’t object to using the one that’s actually correct. And if you think they convey different meanings, you definitely shouldn’t object to using the one that’s actually correct.

 

September 12, 2017

NRL Predictions for Finals Week 2

Team Ratings for Finals Week 2

The basic method is described on my Department home page.

Here are the team ratings prior to this week’s games, along with the ratings at the start of the season.

Current Rating Rating at Season Start Difference
Storm 14.43 8.49 5.90
Broncos 6.56 4.36 2.20
Raiders 4.07 9.94 -5.90
Panthers 2.77 6.08 -3.30
Cowboys 2.68 6.90 -4.20
Sharks 2.55 5.84 -3.30
Eels 2.51 -0.81 3.30
Roosters 1.34 -1.17 2.50
Dragons -0.94 -7.74 6.80
Sea Eagles -1.11 -2.98 1.90
Bulldogs -3.55 -1.34 -2.20
Wests Tigers -3.72 -3.89 0.20
Rabbitohs -3.84 -1.82 -2.00
Warriors -7.23 -6.02 -1.20
Titans -9.03 -0.98 -8.10
Knights -9.54 -16.94 7.40

 

Performance So Far

So far there have been 196 matches played, 118 of which were correctly predicted, a success rate of 60.2%.
Here are the predictions for last week’s games.

Game Date Score Prediction Correct
1 Roosters vs. Broncos Sep 08 24 – 22 -2.50 FALSE
2 Storm vs. Eels Sep 09 18 – 16 17.90 TRUE
3 Sea Eagles vs. Panthers Sep 09 10 – 22 -2.30 TRUE
4 Sharks vs. Cowboys Sep 10 14 – 15 4.20 FALSE

 

Predictions for Finals Week 2

Here are the predictions for Finals Week 2. The prediction is my estimated expected points difference with a positive margin being a win to the home team, and a negative margin a win to the away team.

Game Date Winner Prediction
1 Broncos vs. Panthers Sep 15 Broncos 7.30
2 Eels vs. Cowboys Sep 16 Eels 3.30

 

Mitre 10 Cup Predictions for Round 5

Team Ratings for Round 5

The basic method is described on my Department home page.

Here are the team ratings prior to this week’s games, along with the ratings at the start of the season.

Current Rating Rating at Season Start Difference
Canterbury 20.07 14.78 5.30
Taranaki 5.49 7.04 -1.50
North Harbour 5.30 -1.27 6.60
Otago 4.60 -0.34 4.90
Tasman 4.24 9.54 -5.30
Counties Manukau 3.29 5.70 -2.40
Wellington 3.14 -1.62 4.80
Waikato -0.79 -0.26 -0.50
Auckland -1.87 6.11 -8.00
Manawatu -3.81 -3.59 -0.20
Bay of Plenty -3.83 -3.98 0.10
Northland -4.06 -12.37 8.30
Hawke’s Bay -13.28 -5.85 -7.40
Southland -21.09 -16.50 -4.60

 

Performance So Far

So far there have been 30 matches played, 19 of which were correctly predicted, a success rate of 63.3%.
Here are the predictions for last week’s games.

Game Date Score Prediction Correct
1 Wellington vs. Hawke’s Bay Sep 06 40 – 27 17.70 TRUE
2 Counties Manukau vs. North Harbour Sep 07 18 – 27 4.40 FALSE
3 Canterbury vs. Southland Sep 08 78 – 20 42.30 TRUE
4 Manawatu vs. Bay of Plenty Sep 08 17 – 20 5.60 FALSE
5 Auckland vs. Taranaki Sep 09 38 – 49 -1.70 TRUE
6 Northland vs. Waikato Sep 09 37 – 7 -5.70 FALSE
7 Tasman vs. Wellington Sep 10 37 – 35 5.40 TRUE
8 Hawke’s Bay vs. Otago Sep 10 21 – 64 -7.90 TRUE

 

Predictions for Round 5

Here are the predictions for Round 5. The prediction is my estimated expected points difference with a positive margin being a win to the home team, and a negative margin a win to the away team.

Game Date Winner Prediction
1 Canterbury vs. Counties Manukau Sep 13 Canterbury 20.80
2 Northland vs. North Harbour Sep 14 North Harbour -5.40
3 Southland vs. Auckland Sep 15 Auckland -15.20
4 Taranaki vs. Bay of Plenty Sep 15 Taranaki 13.30
5 Waikato vs. Manawatu Sep 16 Waikato 7.00
6 Otago vs. Tasman Sep 16 Otago 4.40
7 Counties Manukau vs. Hawke’s Bay Sep 17 Counties Manukau 20.60
8 Wellington vs. Canterbury Sep 17 Canterbury -12.90

 

Currie Cup Predictions for Round 10

Team Ratings for Round 10

The basic method is described on my Department home page.

Here are the team ratings prior to this week’s games, along with the ratings at the start of the season.

Current Rating Rating at Season Start Difference
Sharks 4.36 2.15 2.20
Western Province 4.04 3.30 0.70
Lions 3.41 7.41 -4.00
Cheetahs 3.23 4.33 -1.10
Blue Bulls 0.09 2.32 -2.20
Pumas -7.60 -10.63 3.00
Griquas -10.29 -11.62 1.30

 

Performance So Far

So far there have been 27 matches played, 18 of which were correctly predicted, a success rate of 66.7%.
Here are the predictions for last week’s games.

Game Date Score Prediction Correct
1 Griquas vs. Lions Sep 08 17 – 34 -8.50 TRUE
2 Western Province vs. Cheetahs Sep 09 57 – 14 3.20 TRUE
3 Pumas vs. Sharks Sep 09 25 – 27 -8.50 TRUE

 

Predictions for Round 10

Here are the predictions for Round 10. The prediction is my estimated expected points difference with a positive margin being a win to the home team, and a negative margin a win to the away team.

Game Date Winner Prediction
1 Pumas vs. Western Province Sep 15 Western Province -7.10
2 Griquas vs. Sharks Sep 16 Sharks -10.10
3 Lions vs. Blue Bulls Sep 16 Lions 7.80

 

September 11, 2017

Stat of the Week Competition: September 9 – 15 2017

Each week, we would like to invite readers of Stats Chat to submit nominations for our Stat of the Week competition and be in with the chance to win an iTunes voucher.

Here’s how it works:

  • Anyone may add a comment on this post to nominate their Stat of the Week candidate before midday Friday September 15 2017.
  • Statistics can be bad, exemplary or fascinating.
  • The statistic must be in the NZ media during the period of September 9 – 15 2017 inclusive.
  • Quote the statistic, when and where it was published and tell us why it should be our Stat of the Week.

Next Monday at midday we’ll announce the winner of this week’s Stat of the Week competition, and start a new one.

(more…)

Stat of the Week Competition Discussion: September 9 – 15 2017

If you’d like to comment on or debate any of this week’s Stat of the Week nominations, please do so below!

September 10, 2017

Should there be an app for that?

As you may have heard, researchers at Stanford have tried to train a neural network to predict sexual orientation from photos. Here’s the Guardian‘s story.

Artificial intelligence can accurately guess whether people are gay or straight based on photos of their faces, according to new research that suggests machines can have significantly better “gaydar” than humans.

There are a few questions this should raise.  Is it really better? Compared to whose gaydar? And WTF would think this was a good idea?

As one comment on the study says

Finally, the predictability of sexual orientation could have serious and even life-threatening implications to gay men and women and the society asa whole. In some cultures, gay men and women still suffer physical and psychological abuse at the hands of governments, neighbors, and even their own families.

No, I lied. That’s actually a quote from the research paper (here). The researchers say this sort of research is ethical and important because people don’t worry enough about their privacy. Which is a point of view.

So, you might wonder about the details.

The data came from a dating website, using self-identified gender for the photo combined with the gender they were interested in dating to work out sexual orientation. That’s going to be pretty accurate (at least if you don’t care how bisexual people are classified, which they don’t seem to). It’s also pretty obvious that the pictures weren’t put up for the purpose of AI research.

The Guardian story says

 a computer algorithm could correctly distinguish between gay and straight men 81% of the time, and 74% for women

which is true, but is a fairly misleading summary of accuracy.  Presented with a pair of faces, one of which was gay and one wasn’t, that’s how accurate the computer was.  In terms of overall error rate, you can do better that 81% or 74% just by assuming everyone is straight, and the increase in prediction accuracy in random people over the human judgment is pretty small.

More importantly, these are photos from dating profiles. You’d expect dating profile photos to give more hints about sexual orientation than, say, passport photos, or CCTV stills.  That’s what they’re for.  The researchers tried to get around this, but they were limited by the mysterious absence of large databases of non-dating photos classified by sexual orientation.

The other question you might have is about the less-accurate human ratings.  These were done using Amazon’s Mechanical Turk.  So, a typical Mechanical Turk worker, presented only with a single pair of still photos, does do a bit worse than a neural network.  That’s basically what you’d expect with the current levels of still image classification: algorithms can do better than people who aren’t particularly good and who don’t get any particular training.  But anyone who thinks that’s evidence of significantly better gaydar than humans in a meaningful sense must have pretty limited experience of social interaction cues. Or have some reason to want the accuracy of their predictions overstated.

The research paper concludes

The postprivacy world will be a much safer and hospitable place if inhabited by well-educated, tolerant people who are dedicated to equal rights.

That’s hard to argue with. It’s less clear that normalising the automated invasion of privacy and use of personal information without consent is the best way to achieve this goal.

Why you can’t predict Epsom from polls

The Herald’s poll aggregator had a bit of a breakdown over the Epsom electorate yesterday, suggesting that Labour had a chance of winning.

Polling data (and this isn’t something a statistician likes saying) is essentially useless when it comes to Epsom, because neither side benefits from getting their own supporters’ votes. National supporters are a clear majority in the electorate. If they do their tactical voting thing properly and vote for ACT’s David Seymour, he will win.  If they do the tactical voting thing badly enough, and the Labour and Green voters do it much better, National’s Paul Goldsmith will win.

Opinion polls over the whole country don’t tell you about tactical voting strategies in Epsom. Even opinion polls in Epsom would have to be carefully worded, and you’d have to be less confident in the results.

There isn’t anywhere else quite like Epsom. There are other electorates that matter and are hard to predict — such as Te Tai Tokerau, where polling information on Hone Harawira’s popularity is sparse — but in those electorates the polls are at least asking the right question.

Peter Ellis’s poll aggregator just punts on this question: the probability of ACT winning Epsom is set at an arbitrary 80%, and he gives you an app that lets you play with the settings. I think that’s the right approach.

September 6, 2017

Threshold and discards

There have been a few discussions on Twitter about what happens to votes for parties who don’t make the threshold of 5% or one electorate. I’m going to try to make it clearer than is feasible in 140 characters, but without mentioning quotients.

If you voted for a party that gets less than 5% of the party vote and does not win any electorate, your party vote is not used in determining the list seats.  It doesn’t get reassigned, reweighted, or re-anything. It just isn’t used — exactly as if those people hadn’t cast a Party vote. (Electoral Act, section 191 (4)). Last time, about 150,000 votes were set aside at this point.

The votes for the parties that are left (2.3 million, last time) are now used to allocate 120 seats.  Complicated procedures are used to work out a number of votes per seat, call it N.  The total for each party is divided by N, and rounded to the nearest whole  number, so you need at least ½N to get one seat, 1½N to get two seats and so on. (That’s not how the Electoral Act describes it; this is the equivalent ‘Webster’ method rather than the ‘Sainte-Laguë’ method).

So what are the implications?

I don’t like the term ‘wasted’ vote — if you’re voting for, say, Ban 1080 or Aotearoa Legalise Cannabis, it’s presumably not in the expectation of getting representation in Parliament, but more as a way of making your views known.  However, if your intent is to increase representation in Parliament of people whose views you support, this is the basic guideline

  • If the opinion polls show a party is nowhere near the 5%/one electorate threshold, the expected impact of increased votes for that party  on the composition of Parliament is very small (compared to a major party)
  • If the opinion polls show a party is close to the 5% threshold (in either direction) and isn’t certain to get an electorate, the expected impact of increased votes for that party  on the composition of Parliament is relatively large (compared to a major party)
  • If a party is reasonably certain to get an electorate or to get over the 5% threshold, he expected impact of increased votes for that party  on the composition of Parliament is about the same as for a major party.