Posts from March 2014 (59)

March 21, 2014

Common exposures are common

A California head-lice treatment business has had huge success in publicising its business with the claim that selfies are causing a  rise in nits among teenagers. The Herald mentions this in Sideswipe, the right place for this sort of story, but other international sites have been less discriminating.

There are no actual numbers involved, and nothing like representative data even if you’re in the South Bay area of central California. More importantly, though, there is no comparison group. The owner of the business, Mary MacQuillan, says “Every teen I’ve treated, I ask about selfies, and they admit that they are taking them every day.”  That’s probably only a slight exaggeration at most, but every teen she hasn’t treated has also probably been taking photos that way. It’s something teenagers do.  Common exposures are common.

So, why were news organisations around the world publicising this? The fact that it’s about teenagers and the internet goes a long way to explaining it.  It doesn’t need evidence because teenage use of technology is automatically scary and newsworthy: as Ms MacQuillan says ” I think parents need to be aware, and teenagers need to be aware too. Selfies are fun, but the consequences are real.”

You get the same thing happening with ‘chemicals’, as the dihydrogen monoxide parody website loves to point out

A recent stunning revelation is that in every single instance of violence in our country’s schools, …, dihydrogen monoxide was involved.

 

March 20, 2014

Beyond the margin of error

From Twitter, this morning (the graphs aren’t in the online story)

Now, the Herald-Digipoll is supposed to be a real survey, with samples that are more or less representative after weighting. There isn’t a margin of error reported, but the standard maximum margin of error would be  a little over 6%.

There are two aspects of the data that make it not look representative. Thr first is that only 31.3%, or 37% of those claiming to have voted, said they voted for Len Brown last time. He got 47.8% of the vote. That discrepancy is a bit larger than you’d expect just from bad luck; it’s the sort of thing you’d expect to see about 1 or 2 times in 1000 by chance.

More impressively, 85% of respondents claimed to have voted. Only 36% of those eligible in Auckland actually voted. The standard polling margin of error is ‘two sigma’, twice the standard deviation.  We’ve seen the physicists talk about ‘5 sigma’ or ‘7 sigma’ discrepancies as strong evidence for new phenomena, and the operations management people talk about ‘six sigma’ with the goal of essentially ruling out defects due to unmanaged variability.  When the population value is 36% and the observed value is 85%, that’s a 16 sigma discrepancy.

The text of the story says ‘Auckland voters’, not ‘Aucklanders’, so I checked to make sure it wasn’t just that 12.4% of the people voted in the election but didn’t vote for mayor. That explanation doesn’t seem to work either: only 2.5% of mayoral ballots were blank or informal. It doesn’t work if you assume the sample was people who voted in the last national election.  Digipoll are a respectable polling company, which is why I find it hard to believe there isn’t a simple explanation, but if so it isn’t in the Herald story. I’m a bit handicapped by the fact that the University of Texas internet system bizarrely decides to block the Digipoll website.

So, how could the poll be so badly wrong? It’s unlikely to just be due to bad sampling — you could do better with a random poll of half a dozen people. There’s got to be a fairly significant contribution from people whose recall of the 2013 election is not entirely accurate, or to put it more bluntly, some of the respondents were telling porkies.  Unfortunately, that makes it hard to tell if results for any of the other questions bear even the slightest relationship to the truth.

 

 

 

March 19, 2014

Revised Super 15 Predictions for Round 6

I had a mistake in my code so that the country assigned to the Sharks was incorrect so the previously posted prediction for the Bulls versus Sharks game was wrongly calculated. I now have the Sharks to win by 0.10 point.

All the other predictions for the round are unchanged.

Team Ratings for Round 6

The basic method is described on my Department home page. I have made some changes to the methodology this year, including shrinking the ratings between seasons.

Here are the team ratings prior to this week’s games, along with the ratings at the start of the season.

Current Rating Rating at Season Start Difference
Sharks 6.52 4.57 2.00
Crusaders 6.07 8.80 -2.70
Chiefs 5.30 4.38 0.90
Brumbies 4.19 4.12 0.10
Bulls 3.92 4.87 -1.00
Waratahs 3.65 1.67 2.00
Stormers 1.95 4.38 -2.40
Hurricanes -0.22 -1.44 1.20
Reds -0.28 0.58 -0.90
Blues -1.90 -1.92 0.00
Cheetahs -3.35 0.12 -3.50
Highlanders -3.85 -4.48 0.60
Lions -4.15 -6.93 2.80
Force -4.65 -5.37 0.70
Rebels -6.20 -6.36 0.20

 

Performance So Far

So far there have been 29 matches played, 20 of which were correctly predicted, a success rate of 69%.

Here are the predictions for last week’s games.

Game Date Score Prediction Correct
1 Chiefs vs. Stormers Mar 14 36 – 20 6.10 TRUE
2 Rebels vs. Crusaders Mar 14 19 – 25 -8.70 TRUE
3 Hurricanes vs. Cheetahs Mar 15 60 – 27 3.80 TRUE
4 Highlanders vs. Force Mar 15 29 – 31 5.80 FALSE
5 Brumbies vs. Waratahs Mar 15 28 – 23 2.70 TRUE
6 Lions vs. Blues Mar 15 39 – 36 1.50 TRUE
7 Sharks vs. Reds Mar 15 26 – 6 7.80 TRUE

 

Predictions for Round 6

Here are the predictions for Round 6. The prediction is my estimated expected points difference with a positive margin being a win to the home team, and a negative margin a win to the away team.

Game Date Winner Prediction
1 Highlanders vs. Hurricanes Mar 21 Hurricanes -1.10
2 Waratahs vs. Rebels Mar 21 Waratahs 12.40
3 Blues vs. Cheetahs Mar 22 Blues 5.40
4 Brumbies vs. Stormers Mar 22 Brumbies 6.20
5 Force vs. Chiefs Mar 22 Chiefs -5.90
6 Lions vs. Reds Mar 22 Lions 0.10
7 Bulls vs. Sharks Mar 22 Sharks -0.10

 

NRL Predictions for Round 3

Team Ratings for Round 3

The basic method is described on my Department home page. I have made some changes to the methodology this year, including shrinking the ratings between seasons.

Here are the team ratings prior to this week’s games, along with the ratings at the start of the season.

Current Rating Rating at Season Start Difference
Roosters 12.30 12.35 -0.00
Rabbitohs 8.09 5.82 2.30
Sea Eagles 8.07 9.10 -1.00
Storm 7.19 7.64 -0.40
Cowboys 4.04 6.01 -2.00
Bulldogs 3.73 2.46 1.30
Knights 1.16 5.23 -4.10
Panthers 0.84 -2.48 3.30
Titans -1.49 1.45 -2.90
Sharks -1.60 2.32 -3.90
Broncos -2.37 -4.69 2.30
Dragons -4.18 -7.57 3.40
Warriors -5.81 -0.72 -5.10
Raiders -5.86 -8.99 3.10
Wests Tigers -8.36 -11.26 2.90
Eels -17.54 -18.45 0.90

 

Performance So Far

So far there have been 16 matches played, 6 of which were correctly predicted, a success rate of 37.5%.

Here are the predictions for last week’s games.

Game Date Score Prediction Correct
1 Sea Eagles vs. Rabbitohs Mar 14 14 – 12 5.20 TRUE
2 Broncos vs. Cowboys Mar 14 16 – 12 -3.40 FALSE
3 Warriors vs. Dragons Mar 15 12 – 31 7.40 FALSE
4 Storm vs. Panthers Mar 15 18 – 17 13.10 TRUE
5 Roosters vs. Eels Mar 15 56 – 4 30.50 TRUE
6 Titans vs. Wests Tigers Mar 16 12 – 42 19.40 FALSE
7 Knights vs. Raiders Mar 16 20 – 26 15.30 FALSE
8 Bulldogs vs. Sharks Mar 17 42 – 4 4.10 TRUE

 

Predictions for Round 3

Here are the predictions for Round 3. The prediction is my estimated expected points difference with a positive margin being a win to the home team, and a negative margin a win to the away team.

Game Date Winner Prediction
1 Wests Tigers vs. Rabbitohs Mar 21 Rabbitohs -12.00
2 Broncos vs. Roosters Mar 21 Roosters -10.20
3 Panthers vs. Bulldogs Mar 22 Panthers 1.60
4 Sharks vs. Dragons Mar 22 Sharks 7.10
5 Cowboys vs. Warriors Mar 22 Cowboys 14.40
6 Sea Eagles vs. Eels Mar 23 Sea Eagles 30.10
7 Raiders vs. Titans Mar 23 Raiders 0.10
8 Storm vs. Knights Mar 24 Storm 10.50

 

Super 15 Predictions for Round 6

Team Ratings for Round 6

The basic method is described on my Department home page. I have made some changes to the methodology this year, including shrinking the ratings between seasons.

Here are the team ratings prior to this week’s games, along with the ratings at the start of the season.

Current Rating Rating at Season Start Difference
Sharks 6.52 4.57 2.00
Crusaders 6.07 8.80 -2.70
Chiefs 5.30 4.38 0.90
Brumbies 4.19 4.12 0.10
Bulls 3.92 4.87 -1.00
Waratahs 3.65 1.67 2.00
Stormers 1.95 4.38 -2.40
Hurricanes -0.22 -1.44 1.20
Reds -0.28 0.58 -0.90
Blues -1.90 -1.92 0.00
Cheetahs -3.35 0.12 -3.50
Highlanders -3.85 -4.48 0.60
Lions -4.15 -6.93 2.80
Force -4.65 -5.37 0.70
Rebels -6.20 -6.36 0.20

 

Performance So Far

So far there have been 29 matches played, 20 of which were correctly predicted, a success rate of 69%.

Here are the predictions for last week’s games.

Game Date Score Prediction Correct
1 Chiefs vs. Stormers Mar 14 36 – 20 6.10 TRUE
2 Rebels vs. Crusaders Mar 14 19 – 25 -8.70 TRUE
3 Hurricanes vs. Cheetahs Mar 15 60 – 27 3.80 TRUE
4 Highlanders vs. Force Mar 15 29 – 31 5.80 FALSE
5 Brumbies vs. Waratahs Mar 15 28 – 23 2.70 TRUE
6 Lions vs. Blues Mar 15 39 – 36 1.50 TRUE
7 Sharks vs. Reds Mar 15 26 – 6 7.80 TRUE

 

Predictions for Round 6

Here are the predictions for Round 6. The prediction is my estimated expected points difference with a positive margin being a win to the home team, and a negative margin a win to the away team.

Game Date Winner Prediction
1 Highlanders vs. Hurricanes Mar 21 Hurricanes -1.10
2 Waratahs vs. Rebels Mar 21 Waratahs 12.40
3 Blues vs. Cheetahs Mar 22 Blues 5.40
4 Brumbies vs. Stormers Mar 22 Brumbies 6.20
5 Force vs. Chiefs Mar 22 Chiefs -5.90
6 Lions vs. Reds Mar 22 Lions 0.10
7 Bulls vs. Sharks Mar 22 Sharks -0.10

 

March 18, 2014

Your gut instinct needs a balanced diet

I linked earlier to Jeff Leek’s post on fivethirtyeight.com, because I thought it talked sensibly about assessing health news stories, and how to find and read the actual research sources.

While on the bus, I had a Twitter conversation with Hilda Bastian, who had read the piece (not through StatsChat) and was Not Happy. On rereading, I think her points were good ones, so I’m going to try to explain what I like and don’t like about the piece. In the end, I think she and I had opposite initial reactions to the piece from on the same starting point, the importance of separating what you believe in advance from what the data tell you. (more…)

Briefly

  • At Pantheon, a visualisation of globally known people over time

[Update: Just had to add this one from a Huffington Post surveyNearly a quarter of Americans know what we should do about the Ukraine Administrative Adjustment Act of 2005]

Big Data & privacy presentation

If you have time, there’s an interesting event that will be streamed from New York University this (NZ) morning (10:30am today NZ time, 5:30pm yesterday NY time)

..the Data & Society Research Institute, the White House Office of Science and Technology Policy, and New York University’s Information Law Institute will be co-hosting a public event entitled The Social, Cultural, & Ethical Dimensions of “Big Data.” The purpose of this event is to convene key stakeholders and thought leaders from across academia, government, industry, and civil society to examine the social, cultural, and ethical implications of “big data,” with an eye to both the challenges and opportunities presented by the phenomenon.

The event is being organised by danah boyd, who we’ve mentioned a few times and whose new book I plan to write about soon.

Three fifths of five eighths of not very much at all

The latest BNZ-REINZ Residential Market Survey is out, and the Herald has even embedded the full document in their online story, which is a very promising change.

According to the report 6.4% of homes sales in March are  to off-shore buyers, 25% of whom were Chinese. 25% of 6.4% is 1.6%.

If you look at real estate statistics (eg, here) for last month you find 6125 residential sales through agents across NZ. 25% of 6.4% of 6125 is 98. That’s not a very big number.  For context, in the most recent month available, about 1500 new dwellings were consented.

You also find, looking at the real estate statistics, that last month was February, not March.  The  BNZ-REINZ Residential Market Survey is not an actual measurement, the estimates are averages of round numbers based on the opinion of real-estate agents across the country.  Even if we assume the agents know which buyers are offshore investors as opposed to recent or near-future immigrants (they estimate 41% of the foreign buyers will move here), it’s pretty rough data. To make it worse, the question on this topic just changed, so trends are even harder to establish.

That’s probably why the report said in the front-page summary “one would struggle, statistically-speaking, to conclude there is a lift or decline in foreign buying of NZ houses.”

The Herald  boldly took up that struggle.

Seven sigma?

The cosmologists are excited today, and there is data visualisation all over my Twitter feed

That’s a nice display of uncertainty at different levels of evidence, before (red) and after (blue) adding new data.  To get some idea of what is greater than zero and why they care, read the post by our upstairs neighbour Richard Easther (head of the Physics department)