Posts from March 2015 (48)

March 9, 2015

Not all there

One of the most common problems with data is that it’s not there. Families don’t answer their phones, over-worked nurses miss some forms, and even tireless electronic recorders have power failures.

There’s a large field of statistical research devoted to ways of fixing the missing-data problem. None of them work — that’s not my cynical opinion, that’s a mathematical theorem — but many of them are more likely to make things better than worse.  The best ways to handle data you don’t have depends on what sort of data and why you don’t have it, but even the best ways can confuse people who aren’t paying attention.

Just ignoring the missing data problem and treating the data you have as all the data is effectively assuming the missing data look just like the observed data. This is often very implausible. For example, in a weight-loss study it is much more likely that people who aren’t losing weight will drop out. If you just analyse data from people who stay in the study and follow all your instructions, unless this is nearly everyone, they will probably have lost weight (on average) even if your treatment is just staring at a container of felt-tip pens.

That’s why it is often sensible to treat missing observations as if they were bad. The Ministry of Health drinking water standards do this.  For example, they say that only 96.7% of New Zealand received water complying with the bacteriological standards. That sounds serious. Of the 3.3% failures, however, more than half (2.0%) were just failures to monitor thoroughly enough, and only 0.1% had E. coli transgression that were not followed up by immediate corrective action.

From a regulatory point of view, lumping these together makes sense. The Ministry doesn’t want to create incentives for data to ‘accidentally’ go missing whenever there’s a problem. From a public health point of view, though, you can get badly confused if you just look at the headline compliance figure and don’t read down to page 18.

The Ministry takes a similarly conservative approach to the other standards, and the detailed explanations are more reassuring than the headline compliance figures. There are a small number of water supplies with worrying levels of arsenic — enough to increase lifetime cancer risk by a tenth of a percentage point or so — but in general the biggest problem is inadequate fluoride concentrations in drinking water for nearly half of Kiwi kids.

 

March 5, 2015

Showing us the money

The Herald is running a project to crowdsource data entry and annotation for NZ political donations and expenses: it’s something that’s hard to automate and where local knowledge is useful. Today, they have an interactive graph for 2014 election donations and have made the data available

money

Briefly

March 4, 2015

NRL Predictions for Round 1

Team Ratings for Round 1

The basic method is described on my Department home page.

Here are the team ratings prior to this week’s games, along with the ratings at the start of the season.

Current Rating Rating at Season Start Difference
Rabbitohs 13.06 13.06 -0.00
Cowboys 9.52 9.52 -0.00
Roosters 9.09 9.09 -0.00
Storm 4.36 4.36 0.00
Broncos 4.03 4.03 -0.00
Panthers 3.69 3.69 -0.00
Warriors 3.07 3.07 -0.00
Sea Eagles 2.68 2.68 0.00
Bulldogs 0.21 0.21 0.00
Knights -0.28 -0.28 -0.00
Dragons -1.74 -1.74 -0.00
Raiders -7.09 -7.09 -0.00
Eels -7.19 -7.19 -0.00
Titans -8.20 -8.20 0.00
Sharks -10.76 -10.76 -0.00
Wests Tigers -13.13 -13.13 -0.00

 

Predictions for Round 1

Here are the predictions for Round 1. The prediction is my estimated expected points difference with a positive margin being a win to the home team, and a negative margin a win to the away team.

Game Date Winner Prediction
1 Broncos vs. Rabbitohs Mar 05 Rabbitohs -6.00
2 Eels vs. Sea Eagles Mar 06 Sea Eagles -6.90
3 Cowboys vs. Roosters Mar 07 Cowboys 3.40
4 Knights vs. Warriors Mar 07 Knights 0.70
5 Titans vs. Wests Tigers Mar 07 Titans 7.90
6 Panthers vs. Bulldogs Mar 08 Panthers 6.50
7 Sharks vs. Raiders Mar 08 Raiders -0.70
8 Dragons vs. Storm Mar 09 Storm -3.10

 

Super 15 Predictions for Round 4

Team Ratings for Round 4

The basic method is described on my Department home page.

Here are the team ratings prior to this week’s games, along with the ratings at the start of the season.

Current Rating Rating at Season Start Difference
Waratahs 8.16 10.00 -1.80
Crusaders 7.07 10.42 -3.30
Hurricanes 5.52 2.89 2.60
Chiefs 4.08 2.23 1.90
Brumbies 3.84 2.20 1.60
Sharks 2.81 3.91 -1.10
Stormers 2.80 1.68 1.10
Bulls 1.82 2.88 -1.10
Blues 0.43 1.44 -1.00
Highlanders -2.53 -2.54 0.00
Cheetahs -4.29 -5.55 1.30
Lions -4.33 -3.39 -0.90
Force -5.17 -4.67 -0.50
Reds -5.91 -4.98 -0.90
Rebels -7.29 -9.53 2.20

 

Performance So Far

So far there have been 21 matches played, 13 of which were correctly predicted, a success rate of 61.9%.

Here are the predictions for last week’s games.

Game Date Score Prediction Correct
1 Highlanders vs. Reds Feb 27 20 – 13 8.10 TRUE
2 Force vs. Hurricanes Feb 27 13 – 42 -3.40 TRUE
3 Cheetahs vs. Blues Feb 27 25 – 24 -0.50 FALSE
4 Chiefs vs. Crusaders Feb 28 40 – 16 -1.80 FALSE
5 Rebels vs. Brumbies Feb 28 15 – 20 -7.60 TRUE
6 Bulls vs. Sharks Feb 28 43 – 35 2.20 TRUE
7 Lions vs. Stormers Feb 28 19 – 20 -3.60 TRUE

 

Predictions for Round 4

Here are the predictions for Round 4. The prediction is my estimated expected points difference with a positive margin being a win to the home team, and a negative margin a win to the away team.

Game Date Winner Prediction
1 Chiefs vs. Highlanders Mar 06 Chiefs 10.60
2 Brumbies vs. Force Mar 06 Brumbies 13.00
3 Blues vs. Lions Mar 07 Blues 9.30
4 Reds vs. Waratahs Mar 07 Waratahs -10.10
5 Cheetahs vs. Bulls Mar 07 Bulls -2.10
6 Stormers vs. Sharks Mar 07 Stormers 4.00

 

March 2, 2015

A nice cuppa

Q:  What do you think about this new research on tea preventing diabetes?

A: That’s not what it says

Q: Sure it is. Big black letters, right at the top: “Three cups of tea a day can cut your risk of diabetes… even if you add milk”

A: I mean that’s not what the research says

Q: The bit about milk?

A: Well, they didn’t study milk at all, but that’s not the main problem

Q: They didn’t study cups?

A: No. Or diabetes. Or, in one of the studies, tea.

Q: Hmm. Ok, so this “glucose-lowering effect” they write about, is that a lab study?

A: Yes.

Q: Mice?

A:  One of the studies used rats, the other didn’t

Q: Cells, then?

A: No, just enzymes in a test tube, and a highly processed chemical extract of tea.

Q: Ok, forget about that one. But the rat study, that measured actual glucose lowering and actual tea?

A:  Almost. They gave the rats a high-sugar drink, and if they were given the tea first, their blood glucose didn’t go up as much.

Q: Which of the two studies was this one?

A: The one where the story just says the results were similar and doesn’t give the researchers’ names, only their institution.

Q: Wouldn’t you think the story would say more about this one, since it actually involves blood glucose and, like, living things?

A: In a perfect world, yes.

Q: The story says they don’t think milk would make a difference. What about sugar?

A: No mention of it.

Q: That’s strange. Quite a lot of British people have sugar in their tea. Wouldn’t it be helpful to say something?

A: You’d think.

Q: How much tea did the rats get?

A: The lowest effective dose they report is 62.5 mg/kg of freeze-dried tea powder

Q: What’s that in cups?

A: The research paper says “corresponds to nine cups of black tea”.

Q: Per day?

A: No, all at once.

Q: So we need to get bigger cups?

A: Or fewer reprinted British ‘health’ stories.

Stat of the Week Competition: February 28 – March 6 2015

Each week, we would like to invite readers of Stats Chat to submit nominations for our Stat of the Week competition and be in with the chance to win an iTunes voucher.

Here’s how it works:

  • Anyone may add a comment on this post to nominate their Stat of the Week candidate before midday Friday March 6 2015.
  • Statistics can be bad, exemplary or fascinating.
  • The statistic must be in the NZ media during the period of February 28 – March 6 2015 inclusive.
  • Quote the statistic, when and where it was published and tell us why it should be our Stat of the Week.

Next Monday at midday we’ll announce the winner of this week’s Stat of the Week competition, and start a new one.

(more…)

Stat of the Week Competition Discussion: February 28 – March 6 2015

If you’d like to comment on or debate any of this week’s Stat of the Week nominations, please do so below!