Stat of the Week Competition Discussion: April 11 – 17 2015
If you’d like to comment on or debate any of this week’s Stat of the Week nominations, please do so below!
If you’d like to comment on or debate any of this week’s Stat of the Week nominations, please do so below!
If you have to make a decision with several options, each with different types of positive and negative effects, it’s going to be hard. Techniques for breaking down complex decisions into sets of simpler questions are very valuable, but it’s important that the way you break down the problem and recombine the answers fits with how you answer the simpler questions.
I’ve been pointed to what looks like an unfortunate example from the NZTA, in assessing options for the Petone–Grenada link road to be constructed near Wellington. The road comes in two sections: from Petone to the eastern section of Lincolnshire Farm, and from there to Grenada. According to the scoping report (PDF), these can be decided independently of each other, so there’s an ideal opportunity to simplify the decision making. NZTA describes four options P1 to P4 for the first section, and four options A to D for the second section.
I would have expected them to just make independent recommendations for the two sections, but what they actually did was more complicated. First, they looked at the P options and decided based on four criteria that P4 was best. They then looked at A+P4, B+P4, C+P4, and D+P4 for the same four criteria, and said in a footnote (p172) “Upon combining one of Option P1, P2, P3 or P4 with one Option A, B, C or D the effect more towards the negative takes precedence.”
This can only make sense if the harms or benefits weren’t independent. Sometimes that’s possible. In particular, one of the criteria was “resilience”, and you might argue that it doesn’t matter how robust the second part of the road is when the first part is under several meters of rock and mud, or filled with bumper-to-bumper traffic jams. It could make sense to take the worst value of the two sections when assessing resilience: but people who know more about Wellington-area transport than I do still seem dubious.
The same argument certainly doesn’t apply for the other criteria: archaeological, ecological, landscape/visual impact, and transport benefit/cost. If one section of the road is an environmental nightmare, that doesn’t make the environmental impact of the other section unimportant. If one section of the road is unavoidably ugly, that doesn’t excuse making the other section ugly. If one section destroys an important heritage site, it doesn’t mean the other section doesn’t have to care about preservation of the past. If one section is ridiculously expensive it doesn’t mean the costs are unimportant for the other section.
The impact of decomposing and recombining the evaluation as they did, is that any criterion where P4 was bad becomes much less important in choosing among options A to D. P4 was very bad on the landscape/visual criterion, and moderately bad on ecology.
By now you should be expecting the punch line: evaluated independently, options A and B look good because they score well on ecology and landscape/visual criteria. Evaluated in combination with P4, they look terrible, because the ecology and landscape benefits are masked by the “more negative” combining rule. That’s a problem with the combining rule, not with the road. Here’s a colour-coded version of the information in Table 23-19, p182 (from T. Duran)
Not only is the combining rule obviously missing some information, it’s not even internally consistent. If the evaluation had been done in the opposite order they might well have chosen A first, and then looked at A+P1 to A+P4. Even D was what they’d chosen first, P3+D would then look slightly better than P4+D.
It’s very tempting to look for ways of combining preferences that don’t rely on numbers, just on orderings, but in most cases they aren’t available, and attempts to do it leave you worse off than before.
This evaluation wasn’t set up to focus only on resilience — even assuming that the resilience assessment is valid, which I hear is also being questioned — it was set up to value the four criteria equally. It really looks as though a minor detail of the approach to simplifying the evaluation has had a large, accidental effect on the result.
When quoting results of medical research there’s often confusion between odds and probabilities, but there are stories in the Herald and Stuff at the moment that illustrate the difference.
As you know (unless you’ve been on Mars with your eyes shut and your fingers in your ears), Jeremy Clarkson will no longer be presenting Top Gear, and the world is waiting with bated breath to hear about his successor. Coral, a British firm of bookmakers, say that Sue Perkins is the current favourite.
The Herald quotes the Daily Mail, and so gives the odds as odds:
It has made her evens for the role, ahead of former X-factor presenter Dermot O’Leary who is 2-1 and British model Jodie Kidd who is third at 5-2.
Stuff translates these into NZ gambling terms, quoting the dividend, which is the reciprocal of the probability at which these would be regarded as fair bets
Bookmaker Coral have Perkins as the equivalent of a $2 favourite after a flurry of bets, while British-Irish presenter Dermot O’Leary was at $3 and television personality and fashion model Jodie Kidd at $3.50.
An odds of 5-2 means that betting £2 and winning gives you a profit of £5. The NZ approach is to quote the total money you get back: a bet of $2 gets you $2 back plus $5 profit, for a total of $7, so a bet of $1 would get you $3.50.
The fair probability of winning for an odds of 5-2 is 2/(5+2); the fair probability for a dividend of $3.50 is 1/3.50, the same number.
Of course, if these were fair bets the bookies would go out of business: the actual implied probability for Jodie Kidd is lower than 1/3.5 and the actual implied probability for Sue Perkins is lower than 0.5. On top of that, there is no guarantee the betting public is well calibrated on this issue.
Number of learner license tests taken in New Zealand, according to One News.
We’ll follow up to see if the future prediction part of the graph turns out to be correct.
From the Herald (originally from the Independent)
Short people are at a greater risk of heart attack – and there’s little they can do about it because the link is genetic.
This one is partly the fault of the researchers and partly the fault of the journalists. The press release says
“We have shown that the association between shorter height and higher risk of coronary heart disease is a primary relationship and is not due to confounding factors such as nutrition or poor socioeconomic conditions.”
That’s partly true, and new and interesting, but (a) it’s being oversold (“the” association?) and (b) even if it were completely true, it wouldn’t imply the “there’s little they can do about it” added by the journalists.
Taking the second point first: knowing that something has a genetic component tells you absolutely nothing about how easy or hard it is to change. At a biological level hair colour and eye colour have similar degrees of genetic influence, but one of them is very easy to change and the other is more difficult and inconvenient.
Also, it’s certainly not true that height is entirely genetically determined. There is a genetic component: tall people have tall children. There is also an environmental component: most people are taller than their grandparents. Here’s a graph (source) showing how the heights of Dutch people changed over sixty years: the Dutch went from some of the shortest people in Europe to some of the tallest, and this was an environmental change, not a genetic change.
The research paper doesn’t even claim that among modern Westerners the association between height and heart attack risk is all genetic, though if you only have the press release you have to read carefully to avoid getting that impression. Even within the (fairly homogeneous) groups of people being studied, the genetic variants they used explain only about 10% of the variation in height.
What’s new in this research is that some of the relationship between height and heart attack risk is genetic. Until now, it was possible that all the association was explained by environmental factors in childhood or before birth that made people shorter and also, separately, increased their heart attack risk.
For the part of the relationship explained by genetic variation there are basically three possible sorts of explanation:
These are all interesting, and there’s a reasonable hope of being able to separate them out with more data and experiments.
The last sentence of the research paper is a good counterpoint to the media coverage
More generally, our findings underscore the complexity underlying the inherited component of CAD.
[Disclosure: I work with one of the cohorts that is part of one of the consortia that is part of the whole Cardiogram group and I know some of the researchers — but that would be true of anyone in the field]
The basic method is described on my Department home page.
Here are the team ratings prior to this week’s games, along with the ratings at the start of the season.
Current Rating | Rating at Season Start | Difference | |
---|---|---|---|
Rabbitohs | 11.30 | 13.06 | -1.80 |
Roosters | 9.06 | 9.09 | -0.00 |
Cowboys | 6.51 | 9.52 | -3.00 |
Storm | 5.32 | 4.36 | 1.00 |
Broncos | 4.51 | 4.03 | 0.50 |
Panthers | 2.88 | 3.69 | -0.80 |
Bulldogs | 1.52 | 0.21 | 1.30 |
Warriors | 1.47 | 3.07 | -1.60 |
Knights | 0.29 | -0.28 | 0.60 |
Dragons | -1.66 | -1.74 | 0.10 |
Sea Eagles | -2.21 | 2.68 | -4.90 |
Eels | -5.31 | -7.19 | 1.90 |
Raiders | -6.34 | -7.09 | 0.70 |
Wests Tigers | -7.54 | -13.13 | 5.60 |
Sharks | -8.87 | -10.76 | 1.90 |
Titans | -9.60 | -8.20 | -1.40 |
So far there have been 40 matches played, 22 of which were correctly predicted, a success rate of 55%.
Here are the predictions for last week’s games.
Game | Date | Score | Prediction | Correct | |
---|---|---|---|---|---|
1 | Bulldogs vs. Rabbitohs | Apr 03 | 17 – 18 | -7.80 | TRUE |
2 | Titans vs. Broncos | Apr 03 | 16 – 26 | -11.30 | TRUE |
3 | Knights vs. Dragons | Apr 04 | 0 – 13 | 7.80 | FALSE |
4 | Sea Eagles vs. Raiders | Apr 04 | 16 – 29 | 10.30 | FALSE |
5 | Roosters vs. Sharks | Apr 05 | 12 – 20 | 25.40 | FALSE |
6 | Eels vs. Wests Tigers | Apr 06 | 6 – 22 | 8.60 | FALSE |
7 | Panthers vs. Cowboys | Apr 06 | 10 – 30 | 2.40 | FALSE |
8 | Storm vs. Warriors | Apr 06 | 30 – 14 | 6.50 | TRUE |
Here are the predictions for Round 6. The prediction is my estimated expected points difference with a positive margin being a win to the home team, and a negative margin a win to the away team.
Game | Date | Winner | Prediction | |
---|---|---|---|---|
1 | Broncos vs. Roosters | Apr 10 | Roosters | -1.50 |
2 | Sharks vs. Knights | Apr 10 | Knights | -6.20 |
3 | Eels vs. Titans | Apr 11 | Eels | 7.30 |
4 | Panthers vs. Sea Eagles | Apr 11 | Panthers | 8.10 |
5 | Warriors vs. Wests Tigers | Apr 11 | Warriors | 13.00 |
6 | Dragons vs. Bulldogs | Apr 12 | Bulldogs | -0.20 |
7 | Raiders vs. Storm | Apr 12 | Storm | -8.70 |
8 | Rabbitohs vs. Cowboys | Apr 13 | Rabbitohs | 7.80 |
The basic method is described on my Department home page.
Here are the team ratings prior to this week’s games, along with the ratings at the start of the season.
Current Rating | Rating at Season Start | Difference | |
---|---|---|---|
Crusaders | 10.43 | 10.42 | 0.00 |
Waratahs | 8.34 | 10.00 | -1.70 |
Hurricanes | 5.72 | 2.89 | 2.80 |
Brumbies | 4.58 | 2.20 | 2.40 |
Chiefs | 3.79 | 2.23 | 1.60 |
Bulls | 2.49 | 2.88 | -0.40 |
Stormers | 2.03 | 1.68 | 0.30 |
Sharks | 0.17 | 3.91 | -3.70 |
Blues | 0.09 | 1.44 | -1.30 |
Highlanders | -0.23 | -2.54 | 2.30 |
Lions | -3.32 | -3.39 | 0.10 |
Force | -4.56 | -4.67 | 0.10 |
Cheetahs | -7.14 | -5.55 | -1.60 |
Rebels | -7.20 | -9.53 | 2.30 |
Reds | -8.20 | -4.98 | -3.20 |
So far there have been 53 matches played, 36 of which were correctly predicted, a success rate of 67.9%.
Here are the predictions for last week’s games.
Game | Date | Score | Prediction | Correct | |
---|---|---|---|---|---|
1 | Hurricanes vs. Stormers | Apr 03 | 25 – 20 | 8.90 | TRUE |
2 | Rebels vs. Reds | Apr 03 | 23 – 15 | 4.30 | TRUE |
3 | Chiefs vs. Blues | Apr 04 | 23 – 16 | 7.80 | TRUE |
4 | Brumbies vs. Cheetahs | Apr 04 | 20 – 3 | 16.00 | TRUE |
5 | Sharks vs. Crusaders | Apr 04 | 10 – 52 | -1.60 | TRUE |
6 | Lions vs. Bulls | Apr 04 | 22 – 18 | -2.70 | FALSE |
Here are the predictions for Round 9. The prediction is my estimated expected points difference with a positive margin being a win to the home team, and a negative margin a win to the away team.
Game | Date | Winner | Prediction | |
---|---|---|---|---|
1 | Blues vs. Brumbies | Apr 10 | Blues | 0.00 |
2 | Crusaders vs. Highlanders | Apr 11 | Crusaders | 14.70 |
3 | Waratahs vs. Stormers | Apr 11 | Waratahs | 10.80 |
4 | Force vs. Cheetahs | Apr 11 | Force | 7.10 |
5 | Bulls vs. Reds | Apr 11 | Bulls | 15.20 |
6 | Lions vs. Sharks | Apr 11 | Lions | 0.50 |
Me: Either that will be warranted by the data, in which case it’s a good thing, or it won’t, in which case I’m a bad statistician. Are you saying I’m a bad statistician?
First, from Mother Jones magazine, via Twitter
The impact of the carbon tax looks impressive, but this is a bar chart — it starts at zero and they’ve only shown the top fifth of it.
They do link to the data, the quarterly Greenhouse Gas Inventory update. In that report, Figure 8 is
The dotted line is the same data as the bar chart, except that the dotted line has data for every quarter and the bar chart has data only for the July-September quarter each year. And the line chart has a wider range on the vertical axis — it doesn’t go down to zero, but it isn’t a bar chart, so it doesn’t have to. The other point about the line chart is that there’s a solid line there as well. The solid line is adjusted for seasonal variation and weather. If you wanted to know about real changes in how Australians are using energy, that’s the line you’d use.
Second, a beautiful map of CO2 emissions from fossil fuel combustion, from the Washington Post via Flowing Data
The ‘vertical’ scale here is a colour scale; what’s misleading is that it’s a logarithmic scale. The map makes it look as if a large fraction of CO2 emission comes from transporting stuff through empty areas, but the pale beige indicates emissions thousands of times lower than in the urban/suburban areas. Red ink isn’t anywhere close to being proportional to CO2.