Posts from April 2015 (35)

April 13, 2015

Stat of the Week Competition Discussion: April 11 – 17 2015

If you’d like to comment on or debate any of this week’s Stat of the Week nominations, please do so below!

April 12, 2015

Reductionism and the Petone-Grenada link

If you have to make a decision with several options, each with different types of positive and negative effects, it’s going to be hard. Techniques for breaking down complex decisions into sets of simpler questions are very valuable, but it’s important that the way you break down the problem and recombine the answers fits with how you answer the simpler questions.

I’ve been pointed to what looks like an unfortunate example from the NZTA, in assessing options for the Petone–Grenada link road to be constructed near Wellington. The road comes in two sections: from Petone to the eastern section of Lincolnshire Farm, and from there to Grenada. According to the scoping report (PDF), these can be decided independently of each other, so there’s an ideal opportunity to simplify the decision making.  NZTA describes four options P1 to P4 for the first section, and four options A to D for the second section.

I would have expected them to just make independent recommendations for the two sections, but what they actually did was more complicated. First, they looked at the P options and decided based on four criteria that P4 was best.  They then looked at A+P4, B+P4, C+P4, and D+P4 for the same four criteria, and said in a footnote (p172) “Upon combining one of Option P1, P2, P3 or P4 with one Option A, B, C or D the effect more towards the negative takes precedence.

This can only make sense if the harms or benefits weren’t independent.  Sometimes that’s possible. In particular, one of the criteria was “resilience”, and you might argue that it doesn’t matter how robust the second part of the road is when the first part is under several meters of rock and mud, or filled with bumper-to-bumper traffic jams. It could make sense to take the worst value of the two sections when assessing resilience: but people who know more about Wellington-area transport than I do still seem dubious.

The same argument certainly doesn’t apply for the other criteria: archaeological,  ecological,  landscape/visual impact, and transport benefit/cost. If one section of the road is an environmental nightmare, that doesn’t make the environmental impact of the other section unimportant. If one section of the road is unavoidably ugly, that doesn’t excuse making the other section ugly. If one section destroys an important heritage site, it doesn’t mean the other section doesn’t have to care about preservation of the past. If one section is ridiculously expensive it doesn’t mean the costs are unimportant for the other section.

The impact of decomposing and recombining the evaluation as they did, is that any criterion where P4 was bad becomes much less important in choosing among options A to D. P4 was very bad on the landscape/visual criterion, and moderately bad on ecology.

By now you should be expecting the punch line: evaluated independently, options A and B look good because they score well on ecology and landscape/visual criteria. Evaluated in combination with P4, they look terrible, because the ecology and landscape benefits are masked by the “more negative” combining rule. That’s a problem with the combining rule, not with the road. Here’s a colour-coded version of the information in Table 23-19, p182 (from T. Duran)

Separate%20and%20Combined

Not only is the combining rule obviously missing some information, it’s not even internally consistent. If the evaluation had been done in the opposite order they might well have chosen A first, and then looked at A+P1 to A+P4. Even D was what they’d chosen first, P3+D would then look slightly better than P4+D.

It’s very tempting to look for ways of combining preferences that don’t rely on numbers, just on orderings, but in most cases they aren’t available, and attempts to do it leave you worse off than before.

This evaluation wasn’t set up to focus only on resilience — even assuming that the resilience assessment is valid, which I hear is also being questioned — it was set up to value the four criteria equally. It really looks as though a minor detail of the approach to simplifying the evaluation has had a large, accidental effect on the result.

April 10, 2015

Briefly

  • A properly-conducted opinion poll in Cuba, done in secret. Impressive.
  • As the Herald reports, New Zealand moved from 1st to 5th on the index reported by Social Progress Imperative. The story also points out, helpfully, that a lot of this is changes in how things are measured.  It turns out this goes further:  a 2014 version of the index is available using the new measurements. When the same definitions are used for the two years, NZ stays at the same ranking (5th) and improves on the actual values (from 86.93 to 87.08).
  • JPMorgan is using workplace data to predict which employees are likely to ‘go rogue’. Matt Levine doesn’t really worry. The Bloomberg News story worries a bit, but only “Policing intentions can be a slippery slope. Do people get a scarlet letter for something they have yet to do?” They don’t seem to consider false positives: people who weren’t going to do anything wrong (or more wrong than is necessary if you work for an investment bank).
  • The NZ Association of Scientists is having a conference titled “Speaking Out: Going public on difficult issues”. There will probably be more stuff on line soon, but currently you can read an expanded version of Peter Gluckman’s talk, and listen to (NZAS President) Nicola Gaston on Radio NZ; the Twitter hashtag is 

Odds and probabilities

When quoting results of medical research there’s often confusion between odds and probabilities, but there are stories in the Herald and Stuff at the moment that illustrate the difference.

As you know (unless you’ve been on Mars with your eyes shut and your fingers in your ears), Jeremy Clarkson will no longer be presenting Top Gear, and the world is waiting with bated breath to hear about his successor.  Coral, a British firm of bookmakers, say that Sue Perkins is the current favourite.

The Herald quotes the Daily Mail, and so gives the odds as odds:

It has made her evens for the role, ahead of former X-factor presenter Dermot O’Leary who is 2-1 and British model Jodie Kidd who is third at 5-2.

Stuff translates these into NZ gambling terms, quoting the dividend, which is the reciprocal of the probability at which these would be regarded as fair bets

Bookmaker Coral have Perkins as the equivalent of a $2 favourite after a flurry of bets, while British-Irish presenter Dermot O’Leary was at $3 and television personality and fashion model Jodie Kidd at $3.50.

An odds of 5-2 means that betting £2 and winning gives you a profit of £5.  The NZ approach is to quote the total money you get back: a bet of $2 gets you $2 back plus $5 profit, for a total of $7, so a bet of $1 would get you $3.50.

The fair probability of winning for an odds of 5-2 is 2/(5+2); the fair probability for a dividend of $3.50 is 1/3.50, the same number.

Of course, if these were fair bets the bookies would go out of business: the actual implied probability for Jodie Kidd is lower than 1/3.5 and the actual implied probability for Sue Perkins is lower than 0.5.  On top of that, there is no guarantee the betting public is well calibrated on this issue.

 

April 9, 2015

Graph of the week

license

Number of learner license tests taken in New Zealand, according to One News.

We’ll follow up to see if the future prediction part of the graph turns out to be correct.

Height and heart attack: genetic determinism is still wrong

From the Herald (originally from the Independent)

Short people are at a greater risk of heart attack – and there’s little they can do about it because the link is genetic.

This one is partly the fault of the researchers and partly the fault of the journalists.  The press release says

“We have shown that the association between shorter height and higher risk of coronary heart disease is a primary relationship and is not due to confounding factors such as nutrition or poor socioeconomic conditions.”

That’s partly true, and new and interesting, but (a) it’s being oversold (“the” association?) and (b) even if it were completely true, it wouldn’t imply the “there’s little they can do about it” added by the journalists.

Taking the second point first: knowing that something has a genetic component tells you absolutely nothing about how easy or hard it is to change. At a biological level hair colour and eye colour have similar degrees of genetic influence, but one of them is very easy to change and the other is more difficult and inconvenient.

Also, it’s certainly not true that height is entirely genetically determined. There is a genetic component: tall people have tall children. There is also an environmental component: most people are taller than their grandparents.  Here’s a graph (source) showing how the heights of Dutch people changed over sixty years: the Dutch went from some of the shortest people in Europe to some of the tallest, and this was an environmental change, not a genetic change.

pr2012189f2

The research paper doesn’t even claim that among modern Westerners the association between height and heart attack risk is all genetic, though if you only have the press release you have to read carefully to avoid getting that impression. Even within the (fairly homogeneous) groups of people being studied, the genetic variants they used explain only about 10% of the variation in height.

What’s new in this research is that some of the relationship between height and heart attack risk is genetic. Until now, it was possible that all the association was explained by environmental factors in childhood or before birth that made people shorter and also, separately, increased their heart attack risk.

For the part of the relationship explained by genetic variation there are basically three possible sorts of explanation:

  • Being short has some direct biological effect on risk,  for example, smaller people have smaller blood vessels, which might get blocked by smaller blood clots.
  • Being short subjects you to different environmental risks: for example, if shorter people had lower incomes (on average) they might have higher risk for various social and lifestyle reasons
  • The genetic variants that make you shorter also have some separate effect on heart attack risk: for example, the same variant might affect growth in infancy and also affect diabetes risk in later life.

These are all interesting, and there’s a reasonable hope of being able to separate them out with more data and experiments.

The last sentence of the research paper is a good counterpoint to the media coverage

More generally, our findings underscore the complexity underlying the inherited component of CAD.

 

 

[Disclosure: I work with one of the cohorts that is part of one of the consortia that is part of the whole Cardiogram group and I know some of the researchers — but that would be true of anyone in the field]

April 8, 2015

NRL Predictions for Round 6

Team Ratings for Round 6

The basic method is described on my Department home page.

Here are the team ratings prior to this week’s games, along with the ratings at the start of the season.

Current Rating Rating at Season Start Difference
Rabbitohs 11.30 13.06 -1.80
Roosters 9.06 9.09 -0.00
Cowboys 6.51 9.52 -3.00
Storm 5.32 4.36 1.00
Broncos 4.51 4.03 0.50
Panthers 2.88 3.69 -0.80
Bulldogs 1.52 0.21 1.30
Warriors 1.47 3.07 -1.60
Knights 0.29 -0.28 0.60
Dragons -1.66 -1.74 0.10
Sea Eagles -2.21 2.68 -4.90
Eels -5.31 -7.19 1.90
Raiders -6.34 -7.09 0.70
Wests Tigers -7.54 -13.13 5.60
Sharks -8.87 -10.76 1.90
Titans -9.60 -8.20 -1.40

 

Performance So Far

So far there have been 40 matches played, 22 of which were correctly predicted, a success rate of 55%.

Here are the predictions for last week’s games.

Game Date Score Prediction Correct
1 Bulldogs vs. Rabbitohs Apr 03 17 – 18 -7.80 TRUE
2 Titans vs. Broncos Apr 03 16 – 26 -11.30 TRUE
3 Knights vs. Dragons Apr 04 0 – 13 7.80 FALSE
4 Sea Eagles vs. Raiders Apr 04 16 – 29 10.30 FALSE
5 Roosters vs. Sharks Apr 05 12 – 20 25.40 FALSE
6 Eels vs. Wests Tigers Apr 06 6 – 22 8.60 FALSE
7 Panthers vs. Cowboys Apr 06 10 – 30 2.40 FALSE
8 Storm vs. Warriors Apr 06 30 – 14 6.50 TRUE

 

Predictions for Round 6

Here are the predictions for Round 6. The prediction is my estimated expected points difference with a positive margin being a win to the home team, and a negative margin a win to the away team.

Game Date Winner Prediction
1 Broncos vs. Roosters Apr 10 Roosters -1.50
2 Sharks vs. Knights Apr 10 Knights -6.20
3 Eels vs. Titans Apr 11 Eels 7.30
4 Panthers vs. Sea Eagles Apr 11 Panthers 8.10
5 Warriors vs. Wests Tigers Apr 11 Warriors 13.00
6 Dragons vs. Bulldogs Apr 12 Bulldogs -0.20
7 Raiders vs. Storm Apr 12 Storm -8.70
8 Rabbitohs vs. Cowboys Apr 13 Rabbitohs 7.80

 

Super 15 Predictions for Round 9

Team Ratings for Round 9

The basic method is described on my Department home page.

Here are the team ratings prior to this week’s games, along with the ratings at the start of the season.

Current Rating Rating at Season Start Difference
Crusaders 10.43 10.42 0.00
Waratahs 8.34 10.00 -1.70
Hurricanes 5.72 2.89 2.80
Brumbies 4.58 2.20 2.40
Chiefs 3.79 2.23 1.60
Bulls 2.49 2.88 -0.40
Stormers 2.03 1.68 0.30
Sharks 0.17 3.91 -3.70
Blues 0.09 1.44 -1.30
Highlanders -0.23 -2.54 2.30
Lions -3.32 -3.39 0.10
Force -4.56 -4.67 0.10
Cheetahs -7.14 -5.55 -1.60
Rebels -7.20 -9.53 2.30
Reds -8.20 -4.98 -3.20

 

Performance So Far

So far there have been 53 matches played, 36 of which were correctly predicted, a success rate of 67.9%.

Here are the predictions for last week’s games.

Game Date Score Prediction Correct
1 Hurricanes vs. Stormers Apr 03 25 – 20 8.90 TRUE
2 Rebels vs. Reds Apr 03 23 – 15 4.30 TRUE
3 Chiefs vs. Blues Apr 04 23 – 16 7.80 TRUE
4 Brumbies vs. Cheetahs Apr 04 20 – 3 16.00 TRUE
5 Sharks vs. Crusaders Apr 04 10 – 52 -1.60 TRUE
6 Lions vs. Bulls Apr 04 22 – 18 -2.70 FALSE

 

Predictions for Round 9

Here are the predictions for Round 9. The prediction is my estimated expected points difference with a positive margin being a win to the home team, and a negative margin a win to the away team.

Game Date Winner Prediction
1 Blues vs. Brumbies Apr 10 Blues 0.00
2 Crusaders vs. Highlanders Apr 11 Crusaders 14.70
3 Waratahs vs. Stormers Apr 11 Waratahs 10.80
4 Force vs. Cheetahs Apr 11 Force 7.10
5 Bulls vs. Reds Apr 11 Bulls 15.20
6 Lions vs. Sharks Apr 11 Lions 0.50

 

April 7, 2015

Briefly

  • NPR’s Science Friday covers BAHfest, a competition to produce what look like scientific arguments for nutty conclusions. Hilarious, but also important: serious and scammy pseudoscience uses the same tricks.
  • Emma Pierson, a scientist who studies dating analyses a year of emails with her boyfriend
    Him
    : You’re going to find some weird pattern and break up with me.

    Me: Either that will be warranted by the data, in which case it’s a good thing, or it won’t, in which case I’m a bad statistician. Are you saying I’m a bad statistician?

  • And a post by Emma Pierson at 538.com: “people just want to date themselves”
  • Another story about changes in cancer risk that just uses number of diagnoses, without even gesturing in the direction of screening bias.

Evils of Axis

First, from Mother Jones magazine, via Twitter

oz-carbon-emissions4

The impact of the carbon tax looks impressive, but this is a bar chart — it starts at zero and they’ve only shown the top fifth of it.

They do link to the data, the quarterly Greenhouse Gas Inventory update.  In that report, Figure 8 is

ozcarbon-line

The dotted line is the same data as the bar chart, except that the dotted line has data for every quarter and the bar chart has data only for the July-September quarter each year. And  the line chart has a wider range on the vertical axis — it doesn’t go down to zero, but it isn’t a bar chart, so it doesn’t have to. The other point about the line chart is that there’s a solid line there as well. The solid line is adjusted for seasonal variation and weather. If you wanted to know about real changes in how Australians are using energy, that’s the line you’d use.

 

Second, a beautiful map of CO2 emissions from fossil fuel combustion, from the Washington Post via Flowing Data

co2map

The ‘vertical’ scale here is a colour scale; what’s misleading is that it’s a logarithmic scale. The map makes it look as if a large fraction of CO2 emission comes from transporting stuff through empty areas, but the pale beige indicates emissions thousands of times lower than in the urban/suburban areas. Red ink isn’t anywhere close to being proportional to CO2.