November 7, 2013

Why you should eat in crowded food halls

There’s a couple of posts being promoted on the internet about an important and relatively subtle form of selection bias.  Epidemiologists know it as Berkson’s Paradox, in modern causal inference terminology it’s ‘conditioning on colliders’, and for an economist it’s a consequence of production-possibility frontier.

The basic issue is very simple. As Gabriel Rossman puts it at The Atlantic

 There is no ontological reason why we can’t have shoes that are both hideous and uncomfortable but rather there is a practical reason in that nobody wears shoes that are terrible in every way and so such shoes don’t make it unto the market. 

In the same way, there’s no necessary reason why cricketers who are good at bowling have to be bad at batting.  Being able to deliver the ball so it misleads or outpaces the batsman doesn’t make it any harder to spot bowling trickery or to react fast. And in fact, if you look at 12-year-olds, often the same kids are good at batting and bowling.  In international-level cricket, though, all-rounders are pretty rare, and someone who can take 5 wickets in an Test innings is very unlikely to be able to score a Test century.  The slight positive correlation you see in kids turns into a strong negative correlation in adults. The reason is that getting into an international cricket team requires you to be very, very good at batting or very, very good at bowling. Since it’s more likely that you’re very, very good at one thing than two, most international cricketers are either batsmen or bowlers, but not both. Among those who are selected, there’s a negative correlation.

There are examples in the social sciences: opposition to marijuana legalisation is positively correlated with opposition to government wealth redistribution in the US as a whole, but uncorrelated among Republican voters.

There are examples in medicine: the genetic variant Factor V Leiden is strongly associated with deep-vein thrombosis in the population in general, but not at all predictive of recurrence in people who have already had one.

And there are examples in dining: for a given price, a successful restaurant has to do well enough on some combination of food quality, pleasant ambience, trendiness, etc. So these will end up negatively correlated, and if you want good inexpensive food in downtown Auckland, try one of the Asian food courts.

(via @gnat, who points to one of the posts and notes: Anyone who thinks it’s possible to draw truthful conclusions from data analysis without really learning statistics needs to read this.)

avatar

Thomas Lumley (@tslumley) is Professor of Biostatistics at the University of Auckland. His research interests include semiparametric models, survey sampling, statistical computing, foundations of statistics, and whatever methodological problems his medical collaborators come up with. He also blogs at Biased and Inefficient See all posts by Thomas Lumley »