Stat of the Week Competition Discussion: October 10 – 16 2015
If you’d like to comment on or debate any of this week’s Stat of the Week nominations, please do so below!
If you’d like to comment on or debate any of this week’s Stat of the Week nominations, please do so below!
Q: Did you see there’s a giant rock with the potential to end life on Earth?
A: This one?
Q: Yes. Are they exaggerating?
A: Depends what you mean. In a sense it does have the potential to end human life on Earth, but it would have to actually hit Earth to do that.
Q: But it’s “similar to the 1862 Apollo asteroid which was classified as a potentially hazardous object”
A: Similar except for being a lot further away. As the story says, “Potentially Hazardous Objects” approach closer than 7,402,982km, and this one is about 25 million km away at its closest.
Q: That’s an awfully precise number, 7,402,982, isn’t it? Why do they need it to the nearest kilometre?
A: They don’t. It’s 0.05 Astronomical Units, and whoever did the conversion doesn’t understand about significant digits. Wikipedia, for example, rounds it to 7.5 million km.
Q: And the other really precise numbers? It says the asteroid is moving at 64,374km/hr, but surely the speed will change more than 1km/hr because, you know, gravity and physics and stuff?
A: That’s 40,000 miles per hour. Again, looks like one significant digit in the original.
Q: So how far away is this asteroid compared to, say, the moon?
A: To one significant figure, 100 times further away.
Q: That’s quite a lot. Why is NASA making a fuss about this asteroid?
A: They aren’t. They issued a press release about asteroid rumours in August, headlined “There is no asteroid threatening Earth“. The NASA @asteroidwatch twitterwallah is getting a bit tetchy about the whole thing.
Q: Does the asteroid have something to do with the “blood moon” we had recently?
A: Only in the sense that they were both completely unsurprising and harmless astronomical events.
(h/t @philiplyth)
Yesterday I wrote about a ‘gay epigenetics’ story in the Herald, and wasn’t convinced that there was anything worth publicising at this point, and that there wasn’t enough detail to interpret the results.
Ed Yong, a science journalist who was actually at the conference, has a story today in the Atlantic. He fingers the conference as the responsible party for the publicity (here’s their press release), though with the active cooperation of the researchers.
His story has more detail and makes it clear that there’s very little evidence, and more importantly that the lead researcher knew this:
“The reality is that we had basically no funding,” he said. “The sample size was not what we wanted. But do I hold out for some impossible ideal or do I work with what I have? I chose the latter.”
For pilot research presented to consenting scientists that might be reasonable, but for press releases it isn’t.
Epigenetics is an area of science where New Zealand has an international reputation. It would be a pity if it ended up as one of the areas where you can be sure that basically nothing that makes it to the newspapers is true.
From the Herald (from the Telegraph)
Factors ranging from exposure to certain chemicals to childhood abuse, diet and exercise may affect the DNA controlling sexuality, according to research being presented at a US conference on genetics.
…
They believe they can predict with 70 per cent accuracy whether a man is gay or straight, simply by looking at those parts of the genome.
[There’s a slightly better story in Nature News.]
70% accuracy doesn’t seem all that impressive. Using the usual figures on the proportion of men who are gay, the approach of assuming everyone is straight unless you are told otherwise is better than 90% accurate, and doesn’t need expensive genetics. Presumably they mean something different by 70% accuracy, but we don’t know what.
More importantly, this is research in identical twins. If you take pairs of people who are genetically identical, had the same environment in the womb, and then very similar environments in infancy and childhood, you’ve stripped out nearly all the other factors that could affect sexual orientation. That’s the point of doing the research this way — you get a clearer view of potentially-small differences — but it’s a limitation when you’re trying to make claims about people in general.
Also, there’s an important difference between genetics and epigenetics here. The epigenetic markers, as the story says, can be affected by things that happen to you during childhood. But that means we can’t necessarily assume the correlations between epigenetic differences and sexual orientation are causal. The “factors ranging from exposure to certain chemicals to childhood abuse, diet and exercise” that can affect epigenetic markers could also affect sexual orientation directly — especially since the epigenetic markers were measured in cells from the lining of the mouth, not in, say, the brain.
On top of all that, this is another annoying example of research being publicised before it’s published. It’s not at all impossible that the claims are true, but there isn’t enough public information to tell. The research was presented at the conference of the American Society for Human Genetics. People at the conference would have been able to see more detail, and maybe ask questions. We can’t. We won’t be able to until there’s a published research paper. That would have been the time for publicity.
And finally, there’s an interesting assumption revealed in the headline “Boys ‘turned gay by childhood shift in genes’“. The research looked at differences between identical twins. It says absolutely nothing about which twin changed and which one stayed the same — you could equally well say “Boys turned straight by childhood shift in genes”.
Quartz has an interesting analysis of a recent Twitter storm over abortion, triggered by the US Republicans’ attempts to defund Planned Parenthood. The headline is striking “How to tell whether a Twitter user is pro-choice or pro-life without reading any of their tweets.”
The writers describe how they could use words in twitter profiles to predict people’s attitudes. They also found that social network structure was a very strong predictor: people shared the views of those they followed. They write “so polarized is the social network structure that even very basic, obvious characteristics stop mattering if we know who your friends are”
It might seem strange that you could do so well in predicting attitudes across multiple countries on a controversial topic. It would be strange, except that the data they used was restricted to a small group of people who were participating in a Twitter argument about abortion. The story admits this, but not until near the end.
In real life, you probably can’t learn that much about someone’s views on abortion by whether they tweet about cats or football. In the context of a small, highly polarised argument, you probably can. In real life, people don’t necessarily agree with the views of the people they follow on Twitter, but in that context it’s not surprising that they do. And in real life, if someone wants to find out your views on a controversial topic they’d probably be better off asking you than tracking down all your friends and asking them.
Some cautionary tales
Official statistics agencies publish lots of useful information that gets used by researchers, by educators, by businesses, by journalists, and (with the help of groups like Figure.NZ) by everyone else. A dilemma for these agencies is how to handle changes in the best ways to measure something. If you never change the definitions you get perfectly consistent reports of no-longer-useful information. If you do change the definitions, things don’t match up.
This graph is from a blog post by a Canadian economist, Liveo Di Matteo. It shows the number of Canadians employed in the lumber industry over time, patched together from several Statistics Canada time series.
Dr Di Matteo is a professional, and wasn’t trying to do anything subtle here — he just wanted a lecture slide — and a lot of this data was from the time when Stats Canada was among the best in the world, so it’s not a problem that’s easy to avoid. It’s just harder than it sounds to define who works in the lumber industry. For example, are the log drivers in the lumber industry, or are they something like “transport workers, not elsewhere classified”?
The basic method is described on my Department home page.
Here are the team ratings prior to 09 October along with the ratings at the start of the Rugby World Cup.
Rating at 09 October | Rating at RWC Start | Difference | |
---|---|---|---|
New Zealand | 26.72 | 29.01 | -2.30 |
South Africa | 23.39 | 22.73 | 0.70 |
Australia | 21.59 | 20.36 | 1.20 |
Ireland | 16.82 | 17.48 | -0.70 |
England | 16.17 | 18.51 | -2.30 |
Wales | 13.37 | 13.93 | -0.60 |
France | 10.94 | 11.70 | -0.80 |
Argentina | 9.69 | 7.38 | 2.30 |
Scotland | 5.82 | 4.84 | 1.00 |
Fiji | -2.19 | -4.23 | 2.00 |
Samoa | -4.83 | -2.28 | -2.50 |
Italy | -6.04 | -5.86 | -0.20 |
Tonga | -8.60 | -6.31 | -2.30 |
Japan | -9.31 | -11.18 | 1.90 |
USA | -16.91 | -15.97 | -0.90 |
Georgia | -17.74 | -17.48 | -0.30 |
Canada | -17.89 | -18.06 | 0.20 |
Romania | -19.77 | -21.20 | 1.40 |
Uruguay | -31.41 | -31.04 | -0.40 |
Namibia | -33.09 | -35.62 | 2.50 |
So far there have been 32 matches played, 26 of which were correctly predicted, a success rate of 81.2%.
Here are the predictions for previous games.
Game | Date | Score | Prediction | Correct | |
---|---|---|---|---|---|
1 | England vs. Fiji | Sep 18 | 35 – 11 | 29.20 | TRUE |
2 | Tonga vs. Georgia | Sep 19 | 10 – 17 | 11.20 | FALSE |
3 | Ireland vs. Canada | Sep 19 | 50 – 7 | 35.50 | TRUE |
4 | South Africa vs. Japan | Sep 19 | 32 – 34 | 33.90 | FALSE |
5 | France vs. Italy | Sep 19 | 32 – 10 | 17.60 | TRUE |
6 | Samoa vs. USA | Sep 20 | 25 – 16 | 13.70 | TRUE |
7 | Wales vs. Uruguay | Sep 20 | 54 – 9 | 51.50 | TRUE |
8 | New Zealand vs. Argentina | Sep 20 | 26 – 16 | 21.60 | TRUE |
9 | Scotland vs. Japan | Sep 23 | 45 – 10 | 14.40 | TRUE |
10 | Australia vs. Fiji | Sep 23 | 28 – 13 | 24.10 | TRUE |
11 | France vs. Romania | Sep 23 | 38 – 11 | 33.30 | TRUE |
12 | New Zealand vs. Namibia | Sep 24 | 58 – 14 | 64.00 | TRUE |
13 | Argentina vs. Georgia | Sep 25 | 54 – 9 | 24.60 | TRUE |
14 | Italy vs. Canada | Sep 26 | 23 – 18 | 12.50 | TRUE |
15 | South Africa vs. Samoa | Sep 26 | 46 – 6 | 23.90 | TRUE |
16 | England vs. Wales | Sep 26 | 25 – 28 | 11.20 | FALSE |
17 | Australia vs. Uruguay | Sep 27 | 65 – 3 | 50.30 | TRUE |
18 | Scotland vs. USA | Sep 27 | 39 – 16 | 21.40 | TRUE |
19 | Ireland vs. Romania | Sep 27 | 44 – 10 | 38.80 | TRUE |
20 | Tonga vs. Namibia | Sep 29 | 35 – 21 | 27.40 | TRUE |
21 | Wales vs. Fiji | Oct 01 | 23 – 13 | 23.80 | TRUE |
22 | France vs. Canada | Oct 01 | 41 – 18 | 29.60 | TRUE |
23 | New Zealand vs. Georgia | Oct 02 | 43 – 10 | 44.90 | TRUE |
24 | Samoa vs. Japan | Oct 03 | 5 – 26 | 7.10 | FALSE |
25 | South Africa vs. Scotland | Oct 03 | 34 – 16 | 16.00 | TRUE |
26 | England vs. Australia | Oct 03 | 13 – 33 | 3.30 | FALSE |
27 | Argentina vs. Tonga | Oct 04 | 45 – 16 | 17.00 | TRUE |
28 | Ireland vs. Italy | Oct 04 | 16 – 9 | 24.70 | TRUE |
29 | Canada vs. Romania | Oct 06 | 15 – 17 | 2.70 | FALSE |
30 | Fiji vs. Uruguay | Oct 06 | 47 – 15 | 28.60 | TRUE |
31 | South Africa vs. USA | Oct 07 | 64 – 0 | 37.90 | TRUE |
32 | Namibia vs. Georgia | Oct 07 | 16 – 17 | -17.00 | TRUE |
The prediction is my estimated expected points difference with a positive margin being a win to the first-named team, and a negative margin a win to the second-named team.
Game | Date | Winner | Prediction | |
---|---|---|---|---|
1 | New Zealand vs. Tonga | Oct 09 | New Zealand | 35.30 |
2 | Samoa vs. Scotland | Oct 10 | Scotland | -10.70 |
3 | Australia vs. Wales | Oct 10 | Australia | 8.20 |
4 | England vs. Uruguay | Oct 10 | England | 54.10 |
5 | Argentina vs. Namibia | Oct 11 | Argentina | 42.80 |
6 | Italy vs. Romania | Oct 11 | Italy | 13.70 |
7 | France vs. Ireland | Oct 11 | Ireland | -5.90 |
8 | USA vs. Japan | Oct 11 | Japan | -7.60 |
The basic method is described on my Department home page.
Here are the team ratings prior to this week’s games, along with the ratings at the start of the season.
Current Rating | Rating at Season Start | Difference | |
---|---|---|---|
Taranaki | 12.77 | 7.70 | 5.10 |
Canterbury | 12.56 | 10.90 | 1.70 |
Tasman | 8.40 | 12.86 | -4.50 |
Auckland | 8.30 | 5.14 | 3.20 |
Wellington | 4.76 | -4.62 | 9.40 |
Hawke’s Bay | 4.14 | -0.57 | 4.70 |
Counties Manukau | 4.05 | 7.86 | -3.80 |
Otago | 1.46 | -4.84 | 6.30 |
Waikato | -6.41 | -6.96 | 0.50 |
Bay of Plenty | -6.46 | -9.77 | 3.30 |
Manawatu | -8.81 | -1.52 | -7.30 |
North Harbour | -9.59 | -10.54 | 0.90 |
Southland | -10.77 | -6.01 | -4.80 |
Northland | -18.37 | -3.64 | -14.70 |
So far there have been 62 matches played, 44 of which were correctly predicted, a success rate of 71%.
Here are the predictions for last week’s games.
Game | Date | Score | Prediction | Correct | |
---|---|---|---|---|---|
1 | Wellington vs. Hawke’s Bay | Sep 30 | 22 – 22 | 3.70 | FALSE |
2 | North Harbour vs. Otago | Oct 01 | 32 – 39 | -7.10 | TRUE |
3 | Waikato vs. Counties Manukau | Oct 02 | 9 – 30 | -3.30 | TRUE |
4 | Tasman vs. Canterbury | Oct 03 | 25 – 41 | 3.30 | FALSE |
5 | Manawatu vs. Taranaki | Oct 03 | 10 – 44 | -14.00 | TRUE |
6 | Auckland vs. Northland | Oct 03 | 64 – 21 | 28.00 | TRUE |
7 | Southland vs. Hawke’s Bay | Oct 04 | 28 – 35 | -11.40 | TRUE |
8 | Bay of Plenty vs. Wellington | Oct 04 | 13 – 31 | -5.20 | TRUE |
Here are the predictions for Round 9. The prediction is my estimated expected points difference with a positive margin being a win to the home team, and a negative margin a win to the away team.
Game | Date | Winner | Prediction | |
---|---|---|---|---|
1 | Northland vs. Otago | Oct 07 | Otago | -15.80 |
2 | Taranaki vs. Tasman | Oct 08 | Taranaki | 8.40 |
3 | Hawke’s Bay vs. Waikato | Oct 09 | Hawke’s Bay | 14.60 |
4 | Canterbury vs. Southland | Oct 10 | Canterbury | 27.30 |
5 | Wellington vs. Manawatu | Oct 10 | Wellington | 17.60 |
6 | Counties Manukau vs. Auckland | Oct 10 | Auckland | -0.30 |
7 | North Harbour vs. Northland | Oct 11 | North Harbour | 12.80 |
8 | Otago vs. Bay of Plenty | Oct 11 | Otago | 11.90 |