Posts from June 2017 (28)

June 12, 2017

Stat of the Week Competition Discussion: June 10 – 16 2017

If you’d like to comment on or debate any of this week’s Stat of the Week nominations, please do so below!

June 7, 2017

Fraud or typos?

The Guardian saysDozens of recent clinical trials may contain wrong or falsified data, claims study

A UK anaesthetist, John Carlise, has scraped 5000 clinical-trial publications, where patients are divided randomly into two groups before treatment is assigned, and looked at whether the two groups are more similar or more different than you’d expect by chance.  His motivation appears to be that having groups which are too similar can be a sign of incompetent fraud by someone who doesn’t understand basic statistics. However, the statistical hypothesis he’s testing isn’t actually about fraud, or even about incompetent fraud.

As the research paper notes, some of the anomalous results can be explained by simple writing errors: saying “standard deviation” when you mean “standard error” — and this would, if anything, be evidence against fraud.  Even in the cases where that specific writing error isn’t plausible, looking at the paper can show data fabrication to be an unlikely explanation.  For example, in one of the papers singled out as having a big difference not explainable by the standard deviation/standard error confusion, the difference is in one blood chemistry measurement (tPA) that doesn’t play any real role in the conclusions. The data are not consistent with random error, but they also aren’t consistent with deliberate fraud.  They are more consistent with someone typing 3.2 when they meant 4.2. This would still be a problem with the paper, both because some relatively unimportant data are wrong and because it says bad things about your workflow if you are still typing Table 1 by hand in the 21st century, but it’s not of the same scale as data fabrication.

You’d think the Guardian might be more sympathetic to typos as an explanation of error.

 

Super 18 Predictions for Round 16 Game, Hurricanes vs Chiefs

Team Ratings for Round 16 Game, Hurricanes vs Chiefs

The basic method is described on my Department home page.

This week is pretty crazy, just one game from round 16 when round 15 has not been completed and won’t be for a month.

Here are the team ratings prior to this week’s games, along with the ratings at the start of the season.

Current Rating Rating at Season Start Difference
Hurricanes 17.92 13.22 4.70
Crusaders 13.98 8.75 5.20
Highlanders 11.43 9.17 2.30
Lions 10.96 7.64 3.30
Chiefs 8.49 9.75 -1.30
Brumbies 3.44 3.83 -0.40
Blues 2.65 -1.07 3.70
Sharks 1.52 0.42 1.10
Stormers 0.53 1.51 -1.00
Waratahs -0.50 5.81 -6.30
Bulls -5.20 0.29 -5.50
Jaguares -5.38 -4.36 -1.00
Force -8.85 -9.45 0.60
Cheetahs -9.83 -7.36 -2.50
Reds -10.78 -10.28 -0.50
Kings -13.53 -19.02 5.50
Rebels -15.58 -8.17 -7.40
Sunwolves -18.38 -17.76 -0.60

 

Performance So Far

So far there have been 120 matches played, 91 of which were correctly predicted, a success rate of 75.8%.
Here are the predictions for last week’s games.

Game Date Score Prediction Correct
1 Blues vs. Reds Jun 02 34 – 29 14.60 TRUE
2 Crusaders vs. Highlanders Jun 03 25 – 22 6.50 TRUE
3 Chiefs vs. Waratahs Jun 03 46 – 31 12.70 TRUE
4 Brumbies vs. Rebels Jun 03 32 – 3 21.60 TRUE
5 Force vs. Hurricanes Jun 03 12 – 34 -22.90 TRUE

 

Predictions for Round 16, Hurricanes vs. Chiefs

Here are the predictions for the Round 16 game this week. The prediction is my estimated expected points difference with a positive margin being a win to the home team, and a negative margin a win to the away team.

Game Date Winner Prediction
1 Hurricanes vs. Chiefs Jun 09 Hurricanes 12.90

 

NRL Predictions for Round 14

Team Ratings for Round 14

The basic method is described on my Department home page.

Here are the team ratings prior to this week’s games, along with the ratings at the start of the season.

Current Rating Rating at Season Start Difference
Storm 7.50 8.49 -1.00
Broncos 5.57 4.36 1.20
Sharks 5.18 5.84 -0.70
Raiders 5.17 9.94 -4.80
Panthers 3.37 6.08 -2.70
Sea Eagles 3.19 -2.98 6.20
Roosters 3.00 -1.17 4.20
Cowboys 2.07 6.90 -4.80
Dragons 0.73 -7.74 8.50
Eels -1.60 -0.81 -0.80
Titans -2.11 -0.98 -1.10
Warriors -3.71 -6.02 2.30
Rabbitohs -4.69 -1.82 -2.90
Bulldogs -5.04 -1.34 -3.70
Wests Tigers -7.57 -3.89 -3.70
Knights -13.12 -16.94 3.80

 

Performance So Far

So far there have been 99 matches played, 58 of which were correctly predicted, a success rate of 58.6%.
Here are the predictions for last week’s games.

Game Date Score Prediction Correct
1 Storm vs. Knights Jun 02 40 – 12 23.30 TRUE
2 Eels vs. Warriors Jun 02 32 – 24 5.70 TRUE
3 Dragons vs. Wests Tigers Jun 03 16 – 12 13.30 TRUE
4 Roosters vs. Broncos Jun 03 18 – 16 0.70 TRUE
5 Cowboys vs. Titans Jun 03 20 – 8 6.80 TRUE
6 Sea Eagles vs. Raiders Jun 04 21 – 20 1.60 TRUE
7 Bulldogs vs. Panthers Jun 04 0 – 38 0.90 FALSE

 

Predictions for Round 14

Here are the predictions for Round 14. The prediction is my estimated expected points difference with a positive margin being a win to the home team, and a negative margin a win to the away team.

Game Date Winner Prediction
1 Sharks vs. Storm Jun 08 Sharks 1.20
2 Sea Eagles vs. Knights Jun 09 Sea Eagles 19.80
3 Broncos vs. Rabbitohs Jun 09 Broncos 13.80
4 Titans vs. Warriors Jun 10 Titans 5.60
5 Panthers vs. Raiders Jun 10 Panthers 1.70
6 Eels vs. Cowboys Jun 10 Cowboys -0.20
7 Wests Tigers vs. Roosters Jun 11 Roosters -7.10
8 Bulldogs vs. Dragons Jun 12 Dragons -2.30

June 5, 2017

Briefly

  • Possibly a record false positive rate:  “a substantial number of takedown requests submitted to Google are for URLs that have never been in our search index, and therefore could never have appeared in our search results… Nor is this problem limited to one submitter: in total, 99.95% of all URLs processed from our Trusted Copyright Removal Program in January 2017 were not in our index” (Google submission to Register of Copyrights(PDF), via Techdirt)
  • Problem with rental costs in Canada’s historical CPI “the clerks who recorded the data were under an instruction that, since the CPI was to represent prices paid by better off working class families, to edit out any rental figures what were above a designated threshold. By the end of the 1950s they were throwing out more than half of the reported rents.” (Worthwhile Canadian Initiative). Data doesn’t just happen: it’s choices by people.
  • I’ve mentioned the University of Washington course “Calling Bullshit on Big Data” before. Now the New Yorker has a story about it.
  • What different sorts of things can go wrong with a statistical prediction rule? A taxonomy, from Ed Felten.
  • Explore NZ mortality rates divided up by ethnicity, income, and age
  • “What we learned from three years of interviews with data journalists, web developers and interactive editors at leading digital newsrooms” Storybench, via Alberto Cairo
  • A couple of examples from the fine UK election tradition of disinformation graphics: Scotland, London

Stat of the Week Competition: June 3 – 9 2017

Each week, we would like to invite readers of Stats Chat to submit nominations for our Stat of the Week competition and be in with the chance to win an iTunes voucher.

Here’s how it works:

  • Anyone may add a comment on this post to nominate their Stat of the Week candidate before midday Friday June 9 2017.
  • Statistics can be bad, exemplary or fascinating.
  • The statistic must be in the NZ media during the period of June 3 – 9 2017 inclusive.
  • Quote the statistic, when and where it was published and tell us why it should be our Stat of the Week.

Next Monday at midday we’ll announce the winner of this week’s Stat of the Week competition, and start a new one.

(more…)

Stat of the Week Competition Discussion: June 3 – 9 2017

If you’d like to comment on or debate any of this week’s Stat of the Week nominations, please do so below!

June 2, 2017

Time for stakeholder participation?

Q: Did you see `young blood’ cuts cancer and Alzheimer’s risk?

A: That’s the headline, yes.

Q: This is the Silicon Valley startup that’s transfusing young people’s blood into older people?

A: Well, Monterey rather than Silicon Valley, but yes.

Q: Isn’t it a pity we used up all the vampire jokes on Theranos?

A: I’m sure they aren’t really dead, just sleeping. (more…)