Posts from May 2018 (28)

May 7, 2018

One lump or two?

Q: How many spaces do you use after a full stop?

A: Whatever LaTeX thinks is appropriate. We have computers so that people don’t have to worry about that sort of thing.

Q: How many spaces do you actually type?

A: Two. I was brainwashed by the program that taught me touch-typing.

Q: Me too! Science has just proved us right!

A: <eyeroll emoji>

Q: It’s in the Herald! And in the Washington Post!: One space between each sentence, they said. Science just proved them wrong.

A: I see what they did there. <unimpressed emoji>

Q: But Science?

A: The researchers took a group of people who used two spaces after full stops and another group who used one, and got them to read text with different spacing conventions.  They measured reading speed and comprehension

Q: And the one-space texts were read faster and comprehended better?

A: “In the analysis of reading speed, although there was not an overall significant effect of period spacing or typing condition, there was a significant effect of comma spacing such that readers read paragraphs faster when they were written with only one space after the commas, as is the common convention.”

Q: That’s… not really what the story says.

A: No. No, it isn’t. There was a slight tendency for people to read faster when the spacing in the text matched the spacing they personally tended to use.

Q: And what do they mean about “only one space after the comma”?

A: They looked at one or two spaces after the comma as well as after the full stop.

Q: <anguished face emoji>

A:  Indeed.

Q: But at least it shows two spaces at the end of the sentence is ok?

A: Under the conditions of the experiment, yes, there’s weak support for that point of view.

Q: I’m sensing an implied asterisk here

A: The Herald story says, near the start “Some said this was blasphemy. The designers of modern fonts had built the perfect amount of spacing, they said. Anything more than a single space between sentences was too much.”

Q: Yes?

A: The experiment was done in the monospaced font Courier New.

Q: How is that relevant to, like, anything in the modern world? I mean, even if you want a monospaced font for coding or something you can use Monaco or Hack or Consolas

A: Well, you might think that if two spaces was bad in real-world use it would be even worse in Courier New, but it’s not how I would have done the experiment.

 

h/t David Hogg

May 6, 2018

The Midas touch

There’s a shocked story on the internet about a New York restaurant serving chicken wings coated in gold for US$1000.

It’s a great example of not doing the maths. Gold currently costs $42/gram, and culinary gold leaf is easily available for not much more than twice that.  A gram of gold leaf is a lot: nearly a square metre.

In fact, if you track down a more detailed source — and in this case I’m sorry to say the Daily Mail qualifies — you find that 10 gold-covered wings sets you back $30, at a restaurant whose regular “small plates” wings are $15.

The $1000 price tag is for 50 wings plus a bottle of Champagne Armand de Brignac.  It’s the posh bubbly that explains the stratospheric price, not the gold.

May 3, 2018

Undercounting

Kirsty Johnson and Chris Knox have a report  in the Herald today on why the number of unresolved reports of sexual assault has gone up.

On the face of it, one could assume the driver behind the spike in unresolved cases was the overall increase in reporting rates. Data shows reporting has also climbed steadily since last decade.

That’s what I would have assumed. It’s true in a sense. But the change wasn’t more reporting to police; it was more reporting by police. Many cases where the police felt they could not get a conviction were classified for reporting purposes as not being reports of a crime and so were lumped together with false accusations and cases where the reported behaviour wasn’t criminal.

It’s good news that the police don’t do this any more, but it’s also important that the past minimisation of reports is now more widely known.

[I should also note that reporters show their working: there’s a page on the data and calculations, and plenty of links to police documents and other sources. ]

 

May 2, 2018

Briefly

  • From the journal BMJ “Data about the timing of when laboratory tests were ordered were more accurate than the test results in predicting survival in 118 of 174 tests (68%).” In particular, for one of the tests they looked at, having the blood test ordered at 5am (regardless of the result) was more of a bad sign than having it ordered at 5pm and giving an unfavourable result.  This isn’t all that surprising: if you get lab tests ordered at 5am there’s a problem, and a favourable result for the test just means the problem isn’t the one that the test was looking for. But it is potentially a problem for interpreting conclusions from large-scale data mining or, to be fair, from uninformed statistical analysis.
  • NBC News in Chicago did a story on direct-to-consumer genetic testing. As several journalists have done, they sent a DNA sample to multiple testing companies and compared the results.  They also did the same thing with a sample from a Labrador.  Most of the companies said the dog’s DNA didn’t genotype successfully. Most, but not all.
  • Tim Harford recommends books on understanding data-driven prediction algorithms
  • A talk by Jonathan Korum of the New York Times on graphics for science stories
  • From Karl Broman and Kara Woo: “Data Organization in Spreadsheets” If you, or someone you know, uses spreadsheets …
  • How to display uncertainty in election predictions. From The Crosstab.
  • Someone in the US was hit by lightning, bitten by a shark, bitten by a venomous snake, and attacked by a bear in one year. National Geographic says

Since each event is independent the odds of each are multiplied together, he said, making the odds of this happening 893.35 quadrillion to one.

         Of course, they really aren’t independent — and the next sentence pretty much points this out

McWilliams just chalks all this up to being in the wrong place at the wrong time. He encourages everyone to experience the outdoors. “I still go hiking, I still catch rattlesnakes, and I will still swim in the ocean,

 

The breakthrough problem

Here’s a blog post from CSIRO “Bloody data: biostatistics and the first blood test for Alzheimer’s”

Here are some of the media headlines:

New blood test detects Alzheimer’s disease up to 20 years before symptoms begin

New blood test could detect risk of Alzheimer’s disease 30 years early

‘Holy grail’ blood test for early Alzheimer’s indicators could help patients begin treatment before they show symptoms

World-first blood test to diagnose dementia

Science has created a simple blood test for Alzheimer’s

Here’s the research paper and  two press releases about it: from the Florey Institute and from Melbourne Uni 

 

As StatsChat readers will know, this isn’t the world-first early test for Alzheimer’s to end up in the media.  And as you’ll expect, the test hasn’t actually been able to detect Alzheimer’s disease 20 or 30 years before symptoms begin. That would require either waiting 20 or 30 years to see who developed symptoms, or having a test that could be run with blood samples from 20 or 30 years ago.

What the test actually does is predict quite accurately how much amyloid-beta protein people have in their brain using just a blood sample. That’s useful, because the current ways of measuring amyloid in the brain are expensive and involve radiation and/or sampling fluid from the spine, and these aren’t things anyone wants to do at the level of population screening.

We don’t currently know if measuring amyloid in the brain gives good 20-year predictions of Alzheimer’s — there’s increasing suspicion that the relationship between Alzheimer’s and amyloid is more complicated than was thought.   But it’s the best candidate we’ve got for an early test, and a low-cost, low-risk, low-pain version of it would be very useful in selecting people for clinical trials targeting amyloid levels — as I’ve said about a few of the other “first blood test for Alzheimer’s” candidates.

The research paper did look at using the test to distinguish people with definite, known Alzheimer’s from a healthy comparison group.  They found the test picked up almost all of the Alzheimer’s cases, but gave false positives in nearly 20% of the healthy controls.  Maybe that’s just saying that 20% of the controls will end up with the disease in 20-30 years, but there’s no way to know that in the short term.

This is good research, giving a genuine improvement on an important component of one step in testing potential future drugs to prevent Alzheimer’s. And the actual text of some of the stories is quite reasonable.  The headlines, however, are actively misleading, because of the need to frame science stories as  breakthroughs.

This isn’t the Holy Grail. Indiana Jones isn’t just ten minutes and three plot twists away from the final credits. That just isn’t how science typically works.

May 1, 2018

Super 15 Predictions for Round 12

Team Ratings for Round 12

The basic method is described on my Department home page.
Here are the team ratings prior to this week’s games, along with the ratings at the start of the season.

Current Rating Rating at Season Start Difference
Hurricanes 15.41 16.18 -0.80
Crusaders 14.50 15.23 -0.70
Highlanders 11.01 10.29 0.70
Chiefs 9.89 9.29 0.60
Lions 8.93 13.81 -4.90
Bulls -0.05 -4.79 4.70
Stormers -0.31 1.48 -1.80
Sharks -0.59 1.02 -1.60
Blues -1.96 -0.24 -1.70
Waratahs -2.18 -3.92 1.70
Brumbies -2.53 1.75 -4.30
Jaguares -3.12 -4.64 1.50
Reds -8.71 -9.47 0.80
Rebels -9.87 -14.96 5.10
Sunwolves -17.80 -18.42 0.60

 

Performance So Far

So far there have been 67 matches played, 48 of which were correctly predicted, a success rate of 71.6%.
Here are the predictions for last week’s games.

Game Date Score Prediction Correct
1 Hurricanes vs. Sunwolves Apr 27 43 – 15 38.50 TRUE
2 Stormers vs. Rebels Apr 27 34 – 18 13.20 TRUE
3 Reds vs. Lions Apr 28 27 – 22 -16.20 FALSE
4 Blues vs. Jaguares Apr 28 13 – 20 6.80 FALSE
5 Brumbies vs. Crusaders Apr 28 8 – 21 -13.00 TRUE
6 Bulls vs. Highlanders Apr 28 28 – 29 -7.90 TRUE

 

Predictions for Round 12

Here are the predictions for Round 12. The prediction is my estimated expected points difference with a positive margin being a win to the home team, and a negative margin a win to the away team.

Game Date Winner Prediction
1 Chiefs vs. Jaguares May 04 Chiefs 17.00
2 Rebels vs. Crusaders May 04 Crusaders -20.40
3 Waratahs vs. Blues May 05 Waratahs 3.80
4 Hurricanes vs. Lions May 05 Hurricanes 10.50
5 Stormers vs. Bulls May 05 Stormers 3.20
6 Sharks vs. Highlanders May 05 Highlanders -7.60

 

NRL Predictions for Round 9

Team Ratings for Round 9

The basic method is described on my Department home page.
Here are the team ratings prior to this week’s games, along with the ratings at the start of the season.

Current Rating Rating at Season Start Difference
Storm 14.61 16.73 -2.10
Dragons 5.12 -0.45 5.60
Panthers 4.33 2.64 1.70
Broncos 1.24 4.78 -3.50
Sharks 0.70 2.20 -1.50
Wests Tigers 0.61 -3.63 4.20
Raiders 0.57 3.50 -2.90
Rabbitohs 0.07 -3.90 4.00
Roosters -0.10 0.13 -0.20
Cowboys -1.72 2.97 -4.70
Eels -2.33 1.51 -3.80
Warriors -2.86 -6.97 4.10
Bulldogs -2.95 -3.43 0.50
Sea Eagles -5.50 -1.07 -4.40
Knights -6.04 -8.43 2.40
Titans -8.06 -8.91 0.90

 

Performance So Far

So far there have been 64 matches played, 33 of which were correctly predicted, a success rate of 51.6%.
Here are the predictions for last week’s games.

Game Date Score Prediction Correct
1 Dragons vs. Roosters Apr 25 24 – 8 7.00 TRUE
2 Storm vs. Warriors Apr 25 50 – 10 19.00 TRUE
3 Rabbitohs vs. Broncos Apr 26 20 – 24 2.80 FALSE
4 Sea Eagles vs. Knights Apr 27 12 – 18 5.10 FALSE
5 Panthers vs. Bulldogs Apr 27 22 – 14 10.70 TRUE
6 Titans vs. Sharks Apr 28 9 – 10 -6.50 TRUE
7 Cowboys vs. Raiders Apr 28 8 – 18 2.40 FALSE
8 Eels vs. Wests Tigers Apr 29 24 – 22 -0.30 FALSE

 

Predictions for Round 9

Here are the predictions for Round 9. The prediction is my estimated expected points difference with a positive margin being a win to the home team, and a negative margin a win to the away team.

Game Date Winner Prediction
1 Broncos vs. Bulldogs May 03 Broncos 7.20
2 Knights vs. Rabbitohs May 04 Rabbitohs -3.10
3 Panthers vs. Cowboys May 04 Panthers 9.10
4 Raiders vs. Titans May 05 Raiders 11.60
5 Warriors vs. Wests Tigers May 05 Warriors 1.00
6 Sharks vs. Eels May 05 Sharks 6.00
7 Dragons vs. Storm May 06 Storm -6.50
8 Roosters vs. Sea Eagles May 06 Roosters 8.40

 

Aviva Premiership Predictions for Round 22

Team Ratings for Round 22

The basic method is described on my Department home page.
Here are the team ratings prior to this week’s games, along with the ratings at the start of the season.

 

Current Rating Rating at Season Start Difference
Saracens 12.08 7.47 4.60
Exeter Chiefs 10.26 7.99 2.30
Wasps 5.74 5.89 -0.20
Leicester Tigers 3.90 4.64 -0.70
Bath Rugby 0.96 1.23 -0.30
Sale Sharks 0.89 -1.73 2.60
Gloucester Rugby -0.73 0.21 -0.90
Newcastle Falcons -1.34 -3.33 2.00
Northampton Saints -1.90 1.53 -3.40
Harlequins -3.08 0.84 -3.90
Worcester Warriors -4.72 -4.37 -0.30
London Irish -6.62 -4.94 -1.70

 

Performance So Far

So far there have been 126 matches played, 87 of which were correctly predicted, a success rate of 69%.
Here are the predictions for last week’s games.

 

Game Date Score Prediction Correct
1 Leicester Tigers vs. Newcastle Falcons Apr 27 33 – 25 8.30 TRUE
2 Exeter Chiefs vs. Sale Sharks Apr 28 34 – 19 12.00 TRUE
3 Gloucester Rugby vs. Bath Rugby Apr 28 20 – 43 2.90 FALSE
4 Worcester Warriors vs. Harlequins Apr 28 44 – 13 -0.50 FALSE
5 London Irish vs. Saracens Apr 29 14 – 51 -14.20 TRUE
6 Wasps vs. Northampton Saints Apr 29 36 – 29 11.10 TRUE

 

Predictions for Round 22

Here are the predictions for Round 22. The prediction is my estimated expected points difference with a positive margin being a win to the home team, and a negative margin a win to the away team.

 

Game Date Winner Prediction
1 Bath Rugby vs. London Irish May 05 Bath Rugby 10.60
2 Harlequins vs. Exeter Chiefs May 05 Exeter Chiefs -10.30
3 Newcastle Falcons vs. Wasps May 05 Wasps -4.10
4 Northampton Saints vs. Worcester Warriors May 05 Northampton Saints 5.80
5 Sale Sharks vs. Leicester Tigers May 05 Leicester Tigers -0.00
6 Saracens vs. Gloucester Rugby May 05 Saracens 15.80