Posts from August 2015 (48)

August 26, 2015

NRL Predictions for Round 25

Team Ratings for Round 25

The basic method is described on my Department home page.

Here are the team ratings prior to this week’s games, along with the ratings at the start of the season.

Current Rating Rating at Season Start Difference
Roosters 9.80 9.09 0.70
Broncos 7.71 4.03 3.70
Cowboys 7.26 9.52 -2.30
Rabbitohs 4.27 13.06 -8.80
Bulldogs 3.96 0.21 3.80
Storm 3.50 4.36 -0.90
Sea Eagles 2.25 2.68 -0.40
Dragons 0.62 -1.74 2.40
Sharks -0.86 -10.76 9.90
Raiders -2.27 -7.09 4.80
Panthers -2.92 3.69 -6.60
Eels -4.32 -7.19 2.90
Knights -4.74 -0.28 -4.50
Warriors -6.20 3.07 -9.30
Wests Tigers -7.77 -13.13 5.40
Titans -8.96 -8.20 -0.80

 

Performance So Far

So far there have been 176 matches played, 100 of which were correctly predicted, a success rate of 56.8%.

Here are the predictions for last week’s games.

Game Date Score Prediction Correct
1 Dragons vs. Panthers Aug 20 19 – 12 6.40 TRUE
2 Rabbitohs vs. Bulldogs Aug 21 18 – 32 6.10 FALSE
3 Sharks vs. Wests Tigers Aug 22 40 – 18 7.90 TRUE
4 Warriors vs. Cowboys Aug 22 16 – 50 -5.60 TRUE
5 Roosters vs. Broncos Aug 22 12 – 10 5.60 TRUE
6 Titans vs. Raiders Aug 23 28 – 12 -6.80 FALSE
7 Sea Eagles vs. Eels Aug 23 16 – 20 11.80 FALSE
8 Storm vs. Knights Aug 24 6 – 20 15.20 FALSE

 

Predictions for Round 25

Here are the predictions for Round 25. The prediction is my estimated expected points difference with a positive margin being a win to the home team, and a negative margin a win to the away team.

Game Date Winner Prediction
1 Rabbitohs vs. Broncos Aug 27 Broncos -0.40
2 Sea Eagles vs. Roosters Aug 28 Roosters -4.60
3 Eels vs. Sharks Aug 29 Sharks -0.50
4 Knights vs. Bulldogs Aug 29 Bulldogs -5.70
5 Storm vs. Cowboys Aug 29 Cowboys -0.80
6 Wests Tigers vs. Warriors Aug 30 Wests Tigers 2.40
7 Titans vs. Dragons Aug 30 Dragons -6.60
8 Raiders vs. Panthers Aug 31 Raiders 3.60

 

August 25, 2015

Computation and art

fishiness

Normally I wouldn’t be linking favourably to this scatterplot, which has an ill-defined sampling scheme, and where at least the y-axis data are objectively wrong.  On the other hand, normally the scatterplot would be there to convey information.  In this case it’s just an index to some beautiful animated triangular art

shark

The point, and the relevance to this blog, is the way Matt Daniels has written software to make these pictures (relatively) easy to create.

 

Incidentally, before anyone starts complaining that sharks and fish are separate, that bit is exactly correct.  Fish (typical fish with bones, such as the swordfish in the animation) have a more recent common ancestor with sheep than with sharks.

August 24, 2015

Stat of the Week Competition: August 22 – 28 2015

Each week, we would like to invite readers of Stats Chat to submit nominations for our Stat of the Week competition and be in with the chance to win an iTunes voucher.

Here’s how it works:

  • Anyone may add a comment on this post to nominate their Stat of the Week candidate before midday Friday August 28 2015.
  • Statistics can be bad, exemplary or fascinating.
  • The statistic must be in the NZ media during the period of August 22 – 28 2015 inclusive.
  • Quote the statistic, when and where it was published and tell us why it should be our Stat of the Week.

Next Monday at midday we’ll announce the winner of this week’s Stat of the Week competition, and start a new one.

(more…)

Stat of the Week Competition Discussion: August 22 – 28 2015

If you’d like to comment on or debate any of this week’s Stat of the Week nominations, please do so below!

August 23, 2015

Barcharts with delusions of grandeur

The cricket graphics system now allows 3-d barcharts projected over the playing field, and casting actual virtual shadows.

chartjunk1

Yeah, nah.

Briefly

  • I can’t resist mentioning a pointless number puzzle originating in New York. How does this sequence continue: 50, 42, 34, 23, 14,…
  • Big Data: from Jennifer Gardy, a Canadian genomic epidemiologist, on Twitter. This is 5000TB, ie, about 5 million gigabytes, of genetic sequence data CMtfy0YWIAE4PAQ
  • From Upshot at the New York Times again, How to Know Whether to Believe a Health Study. StatsChat readers will be familiar with most of this.
  • Another problem with errors in automatic classifications, for apps that purport to recognise bird songs
    If it tells you a nuthatch’s call is in fact a great tit (a bird which, by the way, is thought to have over 40 different types of call), you’ll take that mistake away with you and carry on misdiagnosing nuthatches.
    (On the other hand, I completely disagree with the argument that recognising bird songs should take work, dammit)
  • Nathan Yau, at Flowing Data, is recreating the Statistical Atlas of the United States, using modern data and the 1870-style graphics.
August 22, 2015

Changing who you count

The New York Times has a well-deserved reputation for data journalism, but anyone can have a bad day.  There’s a piece by Steven Johnson on the non-extinction of the music industry (which I think makes some good points), but which the Future of Music Coalition doesn’t like at all. And they also have some good points.

In particular, Johnson says

“According to the OES, in 1999 there were nearly 53,000 Americans who considered their primary occupation to be that of a musician, a music director or a composer; in 2014 more than 60,000 people were employed writing, singing, or playing music. That’s a rise of 15 percent.”

 

He’s right. This is a graph (not that you really need one)

twopoints

The Future of Music Coalition give the numbers for each year, and they’re interesting. Here’s a graph of the totals:

allpoints

There isn’t a simple increase; there’s a weird two-humped pattern. Why?

Well, if you look at the two categories, “Music Directors and Composers” and “Musicians and Singers”, making up the total, it’s quite revealing

twolines

The larger category, “Musicians and Singers”, has been declining.  The smaller category, “Music Directors and Composers” was going up slowly, then had a dramatic three-year, straight-line increase, then decreased a bit.

Going  into the Technical Notes for the estimates (eg, 2009), we see

May 2009 estimates are based on responses from six semiannual panels collected over a 3-year period

That means the three-year increase of 5000 jobs/year is probably a one-off increase of 15,000 jobs. Either the number of “Music Directors and Composers” more than doubled in 2009, or more likely there was a change in definitions or sampling approach.  The Future of Music Coalition point out that Bureau of Labor Statistics FAQs say this is a problem (though they’ve got the wrong link: it’s here, question F.1)

Challenges in using OES data as a time series include changes in the occupational, industrial, and geographical classification systems

In particular, the 2008 statistics estimate only 390 of these people as being employed in primary and secondary schools; the 2009 estimate is 6000, and the 2011 estimate is 16880. A lot of primary and secondary school teachers got reclassified into this group; it wasn’t a real increase.

When the school teachers are kept out of  “Music Directors and Composers”, to get better comparability across years, the change is from 53000 in 1999 to 47000 in 2014. That’s not a 15% increase; it’s an 11% decrease.

Official statistics agencies try not to change their definitions, precisely because of this problem, but they do have to keep up with a changing world. In the other direction, I wrote about a failure to change definitions that led the US Census Bureau to report four times as many pre-schoolers were cared for by fathers vs mothers.

August 20, 2015

The second-best way to prevent hangovers?

From Stuff: “Korean pears are the best way to prevent hangovers, say scientists.”

This is precisely not what scientists say; in fact, the scientist in question is even quoted (in the last line of the story) as not saying that.

Meanwhile, as a responsible scientist, she reminded that abstaining from excess alcohol consumption is the only certain way to avoid a hangover.

At least Stuff got ‘prevention’ in the headline. Many other sources, such as the Daily Mail, led with claims of a “hangover cure.”  The Mail also illustrated the story with a photo of the wrong species: the research was on the Asian species Pyrus pyrifolia,  rather than the European pear Pyrus communis. CSIRO hopes that European pears are effective, since that’s what Australia has vast quantities of, but they weren’t tested.

What Stuff doesn’t seem to have noticed is that this isn’t a new CSIRO discovery. The blog post certainly doesn’t go out of its way to make that obvious, but right at the bottom, after the cat picture, the puns, and the Q&A with the researcher, you can read

Manny also warns this is only a preliminary scoping study, with the results yet to be finalised. Ultimately, her team hope to deliver a comprehensive review of the scientific literature on pears, pear components and relevant health measures.

That is, the experimental study on Korean pears isn’t new research done at CSIRO. It’s research done in Korea, and published a couple of years ago. There’s nothing wrong with this, though it would have been nice to give credit, and it would have made the choice of Korean pears less mysterious.

The Korean researchers recruited a group of young Korean men, and gave alcohol (in the form of shoju), preceded by either Korean pear juice or placebo pear juice (pear-flavoured sweetened water).  Blood chemistry studies, as well as research in mice by the same group, suggest that the pear juice speeds up the metabolism of alcohol and acetaldehyde. This didn’t prevent hangovers, but it did seem to lead to a small reduction in hangover severity.

The study was really too small to be very convincing. Perhaps more importantly, the alcohol dose was nearly eleven standard drinks (540ml of 20% alcohol) over a short period of time, so you’d hope it was relevant to a fairly small group of people.  Even in Australia.

 

August 19, 2015

Stereotype and caricature

I’ve posted a few times about the maps, word clouds, and so on that show the most distinctive words by gender or state — sometimes they are even mislabelled as the “most common” words.  As I explained, these are often very rare words; it’s just that they are slightly less rare in one group than in the others.

An old post from the XKCD blog gives a really good example. Randall Munroe set up a survey to show people colours and ask for the colour name. He got five million responses, from over 200,000 sessions, and came up with nearly 1000 reasonably well-characterised colours.  You can download the complete data, if you care.

The survey asked participants about their chromosomal sex, because two of the colour receptor genes are on the X-chromosome and this is linked to colour blindness (and possibly to tetrachromatic vision). It turned out that the basic colour names were very similar between male and female respondents, though women were slightly more likely to use modifiers (“lime green” vs “green”).

However, Munroe also looked at the responses that differed most in frequency between men and women. These were all uncommon responses, but all from multiple people, and after extensive spam filtering.

You can probably guess which group is which:

  1. Dusty Teal
  2. Blush Pink
  3. Dusty Lavender
  4. Butter Yellow
  5. Dusky Rose

 

  1. Penis
  2. Gay
  3. WTF
  4. Dunno
  5. Baige

(Presumably this is a gender effect, not an X-linked language defect.)

 

ITM Cup Predictions for Round 2

Team Ratings for Round 2

The basic method is described on my Department home page.

Here are the team ratings prior to this week’s games, along with the ratings at the start of the season.

Current Rating Rating at Season Start Difference
Tasman 12.79 12.86 -0.10
Canterbury 11.10 10.90 0.20
Counties Manukau 6.75 7.86 -1.10
Taranaki 5.78 7.70 -1.90
Auckland 4.50 5.14 -0.60
Hawke’s Bay 1.41 -0.57 2.00
Manawatu -0.41 -1.52 1.10
Wellington -2.70 -4.62 1.90
Otago -5.04 -4.84 -0.20
Southland -5.37 -6.01 0.60
Northland -5.61 -3.64 -2.00
Waikato -6.88 -6.96 0.10
Bay of Plenty -9.39 -9.77 0.40
North Harbour -10.92 -10.54 -0.40

 

Performance So Far

So far there have been 7 matches played, 5 of which were correctly predicted, a success rate of 71.4%.

Here are the predictions for last week’s games.


Game Date Score Prediction Correct
1 Southland vs. Auckland Aug 13 23 – 23 -7.20 FALSE
2 Waikato vs. Tasman Aug 14 20 – 35 -15.80 TRUE
3 Bay of Plenty vs. North Harbour Aug 14 20 – 11 4.80 TRUE
4 Taranaki vs. Wellington Aug 15 14 – 19 16.30 FALSE
5 Otago vs. Canterbury Aug 15 24 – 38 -11.70 TRUE
6 Counties Manukau vs. Manawatu Aug 16 36 – 35 13.40 TRUE
7 Hawke’s Bay vs. Northland Aug 16 39 – 10 7.10 TRUE

 

Predictions for Round 2

Here are the predictions for Round 2. The prediction is my estimated expected points difference with a positive margin being a win to the home team, and a negative margin a win to the away team.


Game Date Winner Prediction
1 North Harbour vs. Wellington Aug 20 Wellington -4.20
2 Tasman vs. Bay of Plenty Aug 21 Tasman 26.20
3 Manawatu vs. Waikato Aug 22 Manawatu 10.50
4 Northland vs. Southland Aug 22 Northland 3.80
5 Otago vs. Hawke’s Bay Aug 22 Hawke’s Bay -2.40
6 Auckland vs. Taranaki Aug 23 Auckland 2.70
7 Canterbury vs. Counties Manukau Aug 23 Canterbury 8.40