Posts from November 2018 (10)

November 28, 2018

Rugby Premiership Predictions for Round 9

Team Ratings for Round 9

The basic method is described on my Department home page.
Here are the team ratings prior to this week’s games, along with the ratings at the start of the season.

Current Rating Rating at Season Start Difference
Saracens 12.17 11.19 1.00
Exeter Chiefs 12.00 11.13 0.90
Wasps 4.90 8.30 -3.40
Leicester Tigers 3.70 6.26 -2.60
Gloucester Rugby 3.35 1.23 2.10
Northampton Saints 3.06 3.42 -0.40
Harlequins 1.77 2.05 -0.30
Bath Rugby 1.34 3.11 -1.80
Sale Sharks -0.30 -0.81 0.50
Worcester Warriors -1.92 -5.18 3.30
Newcastle Falcons -3.07 -3.51 0.40
Bristol -5.43 -5.60 0.20

 

Performance So Far

So far there have been 48 matches played, 38 of which were correctly predicted, a success rate of 79.2%.
Here are the predictions for last week’s games.

Game Date Score Prediction Correct
1 Newcastle Falcons vs. Bath Rugby Nov 23 16 – 8 0.30 TRUE
2 Worcester Warriors vs. Harlequins Nov 23 20 – 13 0.70 TRUE
3 Exeter Chiefs vs. Gloucester Rugby Nov 24 23 – 6 13.50 TRUE
4 Sale Sharks vs. Northampton Saints Nov 24 18 – 13 1.50 TRUE
5 Wasps vs. Bristol Nov 24 32 – 28 17.10 TRUE
6 Leicester Tigers vs. Saracens Nov 25 22 – 27 -2.50 TRUE

 

Predictions for Round 9

Here are the predictions for Round 9. The prediction is my estimated expected points difference with a positive margin being a win to the home team, and a negative margin a win to the away team.

Game Date Winner Prediction
1 Harlequins vs. Exeter Chiefs Nov 30 Exeter Chiefs -4.70
2 Bristol vs. Leicester Tigers Dec 01 Leicester Tigers -3.60
3 Gloucester Rugby vs. Worcester Warriors Dec 01 Gloucester Rugby 10.80
4 Northampton Saints vs. Newcastle Falcons Dec 01 Northampton Saints 11.60
5 Saracens vs. Wasps Dec 01 Saracens 12.80
6 Bath Rugby vs. Sale Sharks Dec 02 Bath Rugby 7.10

 

Pro14 Predictions for Round 10

Team Ratings for Round 10

The basic method is described on my Department home page.
Here are the team ratings prior to this week’s games, along with the ratings at the start of the season.

Current Rating Rating at Season Start Difference
Leinster 12.53 9.80 2.70
Glasgow Warriors 9.87 8.55 1.30
Munster 9.34 8.08 1.30
Scarlets 5.36 6.39 -1.00
Connacht 1.88 0.01 1.90
Cardiff Blues 0.37 0.24 0.10
Ulster 0.31 2.07 -1.80
Ospreys -0.50 -0.86 0.40
Edinburgh -1.29 -0.64 -0.70
Cheetahs -2.61 -0.83 -1.80
Treviso -4.36 -5.19 0.80
Dragons -9.31 -8.59 -0.70
Southern Kings -9.44 -7.91 -1.50
Zebre -11.61 -10.57 -1.00

 

Performance So Far

So far there have been 63 matches played, 52 of which were correctly predicted, a success rate of 82.5%.
Here are the predictions for last week’s games.

Game Date Score Prediction Correct
1 Scarlets vs. Ulster Nov 24 29 – 12 8.10 TRUE
2 Leinster vs. Ospreys Nov 24 52 – 7 15.50 TRUE
3 Glasgow Warriors vs. Cardiff Blues Nov 24 40 – 15 13.00 TRUE
4 Cheetahs vs. Treviso Nov 24 31 – 25 6.30 TRUE
5 Southern Kings vs. Connacht Nov 26 14 – 31 -5.80 TRUE
6 Zebre vs. Munster Nov 26 7 – 32 -15.60 TRUE
7 Dragons vs. Edinburgh Nov 26 18 – 12 -4.40 FALSE

 

Predictions for Round 10

Here are the predictions for Round 10. The prediction is my estimated expected points difference with a positive margin being a win to the home team, and a negative margin a win to the away team.

Game Date Winner Prediction
1 Munster vs. Edinburgh Dec 01 Munster 15.10
2 Ospreys vs. Zebre Dec 01 Ospreys 15.60
3 Cheetahs vs. Connacht Dec 02 Cheetahs 0.00
4 Ulster vs. Cardiff Blues Dec 02 Ulster 4.40
5 Dragons vs. Leinster Dec 02 Leinster -17.30
6 Glasgow Warriors vs. Scarlets Dec 02 Glasgow Warriors 9.00
7 Southern Kings vs. Treviso Dec 02 Treviso -0.60

 

November 27, 2018

NZ Census updates

Setting the record straight?

Brian Wansink, a prominent food researcher from Cornell, was forced to retire earlier this year. Andrew Gelman has some good perspectives. Wansink’s research was on contextual effects on eating — eg, the impact of plate size — and a bunch of these papers have now been retracted.

This week, Wired has a Thanksgiving-themed article about his research. It includes this quote from an email he sent to colleagues ahead of his retirement

“We may believe that our papers have been unfairly retracted. But what they can’t retract is the impact these have had on people’s lives and the impact they will continue to have.”

That’s a perfect summary of the problems with studies that over-promise and are over-publicised.  Whether the research was done well or not, it’s going to stick.  Subsequent developments — whether modifications, replication failures, or retractions — never get the impact of the original claim.

November 26, 2018

Briefly

  • “Rather than assume algorithms will produce better outcomes and hope they don’t accelerate discrimination, we should assume they will be discriminatory and inequitable unless designed specifically to redress these issues.” Lucy Bernholz
  • ” Introduction of [a predictive risk screening tool] resulted in a statistically significant increase in emergency hospital admissions and use of other [National Health] services without evidence of benefits to patients or the [National Health Service].” In the academic journal BMJ, so a bit more technical
  • Why the NY Times map of the US election results is so good: a Twitter thread
  • Stacey Kirk in the Sunday Star-Times on the campaign to get Pharmac to pay for one of the most expensive drugs in the world.
  • Interesting interactive in the Herald about quality-of-life and work in NZ cities.  It’s very economist in style.  That’s true on the good sense that it appreciates high house prices are a signal that lots of people want to live somewhere and low house prices are a signal that lots of people don’t.  It’s also true in the bad sense that there some places where not many people want to live, but the people who do live there really like it — and this sort of analysis suppresses that variation in preferences.
  • Interesting book on data science and data use: “Data Feminism”

Privacy and mathwashing

The Herald has a story from the Washington Post on an “AI” screening tool for babysitters, that allegedly uses both computer vision and text processing to screen social media for risk factors.   Here are three quotes from it:

1. A company co-founder says

Parents, he said, should see the ratings as a companion that “may or may not reflect the sitter’s actual attributes.”

But the danger of hiring a problematic or violent babysitter, he added, makes the AI a necessary tool for any parent hoping to keep his or her child safe.

The first thing to note about this is that you could make the same claims about astrology or handwriting analysis or a tarot reading. There’s no quantitative information about accuracy given, and it’s hard to see how the company could even know much how accurate its ratings were, or how biased. It’s not even for sure that the risk rating is positively correlated with risk to kids; the company seems careful not to make even a claim this weak.

 

2.

Parents could, presumably, look at their sitters’ public social media accounts themselves. But the computer-generated reports promise an in-depth inspection of years of online activity, boiled down to a single digit: an intoxicatingly simple solution to an impractical task.

If the algorithms actually predicted risk better and were less biased than typical employers there might be an advantage to this: your social media would be shared with a faceless US company rather than your potential employer, so there might be less actual privacy invasion — after all, some faceless US companies already have your social media. It might be less embarrassing than your boss knowing what your favourite member of the appropriate sex calls you. The computer could also be set up to ignore irrelevant information like whether you talk about your sexual orientation online.   With the setup as it is, that’s not the case, and one of the biggest risks is the completely unfounded appearance of both accuracy and objectivity — “mathwashing” as the jargon puts it.

 

3.

Where she lives, “100 per cent of the parents are going to want to use this,” she added. “We all want the perfect babysitter.”

One of the significant risks of automating human judgement is that it can go viral. There’s a limit to how well ordinary human prejudices can scale — you’ve got some chance of finding someone who has different biases. The prejudices of one computer checklist, though, can keep someone completely out of an employment sector.

November 20, 2018

Rugby Premiership Predictions for Round 8

Team Ratings for Round 8

The basic method is described on my Department home page.
Here are the team ratings prior to this week’s games, along with the ratings at the start of the season.

Current Rating Rating at Season Start Difference
Saracens 11.95 11.19 0.80
Exeter Chiefs 11.68 11.13 0.50
Wasps 5.52 8.30 -2.80
Leicester Tigers 3.92 6.26 -2.30
Gloucester Rugby 3.67 1.23 2.40
Northampton Saints 3.38 3.42 -0.00
Harlequins 2.34 2.05 0.30
Bath Rugby 1.76 3.11 -1.40
Sale Sharks -0.61 -0.81 0.20
Worcester Warriors -2.49 -5.18 2.70
Newcastle Falcons -3.49 -3.51 0.00
Bristol -6.04 -5.60 -0.40

 

Performance So Far

So far there have been 42 matches played, 32 of which were correctly predicted, a success rate of 76.2%.
Here are the predictions for last week’s games.

Game Date Score Prediction Correct
1 Gloucester Rugby vs. Leicester Tigers Nov 16 36 – 13 3.60 TRUE
2 Harlequins vs. Newcastle Falcons Nov 16 20 – 7 11.00 TRUE
3 Bath Rugby vs. Worcester Warriors Nov 17 30 – 13 8.90 TRUE
4 Northampton Saints vs. Wasps Nov 17 36 – 17 1.80 TRUE
5 Saracens vs. Sale Sharks Nov 17 31 – 25 19.30 TRUE
6 Bristol vs. Exeter Chiefs Nov 18 29 – 31 -13.30 TRUE

 

Predictions for Round 8

Here are the predictions for Round 8. The prediction is my estimated expected points difference with a positive margin being a win to the home team, and a negative margin a win to the away team.

Game Date Winner Prediction
1 Newcastle Falcons vs. Bath Rugby Nov 23 Newcastle Falcons 0.30
2 Worcester Warriors vs. Harlequins Nov 23 Worcester Warriors 0.70
3 Exeter Chiefs vs. Gloucester Rugby Nov 24 Exeter Chiefs 13.50
4 Sale Sharks vs. Northampton Saints Nov 24 Sale Sharks 1.50
5 Wasps vs. Bristol Nov 24 Wasps 17.10
6 Leicester Tigers vs. Saracens Nov 25 Saracens -2.50

 

Pro14 Predictions for Round 9

Team Ratings for Round 9

The basic method is described on my Department home page.
Here are the team ratings prior to this week’s games, along with the ratings at the start of the season.

Current Rating Rating at Season Start Difference
Leinster 11.52 9.80 1.70
Glasgow Warriors 9.36 8.55 0.80
Munster 8.92 8.08 0.80
Scarlets 4.65 6.39 -1.70
Connacht 1.39 0.01 1.40
Ulster 1.02 2.07 -1.00
Cardiff Blues 0.89 0.24 0.60
Ospreys 0.51 -0.86 1.40
Edinburgh -0.82 -0.64 -0.20
Cheetahs -2.58 -0.83 -1.70
Treviso -4.39 -5.19 0.80
Southern Kings -8.95 -7.91 -1.00
Dragons -9.77 -8.59 -1.20
Zebre -11.18 -10.57 -0.60

 

Performance So Far

So far there have been 56 matches played, 46 of which were correctly predicted, a success rate of 82.1%.
Here are the predictions for last week’s games.

Game Date Score Prediction Correct
1 Ospreys vs. Glasgow Warriors Nov 03 20 – 29 -3.50 TRUE
2 Edinburgh vs. Scarlets Nov 03 31 – 21 -2.00 FALSE
3 Treviso vs. Ulster Nov 04 10 – 15 -0.10 TRUE
4 Connacht vs. Dragons Nov 04 33 – 12 14.60 TRUE
5 Southern Kings vs. Leinster Nov 04 31 – 38 -16.90 TRUE
6 Cheetahs vs. Munster Nov 05 26 – 30 -7.60 TRUE
7 Cardiff Blues vs. Zebre Nov 05 37 – 0 14.90 TRUE

 

Predictions for Round 9

Here are the predictions for Round 9. The prediction is my estimated expected points difference with a positive margin being a win to the home team, and a negative margin a win to the away team.

Game Date Winner Prediction
1 Glasgow Warriors vs. Cardiff Blues Nov 24 Glasgow Warriors 13.00
2 Leinster vs. Ospreys Nov 24 Leinster 15.50
3 Scarlets vs. Ulster Nov 24 Scarlets 8.10
4 Cheetahs vs. Treviso Nov 24 Cheetahs 6.30
5 Southern Kings vs. Connacht Nov 26 Connacht -5.80
6 Zebre vs. Munster Nov 26 Munster -15.60
7 Dragons vs. Edinburgh Nov 26 Edinburgh -4.40

 

November 16, 2018

What do statisticians do all day?

Yesterday and today we had talks by our postgraduate students about their short research projects

  • An investigation into criticism of the FST software. (forensic DNA analysis: you want it to be right)
  • Comparison of accuracy between different algorithms for solving the Normal Equations in Regression Analysis (It’s hard to get big rounding errors nowadays)
  • Multi-catchment Streamflow Modelling by Reduced-rank Regression (improving hydropower modelling)
  • A Study on Equity in Academic Outcomes in First Year Statistics Courses (Could do better)
  • Costs and Financing of Routine Immunisation (estimating cost components in low-middle income countries)
  • Statistical Examination of the Relationship between Maternal Diet, Metabolome and Gestational Diabetes Mellitus (predicting diabetes is hard, especially in the future)
  • Robustness of Spatial Capture-Recapture Models to Misspecified Detection Functions (if you’re listening for whales or gibbons, how close do you need to be?)
  • An Examination of Participant Perspectives on the Scampy Tool (teaching the ideas of randomness at an introductory level)
  • Who lives in deprived places? The association between individual and area level socioeconomic position (guessing someone’s income from where they live isn’t all that reliable)
  • Is LIBS a reliable technology for the forensic analysis of glass? (more forensics, but with lasers)
  • De-batching data from a complex experiment (life would be simpler if you didn’t have to worry about lab ‘batch effects’ when studying ocean acidification)
  • Optimal path in random graphs (maths about really big networks)
  • Spatial Distribution of fish on the Chatham Rise (if you want to count them, you need to know where to look)
  • Exploring climate variables (in particular, extreme values like hottest place, rainiest day)
  • Interactive Tools for Climate Data (for browsing through NZ historical weather data)
  • How old is that mud: Convex Biclustering Applied in Tephrostratigraphy of the Orakei Basin (Looking for volcanic ash layers in mud samples)
  • Comparison of Methods for Inferring Granger Causality (time series techniques for economists)
  • Predicting Patronage (how does Patreon support vary over time, and can you predict it?)
  • Expected Information Matrices for Some Poisson Variants. (Calculations and software for some new counting models)
November 7, 2018

Graph of the week

From Axios

The wealth of the world’s billionaires rose by $1.4 trillion in 2017, the largest annual increase ever.

The details: Nearly all of that increase was driven by the Asia-Pacific region, and specifically China, where billionaire wealth rose 39%.

The graph is not well designed for illustrating the claim about billionaires: in a stacked bar chart like this it is easy to compare the first level, the sum of the first two, and the sum of all three, but not the top level.  If the chart had been stacked the other way up it would be more obvious that it doesn’t seem to go along with the claim.

Measuring the height of the bars in MacOS Preview (because it’s not possible to read the graph to high enough precision) I get 35 pixels increase in the Asia-Pacific region and 46 in the rest of the world, so the growth in the Asia-Pacific region actually is less than half the total growth.  Estimating the total growth by counting pixels does agree with the $1.4 trillion total, so what has gone wrong?

Clicking through to the UBS/PwC press release gives a bit more detail:

Chinese billionaires increased in number to 373 in 2017 from 318 in 2016 and their wealth rose by 39 percent to USD 1.12 trillion

So, the net increase in total wealth for Chinese billionaires is about $0.44 trillion. That hardly qualifies as “nearly all” of the increase for the Asia-Pacific region, let alone for the world

The original source is the the UBS/PwC Billionaires report. Their graph is better (apart maybe from the second y-axis). Not only is it the right way up for looking at increases in Asia vs the rest of the world, but it’s got tick marks on the right-hand axis where they’re actually useful. And it shows some historical context.

Normally I wouldn’t go to this much effort for a business news item — but the Axios Edge newsletter where I read this is edited by Felix Salmon, who is usually better at the difference between “nearly all” and “nearly half”

 

P.S. Yes, I did think of titling this post Crazy Rich Asians. No, I’m not sorry I didn’t.