January 10, 2023

Rugby Premiership Predictions for Round 16

Team Ratings for Round 16

The basic method is described on my Department home page.
Here are the team ratings prior to this week’s games, along with the ratings at the start of the season.

Current Rating Rating at Season Start Difference
Sale Sharks 5.71 4.14 1.60
Saracens 2.62 -5.00 7.60
Leicester Tigers 2.54 7.93 -5.40
Gloucester 2.20 5.92 -3.70
Exeter Chiefs 1.87 3.67 -1.80
Northampton Saints 1.58 3.99 -2.40
Harlequins 1.28 3.92 -2.60
London Irish 0.99 -1.65 2.60
Wasps -0.18 0.77 -1.00
Bristol -3.64 -2.43 -1.20
Bath -5.68 -9.15 3.50
Newcastle Falcons -6.54 -8.76 2.20
Worcester Warriors -11.69 -12.27 0.60

 

Performance So Far

So far there have been 72 matches played, 47 of which were correctly predicted, a success rate of 65.3%.
Here are the predictions for last week’s games.

Game Date Score Prediction Correct
1 Gloucester vs. Saracens Jan 07 16 – 19 5.00 FALSE
2 Newcastle Falcons vs. Leicester Tigers Jan 08 45 – 26 -7.10 FALSE
3 Exeter Chiefs vs. Northampton Saints Jan 08 35 – 12 2.70 TRUE
4 Harlequins vs. Sale Sharks Jan 09 16 – 24 1.10 FALSE
5 London Irish vs. Bristol Jan 09 23 – 7 8.20 TRUE

 

Predictions for Round 16

Here are the predictions for Round 16. The prediction is my estimated expected points difference with a positive margin being a win to the home team, and a negative margin a win to the away team.

Game Date Winner Prediction
1 Sale Sharks vs. Bath Jan 28 Sale Sharks 15.90
2 Leicester Tigers vs. Northampton Saints Jan 29 Leicester Tigers 5.50
3 Saracens vs. Bristol Jan 29 Saracens 10.80
4 Exeter Chiefs vs. Gloucester Jan 29 Exeter Chiefs 4.20
5 London Irish vs. Harlequins Jan 30 London Irish 4.20

 

January 9, 2023

Briefly

  • “We were able to put together a relatively good data set of case numbers for all states, but we were explicitly forbidden to make the data publicly available, even though our data was more accurate than what was appearing in the media.” Rob Hyndman, quoted by the ABC
  • Yet another example that counting isn’t simply neutral, from the Wikipedia entry for the Bechdel Test, via depths of wikipedia: “What counts as a character or as a conversation is not defined. For example, the Sir Mix-a-Lot song “Baby Got Back” has been described as passing the Bechdel test, because it begins with a valley girl saying to another “oh my god, Becky, look at her butt”. 
  • From the Washington Post: is your name more common for dogs or people? (in the US, of course)
  • From the New York Times, estimated carbon emissions by neighbourhood across the USA.
  • From David Hood, using the Ministry of Health public data, our holiday Covid wave. Something different seems to have happened in Tarāwhiti, and it seems to have happened at roughly the same time as the Rhythm’N’Vines festival
January 8, 2023

Murderous Kiwis

Newshub has a story Map: New Zealand’s murder hotspots revealed.

This is the map

The map (and the text) don’t say what these geographical units are. Based on the context and the presence of “Counties Manukau” as one of them, I would expect them to be police districts: this (just a map, no data) is from the NZ Police website

There’s a few confusing things about the Newshub map, though.  We seem to be missing Wellington (in the text, too), along with Auckland City and Northland. The ‘Southern’, ‘Eastern’, and ‘Central’ police districts are under a label ‘Auckland’ at the top right, making them look as though they might be southern, eastern, and central Auckland.

As always, there’s the question of the appropriate denominator.  Police districts are large enough that the distinction between the location of the murder and the residence of the victim might not matter too much (in contrast to census area units and assault), and I’m going to assume that the data include homicides in private homes (in contrast to census area units and assault) because that would have been mentioned otherwise. So it seems reasonable to use a general population denominator. This is trickier than I would have expected; it seems quite hard to find the police district populations. If you’re putting in a police OIA request like this one you might want to ask them for populations as well.

Looking at maps, the police districts seem to (at least approximately) be combinations of DHBs*, so I used the populations of those DHBs. Here are the comparisons just by counts of homicides over nearly three years (we’re missing Wellington and Northland)

And here are the (approximated) rates per thousand people over those three years. You might worry about how well the three Auckland districts can be separated; it wouldn’t be hard to combine them.

Bay of Plenty looks higher and Canterbury, Counties, and Waitematā look lower when you account for the differences in numbers of people.  Comparisons like this usually want rates (how dangerous), not counts (how many), if a relevant denominator is available.

Newshub does get points, though, for correctly saying all these numbers are pretty low by international standards.

 

* DHB: Deprecated Health Boundary

January 5, 2023

How common is long covid and why don’t we know?

You see widely varying estimates for the probability of getting long Covid and for the recovery prognosis. Some of this is because people are picking numbers to recirculate that match their prejudices, but some of it is because these are hard questions to answer.

For example, the Hamilton Spectator (other Hamilton, not ours) reports a Canadian study following 106 people for a year. The headline was initially 75 per cent of COVID ‘long haulers’ free of symptoms in 12 months: McMaster study. It’s now 25 per cent of COVID patients become ‘long haulers’ after 12 months: Mac study. Both are misleading, though the second is better.

This study started out with 106 people, with an average age of 57. They had substantially more severe Covid than average:

Twenty-six patients recovered from COVID19 at home, 35 were admitted to the ICU, and 45 were hospitalized but not ICU-admitted

For comparison, in New Zealand the hospitalisation rate has been about 1% of reported cases, with about 0.03% of reported cases admitted to the ICU. It’s not a representative sample, and this matters for estimating overall prevalence. On top of that, only half the study participants have 12-month data. That means the proportion known to have become ‘long-haulers’ is only about 12%; the 25% is a guess that the people who didn’t continue with the study were similar.

A more general problem is that “long covid” isn’t an easily measurable thing. There are people who are still unwell in various ways a long time after they get Covid. There are multiple theories about what exactly is the mechanism, and it’s quite possible that more than one of these theories is true — we don’t even know that ‘long covid’ is just a single condition.  Because we aren’t sure about the mechanism or mechanisms, there isn’t a test for long Covid the way there is for Covid.  If you have symptoms plus a positive RAT or PCR test for the SARS-2-Cov virus you have Covid; that’s what ‘having Covid’ means. There isn’t a simple, objective definition like that for long Covid.

Because there isn’t a simple, objective test for long covid, different studies define it in different ways: usually as having had Covid plus some set of symptoms later in time. Different studies use different symptoms. The larger the study, the more generic the symptom measurements tend to be, and so you’d expect higher rates of people to report having those symptoms.  If you simply ask about ‘fatigue’ you’ll pick up people with ordinary everyday <gestures-broadly-at-internet-and-world> as well as people with crushing post-Covid exhaustion, even though they’re very different.

There are also different time-frames in different studies: more people will have symptoms for three months than for twelve months just because twelve months is longer.  Twelve-month follow-up also implies the study must have started earlier; a study that followed people for twelve months after initial illness won’t include anyone who had Omicron and might include a lot of unvaccinated people.

The different definitions and different populations matter. The majority of people in New Zealand have had Covid. There’s no way that 25% them have the sort of long Covid that someone like Jenene Crossan or Daniel Freeman did; it would be obvious in the basic functioning of society.   Some people do have disabling long Covid; some people have milder versions; some have annoying post-Covid symptoms; some people seem to recover ok (though they might be at higher risk of other disease in the future). We don’t have good numbers on the size of these groups, or ways to predict who is who, or treatments, and it’s partly because it’s difficult and partly because the pandemic keeps changing.

It’s also partly because we haven’t put enough resources into it.

Ok boomers?

A graph, which has been popular on the internets, in this instance via Matthew Yglesias

Another graph, showing the same thing per capita rather than as shares of the population, also via Matthew Yglesias. This one appears to have a very different message.

And a third graph, from the FRED system operated by the Federal Reserve Bank, showing US real per-capita GDP

So: Gen X have a much lower share of US wealth than the Baby Boomers did at the same age.  This is partly because we are a smaller fraction of the population than they were: per-capita wealth is similar.  But per-capita wealth being similar isn’t as good as it sounds, because the US as a whole is substantially richer now than when the Boomers were 50.

This isn’t a gotcha for either of the first two graphs — different questions are allowed to have different answers — but it might be useful context for the comparison

January 3, 2023

United Rugby Championship Predictions for Week 12

 

 

Team Ratings for Week 12

The basic method is described on my Department home page.
Here are the team ratings prior to this week’s games, along with the ratings at the start of the season.

Current Rating Rating at Season Start Difference
Leinster 17.66 16.79 0.90
Ulster 9.06 9.27 -0.20
Munster 8.74 9.78 -1.00
Stormers 8.09 7.14 1.00
Bulls 7.22 7.84 -0.60
Sharks 6.21 6.95 -0.70
Edinburgh 2.82 3.58 -0.80
Glasgow 1.65 -0.00 1.70
Connacht 0.76 -1.60 2.40
Ospreys -1.58 -0.83 -0.80
Lions -2.44 -1.74 -0.70
Benetton -3.90 -3.68 -0.20
Scarlets -4.88 -1.23 -3.60
Cardiff Rugby -5.28 -7.42 2.10
Dragons -9.94 -11.81 1.90
Zebre -18.15 -16.99 -1.20

 

Performance So Far

So far there have been 85 matches played, 65 of which were correctly predicted, a success rate of 76.5%.
Here are the predictions for last week’s games.

Game Date Score Prediction Correct
1 Edinburgh vs. Glasgow Dec 31 25 – 32 6.40 FALSE
2 Zebre vs. Benetton Dec 31 17 – 40 -9.00 TRUE
3 Sharks vs. Bulls Jan 01 47 – 20 1.00 TRUE
4 Stormers vs. Lions Jan 01 40 – 8 12.90 TRUE
5 Cardiff Rugby vs. Ospreys Jan 02 19 – 22 1.10 FALSE
6 Scarlets vs. Dragons Jan 02 33 – 17 8.20 TRUE
7 Ulster vs. Munster Jan 02 14 – 15 5.70 FALSE
8 Leinster vs. Connacht Jan 02 41 – 12 20.00 TRUE

 

Predictions for Week 12

Here are the predictions for Week 12. The prediction is my estimated expected points difference with a positive margin being a win to the home team, and a negative margin a win to the away team.

Game Date Winner Prediction
1 Dragons vs. Bulls Jan 07 Bulls -12.70
2 Munster vs. Lions Jan 07 Munster 15.70
3 Benetton vs. Ulster Jan 08 Ulster -8.50
4 Edinburgh vs. Zebre Jan 08 Edinburgh 25.50
5 Cardiff Rugby vs. Scarlets Jan 08 Cardiff Rugby 3.60
6 Connacht vs. Sharks Jan 08 Sharks -0.90
7 Ospreys vs. Leinster Jan 08 Leinster -14.70
8 Glasgow vs. Stormers Jan 09 Stormers -1.90

 

Top 14 Predictions for Round 15

Team Ratings for Round 15

The basic method is described on my Department home page.
Here are the team ratings prior to this week’s games, along with the ratings at the start of the season.

Current Rating Rating at Season Start Difference
Stade Toulousain 8.29 6.34 1.90
La Rochelle 6.45 6.88 -0.40
Racing 92 4.82 4.86 -0.00
Bordeaux Begles 4.76 5.27 -0.50
Montpellier 4.28 4.18 0.10
Stade Francais 4.27 -1.05 5.30
Toulon 3.90 4.09 -0.20
Clermont 1.70 4.05 -2.40
Castres Olympique 1.67 2.87 -1.20
Lyon 1.05 3.10 -2.10
Aviron Bayonnais -1.49 -4.26 2.80
Section Paloise -1.51 -2.12 0.60
USA Perpignan -5.43 -2.75 -2.70
Brive -5.51 -4.20 -1.30

 

Performance So Far

So far there have been 98 matches played, 69 of which were correctly predicted, a success rate of 70.4%.
Here are the predictions for last week’s games.

Game Date Score Prediction Correct
1 Bordeaux Begles vs. Montpellier Dec 31 40 – 10 5.50 TRUE
2 Aviron Bayonnais vs. Toulon Jan 01 23 – 18 0.70 TRUE
3 Castres Olympique vs. Racing 92 Jan 01 26 – 26 3.70 FALSE
4 Lyon vs. Brive Jan 01 27 – 30 14.20 FALSE
5 Stade Francais vs. Section Paloise Jan 01 37 – 3 10.80 TRUE
6 USA Perpignan vs. La Rochelle Jan 01 10 – 29 -4.40 TRUE
7 Clermont vs. Stade Toulousain Jan 02 13 – 32 1.20 FALSE

 

Predictions for Round 15

Here are the predictions for Round 15. The prediction is my estimated expected points difference with a positive margin being a win to the home team, and a negative margin a win to the away team.

Game Date Winner Prediction
1 Bordeaux Begles vs. Aviron Bayonnais Jan 08 Bordeaux Begles 12.80
2 Brive vs. Toulon Jan 08 Toulon -2.90
3 Clermont vs. USA Perpignan Jan 08 Clermont 13.60
4 La Rochelle vs. Stade Toulousain Jan 08 La Rochelle 4.70
5 Section Paloise vs. Lyon Jan 08 Section Paloise 3.90
6 Stade Francais vs. Castres Olympique Jan 08 Stade Francais 9.10
7 Montpellier vs. Racing 92 Jan 09 Montpellier 6.00

 

Rugby Premiership Predictions for Round 15

Team Ratings for Round 15

The basic method is described on my Department home page.
Here are the team ratings prior to this week’s games, along with the ratings at the start of the season.

Current Rating Rating at Season Start Difference
Sale Sharks 5.19 4.14 1.10
Leicester Tigers 3.83 7.93 -4.10
Gloucester 2.67 5.92 -3.20
Northampton Saints 2.61 3.99 -1.40
Saracens 2.15 -5.00 7.20
Harlequins 1.81 3.92 -2.10
Exeter Chiefs 0.84 3.67 -2.80
London Irish 0.54 -1.65 2.20
Wasps -0.18 0.77 -1.00
Bristol -3.19 -2.43 -0.80
Bath -5.68 -9.15 3.50
Newcastle Falcons -7.82 -8.76 0.90
Worcester Warriors -11.69 -12.27 0.60

 

Performance So Far

So far there have been 67 matches played, 45 of which were correctly predicted, a success rate of 67.2%.
Here are the predictions for last week’s games.

Game Date Score Prediction Correct
1 Sale Sharks vs. Leicester Tigers Dec 31 40 – 5 2.80 TRUE
2 Bath vs. Newcastle Falcons Jan 01 24 – 16 6.40 TRUE
3 Gloucester vs. London Irish Jan 01 8 – 6 7.30 TRUE
4 Saracens vs. Exeter Chiefs Jan 01 35 – 3 3.00 TRUE
5 Northampton Saints vs. Harlequins Jan 02 46 – 17 2.70 TRUE

 

Predictions for Round 15

Here are the predictions for Round 15. The prediction is my estimated expected points difference with a positive margin being a win to the home team, and a negative margin a win to the away team.

Game Date Winner Prediction
1 Gloucester vs. Saracens Jan 07 Gloucester 5.00
2 Newcastle Falcons vs. Leicester Tigers Jan 08 Leicester Tigers -7.10
3 Exeter Chiefs vs. Northampton Saints Jan 08 Exeter Chiefs 2.70
4 Harlequins vs. Sale Sharks Jan 09 Harlequins 1.10
5 London Irish vs. Bristol Jan 09 London Irish 8.20

 

January 1, 2023

Briefly

  • The “Great Kiwi Christmas Survey” led to stories at Herald, Newshub, Farmers Weekly, and Radio NZ on what people were eating for their Christmas meal.  The respondents for the “Great Kiwi Christmas Survey” were variously described as “over 1000”, “over 1800”, and “over 3300” Kiwis, which seems a bit vague. According to newsroom, this was actually a bogus poll: “We promoted the survey through social media channels and sent the survey to those people who had signed up to receive information from us,” concedes Lisa Moloney, the promotions manager for Retail Meat NZ and Beef + Lamb NZ.  Headlines based on bogus polls aren’t ever ok — even when you don’t think the facts really matter. Newsroom argued that the results under-represented vegetarians, which is plausible, but you can’t really tell from the data presented on the number of vegetarians. Not all Christmas meals at which vegetarians are present will be centred around plant-based food, as any vegetarian can tell you.
  • Stuff, with the help of Auckland Transport, wrote about Auckland’s most prolific public transport user. Apparently, someone took 3400 trips over a year.  It’s surprising that’s even possible: nearly ten trips per day, every day,  and since the person is doing this on a gold card, starting no earlier than 9am on weekdays.  Assuming the numbers are correct — actually, whether the numbers are correct or not — it’s also a bit disturbing that this analysis was done.  The summaries of typical and top 100 users seem a lot more reasonable. The piece says “Stuff asked to interview the person, however Auckland Transport would not reveal their identity for privacy reasons.”, which is good, but you might want them not to be in a position to reveal it.
  • “Support for low-income housing followed a similar pattern, with broad approval for building it someplace in the country (82 percent) but much less for building it locally (65 percent)” at 538. There should be a word for this.
  • Interesting discussion on the Slate Money podcast about a data display, the “Fed Dot Plot”, which shows the best guesses of members of the Federal Reserve Open Market Committee as to what interest rates they will want in the future; each dot is one person.  The Fed is trying to de-emphasise this graph at the moment — partly because people tend to over-interpret it. Importantly, there’s no individual uncertainty shown, and there’s no way to tell how much of the difference between people is due to difference in what they think the economic situation will be and how much is due to differences in how they expect they will want to react to it.
December 31, 2022

Death by Chocolate?

The BBC: Hershey sued in US over metal in dark chocolate claim. 

This is a slight variation on normal headline grammar:  Hershey isn’t being sued over something they claimed; they are being sued because Consumer Reports claims to have found surprisingly high concentrations of lead and cadmium in dark chocolate from a wide range of manufacturers, small and large, organic and conventional, fair-trade and … whatever the opposite of that is.  The cadmium seems to come from the soil — chocolate eaters are on the wrong end of phytoremediation here — and the experts don’t actually know where the lead comes from. Hershey is being sued because they’re a potentially rewarding target, not because they are more at fault than other chocolate makers.

So, how bad is it? Consumer Reports say that the heavy-metal concentrations exceed health standards if you eat an ounce (like, 30g) every day. To get this result, they used the strictest health thresholds they could find: as they phrase it, “CR’s scientists believe that California’s levels are the most protective available”.  We can look at how California computed its threshold (MADL) for cadmium — at least, how it did in 2001; it’s possible there’s a stricter threshold that I haven’t found on Google.  The procedure was to take the highest concentration with no observed adverse effects in animals, scale it by weight, and divide by 1000 for safety.  With cadmium, they didn’t have a no-effect study, they only had a study showing adverse effects, so they put in an extra factor of 10 to account for that.  So, the threshold we’re comparing to is 10,000 times lower than the lowest concentration definitely shown to be harmful.  The California law doesn’t say it’s dangerous to exceed this threshold; it says that if you’re under this threshold you’re so safe that you don’t have to warn consumers that there’s cadmium present. (PDF)

For chemicals known to the state to cause reproductive toxicity, an exemption from the warning requirement is provided by the Act when a person in the course of doing business is able to demonstrate that an exposure for which the person is responsible will have no observable reproductive effect, assuming exposure at 1,000 times the level in question

Presumably the same is basically true of lead.  Now, lead and cadmium are well worth avoiding, even at levels not specifically known to be harmful. Lead, in particular, seems to have small adverse effects even at very low concentrations.  But the level of risk from doses anywhere in the vicinity the California MADL is, by careful design, very low.

We can look at NZ dietary exposures to cadmium, in the incredibly-detailed NZ Total Diet Study (PDF). We’re averaging about 5.2 ug per kg of bodyweight per month for women, 6.6 for men, and 12 for 5-6year old kids. The provisional monthly tolerable dose given in that report is 25.

Our numbers are a bit  higher than France and Australia, a bit lower than Hong Kong, and about the same as Italy.  If you take the hypothetical 58kg woman used in the California regulatory maths, she would consume about 10 ug/day of cadmium. The California limit is 4.1 and the NZ limit is 48. So, an ounce of high-cadmium dark chocolate per day, if it’s, say, twice the California limit, is a significant fraction of the typical cadmium consumption, but well under any levels actually known to have health risks.

For years, the StatsChat rule on dark chocolate has been “If you’re eating it primarily for the health benefits, you’re doing it wrong”. That still seems to hold.