Posts from May 2014 (77)

May 10, 2014

How close is the nearest road in New Zealand?

Gareth Robins has answered this question with a very beautiful visualization generated with a surprisingly compact piece of R code.

Distance to the nearest road in New Zealand

Distance to the nearest road in New Zealand

Check out the full size image and the coded here.

May 9, 2014

Terrible, horrible, no good, very bad month

From Stuff

The road toll has moved into triple figures for 2014 following the deadliest April in four years.

Police are alarmed by the rising number of deaths, that are a setback after the progress in 2013 when 254 people died in crashes in the whole year – the lowest annual total since 1950.

So far this year 102 people have died on the roads, 15 more than at the same point in 2013, Assistant Commissioner Road Policing Dave Cliff said today.

The problem with this sort of story is how it omits the role of random variation — bad luck.  The Police are well aware that driving mistakes usually do not lead to crashes, and that the ones which do are substantially a matter of luck, because that’s key to their distracted driver campaign. As I wrote recently, their figures on the risks from distracted driving are taken from a large US study which grouped together a small number of actual crashes with a lot of incidents of risky driving that had no real consequence.

The importance of bad luck in turning bad driving into disaster means that the road toll will vary a lot. The margin of error around a count of 102 is about +/- 20, so it’s not clear we’re seeing more than misfortune in the change.  This is especially true because last year was the best on record, ever. We almost certainly had good luck last year, so the fact that it’s wearing off a bit doesn’t mean there has been a real change in driver behaviour.

It was a terrible, horrible, no good, very bad month on the roads, but some months are like that. Even in New Zealand.

Seeking case-study material for journo unit standard

As many of you will know, I’m on the working group developing the content of a unit standard in statistical concepts for the National Diploma in Applied Journalism, which journalists with basic qualifications pursue while on the job to extend their skills.

It’s now time for me to ask you to send New Zealand examples of well-written statistically-based stories and poorly-written statistically-based stories that we can use. BUT you need to be able to send me the source data and an explanation (however rough) of what’s wrong with the story and how it could be improved.

I think that teachers may have existing examples that might do, and I would love to see them. You can email me your juicy contributions to statschat@gmail.com.

May 8, 2014

Where to see NZ data

Today, Wiki New Zealand came to our department to talk about their work.  I mentioned them in late 2012 when they first went live, but they’ve developed a lot since then.

The aim of Wiki New Zealand is to have ALL THE DATAS about New Zealand in graphical form, so that people who aren’t necessarily happy with spreadsheets and SQL queries can browse the information. Their front page at the moment has data on cannabis use, greenhouse gas emissions,  wine grape and olive plantings, autism, and smoking.

 Check them out. 

Where would they get that impression?

From Stuff: “New Zealand’s worst air is not where you think“.  That’s not actually true. New Zealand’s worst air, according to the story, is pretty much where I thought it would be, in coastal Canterbury and Otago.  However, if you search the Stuff website for the term “air pollution”, you get:

airpollution-search

 

So if you expected Auckland to be the worst, you know who to blame.

Briefly

 

Think I’ll go eat worms

This table is from a University of California alumni magazine

Screen-Shot-2014-05-06-at-9.06.38-PM

 

Jeff Leek argues at Simply Statistics that the big problem with Big Data is they, too, forgot statistics.

Who’s afraid of the NSA?

Two tweets in my time line this morning linked to this report about this research paper, saying “americans have stopped searching on forbidden words

That’s a wild exaggeration, but what the research found was interesting. They looked at Google Trends search data for words and phrases that might be privacy-related in various ways: for example, searches that might be of interest to the US government security apparat or searchers that might be embarrassing if a friend knew about them.

In the US (but not in other countries) there was a small but definite change in searches at around the time of Edward Snowden’s NSA revelations. Search volume in general kept increasing, but searches on words that might be of interest to the government decreased slightly

unnamed

The data suggest that some people in the US became concerned that the NSA might care about them, and given that there presumably aren’t enough terrorists in the US to explain the difference, that knowing about the NSA surveillance is having an effect on political behaviour of (a subset of) ordinary Americans.

There is a complication, though. A similar fall was seen in the other categories of privacy-sensitive data, so either the real answer is something different, or people are worried about the NSA seeing their searches for porn.

May 7, 2014

Super 15 Predictions for Round 13

Team Ratings for Round 13

The basic method is described on my Department home page. I have made some changes to the methodology this year, including shrinking the ratings between seasons.

Here are the team ratings prior to this week’s games, along with the ratings at the start of the season.

Current Rating Rating at Season Start Difference
Crusaders 8.08 8.80 -0.70
Sharks 5.49 4.57 0.90
Chiefs 4.24 4.38 -0.10
Brumbies 4.08 4.12 -0.00
Waratahs 3.05 1.67 1.40
Bulls 1.99 4.87 -2.90
Hurricanes 1.63 -1.44 3.10
Blues -0.10 -1.92 1.80
Stormers -0.21 4.38 -4.60
Highlanders -1.71 -4.48 2.80
Force -2.36 -5.37 3.00
Reds -2.86 0.58 -3.40
Cheetahs -2.90 0.12 -3.00
Rebels -4.74 -6.36 1.60
Lions -6.69 -6.93 0.20

 

Performance So Far

So far there have been 74 matches played, 48 of which were correctly predicted, a success rate of 64.9%.

Here are the predictions for last week’s games.

Game Date Score Prediction Correct
1 Blues vs. Reds May 02 44 – 14 3.70 TRUE
2 Rebels vs. Sharks May 02 16 – 22 -6.30 TRUE
3 Crusaders vs. Brumbies May 03 40 – 20 6.30 TRUE
4 Chiefs vs. Lions May 03 38 – 8 12.90 TRUE
5 Waratahs vs. Hurricanes May 03 39 – 30 4.80 TRUE
6 Stormers vs. Highlanders May 03 29 – 28 6.20 TRUE
7 Bulls vs. Cheetahs May 03 26 – 21 7.80 TRUE

 

Predictions for Round 13

Here are the predictions for Round 13. The prediction is my estimated expected points difference with a positive margin being a win to the home team, and a negative margin a win to the away team.

Game Date Winner Prediction
1 Chiefs vs. Blues May 09 Chiefs 6.80
2 Rebels vs. Hurricanes May 09 Hurricanes -2.40
3 Highlanders vs. Lions May 10 Highlanders 9.00
4 Brumbies vs. Sharks May 10 Brumbies 2.60
5 Cheetahs vs. Force May 10 Cheetahs 3.50
6 Bulls vs. Stormers May 10 Bulls 4.70
7 Reds vs. Crusaders May 11 Crusaders -6.90

 

May 6, 2014

Animal testing in New Zealand

Wiki New Zealand, which has information on all sorts of things, has a graph showing animal use for research/testing/teaching in NZ over time.  The data are from the annual report (PDF) of the National Animal Ethics Advisory Committee.

Here’s a slightly more detailed graph showing types of animals and who used them, over time.

animals

 

It’s also important to remember that nearly all the livestock and domestic animals weren’t harmed significantly — research on things like different feed or stocking densities still counts.  Most of the rodents and rabbits ended up dead, as did about a third of the fish.

The two big increases recently are commercial livestock (most of which are no worse off than they would be anyway as livestock) and fish at universities. The increase in fish is probably due at least in part to substitution of zebrafish for mice in some biological research.

No, I don’t know what the government departments did with 40000 birds in 2009. [Update: thanks to James Green in comments, I now do. I]

 

[Update: here’s the data in accessible form]