December 17, 2015

Briefly

  • “Polls Suggest Trump Will Win Between 8 Percent And 64 Percent Of The Vote”: on the actual predictive accuracy of polls this far in advance, from 538.
  • A social-network analysis to find which European Union staff members are most worth influencing
  •  

  • Google trends for “data is” vs “data are”. The red line is people correctly treating ‘data’ as plural; the blue line is people correctly treating ‘data’ as a mass noun. (via David Smith)

    6a010534b1db25970b01b7c7f7cbc8970b

December 15, 2015

Graphs: when zero is not a relevant value

Bar charts have a filled area tying the axis to the plotted value, and this only makes sense when the axis is at a true zero.  Scatterplots and line plot don’t have the same limitations, and can be useful even when there isn’t a true zero or it isn’t a relevant value.

Here’s the Wikipedia compilation of world average temperature estimates back into deep time:

All_palaeotemps.svg

The zero on the graph is the 1960-1990 average, because that’s a reasonable point of comparison. It’s not a true zero; you couldn’t use barcharts.

Here’s the Berkeley Earth estimate of average land temperatures, based on actual thermometer readings at weather stations, using all the data, with open code, data and methods.

global-land-TAVG-Trend

They could have put a zero on the graph by using differences from the average for some period — their data output is difference from the 1951-1980 average — but they presumably thought it was clearer to just label in degrees Celsius and not make everyone do the conversion.

We had a comment suggesting that zero Celsius should be on this sort of graph, and there’s a graph circulating on Twitter that has its baseline at zero Fahrenheit.

CWN3D6nWUAUmQWW

These looks like a deliberately uninformative choice: there’s nothing special about zero Fahrenheit and nothing special about zero Celsius as temperatures either in any absolute sense or as mean global temperatures.

The only natural zero for temperature is zero kelvin. If you want to argue there has to be a zero on climate graphs, it should be that one. But you’d look silly.
temp-zero

If you want to use graphs of temperature history to make a point about policy, the graph needs to be one where differences that would matter for policy are clearly visible. As far as I know, no-one denies that a rapid 4C (7F) change in global temperature would be important. If your graph would make it look unimportant, your graph is wrong.

 

December 14, 2015

A sense of scale

It was front page news in the Dominion Post today that about 0.1% of registered teachers had been investigated for “possible misconduct or incompetence in which their psychological state may have been a factor.”  Over a six year period. And 5% of them (that is, 0.005% of all teachers) were struck off or suspended as a result.

Actually, the front page news was even worse than that:CWKJ22nUwAEguz2

 

but since the “mentally-ill” bit wasn’t even true, the online version has been edited.

Given the high prevalence of some of these psychological and neurological conditions and the lack of a comparison group, it’s not even clear that they increase the risk of being investigated or struck off . After all, an early StatsChat story was about a Dom Post claim that “hundreds of unfit teachers” were working in our schools, based on 664 complaints over two years.

It would be interesting to compare figures for, say, rugby players or journalists. Except that would be missing the other point.  As Jess McAllen writes at The Spinoff, the phrasing and placement of the story, especially the original one, is a clear message to anyone with depression, or anxiety, or ADHD. Anyone who wants to think about the children might think about what that message does for rather more than 0.1% of them.

(via @publicaddress)

New StatsNZ data interface


The service lets you click on a map and see data for that location (eg, that meshblock).  You can also download the data and map information, or write computer queries to its API.

Briefly

  • Andrew Chen has some interestingly boring graphs of election turnout:
    flagout
    Voting in referendums is lower than in general elections, but in a fairly uniform way.
  • The Herald, which usually does better than this, ran a story about voting intentions for round two based on a bogus clicky poll
  • “What’s the likelihood you will run into a 0-4 year to lighten up your day? “  In Vancouver, toddlers tend to live in apartments. (via @vb_jens)
  • ” 70% Of Iran’s Science And Engineering Students Are Women”  from Amy Guttman at Forbes

Stat of the Week Competition: December 12 – 18 2015

Each week, we would like to invite readers of Stats Chat to submit nominations for our Stat of the Week competition and be in with the chance to win an iTunes voucher.

Here’s how it works:

  • Anyone may add a comment on this post to nominate their Stat of the Week candidate before midday Friday December 18 2015.
  • Statistics can be bad, exemplary or fascinating.
  • The statistic must be in the NZ media during the period of December 12 – 18 2015 inclusive.
  • Quote the statistic, when and where it was published and tell us why it should be our Stat of the Week.

Next Monday at midday we’ll announce the winner of this week’s Stat of the Week competition, and start a new one.

(more…)

December 11, 2015

Flag notes

So, the preliminary count is up and it’s a victory for opinion polling, if not aesthetics.

As the UMR poll predicted, the two Lockwood ferns were very close, and everything else was well back.  Red Peak did a bit worse than in the opinion poll.

It’s notable that each of the two Lockwood ferns got more votes than Red Peak, Hypnoflag, Qualmark Fern, Informal, and Invalid put together.

It is, of course,  theoretically possible that there is a Condorcet winner other than one of the Lockwood ferns, because almost anything is theoretically possible in voting-system maths. All you’d need is about three-quarters of the votes ending up at each fern at stage 3 to have Red Peak rather than the other fern next on their preference list.

Against sampling?

Stuff has a story from the Sydney Morning Herald, on the claim that smartphones will be obsolete in five years. They don’t believe it. Neither do I, but that doesn’t mean we agree on the reasons.  The story thinks not enough people were surveyed:

The research lab surveyed 100,000 people across its native Sweden and 39 other countries.

With around 1.9 billion smartphone users globally, this means ConsumerLab covered just 0.0052 per cent of active users for its study.

This equates to about 2500 in each country; the population of Oberon

If you don’t recognise Oberon, it’s a New South Wales town slightly smaller than Raglan.

Usually, the Sydney Morning Herald doesn’t have such exacting standards for sample size. For example, their recent headline “GST rise backed by voters if other taxes cut: Fairfax-Ipsos poll” was based on 1402 people, about the population of Moerewa.

The survey size is plenty large enough if it was done right. You don’t, as the saying goes, have to eat the whole egg to know that it’s rotten. If you have a representative sample from a population, the size of the population is almost irrelevant to the accuracy of survey estimates from the sample. That’s why opinion polls around the world tend to sample 1000-2000 people, even though that’s 0.02-0.04% of the population of New Zealand, 0.004%-0.009% of the population of Australia, or 0.0003-0.0006% of the population of the USA.

What’s important is whether the survey is representative, which can be achieved either by selecting and weighting people to match the population, or by random sampling, or in practice by a mixture of the two.  Unfortunately, the story completely fails to tell us.

Looking at the Ericsson ConsumerLab website, it doesn’t seem that the survey is likely to be representative — or at least, there aren’t any details that would indicate it is.  This means it’s like, say, the Global Drug Survey,  which also has 100,000 participants, out of over 2 billion people worldwide who use alcohol, tobacco, and other drugs, and which Stuff  and the SMH have reported on at great length and without the same skepticism.

December 10, 2015

#!;@? punctuation

 

From Stuff

Ending your texts with a full stop is truly monstrous. We all know this. Grammar be darned, it just doesn’t look friendly.

Now a study has confirmed it. Researchers led by Binghamton University’s Celia Klin report that text messages ending with a full stop are perceived as being less sincere, probably because the people sending them are heartless.

Or from The Wireless, which at least knows what ‘grammar’ means:

Researchers at the University of Binghamton in the US have released a study concluding that the full stop is evil. Or, in their own words: “Inclusion of a sentence-final period in text messages affects readers’ perception of the sincerity of the messages”. 

The quote is correct; that is what the researchers said! But as graphics guru Edward Tufte points out, one of the characteristics of numbers is that they have a magnitude as well as a direction!

How evil is the full stop? Well, assuming the people you text have the same assumptions about writing as undergraduate students in upstate New York, the following graph gives a picture! These are two Normal distributions with the mean and variance that the researchers found for messages with and without a full stop!

 sincereness  

Truly monstrous and evil!!

The press release also says

In some very recent follow-up work, Klin’s team found that a text response with an exclamation mark is interpreted as more, rather than less, sincere.

That’s a relief!!!!

December 9, 2015

Briefly

  • “Clearly, they had to train the model against some existing data set, and the one they chose was Nazi Germany in the run-up to Operation BARBAROSSA.” There are newly declassified documents about the USSR’s computer model for the risk of a US surprise attack.  As Alex Harrowell writes “The neural network that classifies cat photos must by definition contain enough information to make a random collection of pixels catlike, although uncannily not quite right. Similarly, RYAN picked up a lot of unrelated data and invariably made it vaguely Hitler-y.”
  • The usually-reliable New York Times has an infographic saying luxury hotels spend $1200 per room per month on bathroom products. Felix Salmon doesn’t believe it. Neither do I.
  • As I’ve pointed out from time to time, you don’t have to choose which of the AA and the petrol companies to believe about costs and prices: MBIE monitors the ‘importer margin’ on a weekly basis.
  • A book recommendation list from people involved with the UK charity Sense about Science
  • A description of how an infographic about bacteria in house dust was designed, at Scientific American