Posts from July 2012 (55)

July 17, 2012

Excellence in Statistical Reporting Award

The American Statistical Association gives an annual award for Excellence in Statistical Reporting.  This year it goes to Amanda Cox, a graphics editor at the New York Times, some of whose graphs we’ve highlighted on the blog.  Here are some more examples and talks.

The award was created to encourage and recognize members of the communications media who have best displayed an informed interest in the science of statistics and its role in public life. The award can be given for a single statistical article or for a body of work. In selecting the recipient, consideration is given to:

  • Correctness, clarity, fairness, brevity, and professionalism of the communication
  • Importance, relevance and overall effectiveness in impacting the intended audience
  • Impact on the growth and national or regional exposure of statistics
  • Appreciation and emphasis of the statistical aspects of a particular issue or event
  • Excellent coverage of research on statistics or statistical issues

You’re all individuals

The Herald  is at least showing some scepticism about Italian-style patisserie that is supposed to make you lose weight (they include green tea and guarana, ie, caffeine).  The manufacturer isn’t willing to give any numbers

But she said it was not possible to measure how much eating the treats would help boost the metabolism because ‘everyone is different’.

Of course, this is a pretty transparent excuse.  If the fact that everyone is different made measurements impossible, medical science would be in a bad way.  We can measure the average effect.  We can measure the variability in the effect.  We can measure the proportion of people helped.  And we do all these things.  For example,  we’ve known for years that angiotensin converting enzyme inhibitors reduce blood pressure on average by about 10mmHg.  More recently, some Sydney researchers reanalyzed the data from the randomized trials to look at how much person-to-person variation in effect there was, and found it was extremely small.

The story goes on to say

Registered public health nutritionist Charlotte Stirling-Reed said that consumers should always look for evidence before making purchases based on health claims.

True, but that would spoil all the fun.

Margin of error yet again

In my last post I more-or-less assumed that the design of the opinion polls was handed down on tablets of stone.  Of course, if you really need more accuracy for month-to-month differences, you can get it.   The Household Labour Force Survey gives us the official estimates of unemployment rate.  We need to be able to measure changes in unemployment that are much smaller than a few percentage points, so StatsNZ doesn’t just use independent random samples of 1000 people.

The HLFS sample contains about 15,000 private households and about 30,000 individuals each quarter. We sample households on a statistically representative basis from areas throughout New Zealand, and obtain information for each member of the household. The sample is stratified by geographic region, urban and rural areas, ethnic density, and socio-economic characteristics. 

Households stay in the survey for two years. Each quarter, one-eighth of the households in the sample are rotated out and replaced by a new set of households. Therefore, up to seven-eighths of the same people are surveyed in adjacent quarters. This overlap improves the reliability of quarterly change estimates.

That is, StatsNZ uses a much larger sample, which reduces the sampling error at any single time point, and samples the same households more than once, which reduces the sampling error when estimating changes over time.   The example they give on that web page shows that the margin of error  for annual change in the employment rate is on the order of 1 percentage point.  StatsNZ calculates sampling errors for all the employment numbers they publish, but I can’t find where they publish the sampling errors.

[Update: as has just been pointed out to me, StatsNZ publish the sampling errors at the bottom of each column of the Excel version of their table,  for all the tables that aren’t seasonally adjusted]

July 16, 2012

Euro-zone debt crisis hits number-crunchers, too …

Here’s a different statistical take on the Euro-zone crisis:

Debt crisis: Italy’s statisticians threaten ‘stats black-out’

Italy’s official statisticians are threatening to down calculators and stop reporting on its stricken economy – as they themselves fall victim to the recession they are paid to track. 

Read the details here.

 

 

Ewen Macdonald: (Vile) trial by opinion poll

This appeared on stuff.co.nz and in Fairfax papers last week:

After a harrowing trial that gripped the nation, a survey has revealed just one in five New Zealanders think Ewen Macdonald did not murder his brother-in-law Scott Guy.

A jury of 11 handed down a not guilty verdict to Macdonald, 32, last week, after a month-long trial in the High Court at Wellington.

But results to be made public by market research company UMR today show just 20 per cent of people surveyed agreed with Ewen Macdonald being acquitted of slaying Mr Guy outside his rural Feilding home in July 2010.

Living in New Zealand means agreeing to deal with criminal allegations transparently in the courtroom, not the court of (ill-informed, speculative) public opinion. The only people with the information on which to make an informed opinion are members of the jury – and they have delivered a verdict that police will not appeal.  What was UMR thinking?

When a dog bites a man, that’s not news

A question on my recent post about political opinion polls asks

– at what point does the trend become relevant?

– and how do you calculate the margin of error between two polls?

Those are good questions, and the reply was getting long enough that I decided to promote it to a post of its own. The issue is that proportions will fluctuate up and down slightly from poll to poll even if nothing is changing, and we want to distinguish this from real changes in voter attitudes — otherwise there will be a different finding every month and it will look as if public opinion is bouncing around all over the place.  I don’t think you want to base a headline on a difference that’s much below the margin of error, though reporting the differences is fine if you don’t think people can find the press release on their own.

The (maximum) margin of error, which reputable polls usually quote, gives an estimate of uncertainty that’s designed to be fairly conservative. If the poll is well-designed and well-conducted, the difference between the poll estimate and the truth will be less than the maximum margin of error 95% of the time for true proportions near one-half, and more often than 95% for smaller proportions.  The difference will be less than half the margin of error about two-thirds of the time, so being less conservative doesn’t let you shrink the margin very much.   In this case the difference was well under half the margin of error.  In fact, if there were no changes in public opinion you would still see month-to-month differences this big about half the time.

For trends based on just two polls, the margin of error is larger than for a single poll, because it could happen by chance that one poll was a bit too low and the other was a bit too high: the difference between the two polls can easily be larger than the difference between either poll and the truth.

The best way to overcome the random fluctuations to pick up small trends is to do some sort of averaging of polls, either over time, or over competing polling organisations.  In the US, the website fivethirtyeight.com combines all the published polls to get estimates and probabilities of winning the election, and they do very well in short-term predictions.  Here’s a plot for Australian (2007) elections, by Simon Jackman, of  Stanford, where you can see individual poll results (with large fluctuations) around the average curve (which has much smaller uncertainties).  KiwiPollGuy  has apparently done something similar for NZ elections (though I’d be happier if their identity or their methodology was public).

So, how are these numbers computed?  If the poll was a uniform random sample of N people, and the true proportion was P, the margin of error would be 2 * square root(P*(1-P)/N).  The problem then is that we don’t know P — that’s why we’re doing the poll. The maximum margin of error takes P=0.5, which gives the largest margin of error, and one that’s pretty reasonable for a range of P from, say, 15% to 85%. The formula then simplifies to 1/square root of N.   If N is 1000, that’s 3.16%, for N=948 as in the previous post, it is 3.24%.

Why is it  2 * square root(P*(1-P)/N)?  Well, that takes more maths than I’m willing to type in this format so I’m just going to mutter “Bernoulli” at you and refer you to Wikipedia.

For trends based on two polls, as opposed to single polls, it turns out that the squared uncertainties add, so the square of the margin of error for the difference is twice the square of the margin of error for a single poll.  Converting back to actual percentages, that means the margin of error for a difference based on two polls is 1.4 times large than for a single poll.

In reality, the margins of error computed this way are an underestimate, because of non-response and other imperfections in the sampling, but they don’t do too badly.

Stat of the Week Competition: July 14 – 20 2012

Each week, we would like to invite readers of Stats Chat to submit nominations for our Stat of the Week competition and be in with the chance to win an iTunes voucher.

Here’s how it works:

  • Anyone may add a comment on this post to nominate their Stat of the Week candidate before midday Friday July 20 2012.
  • Statistics can be bad, exemplary or fascinating.
  • The statistic must be in the NZ media during the period of July 14 – 20 2012 inclusive.
  • Quote the statistic, when and where it was published and tell us why it should be our Stat of the Week.

Next Monday at midday we’ll announce the winner of this week’s Stat of the Week competition, and start a new one.

(more…)

Stat of the Week Competition Discussion: July 14 – 20 2012

If you’d like to comment on or debate any of this week’s Stat of the Week nominations, please do so below!

July 14, 2012

BBC radio equivalent of StatsChat

If you don’t like StatsChat you will probably not enjoy “More or Less”,  a BBC radio show/podcast/blog on statistics in the British media, presented by Tim Harford.  Their new season starts this week.

Poll shows not much

According to the Herald

The latest Roy Morgan Poll shows support for the National Party has fallen two per cent since early June.

 The poll is based on 948 people, so the maximum margin of error (which is a good approximation for numbers near 50%) is about 3.2%, and the margin of error for a change between two polls is about 1.4 times larger: 4.6%.