Stat of the Week Winner: August 6-12 2011
Thanks to all who added nominations for our first Stat of the Week competition. The nominations were all fascinating for a variety of reasons and much could be written about each of them. We’ve chosen Eric Crampton’s nomination of John Pagani’s heated blog post on youth unemployment:
In the midst of extensive discussion of the rise in youth unemployment starting around Q4 2008, Pagani points to changes in apprenticeship funding as a policy shift that could have generated the change (arguing against changes in the youth minimum wage as having been the cause). He writes:
“If it wasn’t the removal of the youth minimum wage that caused youth unemployment to increase, then it would have to have been caused by something else that happened around the same time.One other big change was the a sharp fall in young people getting skills for work.
In December 2008 there were 133,300 people in industry training. By the end of last year, there were 108,000. ”
You could be forgiven for assuming that about 25,000 kids had been kicked out of apprenticeships – it sure looks like he’s referring to youths. All the other discussion is on youth unemployment. But the number he’s citing is overall enrolment in training and apprenticeships. And the drop in youth enrolment in training – about 4,000 – is nowhere near large enough to provide a plausible alternative explanation.
Congratulations Eric!
Thank you for your award for Stat of the Week, which I will *always prize, if ‘always’ is a statistical approximation of ‘never’.
I used a lot of stats in my blog post because I was trying to demonstrate the extent of a new problem with youth unemployment. That’s why I referenced Mr Crampton’s interesting graph, and tried to show we have an unusually dire problem emerging.
Then I wanted to show that the simple explanation might not be the only one available.
That is a lot of statistical work for a short, easy read.
In one sentence in this post I referred to overall numbers in industry training. You seem to have decided this reference is a red herring in a post about youth unemployment.
I wish you asked me to defend the point because I would have questioned Mr Crampton’s assertions, based on his misuse of statistics.
I can’t see any evidence in your post that the statistic I quoted was wrong.
In my post I accepted that an investigation of Mr Crampton’s graph required me to postulate another policy change that occurred at around the same time. I believe this is scrupulous economics.
I nowhere made that claim that the statistic at issue was exclusively about teenagers.
What I tried to show was that overall numbers in industry training fell sharply. It can be inferred from that precipitous decline that sharp falls also occurred among teenagers.
You and Mr Crampton have inferred I was confusing overall numbers with youth numbers, and therefore I have to accept the inference was reasonable. But the alternative explanation is reasonable too: That the figure was an example of the whole.
And therefore what we have is undesirable ambiguity of expression, which is not the same thing as misuse of statistics.
If I had another thousand words in the blog post, I would have referred to a recent NZ Institute study, citing OECD figures, showing deterioration in numbers of teens leaving school unprepared for work.
This figure supports my view there is a decline in ‘young people getting skills for work’ and demonstrates the category ‘young people getting skills for work’ is much wider than the category ‘young people in industry training.’
But the waterfall of numbers and statistics would make the post unreadable to anyone but an expert.
If there’s been a decline in young people in work, I would predict a rise in the numbers in training, just as I would predict a rise in university enrollments when unemployment rates reduce the ability of young people to directly enter the workforce without qualifications. Therefore, the correct comparison might not be the numbers in industry training in December 2008, but the higher number theory might have predicted last year. So where Mr Crampton accuses me of exaggerating the decline in industry training, its plausible (but not provable) that I might have underestimated the decrease in a way favorable to his case, and unfavorable to mine.
My point in the ‘133,300 to 108,000’ sentence was to demonstrate that numbers in industry training went down; it’s not unreasonable to infer many of those affected might have been teenagers.
At most therefore, you can say I could have added for clarity, the words ‘of all ages’ in the sentence “In December 2008 there were 133,300 people in industry training. By the end of last year, there were 108,000.”
Still, that statement, existing in a paragraph on its own and unqualified in any way, is literally true. I note neither Mr Crampton nor you dispute its accuracy.
Is a statistical statement that is exactly true really the worst example of statistical misuse in New Zealand this week?
So we turn to context. Did I use the figure in a contextually misleading way?
Let’s compare my statement to Mr Crampton’s reference to a ‘drop in youth enrolment in training’ of 4,000.
What trick is this? He has implied “youth enrolment in training” is the same thing as “apprenticeships” or “young people getting skills for work.” This is, with respect, egregious misdirection.
May I nominate this misdirection, and your acceptance of it, as the worst examples of statistical misuse for this week?
“Youth enrolment in training”, “apprenticeships” and “young people getting skills for work” are radically different concepts.
Industry training is not the same as apprenticeships, and neither is the same as “young people getting skills for work.”
I don’t know where the figure ‘4000’ is sourced from, but even if we accept the statement “the drop in youth enrolment in training [was] about 4,000”, that in no way is conclusive enough to support his conclusion that it’s “nowhere near large enough to provide a plausible alternative explanation.”
He could only make that claim by making “youth enrolment in training” mean the same as “young people getting skills for work.” But if they do mean the same, then the figure 4000 is hopelessly low. He is only counting a subset. Alternatively, whatever the derivation of the number 4000, it can’t plausibly be the same as the full category ‘young people getting skills for work.’
What you and he can’t defend is the equation of ‘youth enrolment in training’ with the statement he asserts I implied: “25,000 kids kicked out of apprenticeships.” Nowhere do I reference, even contextually, that 25,000 kids have been kicked out of apprenticeships. I would not make such a bizarre statement, because I doubt there have ever been 25,000 kids in apprenticeships.
Relying on this wrong misuse of statistics to claim I have misused stats is … ironic.
So, will you nominate yourselves for bad stat of the week?
13 years ago
John, you cited industry training as the relevant statistic. I pointed out what a tiny portion of that stat was comprised of youths. If there’s any problem with industry training being the relevant stat, the fault is yours. If you want the source on the figure, it’s on the TEC website. The one that says “Industry Training 2010”. Here:
http://www.tec.govt.nz/Tertiary-Sector/Performance-information/Industry-training/Industry-training-2010/
Just compare the youth component for end 2010 with the youth component for end 2008.
You’re right that you were literally correct when you cited the training numbers. But given the context, it was either mendacious or incompetent. You might as well have used even bigger numbers and not told anybody that you were referencing WTO worldwide figures.
13 years ago
Suppose I had a big op ed piece on the problem of armed robbery in New Zealand. Then, half way through, I said “Just last year, 250,000 people were victims of armed robbers.” If somebody then rightly pointed out that I was using total stats from some set of 50 countries instead of just NZ, it would be more than a bit rich of me to whine “Oh, I never said “in New Zealand in that particular sentence! Unfair!”
13 years ago
Excellent start to the competition. Congratulations to both John and Eric.
13 years ago
To Michael I would say that the existence of a disagreement does not make all points in the disagreement valid, nor validate the selection of the example. It could be that they made a mistake.
I notice the silence of the Stats department when invited to defend their decision, suggesting to me that the Department can dish it out, but shamefully can’t defend a dishonest and lazy decision.
How revealing that an academic department is so weak in its integrity, so poor in its intellectual honesty and so cowardly in its public accountability.
Eric, thank you firstly for acknowledging I am ‘literally correct.’
Everything after your ‘but’ avoids entirely the substance of my response. To reprise the points you have been unable so far to answer, despite multiple entries into the comments to attempt to do so:
1. That the category ‘industry training is not the only example of youth training in existence, nor the only one I explicitly mentioned.
2. That the reference to industry training could easily be read as an *example* of a decline in training overall, and therefore what I have done at worst is create ambiguity. I recognise I could have been clearer on the point, but that doesn’t make out the claim of a misuse of statistics.
It’s more like saying ‘house prices in Wellington are up. I saw the northern suburbs alone have risen six per cent. Naturally, the second statement doesn’t prove the first, but it’s hardly ‘mendacious or incompetent’.
Describing it that way is not a substitute for a substantive response to this point.
You are saying: ‘young people getting skills for work’ is to ‘industry training’ as ‘WTO’ is to ‘New Zealand.’ Even on your figures this analogy is wildly wrong: On YOUR figure the numbers would be an error of 4000-25,000, or one to six. NZ is to WTO about 0.4%, or one to 250.
Since your point is that my comparison appears to exaggerate, what does that make your comparison?
You cannot consistently accuse me of a false comparison and then apply utterly false ones yourself.
3. That while you accuse me of taking too large a category as the starting point of comparison; I accuse you of taking far too low a comparative starting point. The figure you link to (4000) cannot plausibly be equated with the total number of kids getting skills for work.
I notice your response to this point is ‘you cited industry training as the relevant statistic.’ This claim is false.
I cited: “young people getting skills for work.”
I gave ‘industry training’ figures as an example that supports the point, but it is only you who has anywhere conflated the two figures.
4. And, further that your accusation my starting point is too wide has to deal with a good case that the starting point I chose is arguably too small.
Once again, I invite you to retract your misuse of statistics, which seem to be compounding with every fresh post, and I invite the stats department to defend or repudiate their original decision.
13 years ago
John, as reminder, here was the quote that I used in the nomination:
“If it wasn’t the removal of the youth minimum wage that caused youth unemployment to increase, then it would have to have been caused by something else that happened around the same time One other big change was the a sharp fall in young people getting skills for work. In December 2008 there were 133,300 people in industry training. By the end of last year, there were 108,000. No wonder unemployment has gone up.”
The natural reading of the section is that your presented stat referred to youths in training. Most folks haven’t census or HLFS data sitting at the front of their minds so they wouldn’t recognize that it’s more than a little implausible that about a third of the population cohort was in training, but what they would see is your citing a drop of about 25k people in training and reckon it referred to youths, since every other bit of your article is about youths. If 25k kids had been booted out of industry training programmes by changes in policy under National, that would have been a plausible alternative explanation for the big increase in youth unemployment that started late 2008. But the youth component of that figure – which would have had to have been at your fingertips if you were pulling TEC’s .xls sheets to source your figures – was much much smaller.
You will note that I never disputed the cited total figures; it’s then hardly a concession that I’m happy to point to the TEC figures showing 130k youths in Industry Training end-2007, 133.3k end-2008 and 102.5k end-2010 (the last a bit different from your figure, but I figured you there just made an honest mistake in citing the September quarter figure rather than December quarter for year end). But that you had the totals right made things more damning rather than less: the Excel sheets where those numbers live has a separate tab very clearly labelled “Industry by Age”. You could have chosen the relevant statistic. Instead, you chose the bigger and misleading one.
Of course, there are other training programmes for youths. The Modern Apprenticeships programme had enrolled about 8900 15-19 year olds in 2008, dropping to 8100 in 2010. So you could add another 700 to the drop in youths in training. (See the TEC site, linked above, for the data if you’re keen). The drop there seems pretty inconsequential, but if you like 4700 instead of 4000, that’s fine.
But without a longer time series of how youth apprenticeships and industry training have tracked with changes in unemployment over a period running back earlier than 2008, we can’t say whether the drop in training is about what we’d expect with the economic downturn or not. Happy to take your point that more kids will be wanting training when the economy’s poor if you’ll take mine that fewer employers will be happy to take on trainees when they’re contemplating layoffs for more senior staff. I’d expect the supply side constraint here to be the more binding.
My point with the fictional WTO stat was to show the kind of error you made, and how large of discrepancies are possible through such technique.
Finally, I’ll dispute your new analogy on housing prices. What you did was rather more: “house prices in Wellington are down a bit. REINZ data shows house prices dropped only $4000 in the year to July – just a bit over a percent.” Except that the real figure I’m citing there, for the country as a whole, bears little resemblance to the Wellington data where prices dropped $13,750 over the same period – a bit over three percent. See what I did there? I mislead people without actually having a sentence, taken alone, be factually incorrect. It’s a nice trick, until somebody calls you on it.
13 years ago
If you wanted to argue ‘the drop off in industry training is not enough to explain the increase in youth unemployment because youth unemployment is only a subset of the drop off in industry training’, then that would be an arguable point.
Then we could get onto a debate about whether supply or demand side constraints explain the phenomenon, and – more importantly – the extent to which other categories have dropped off as well.
But you are making different argument: that I was mendacious and incompetent for even citing the drop off in industry training as an example in the context of youth unemployment.
But they have dropped off. So have other categories of young people getting skills for work. I don’t see this as misleading.
If I had said: ‘here’s one example where training has fallen away’, then your case collapses completely.
So everything in your case turns on an assertion that there is no reasonable reading of the sentence as an example; that the only reasonable reading is that the number I gave was an attempt to describe every possible category of young people getting skills for work.
I doubt most people read it that way, though, because most of us know young people get skills elsewhere. Adult community education, for example.
If I wanted to make the point you claim I was trying to make, I would have written: There has been a sharp fall in young people in industry training. And then given the figures for the fall in industry training.
But I didn’t. I used two different category descriptors because I was describing two different categories.
I said there has been a sharp fall in the numbers of “young people getting skills for work.”
I’m not saying it is unreasonable to read it the way you did; I am saying it is also reasonable to read it the way it is intended.
I take the responsibility for ambiguity, but not for misusing a statistic, because the figure is not misused.
You seem to be arguing that because the industry training figures don’t explain the fall on their own, they are incapable of being a component of the whole. In this you are wrong.
I made the point that category A has fallen. I present by way of evidence that a subset of category A has fallen. Indeed that does not entirely prove my first point, but nor is it inconsistent.
One reason that numbers of young people getting skills for work is down is that young people in industry training is down, and the overall numbers in industry training give an idea of the direction.
Your entire point is that you felt I was representing the subset as the entire category.
But you would need to explain why I used two different category definitions.
And you would have to explain why the decrease in young people in industry training is inconsistent with my claim that numbers of young people getting skills for work are down sharply.
If you can’t do the latter, then it can’t be misleading.
13 years ago
I rather suspect that a reasonable majority would have read your sentence as I did, and that that’s why it got the prize.
Of course changes in training could explain some of the difference. But be careful here. The graph of mine that you put up shows the excess of youth unemployment over what we’d expect youth unemployment to be given adult unemployment rates. If the training changes hit only the younger cohort, then it helps explain the gap. But as it obviously hit both kids and adults, it’ll show up in both adult and youth unemployment rates. And so it’s only the extent to which youths are disproportionately enrolled in training rather than the raw youth numbers that could explain the gap between expected and realized outcomes. And that’s much smaller than the drop in youth enrolment.
13 years ago
A very good example of misleading statistics, don’t blame the stats department.
13 years ago