Posts from November 2016 (30)

November 28, 2016

Stat of the Week Competition: November 26 – December 2 2016

Each week, we would like to invite readers of Stats Chat to submit nominations for our Stat of the Week competition and be in with the chance to win an iTunes voucher.

Here’s how it works:

  • Anyone may add a comment on this post to nominate their Stat of the Week candidate before midday Friday December 2 2016.
  • Statistics can be bad, exemplary or fascinating.
  • The statistic must be in the NZ media during the period of November 26 – December 2 2016 inclusive.
  • Quote the statistic, when and where it was published and tell us why it should be our Stat of the Week.

Next Monday at midday we’ll announce the winner of this week’s Stat of the Week competition, and start a new one.

(more…)

Stat of the Week Competition Discussion: November 26 – December 2 2016

If you’d like to comment on or debate any of this week’s Stat of the Week nominations, please do so below!

November 26, 2016

Garbage numbers from a high-level source

The World Economic Forum (the people who run the Davos meetings) are circulating this graph:cyjjcamusaaooga

According to the graph, New Zealand is at the bottom of the OECD, with 0% waste composted or recycled.  We’ve seen this graph before, with a different colour scheme. The figure for NZ is, of course, utterly bogus.

The only figure the OECD report had on New Zealand was for landfill waste, so obviously landfill waste was 100% of that figure, and other sources were 0%.   If that’s the data you have available, NZ should just be left out of the graph — and one might have hoped the World Economic Forum had enough basic cluefulness to do so.

A more interesting question is what the denominator should be. The definition the OECD was going for was all waste sent for disposal from homes and from small businesses that used the same disposal systems as homes. That’s a reasonable compromise, but it’s not ideal. For example, it excludes composting at home. It also counts reuse and reduced use of recyclable or compostable materials as bad rather than good.

But if we’re trying to approximate the OECD definition, roughly where should NZ be?  I can’t find figures for the whole country, but there’s some relevant –if outdated — information in Chapter 3 of the Waste Assessement for the Auckland Council Waste Management Plan. If you count just kerbside recycling pickup as a fraction of kerbside recycling+waste pickup, the diversion figure is 35%. That doesn’t count composting, and it’s from 2007-8, so it’s an underestimate. Based on this, NZ is probably between USA and Australia on the graph.

Where good news and bad news show up

In the middle of last year, the Herald had a story in the Health & Wellbeing section about solanezumab, a drug candidate for Alzheimer’s disease. The lead was

The first drug that slows down Alzheimer’s disease could be available within three years after trials showed it prevented mental decline by a third.

Even at the time, that was an unrealistically hopeful summary. The actual news was that solanezumab had just failed in a clinical trial, and its manufacturers, Eli Lilly, were going to try again, in milder disease cases, rather than giving up.

That didn’t work, either.  The story is in the Herald, but now in the Business section. The (UK) Telegraph, where the Herald’s good-news story came from, hasn’t yet mentioned the bad news.

If you read the health sections of the media you’d get the impression that cures for lots of diseases are just around the corner. You shouldn’t have to read the business news to find out that’s not true.

November 25, 2016

Thanksgiving

It’s well into Thanksgiving Day in the US now, and that’s a nice tradition to export. So, today, I’m thankful for geophysics.

In the Late Bronze Age, it made perfect sense that earthquakes were caused by God or gods getting upset. That, on a larger scale, is how people often behave, and whether we are made in God’s image or he in ours, you’d expect some similarities.  And when an earthquake destroys a city, well, whether you think God is more offended by homosexuality or homelessness, by not giving enough to the temple or not giving enough to the poor, there’s going to be something in any major city to piss him off.

Now we have maps like this one from GNS Science:
kaikoura-earthquake-faults-e1479265716143
and this one, which I made for a very early StatsChat post, showing all sufficiently-large earthquakes from 1973 to mid-2011.

moonmap1

Working from travellers’ tales in the Middle East it would be impossible to see the patterns, but technologies including GPS, helicopters, the internet, and a worldwide network of seismometers makes them much clearer. Earthquakes mostly happen along a small set of lines, and scientists can measure the strains in the rock around those lines that lead to the earth rupturing.  The global pattern, together with a vast network of other evidence, fits an explanation where whole continents are pushed around on the Earth by convection deep inside, bumping and grinding as they collide. It doesn’t fit an explanation based on human behaviour being different in different places — even though that might seem a less grandiose explanation before we got the data.

There’s a lot we don’t know about earthquakes, but we understand them well enough to make high-risk/low-risk predictions, to describe the patterns of aftershocks, to do tsunami warnings (on a good day), and to buy and sell earthquake insurance.  We don’t know exactly why one building is destroyed and another is spared, but there aren’t any mysteries about it: it’s the sort of thing we could work out given time and money.

Science isn’t a pure good; there are many things we can go with more knowledge of the world, and the blue circles on the world map above show some seismic events that are the result of human action. But even they have become less frequent.

And now that God has gotten out of the natural-disaster business, many people in this country don’t believe in him, and those that do still believe mostly (with sad exceptions) have a higher opinion of him than their ancestors did.

November 24, 2016

Briefly

  • “The problem scientists have to face here isn’t whether the data is real, but whether this is an appropriate way to represent it.” On the sea-ice graphic that’s going around.
  • “Using the language of economics, judgment is a complement to prediction and therefore when the cost of prediction falls demand for judgment rises. We’ll want more human judgment.” Harvard Business Review
  • Apps blamed for rise in road deaths (NY Times)
  • The sort of basic search skills Tim O’Reilly describes can also be applied to non-political fake news. If you start with “Ice cream for breakfast makes you smarter, claims scientist” from the Herald you can easily find the Japanese story that’s the source. If you look a little harder, as my brother did, you can find the 2013 story on the same Japanese site, which has a little more detail. Using Google Translate, the research was sponsored by an ice-cream company and the source for the story is the company website. The researcher is real, but the research appears not to have been published — and there has been plenty of time since 2013.   Ice-cream doesn’t really matter, but the question of which stories in the newspaper we’re supposed to take seriously does matter.
November 23, 2016

Indigenous data – why is it important?

andrew-sporle tahu-kukutai-240712In a data-driven world, indigenous peoples are becoming increasingly concerned about who owns and represents statistics about indigenous people: that is, who has access to the data, its cultural integrity, and how people’s privacy and autonomy is protected.

Not only do governments collect data about their citizens, but so, too, do indigenous peoples about themselves – just think of the data that iwi need to collect about their own people in this post-settlement era. As an example, I’m a registered member of Waikato-Tainui. The central administration knows six or so generations of my whakapapa because becoming registered means putting your links on paper that a kaumatua then signs off. It knows my home marae and all sorts of personal details such as where I live and my birth date. As I have been the privileged recipient of educational scholarships from the iwi, it also knows my academic record and quite a lot of personal stuff about my goals and aspirations.

So why is this important? Indigenous people have historically had a problematic relationship with researchers, academics and other data collectors. Researcher Andrew Sporle, pictured at right (Rangitāne, Ngāti Apa, Te Rārawa) recently told me that “From a Māori perspective, we were all too often the researched, not the researchers, and Māori realities were often portrayed as a strange and inferior ‘other’. Indigenous peoples are asserting the right to govern and protect the data that are so important to our development. We cannot afford to lose control of data about us.”

Data, he added, is a “highly valuable strategic asset” for Māori development. “In the age of big data, Māori want access to data to support our decision‐making and to be involved when big data is used to make decisions about us.”

In this field, things have been moving fast of late, and New Zealander statisticians are among the leaders.  Andrew and Tahu Kukutai pictured left (Ngāti Maniapoto, Te Aupōuri), Associate Professor at the Institute of Demographic and Economic Analysis, University of Waikato, are among the founding members of Te Mana Raraunga (the Māori Data Sovereignty Network), which was set up last year to assert Māori rights and interests in relation to data.

The group’s guiding motto is “He whenua hou, Te Ao Raraunga; Te Ao Raraunga, He whenua hou”, or “Data is a new world, a world of opportunity.”  It advocates “for the development of capacity and capability across the Māori data ecosystem, including data rights and interests, data governance, data storage and security, and data access and control”.

Andrew and Tahu attended last month’s  Indigenous Open Data Summit in Madrid, Spain, alongside independent statisticians Kirikowhai Mikaere (Tūhourangi, Ngāti Whakaue) and James Hudson (Ngāti Pukeko, Ngāti Awa, Ngāi Tai, Tūhoe), a researcher for Auckland Council’s Independent Māori Statutory Board. The summit, a first of its kind, provided a forum to discuss what action was being taken to protect the use of data about indigenous peoples.

Tahu and John Taylor, Emeritus Professor at the Centre for Aboriginal Economic Policy Research at the Australian National University,  have edited the just-released first book on indigenous data, titled Indigenous Data Sovereignty – Towards an Agenda, published by ANU Press.

It’s free to download and provides a comprehensive overview of why indigenous oversight of data is important, focusing largely on Australasia. It’s an interesting read and provides a perspective on data that has been missing for too long.

The local contributors include Darin Bishop (Ngāruahine, Taranaki), team leader of organisational knowledge at Te Puni Kōkiri, the Ministry of Māori Development; Dickie Farrar (Whakatōhea, Te Whānau ā Apanui, Te Aitanga ā Mahaki), CEO of the Whakatōhea Māori Trust Board;  James Hudson, mentioned above; Maui Hudson (Ngāruahine, Te Mahurehure, Whakatōhea), Associate Professor in the Faculty of Māori and Indigenous Studies at the University of Waikato; GP Rawiri Jansen (Ngati Hinerangi); Lesley McLean (Whakatōhea, Te Whānau ā Apanui), tribal database coordinator for the Whakatōhea Māori Trust Board; and leading demographer Ian Pool, Emeritus Professor at Waikato University.

 

 

November 21, 2016

Stat of the Week Competition: November 19 – 25 2016

Each week, we would like to invite readers of Stats Chat to submit nominations for our Stat of the Week competition and be in with the chance to win an iTunes voucher.

Here’s how it works:

  • Anyone may add a comment on this post to nominate their Stat of the Week candidate before midday Friday November 25 2016.
  • Statistics can be bad, exemplary or fascinating.
  • The statistic must be in the NZ media during the period of November 19 – 25 2016 inclusive.
  • Quote the statistic, when and where it was published and tell us why it should be our Stat of the Week.

Next Monday at midday we’ll announce the winner of this week’s Stat of the Week competition, and start a new one.

(more…)

Stat of the Week Competition Discussion: November 19 – 25 2016

If you’d like to comment on or debate any of this week’s Stat of the Week nominations, please do so below!

November 20, 2016

Gained in translation

From a talk  at the workshop on Fairness, Accountability, and Transparency in Machine Learning, via Twitter

she-is-a-nurse

There’s obviously something wrong with these translations, but it’s also hard to do better.

To step back, there has classically been a translation problem where Greek and Latin have separate words for man as distinguished from woman and for man ‘as distinguished from beasts and angels’. It can be quite hard to guess which word was in the original source, if you’re working from the English translation.  This problem has a simple solution, since modern English also has a clear (and increasingly unavoidable) distinction between ‘man’ on the one hand and  ‘human’ or ‘person’ on the other.

This isn’t that problem.  It’s kind of the opposite.

The correct translation of “O bir doktor” is one of “He is a doctor”, “She is a doctor”, and “They are a doctor” and the correct translation of “O bir hemşire” is one of “He is a nurse”, “She is a nurse”, and “They are a nurse”.  Without more context, though, you can’t tell which, and none of them is unmarked or neutral.  “He” and “She” are obviously too narrow, and while singular ‘They” has always been standard English for an unspecified individual, it is only recently standard for a specific individual if they have asked to be referred to that way because of non-binary gender identification.

This is an example where the ambiguities probably have to be put back in by humans, because predictive analytics is unavoidably going to follow the stereotypes. Or, as a new Harvard Business Review article rather optimistically says about the impacts of machine learning:

Using the language of economics, judgment is a complement to prediction and therefore when the cost of prediction falls demand for judgment rises. We’ll want more human judgment.