November 16, 2015

Measuring gender

So, since we’re having a Transgender Week of Awareness at the moment, it seems like a good time to look at how statisticians ask people about gender, and why it’s harder than it looks.

By ‘harder than it looks’ I don’t just mean that it isn’t a binary question; we’re past that stage, I hope.  Also, this isn’t about biological sex — in genetics I do sometimes care how many X chromosomes someone has, but most questionnaires don’t need to know. It’s harder than it looks because there isn’t just one question.

The basic Male/Female binary question can be extended in (at least) two directions.  The first is to add categories to represent other ways people identify their gender beyond just male/female, which can be fluid over time, or can have more than two categories. Here a write-in option is useful since you almost certainly don’t know all the distinctions people care about across different cultures. In a specialised questionnaire you might even want to separate out questions about fluid/constant identity from non-binary/diversity, but for routine use that might be more than you need.

A second direction is to ask about transgender status, which is relevant for discrimination and (or thus) for some physical and mental health risks.  (Here you might want also want to find out about people who, say, identify as female but present as male.) We have very little idea how many people are transgender — it makes data on sexual orientation look really precise — and that’s a problem for service provision and in many other areas.

Life would get simpler for survey collectors if you combined these into a single question, or if you had a Male/Female/It’s Complicated question with follow-up questions for the third group. On the other hand, it’s pretty clear why trans people don’t like that approach. These really are different questions. For people whose answer to the first question is something like “it depends” or a culturally specific third option, the combination may not be too bad. The problem comes when answer to the second type of question might be “Trans (and yes I sometimes get comments behind my back at work but most people are fine)”, but the answer to the first “Female (and just as female as people with ovaries and a birth certificate, ok)”.

Earlier this year Stats New Zealand ran a discussion and  had a go at a better gender question, and it is definitely better than the old one, especially when it allows for multiple answers and for a write-in answer. They also have a ‘synonym list’ to help people work with free-text answers, although that’s going to be limited if all it does is map back to binary or three-way groups. What they didn’t do was to ask for different types of information separately. [edit: ie, they won’t let you unambiguously say ‘female’ in an identity question then ‘trans’ in a different question]

It’s true that for a lot of purposes you don’t need all this information. But then, for a lot of purposes you don’t actually need to know anything about gender.

(via Writehanded and Jennifer Katherine Shields)

avatar

Thomas Lumley (@tslumley) is Professor of Biostatistics at the University of Auckland. His research interests include semiparametric models, survey sampling, statistical computing, foundations of statistics, and whatever methodological problems his medical collaborators come up with. He also blogs at Biased and Inefficient See all posts by Thomas Lumley »

Comments

  • avatar
    Megan Pledger

    I participated in that group and presented a potential question because if the people who generally answer a gender question without a second thought have even a slight error rate (or get bolshie) than it’s going to swamp the gender diverse group.

    I actually thought the question shouldn’t be put to anyone under 18, (maybe u16), because they are probably still living at home and might have problems if they answer honestly (as they are supposed too) and family members see their answer. (And the answer is potentially unstable anyway in the younger age groups.)

    An 18 year old can probably tell their parents that they don’t want them to see their form but a 14 year old can’t.

    9 years ago

    • avatar
      Thomas Lumley

      Error rate, either accidental or deliberate, is a definitely a problem, as it is for sexual orientation.

      9 years ago

  • avatar
    John Egan

    Well said Tom. At the university the new default is Male, Female, X,or Other. The X one strikes me as strange, since I’ve never met anyone who identifies as X. I do know a lot of people who identify as trans or gender fluid. But a fafa’fine, takatāpui, or FTM all have different experiences. However, on surveys having 12 options doesn’t, from my experience, get much better usable data: the numbers are often so low in each, they get aggregated regardless.

    9 years ago

  • avatar
    David Hood

    I have encountered, possibly slightly tongue in check, a bit of mourning that peoples nice easy population pyramid graphs are going to go away. I just want to say:

    * 3D graphs are generally bad

    * Pie graphs are bad

    * population pyramid graphs are actually a 3D graph where they are made by a binary nominal variable access rotated through the third dimension. It is like a histogram crossed with a pie chart :) In fact, since there clearly must be a z-axis that it is rotating through, if you imagine it looking down on a pie graph where the two segments are not necessarily the same radius.

    Conclusion: don’t mourn the population pyramid, embrace the facet graph.

    FWIW I work a lot with historic data (reconciling 140 years of how people described their religion, hurrah) but would much rather have a source that reflects the true complexities of society than official classifications. I am reminded of last centuries discussions around what “race” was, see “Stories of old computer risks” in https://catless.ncl.ac.uk/Risks/6.50.html

    9 years ago