September 5, 2012

Lotto and abstract theory

There is a recurring argument in statistics departments around the world about how much abstract theory should be taught to students, and how much actual applied statistics. One of the arguments in favour of theory, even for students who are being trained to do applied data analysis, is that theory gives you a way to substitute calculation for thought. Thinking is hard, so we try to save it for problems where it is needed.

The current top Google hit for “big wednesday statistics” offers a nice illustration.  It’s a website selling strategies to increase your chance of winning, based on a simple message

If you play a pattern that occurs only five percent of the time, you can expect that pattern to lose 95 percent of the time, giving you no chance to win 95 percent of the time. So, don’t buck the probabilities.

For example,

When you select your lotto numbers, try to have a relatively even mix of odd and even numbers. All odd numbers or all even numbers are rarely drawn, occurring only one percent of the time. The best mix is to have 2/4, 4/2 or 3/3, which means two odd and four even, or four odd and two even, or three odd and three even. One of these three patterns will occur in 83 percent of the drawings.

Now, if you understand how the lottery is drawn and know some basic probability, you can tell that this advice can’t possibly work, without even reading it carefully. But if you had to explain the fallacy to someone, it might take a bit of thought to locate it.  If 99% of wins are have a mixture of odd and even (actually, more like 98%), why doesn’t that make it bad to choose all odd or all even?

When you have an answer (or have given up), click through for more:

It’s more likely that a mixture of odd and even will win, but there are many more such combinations, so it’s less likely that the winning combination will be the one you picked.  These factors exactly cancel each other out: there are 177100 all-odd combinations, 177100 all-even combinations, and 15536500 mixed combinations.

The chance of winning with an all-even combination is the chance that an all-even combination wins, times the chance that the one that wins is yours:  (177100/15890700)×(1/177100)=1/15890700.

The chance of winning with an all-odd combination is the chance that an all-odd combination wins, times the chance that the one that wins is yours:  (177100/15890700)×(1/177100)=1/15890700.

The chance of winning with a mixed combination is the chance that a mixed combination wins, times the chance that the one that wins is yours: (15536500/15890700)×(1/15536500)=1/15890700

If you don’t understand how the probabilities work, it might seem  an amazing coincidence that these fractions cancel perfectly, but they have to cancel because we know the final answer must be the same.  Thinking about how it works is harder than just knowing the probabilities are equal.

 

 

 

avatar

Thomas Lumley (@tslumley) is Professor of Biostatistics at the University of Auckland. His research interests include semiparametric models, survey sampling, statistical computing, foundations of statistics, and whatever methodological problems his medical collaborators come up with. He also blogs at Biased and Inefficient See all posts by Thomas Lumley »

Comments

  • avatar
    Anne

    I have a lotto question. It makes sense to me that playing the same numbers always means you have a better chance of winning. If you chose different numbers each time, it means you need to choose the right number at the right time; whereas if you play every week using the exact same number your chance of winning should increase, because you only need that number to come up once a year, at any time. Show the theory to either prove or disprove this statement.

    12 years ago

    • avatar
      Thomas Lumley

      Disprove:

      Most simply, your number choice is independent (causally, and statistically) of which ball comes out of the machine. Therefore your chance of winning is the same either way

      By computation:
      If you pick a different combination each time, then over a year you have 52 combinations (x4 for four lines), each with one chance to win, for a total of 52x4x1/15890700

      If you pick the same combinations each week, you have 1 combination (x4 for four lines), with 52 chances to win, for a total of 1x4x52/15890700

      It’s the same number either way.

      12 years ago

      • avatar
        Nick Iversen

        How do you answer this question easily?

        If you put in two entries in a given week what is the optimal number-picking strategy for the two series of numbers? Should the numbers overlap?

        You can’t answer by saying it doesn’t matter what numbers you choose because you have to share the prizes with other players.

        For example, if you make both tickets identical and you are the only winner then one ticket was wasted. But if you share the prize with someone else then the other ticket becomes valuable.

        Given the possibility of a wasted ticket then it seems to me that the best strategy for multiple entries is to make the entries as disparate as possible.

        Picking tickets at random doesn’t achieve this – ergo there exists a better strategy than picking at random.

        12 years ago

        • avatar
          Thomas Lumley

          There are two issue here. The first is the probability of getting the top prize, the second is sharing with other people.

          If we consider only the division 1 prize, and consider only sharing with yourself, not with other people, you can’t beat a simple random sample without replacement from the possible combinations. That is, choose them at random but drop any exact duplicates within the same draw.

          For sharing with other people, the strategy depends on what they choose, which makes it hard. It’s no longer a trivial probability problem, it’s a non-trivial game theory problem. This is especially true as we don’t have good data and we don’t know what utility function to use.

          You still want to avoid exact duplicates: two identical combinations gives you half the chance of winning as two different combinations, and increases the winnings if you do win by less than a factor of two. (that’s true for expected returns, and even more true for more plausible utility functions)

          I believe simple random sampling without replacement is the minimax strategy, and an evolutionarily stable strategy — that is, if everyone has the same information and is proceeding rationally, you still can’t reliably beat random choice dropping duplicates. In reality, though, lots of people pick their birthdays, so you can do better.

          In this case I would be surprised if you could beat a strategy of developing a list of undesirable features in a combination (eg all under 30), and using simple random sampling without replacement from the remainder. But I can’t prove it.

          12 years ago

        • avatar
          Thomas Lumley

          I should also note that while random sampling without duplicates is better than pure random sampling, the difference is tiny. Your chance of matching 6 balls with a minimum 4-line ticket goes from

          4/15890700 to 3.999999622/15890700

          12 years ago

  • avatar

    Simply put, the rule is get a lucky dip, don’t chose your own numbers.

    12 years ago

  • avatar
    Steffen Klaere

    I like to view playing the lottery as a Poisson process with parameter 1/15,890,700. Then the average waiting time for a division 1 prize is 15,890,700 drawings, i.e. approximately 15,000 years when one considers two drawings per week. As a comparison, Auckland City council estimates the average waiting time for a volcano to erupt in Auckland to be 1,000 years. In Stats 210 we computed that based on this the probability of a volcano erupting in the next 50 years is approximately 5%. Adapting this to the lottery, the chance of a division 1 prize in 50 years is 0.03%. With these probabilities, one shouldn’t worry about optimal choice of numbers, but rather feel very lucky to actually win it.

    12 years ago

    • avatar
      Steffen Klaere

      I do apologize, it is 150,000 years. because even with two drawings a week we still only get 100 draws in a year and not 1,000…

      12 years ago

  • avatar
    Jamie Murdoch

    There are 16 teams in the NRL. Only one team can win the competition each year.

    Assuming all things are equal (which they aren’t in the case of sport) …

    You can either

    (1) back the same team every year and by the law of averages they would win once in the next 16 years.

    or
    (2) Or you could back a different team every year and all averaged out you’d back one winner some time in that 16 years.

    Backing the same team every year would do absolutely nothing to improve your odds (especially if you are like me and are into your 18th season supporting the Warriors :)

    12 years ago

  • avatar
    Paora Yates

    Just a random comment here. I have one rule for picking my lotto lines – always take 2 or 3 of the coldest 17 numbers in each line. You’ll find that covers more than 90% of previous draws and it reduces the possibilities to 2,408,560 (from 3,838,380).

    12 years ago

  • avatar
    Martin Kealey

    While it’s true that removing duplicates only marginally improves the odds of winning the first division prize, it’s not the whole story (just most of it – about 80% of the expected return).

    To a first approximation, I would attempt to eliminate all pairs of games that could potentially simultaneously win something, by not having more than 3 numbers in common between any two games.

    I would also apply this rule to games that other people might be playing in the same draw, such as the “lucky numbers” publishes with some horoscopes.

    12 years ago

  • avatar
    Joshua Dlamini

    I believe that there are a lot of mathematics facts that could aid in cracking lottery predictions. Also, the abilities of Microsoft Excel shouldn’t be taken lightly for instance. Most statisticians make one mistake when calculating lottery predictions: they don’t deduct the history combination out of the probability because it has already occurred.

    If the probability of winning the lottery is 1/1000000, you cannot therefore say you have 1 chance in a million to win it because since the advent of the game the probability was 1/1000000, and as the draws passes by the probability gets lower and lower because certain combinations have been exhausted which form part of the 1/1000000 probability

    11 years ago

    • avatar
      Thomas Lumley

      No, the lottery has no history, as you can tell from how it is drawn, so no combinations get exhausted.

      11 years ago

      • avatar
        Joshua Dlamini

        It does have history because a combination is never drawn more than once

        10 years ago

        • avatar
          Thomas Lumley

          That’s simply not true. Have you watched a televised lotto draw? The machine draws balls out of a rotating drum and that gives the combination. There isn’t any reference to previous draws in the process.

          10 years ago

  • avatar
    mack sazmand

    Do you think that consecutive numbers have lower probability? such as 1,2,3,…

    10 years ago

    • avatar
      Thomas Lumley

      No. Think of how the lotto machine works. Whether it picks a ball labelled ‘2’ isn’t going to be affected by the number on the first ball it picked.

      10 years ago

  • avatar
    Kevin nosworthyk

    Has anyone in here heard of a lotto abstract painting??

    10 years ago