Consider the following game: I write down a random real number between 0 and 1, and ask you to guess it. What’s the probability that you guess it correctly? The answer is zero. You might wonder: “But it’s *possible* for me to guess the correct answer! That means that the probability has to be more than zero!” and you would be justified in wondering, but you’d be wrong. It’s true that events that are impossible have zero probability, but the converse is not true in general. In the rest of this post, we show why the answer above was in fact zero, and why this doesn’t need to do irreparable damage to your current worldview.

Let’s begin by showing that the probability that you guess my number right is zero. Let be the probability in question. The idea is to show that for any positive real number . We know that , and if it’s smaller than ANY positive number, then it has to be zero! The argument is as follows. Let’s call the number I randomly picked . Imagine that the interval is painted white. Pick any positive real number . Then there is an sub-interval of length within the interval containing . Imagine that this sub-interval is painted black, so now we have a black strip of length on the original white strip, and the number I chose was in the black strip. What’s the probability that your guess lands on the black strip? It has to be , since that’s the proportion of the white strip that is covered. But in order for your guess to equal my number , it has to land in the black strip, so your probability of guessing can’t be larger than the probability of guessing a number on the black strip! Therefore .

You should now be convinced that this event indeed has zero probability of happening, but it’s still true. This phenomenon is because of the following geometric fact: **it’s possible to have a non-empty set with zero “volume”.** The term “volume” depends on the context; in the case of the point on the interval, “volume” is length. The probability of an event measured on the interval is equal to its length, and a single point on the interval has zero length, yet it’s still a non-empty subset of the interval! Probability is basically a measure of “volume” where the entire space has “volume” equal to 1. By defining probability in this way, we can prove all kinds of neat facts using something called **measure theory**.

To recap, you should have learned the following from this post:

- The probability of randomly choosing a specific number in the interval is equal to zero
- Events that have zero probability are still possible

## 17 comments

Comments feed for this article

July 14, 2011 at 4:30 pm

j2kunNow wait just a minute. You can’t possibly write down a truly random number between 0 and 1, because it would not have a finite description with probability 1!

I think the real distinction is that in our minds, where the pure mathematics lives, probability zero does imply impossibility. On the other hand, if we tried to implement that situation in real life, we would necessarily break the rules of the game: the set of numbers you’d pick from would be finite, bounded by your toleration for its description length.

Nice proof though, the colored intervals make things very easy to envision, and hence obvious.

July 14, 2011 at 5:15 pm

Alan GuoGood point. The issue of sampling a truly random real number is tricky. In real life, finiteness conditions pretty much make sure that this kind of stuff doesn’t happen. For example, you can’t pick a positive integer uniformly at random, since that would force each integer to have probability zero, but since there are countably many integers, the probability of their union is the sum of their probabilities, but this needs to equal 1!

July 14, 2011 at 5:29 pm

j2kunHmm… well I suppose you could construct a truly random number with an infinite sequence of coin flips, determining the binary decimal expansion of the number (i.e. if the ith flip is heads, then the ith digit is a 1, otherwise 0). I’m pretty sure that’d produce a uniformly random number between 0 and 1 under the limit of the sequence, but again, you can’t do it in real life.

Speaking of coin flips, I wish I knew more about probability theory. They say that every fact about probability can be deduced as facts about the probability space of coin flips. I guess that’s because every probability space can be built up from a bunch of spaces of coin flips… Like I said, I wish I knew more.

December 5, 2011 at 11:47 pm

Ladislav MecirYou are stating, that zero probability events are possible. Here is a counterproposal. I know, that if you write down a real number between 0 and 1, there is a probability 1 that it will be irrational. So, that is an event that is almost sure! If you are an expert of picking numbers at random, did you ever succeed to obtain such a result?

January 14, 2012 at 11:43 pm

AnonymousDoesn’t the existence of non-measurable real sets (assuming the axiom of choice) also imply that there’s no such thing as a ‘random’ real number?

January 25, 2012 at 5:57 pm

Allan F. RandallI am not convinced. Proofs like this seem to me inherently fishy, and not just because we can’t do it “in real life”. After all, if your probability theory results in “zero probability” not implying “impossible”, then so much the worse for your probability theory. What else could “zero probability” mean? It flies in the face of any reasonable sense of the word “probability”.

The real answer, I think, is that your problem is ill-framed, since you haven’t clearly defined a “picking procedure” when you talk about picking a random real. Any clearly-specified picking procedure will, I suspect, reduce your proof about zero probability into a proof about probability going to zero in the limit, which is something I can actually make sense of. j2kun’s picking procedure is, for instance, well-defined, but only produces a real number in the limit.

March 28, 2012 at 7:03 pm

Jonathan WillifordActually, the Alan’s claims are correct. Wikipedia has a pretty good article about it: http://en.wikipedia.org/wiki/Almost_surely. It mentions the coin flip example as j2kun points out and it also gives a “throwing dart” example which doesn’t depend on limits.

June 11, 2012 at 6:46 pm

Allan F. RandallThe dart example only avoids dependence on limits by replacing them with hand-waving. As in the other examples, no precise “picking procedure”, of any sort, is defined. The whole argument is based on the dubious idea that a dart actually has an infinite number of points to choose from, and that it picks one of these out “at random”. What does that mean? I can imagine what it might mean… but only by appealing to limits, in which case, there is no notion to be found here of “zero probability without impossibility”. Perhaps it can be explained in a rational and understandable way, without using limits, but the wikipedia article certainly makes no attempt to do so.

June 11, 2012 at 6:58 pm

j2kunI agree that the dart-throwing example is not well-defined, since it depends on our ability to measure where the dart lands, which is also inherently finite. Already the article begins with mathematical abstraction of a “unit square” (also doesn’t exist in the real world) and saying that the dart hits at exactly one point (points don’t exist in the real world either, and the head of the dart has positive surface area).

I think a big problem here is that probability does not provide anything to talk about the randomness of a particular event, such as the dart hitting a specific location.

Going back to the post’s topic, I think it’s reasonable to give a large finite upper bound on the amount of time one has to pick a number. In this case, there are only finitely many real numbers one can write down, and hence a positive probability that someone will guess it.

June 14, 2012 at 7:51 am

Allan F. Randallj2kun: “I think a big problem here is that probability does not provide anything to talk about the randomness of a particular event…”

Should we even expect it to? I think the problem is in thinking that a truly ‘particular’ and ‘individual’ event can even have a probability all on its own, in the first place. Giving a particular event a probability surely only makes sense in the context of a larger ensemble of possible events.

June 11, 2012 at 10:50 pm

Alan GuoThe point is that probability theory based on measure theory does not depend on any “picking procedure”. The dependence on a picking procedure is a flaw of past attempts to create a theory of probability. You get into philosophical issues about the interpretation of probabilities, even for finite sets. What does it mean for a coin to land heads up with probability 1/2? Does it mean that 1/2 of the time, it lands heads up? 1/2 of how many times? What does it mean to uniformly pick a number between 1 and n? How do you generate this random number? The point is that a probability distribution is simply an assignment of positive numbers to a set which sum to (or integrate to, if the set isn’t discrete) 1. You don’t talk about picking procedures because as far as we know, there could be no such thing as a truly random procedure in the physical world (and do you really want to axiomatize mathematics based on the physical world?)

June 14, 2012 at 7:39 am

Allan F. RandallOkay, I’ll bite:

“What does it mean for a coin to land heads up with probability 1/2?”

That, since there are TWO possible outcomes, of which ONE fits the category of “heads”, the probability of heads is 1/2. Proviso: the notion of “possibility” must be precisely defined for this scenario with a mathematical model (not hand-waving) – what I have called a “picking procedure”… e.g., a computer program that produces all the possible outcomes, and another program that examines these results and classifies them according to whatever categories we have established (in this case, head-ness and tail-ness).

“What does it mean to uniformly pick a number between 1 and n?”

It means that our model of the picking action is a computer program that produces the numbers 1 to n. Proviso: all subsequent actions, if any, by our program must keep the n different computation streams separate from each other (they must not share data once they have “chosen” different results).

“How do you generate this random number?”

If by ‘you’, you mean an actual person, then the above model needs to also model this person, call him ‘Alan’. Then you need another program that examines the n results and categorizes them according to Alan-ness.

“You don’t talk about picking procedures because as far as we know, there could be no such thing as a truly random procedure in the physical world”

Quantum mechanics at least appears to say otherwise.

“You get into philosophical issues about the interpretation of probabilities…”

Are you suggesting that an (at least apparent) contradiction like “non-impossible zero probabilities” does NOT get us into philosophical problems?

June 14, 2012 at 8:49 pm

Alan Guo“It means that our model of the picking action is a computer program that produces the numbers 1 to n. Proviso: all subsequent actions, if any, by our program must keep the n different computation streams separate from each other (they must not share data once they have “chosen” different results).”

This doesn’t at all produce a uniform distribution. What I was raising as an issue is that fact that even your computer program doesn’t have a way to sample from the set {1,…,n} UNIFORMLY. It’s not at all clear that you can produce a truly uniform distribution with a computer (if you think so, write a program that generates a truly uniform number between 1 and n). This is why probability distributions are formally not specified by procedures but by a measure function on the probability space.

The fact of the matter is that when sampling from a continuous distribution, as opposed to a discrete distribution, the probability of your sample being equal to a specific number is zero, but the probability of it falling within an interval is positive. This is not a fact about the physical world or anything like that — it just follows from how we have chosen to axiomatize probability.

You may object to the fact that “sampling from a continuous distribution” is not well-defined, because you can’t do it with a computer in finite time. But the whole point of the more general notion of “sampling” is that it’s not limited by computer procedures. In the same way, we know that real numbers exist even though almost all of them cannot be written down by a computer procedure. And how much is “almost all”? It’s 100% of them, even though it’s not all of them. In other words, THERE EXIST real numbers that can be written down by computers (for example, the integers) but these numbers constitute 0% of the set of real numbers (and this is precisely the whole point of the post — a set having measure zero is not necessarily the empty set).

August 21, 2012 at 12:03 pm

Tenth Linkfest[…] Guo: “Impossible” vs. “Zero Probability”, Error-correcting […]

October 23, 2012 at 5:46 pm

IsaacToastThanks for the good article! I had some hard time understading your argument until I re-read “ANY” in “if it’s smaller than [ANY] positive number, then it has to be zero”. Laymen like me would have easier time if you kindly emphasize the phrase, “smaller than ANY” :)

October 23, 2012 at 5:49 pm

Alan GuoThanks for the comment! I’ll change the article to emphasize “ANY”.

July 6, 2013 at 9:01 am

Stephen TashiroThe rigorous mathematical answer to whether events with probability zero are impossible (or not) is that the mathematical theory of probability does not comment on this question because it does not define the concept of a “possible” or “impossible event” in the physical sense. It doesn’t even comment on whether it is actually possible to take a random sample.

Elementary non-rigorous books on probability speak in terms of making observations and taking random samples, but the rigorous theory of probability only defines distributions for such terminology. It does not assert or assume the possibility of taking a random sample.

The question of whether a zero probability event is possible is an interesting philosophical debate, but it is not addressed by the measure theoretic approach to probability. Attempting to explain the relationship between zero probability events and impossible events using measure theory is merely proposing one intuitive way to interpret measure theory, not what measure t heory actually asserts.