The Grand Old Stumper
Reader Bryan Davies quoted a poser from Math Overflow
(have we heard of that?) which read:
In a country in which people only want boys every family continues to have children until they have a boy. If they have a girl, they have another child. If they have a boy, they stop. What is the proportion of boys to girls in the country?
Don’t read my answer nor anybody else’s until you’ve had a go.
The question is ill-posed as stated because it includes several unstated assumptions. One set might be, and the one I’ll use, is that the country begins with no kiddies and with n couples, all the same age and who never die during their reproductive years and can reproduce at will and do, always on New Year’s Day, a day of celebration. Further, no babies are killed or die before they are born and none or killed or die once they escape into the wild. Babies are born once per year for every couple until success (a boy!). And no immigration nor emigration.
One last assumption is no genetic engineering or other meddling: forget the kind of things that happen in China and India. Why complicate things? Now, something causes each child to be a boy or girl, but we do not know what this something is in each case. Probability is a measure of information, not of biology. Therefore, given there are only two concrete choices, we deduce from our assumptions the probability (which I repeat measures our uncertainty, not the biology) is 1/2 for boys, same for girls.
So, Year 0, there are no boys, no girls and no ratio neither.
Year 1, the uncertainty in the number of boys will (given our assumptions) follow a binomial, characterized with p = 1/2 and n chances. Pr(0 boys | assumptions) = (1/2)n, Pr(1 boy | assumptions) = n * (1/2)n, Pr(2 boys | assumptions) = (n choose 2) * (1/2)n and so forth. The “(1/2)n” is always there because of a nifty quirk of the binomial with p = 1/2.
The proportion of boys to girls follows right from this. If there are 0 boys, the proportion is 0/n, because there must be, given our assumptions, n girls. The probability of seeing this proportion is the same as seeing 0 boys. And so on for all the other proportions, 1/(n-1), 2/(n-2), etc., except if all boys are born then the proportion is infinite (well, n/0 anyway). Lastly, assuming n is even, the most likely occurrence is (if n is even, or within rounding if not) n/2 boys, giving a proportion of 1/1. This follows because p = 1/2.
Visually (with n = 8), we might have seen this:
With a (for boys) b1 = 3 and (for girls) n – b1 = 5 and thus a ratio of 3/5.
Year 2. Those couples who have had a boy exit the competition, the remainder have another go. The uncertainty in the number of boys in this new crop will again be characterized with a binomial with the same p but with n – b1 chances. Again, the most likely outcome, to our knowledge, is (n – b1)/2 new boys. That’s again because p = 1/2.
This year might have given, say, a b2 = 3, thus
The ratio counts Year one’s b1 boys and n – b1 girls plus this year’s crop, for a total of b1 + b2 = 6 boys and (n – b1) + (n – b1 – b2) = 7. The ratio is 6/7.
Year 3 is a repeat, our uncertainty another binomial but with (n – b1 – b2) = 2 chances. The most likely number of boys is 1. Suppose we see two boys:
The total boys is 8, the total girls 7, for a ratio of 8/7.
If you are mathematically inclined, you will notice this ratio is not 1/1, which (I’m guessing) is the answer the examiner wants. It’s close, though.
The reason it is close is that each year the most likely occurrence, to within rounding, is half boys, half girls from the couples who are still going at it. Adding all those halfs up, as it were, gives half-and-half boys and girls as the most likely final outcome. But this isn’t necessarily the outcome.
We could figure the probability of seeing 8 total boys and 7 total girls easily but tediously enough. It involves calculating the probability of seeing 3 boys and 5 girls in Year one and 3 boys and 2 girls in Year two and 2 boys and 0 girls in Year 3. But then we’d have to figure the other ways (if any exist) to get 8b/7g in three years. Conceptually simple, because each combination follows a binomial with known parameters, but, as claimed, tedious to run through.
Now it could have been, for whatever sized n, that Year 1 saw all boys. We know the probability of this is (1/2)n, which is always greater than 0 for any finite n (which it always will be). Meaning, our knowledge does not preclude an infinite ratio: it could happen, especially with small n.
For large n, we follow the same pattern as above. But eventually two things happen: the kiddies become adults and pair off and begin producing their own children, and eventually the thus created grandpas and grandmas cease their efforts. How many child bearing years does a woman have, after all, assuming she’s pushing out a kid a year?
Ceasing to produce is easy to account for, but figuring the number of new couples is hard, because that depends on—you guessed it—the proportion of extant boys-now-men to girls-now-women. If the proportion of boys and girls is not 1/1, and while this is the most likely it is not certain, then some boys or girls will go marriageless. You then have to assume if they’re going to remain that way (easy), or if the strays can marry the strays which probably will come along the next year (hard, because how long will they remain fecund?).
Now all this is discrete and tedious, but if one has the energy and time it could all be ploughed through. We also have the sense that, because of the symmetries and clear assumptions, that the “system” will reach a limit where the proportions of boys to girls is roughly equal.
It may be equal in any year, but it’s more likely, we guess, to only “near” equal, where we can work out what it means to be near.