Did Elvis Have an Identical Twin? Probably

Two Elvises

This article is based on an example from the second edition of Think Bayes, forthcoming from O’Reilly Media.

Elvis Presley had a twin brother who died at birth. It’s unknown if they were identical or fraternal twins, but we can use Bayes’s Rule and data from the U.S. Census Bureau to figure the odds.

Here’s how:

  1. First, we need some background information about the relative frequencies of identical and fraternal twins.

  2. Then we’ll use Bayes’s Rule to take into account one piece of data, which is that Elvis’s twin was male.

  3. Then we’ll take into account a second piece of data, which is that Elvis’s twin died at birth.

Step 1: Get the Data

For background information, I’ll use data from 1935, the year Elvis was born, from the U.S. Census Bureau, Birth, Stillbirth, and Infant Mortality Statistics for the Continental United States, the Territory of Hawaii, the Virgin Islands 1935.

It includes this table, which shows the total number of plural births in the United States.

The table doesn’t report which twins are identical or fraternal, but we can use the data to estimate it.

With the numbers in the table, we can compute the fraction of twins that are opposite sex, which I’ll call x.

opposite = 8397
same = 8678 + 8122

x = opposite / (opposite + same)

But the quantity we want is the fraction of twins who are fraternal, which I’ll call p_f. Let’s see how we can get from x to p_f.

Because identical twins have the same genes, they are almost always the same sex. Fraternal twins do not have the same genes; like other siblings, they are about equally likely to be the same or opposite sex.

So we can write the relationship:

x = p_f / 2 + 0

which says that the opposite sex twins include half of the fraternal twins and none of the identical twins.

And that implies

p_f = 2 * x

We can also compute the fraction of twins that are identical, p_i:

p_i = 1 - p_f

In 1935 about 2/3 of twins were fraternal and 1/3 were identical.
So if we know nothing else about Elvis, the probability is about 1/3 that he was an identical twin.

But we have two pieces of information that affect our estimate of this probability:

  • Elvis’s twin was male, which is more likely if he was identical.

  • Elvis’s twin died at birth, which is also more likely if he was identical.

Step 2: Apply Bayes’s Rule

To take this information into account, we will use Bayes’s Rule:

odds(H|D) = odds(H) * likelihood_ratio(D)

That is, the posterior odds of the hypothesis H, after seeing data D, are the product of the prior odds of H and the likelihood ratio of D.

We can use p_i and p_f to compute the prior odds that Elvis was an identical twin.

prior_odds = p_i / p_f

The prior odds are about 0.5:1.

Now let’s compute the likelihood ratio of D. The probability that twins are the same sex is nearly 100% if they are identical and about 50% if they are fraternal. So the likelihood ratio is 100 / 50 = 2.

likelihood_ratio = 2

Now we can apply Bayes’s Rule:

posterior_odds = prior_odds * likelihood_ratio

The posterior odds are close to 1, or, in terms of probabilities:

posterior_prob = posterior_odds / (posterior_odds + 1)

Taking into account that Elvis’s twin was male, the probability is close to 50% that he was identical.

Step 3: More Data, More Bayes’s Rule

Now let’s take into account the second piece of data: Elvis’s twin died at birth.

It seems likely that there are different risks for fraternal and identical twins, so I’ll define:

  • r_f: The probability that one twin is stillborn, given that they are fraternal.

  • r_i: The probability that one twin is stillborn, given that they are identical.

We can’t get those quantities directly from the table, but we can compute:

  • y: the probability of “1 living”, given that the twins are opposite sex.

  • z: the probability of “1 living”, given that the twins are the same sex.

y = (258 + 299) / opposite
z = (655 + 564) / same

Assuming that all opposite sex twins are fraternal, we can infer that the risk for fraternal twins is y:

r_f = y

To compute r_i, we can write the following relation:

z = q_i * r_i + q_f * r_f

which says that the risk for same sex twins is the weighted sum of the risks for identical and fraternal twins, with weights

  • q_i, the fraction of same sex twins who are identical, and

  • q_f, compute the fraction who are fraternal.

q_i is the posterior probability we computed in the previous update; q_f is its complement.

q_i = posterior_prob
q_f = 1 - posterior_prob

Solving for r_i, we get

r_i = (z - q_f * r_f) / q_i

Now we can compute the likelihood ratio:

likelihood_ratio2 = r_i / r_f

In this dataset, the probability that one twin dies at birth is about 19% higher if the twins are identical.

Finally, we can apply Bayes’s Rule again to compute the posterior odds after both updates:

posterior_odds2 = posterior_odds * likelihood_ratio2

Or, if you prefer probabilities:

posterior_prob2 = posterior_odds2 / (posterior_odds2 + 1)

Taking into account both pieces of data, the posterior probability that Elvis was an identical twin is about 54%.

More Reading

This example is from the second edition of Think Bayes, forthcoming from O’Reilly Media. The first four chapters are available now as an early release.

The code in this example is in a Jupyter notebook you can run on Colab.

I learned about this problem from Bayesian Data Analysis.
Their solution takes into account that Elvis’s twin was male, but not the additional evidence that his twin died at birth.

Jonah Spicher, who took my Bayesian Statistics class at Olin College, came up with the idea to use data from 1935 to compute the likelihood of the data.