Age Difference Between Leading Actors and Actresses

Much has been made of the age gap between leading men and women.  Male romantic leads are often cast opposite much younger females.  And it is often difficult for actresses to find roles after age 40. While I suspect this gender gap has improved in the past 50 years, I decided to see if it is still alive and well.  I built a spider to scrape age data for the top 5,000 actors and actresses, as ranked by IMDb "starmeter."

For the 500 most popular actors and actresses on IMDb, the average actor is age 40.77 and the average actress is age 33.39.  Expanding to the top 5,000 actors and actresses on IMDb, the gap narrows.  Here the average actor is 44.74 and the average actress is 39.47.

Not surprisingly, star popularity is correlated to age.  The top 50 actors are the youngest, and possess the largest age gap between men and women.  For the top 5000 actors, the average age is older, and with a smaller age gap.  This chart shows how the age gap narrows as popularity decreases.

Another interesting phenomenon is that 60% of the top 500 most popular stars on IMDb are female.  This trend holds true for the top 100 most popular stars, as well as for the top  50.  Put another way, only 40% of the 500 most popular stars on IMDb are male.

However, when the sample size is stretched to include the 5,000 most popular stars, women equal men almost exactly (50.59% - 49.41%).

So why are women more likely to have high starmeter ratings?  Amazon's starmeter algorithm is a measure of what people are searching for.  A glance at the IMDb message boards suggests IMDb's userbase is disproportionately male.  So it could very well be that men search for their favorite actresses at a higher rate than women search for their favorite actors.  Thus, actresses may have a slight advantage in obtaining top "starmeter" rankings on IMDb.

"95% of Income Gains to the Top 1%" is a Misleading Statistic

"Obama admits 95% of income gains go to top 1%."

- CNN, September 15, 2013.

Chances are you've seen articles like this pop up in your Facebook news feed, along with angry commenters calling for revolution and storming the Bastille. The statistic conjures images of Mark Zuckerberg and Oprah laughing maniacally as they mug the poor. But take hope...

The 95% statistic is wildly misleading.

And if you put down your pitchforks and torches for a minute, I'll explain the major fallacies behind this statistic.

First, Some Background...

"During the 1980s, the wealthiest 1 percent of Americans got 70 percent of the income gains." - Bill Clinton and Al Gore, Putting People First, 1992.

This statistic helped Clinton beat Bush Sr. in the 1992 election (see page 77 of Alan Reynold's "Income and Wealth" for a thorough smackdown of this erroneous statistic). Awkwardly for Clinton, those same one-percenters made 45% of income gains throughout Clinton's tenure[1].

Bush Jr.'s presidency saw income gains continue to accrue to the nation's richest. Dubyah raised this 1% statistic in the run-up to the Obama-McCain presidential election [2].  In fact, this handy 1% statistic comes up in every election cycle. It even birthed the ninety-nine percent political movement in 2011.

So who are these pesky one-percenters?

The Non-Enduring Class Fallacy

The term "one-percenter" is a non-enduring class fallacy. There is no static class of individuals earning top 1% income gains year-over-year.

In fact, the smaller the percentage we choose, the larger the inaccuracy. Even if we make statements about 100% of Americans, we are not talking about the same individuals each year. People are born, die, or move away. Marilyn vos Savant, who holds a Guinness record for the highest recorded I.Q., made this point in her 1996 book "The Power of Logical Thinking." Berkeley produced the 1% study that currently has Obama in such hot water. But when the IRS and CBO present Berkeley with their raw income data, Berkeley does not get individual names of income gainers. There is no way to track who is in the 1% year-over-year.

So who are those 3.13 million people in this year's top 1%? Are they all palm-rubbing Goldman Sachs partners in $3,000 suits?

Probably not.

And it's not all athletes, artists, and entrepreneurs either.

Imagine a man selling the family farm to pay medical bills. Or an exonerated convict winning a legal judgement. Or a struggling screenwriter selling a script after a decade of waiting tables. The point is we don't know. This year they may be one-percenters. Next year they may be ninety-nine percenters.

So who's making all the money?

Only 63% of Americans are in the workforce. So roughly half the population makes all the money.

Also consider that people at the peak of their careers, age 54 - 64, have the highest income. Hey wait, 12% of the population is making most of the money! That's unfair! Oh wait, no. This makes total sense.

But let's get to the bigger fallacies...

Decile Analysis is Wildly Misleading

The Berkeley study cites decile analysis, which economists use to study income gains. The problem is, decile analysis can be used to say pretty much anything.

To understand why decile analysis is so clunky, consider the ten fictional households of Stokesville:

 

Let's say the household earner in Decile 9 launches her singing career. She signs a $200,000 recording contract! This poor Decile just made a lot of money, right?

Wrong.

The richest decile did...

 

...Because our singer jumped to the richest decile.

So the highest decile made a 100% income gain. And the poorest decile made no income gain whatsoever.

And now a journalist can claim the highest income decile made a 100% income gain at the expense of the poor.

Darn those wealthy people for making all the money!

 

But this is wildly misleading!

Yes. And this is the methodology of the Berkeley study and all other income distribution studies.

And it gets much, much crazier. Consider the following scenario in Stokesville:   Everyone in Stokesville receives a 100% raise. Plus ten new jobs are created for the bottom five deciles!   Everyone wins, right?   Wrong again. According to decile analysis, the top half of Stokesville received 100% of the income gains while the bottom half received zero percent!

And despite getting equal raises, the top decile's income grew 95%.  More than any other decile.  Gains always accrue to the top decile.

But there are more fallacies to Berkeley's one percent study...

The Biased Sample Fallacy

Why does the Berkeley study only choose the time period of 2009 - 2012? According to their own numbers, the top 1% took 75% of the income losses during the recession of 2007 - 2009. So what are the cumulative numbers from 2007 - 2012? Did the wealthiest 1% only earn 20% of the income gains over that full period? Suddenly this news headline is a lot less sexy.

The wealthiest suffer more when the stock market crashes (2007 - 2009) and gain more when the stock market rises (2009 - 2012). The Dow Jones rose nearly 60% from 2009 - 2012 (see chart). Berkeley's choice to only report the income gains of one-percenters during a massive stock market run seems like a biased sample.

And now we get to the main point...

The Median-Mode Fallacy

Consider the following problem:

1) 9 people in Stokesville are 5 feet tall 2) 1 person in Stokesville is 6 feet tall

Therefore, the "average" height in Stokesville is 5 foot 1.

So 90% of the population is below average?

Now imagine a person moves to Stokesville who is 1 million feet tall. Suddenly, everyone is 100,000 feet below average. This is what happens when you introduce a billionaire into an economy…

The Billionaire Dilemma Imagine Stokesville has a total population of 1,000 millionaires. Plus one Warren Buffett (net worth ~ $60 Billion).

The stock market rises 10%. The 1,000 millionaires made $100,000,000! A good year!

But Warren Buffett made $6 Billion. So 98% of the income gains went to the top .001%.

Note the zero-sum fallacy. Everyone in Stokesville is wealthy. Everyone's net worth increased 10%. But a politician can argue that Stokesville is economically unhealthy because the uber-rich are taking 98% of the pie. Gains in the wealthy do not equal suffering in the poor.

Billions and billions…

Adding billionaire outliers to an economy kills income gain analysis. And we are fortunate to live in a country with 442 billionaires and counting. In 2007, before the financial crisis, America boasted 16,600,000 millionaires. That's 5.3% of the U.S. population. In 2007, an American had a one-in-twenty chance of being a millionaire. Even after the financial crisis, America has more millionaires than any other country.

As long as we have great innovators like Elon Musk, Jeff Bezos, Sergey Brin, and Larry Page, then we are going to have billionaires. This is great news. And yes, it will throw off our income gain statistics. It will destroy normal distribution curves and create wonky income studies. But the successes of the wealthy do not necessarily come at the expense of the poor.

Birthday Statistics

Each spring I'm hit by a deluge of birthdays to attend.  The deluge tapers off in July.  This made me curious: are my friends more likely to have spring birthdays?  I did some digging and found the answer: overwhelmingly, yes. First, the control group.  Here are average US birthdays by month (2009 census):

As one would expect, US birthdays average 8.33% per month (as 100% divided by 12 months = 8.33%).  Now, below are my friends' birthdays by month:

Fully 25% of my friends are born in May and June.  And three of my friends share my exact birthday, June 18th.  When you consider the math of the Birthday Problem, this seems unlikely.  What are the odds of four individuals in a set of 167 sharing the exact same birthday?

By way of control group, only two of my other 167 friends share the same birthday with each other.

DATA SET

To obtain the data above, I took my total set of Facebook friends and parsed 180 that I feel a genuine connection with (as many Facebook friends are acquaintances).  Of 180 friendships, I was able to scrape birthday data for 176 of them.  Creating the chart above.

THE BIG QUESTIONS

Why am I nearly twice as likely to have a friend born in the spring than the summer?  Why am I nearly three times as likely to have a friend born in June as a friend born in January?

Is this random chance or do other people notice trends among their friends as well?

THE SCIENCE OF BIRTHDAYS

Turns out, science has spotted many birthday correlations, none of which are properly understood.  For instance, children with autism are 16% more likely to be born in winter months. 1 Spring babies are at a 17% higher risk of suicide.2 A mother's exposure to sunlight (read: vitamin D levels) during gestation may be a significant factor in fetal development. For instance, both MS and Schizophrenia are strongly correlated to babies who came to term during winter months, or in northern latitudes with lower levels of sunlight.1 2 If vitamin D can have such a marked effect on fetal health and development, is it possible that brain and personality may be effected as well? Since photoperiodism can effect the brain chemistry of adults (fully 10% of Alaskans suffer from Seasonal Affective Disorder), can daylight itself be a factor?

Other bizarre birthday statistics:

* US teen mothers are more likely to give birth in January than any other month 3 * February babies have a higher likelihood of narcolepsy 4 * Pilots are more likely to be born in March 4 * People with autumn birthdays have the longest lifespans; spring birthdays have the shortest. A person born in October will outlive a person born in March by an average of 215 days.4 * June and July babies consistently have the highest likelihood of short-sightedness4 *September babies get the best grades and test scores in school.4

Conclusions

I think astrology is malarkey. But is it possible that birth month affects personality? Is my statistical sample of 167 friends simply too small to be meaningful? It is interesting to me that among my very best friends, spring babies are still over-represented, with a distribution mirroring the chart above. Possibly science will begin to formulate explanations for the statistical correlations between birth date and personality, health, and aptitude.