Not only can you get a good idea of age from a name you can generate names that match age and sex. I have a niece who recently did a science fair project where she used Markov chains seeded with U.S. census data over the last hundred years to create new names. With about 90% accuracy people could tell if a fake name was from 100, 50, or <10 years ago and the sex.
An interesting side note was that she put in a simple profanity filter but in all of her trial runs it never picked up any "fuq" or variant names.
Edit: Here are sample boy names:
Shill
Flay
Roshard
Per
Coll
Milius
Madfrego
Derry
Fer
Fordy
Carlel
Marler
Rommyronance
Jord
Felwooke
Rott
Luper
Bent
Zekin
Othen
Nolanterry
Jerarton
Here are some girl names
Esalessie
Rine
Nolenn
Alynna
Myrtinet
Faybeciline
Aline
Orassabenda
Phina
Dorgia
Lideleaste
Beara
Sonilinn
Judelia
Monangeora
Jarnina
Geleene
Emozellyn
Maudra
Verta
Lortis
Fret
Kathoph
I did specify "fantasy" for a reason. If you're writing fantasy you want your names to sound natural yet unlike anything your reader will seen before. Hence why I guess many names in fantasy novels (I'm thinking Song of Fire and Ice and Wheel of Time) are pretty much a normal name with one or two letters replaced.
I'll keep your link in mind though - for my own writing which is in a non-fantasy setting.
fakenamegenerator.com does have "Hobbit" as a name set. Also the Norwegian name set does generate some interesting nordic names like Rosenvinge, Valgard, etc.
My first name is Aubrey, which completely flipped to a girl's name in the US about ten years ago. According to this chart, the fraction of female Aubrey's is approaching 12% at birth. When that fad wears off, it will make a nice spike in the curve for many decades. By the way, Aubrey means "elf leader" or "king of the elves".
This is one of the ways cold readers hone in on all kinds of things about the person they are reading. It is a very effective way to guess someone's mother's or grandmother's name or sister's name.
if the audience is a group of mostly 30 to 50 year old women the reader has a good starting point. It goes something like "Is there a Laura or Lisa here?" There is a high probability there will be one of those. Once a woman acknowledges their name is Laura the reader can see what her approx. age is and make a guess about what their mother or grandmother or grandfather's name is. They use other cues to figure out which dead relative the woman is there to "hear" from and then say something like "Someone with a M or K is coming forward" if the target reacted to one of those letters the reader guesses "Mar.... Marg... Mary...Margeret... Margeret... Is that your mother?"...
I actually used something similar to this (but not as sophisticated) at a previous startup to generate recommendations of people to invite to the app because the app's target demographic was women ages 20-40.
Baby Name Wizard (linked in the article) is one of the true hidden gems on the internet. It looks like a fluffy website for moms-to-be, but then you start poking around at the graphs and you realize that an hour of your life has disappeared...
I found the age range on "Jennifer" to be particularly interesting.
My sister Jennifer (see http://en.wikipedia.org/wiki/Jennifer_Tilly for details) is in her mid-50s. She was in college before she met another Jennifer her own age. People still are mislead by her name and believe that she has to be a lot younger than she really is.
The moral is that if you have the great fortune to pick a girl's name that will be popular some day but is not now, that girl will probably be happy about it. :-)
I have played poker with Jennifer Tilly a handful of times at wsop. After looking at her Wiki, I realize I was around 15 years off her age. This has less to do with the fact her name is uncommon for her age, and more to do with the fact she doesn't look 55.
Me too. I've met her when she came to Melbourne to play the Aussie Millions several times as well.
On TV she often plays a loveable dumb blonde character. Yet she has/had a blog online somewhere (I'm looking now, cannot find it again) that is well worth reading - she writes like someone who is particularly well read. She is highly intelligent and funny, I wish she had kept it up.
Oh, and she's stunning irrespective of her age. What a wonderful lady.
Yea, I've always liked her! She has a sister too, that looks even better? Certain genetic combinations age well;
she has the right mix. Along with Irish and Mexican.
Don't believe everything you see on the internet! The Irish comes from my father, her step father, and she is not that. And "Mexican" was a common misidentification of Caucasian-Asian mixes before that mix was common. (Remember, at the time she was born, marriage between Chinese and white was illegal in large portions of the country. See "Virginia vs Loving" for more on that.)
Meg is an interesting one too, appearing from nowhere to become popular for ten years from the mid-50s to 60s, then disappearing again. But oh how dreamy she was in the Big Chill.
so according to your username and that wiki page, you are therefore her half-brother ben tilly. your mother is patricia, your father is john ward, and you are from british columbia, specifically texada island.
honestly, i'm not really sure why you would post identifying information on the internet.
There is enough identifying information about me out there that there is no point in denying it. That ship sailed for me many years ago. And sometimes it is convenient to be able to say something and have people realize that I have direct experience with it.
Of course you shouldn't believe everything you read on the Internet. Contrary to what you surmised, I was actually born in California, did most of my growing up in Victoria, British Columbia, and Jennifer is not part-Irish.
Now if you want to get disturbed, go read my sister's book, Singing Songs. None of what is discussed there was stuff that I had any control over, so I have no shame about it. And it is all stuff I've said in public before.
As I said, that ship sailed for me long ago. There is no point in hiding it.
Many posters on Hacker News use identifying profiles, as do I. IMO a very small percentage of comments posted here could be damaging to that person. On the other hand I think that many opportunities can arise from being identifiable here.
fuck you. is this really a meaningful thing for you to do? you really have nothing better to do than trace a username? even if it's all public info you're a piece of shit for tracing it for no reason besides outing because you can. fuck off and die you parasite.
I used to work for an NLP startup, we focused on stuff you could do with Romanized names -- names that were original not written in the Latin alphabet and ended up being written in the Latin alphabet using some kind of transliteration scheme.
For example, we could take a name and generate a pretty comprehensive, and culturally aware, list of variants.
Jennifer -> Jenifer, Jen, Jenny, Jennie, etc.
Richard -> Rich, Richie, Dick, Dickie, Ritchard, etc.
Rho -> No, Lo, Loh, Noh, Roh, Ro, Nho, etc.
The intention of course was to build up lists of name variants that could be used during identification checks.
We also had some pretty significant statistical models that could guess Gender and provide a descending list with confidence levels of the most likely country of origin for a name. It was surprisingly accurate and could account for different Romanization schemes popular in different countries. It could even guess if a name was a surname or a given name.
What did we build the models on? Somehow, one of the founders was able to swing access to U.S. Border Control Data. Even though it was names and country of origin data, it's de-identified (having a list of names doesn't mean we know who the names belong to). There was something north of a billion names in the collection, and included place of birth, country of origin, gender, etc. Names were mined for digraphs so we could build CFGs that could be walked to generate variants. There was lots of manual work as well. Endless regex writing and testing, QA, that sort of thing.
For some countries, we had pretty poor data to be honest. I think we had a couple dozen North Koreans, but for most of the world, our coverage was surprisingly good. It turns out all that work boiled down into a surprisingly small library just a couple dozen megabytes in size and was pretty fast -- I don't remember how fast, but something like a few thousand names per hour. It was pretty niche, but eventually the company was acquired and I went on my way.
I always assumed that technology like that would find its way into more applications, but I'm constantly surprised it hasn't.
>'I always assumed that technology like that would find its way into more applications, but I'm constantly surprised it hasn't.'
Many years ago, I was working on a large project for an organization nothing apparently consistent between half a dozen systems with tens of thousands of users each except names. Naturally, those names were full of exactly the kind of variations you're describing.
When I went looking for a solution to do exactly what you're describing I ran into solutions that were both vague about their functionality and expensive. Like you say, pretty niche - it seemed that everyone was used to selling very specific 'solutions' not a library/API.
I ended up hacking together a very basic script to accomplish the same. It took days to run thanks to my non-existent coding skills, but the accuracy was pretty good.
What it couldn't line up was solved by later decoding and discovering correlations between the long forgotten conventions used for unique IDs in the various systems.
1. Marketers surely have mined this data to the hilt -- cross-referencing these trends with address lists and full-name email prefixes can make targeted promotions a lot more effective.
2. My own name is relatively rare in the U.S. among my age cohort (http://www.wolframalpha.com/input/?i=ian) to the point where some adults had problems pronouncing it when I was in elementary school 35 years ago ("Isn't that a girl's name?"). But I suspect, based on anecdotal evidence and personal observation, that the name is more common in England, Scotland, Australia and Canada. And the Wolfram data shows that it has been growing in popularity for many years in the U.S.
Can we agree to use the plural "their" for ambiguous sex third-person possessive? "His" is sexist, but so is "her", which is distracting on top of that because it isn't conventional.
As someone that's not a native english speaker, can I ask you why "his" is sexist?
A quick read of a dictionary (http://dictionary.reference.com/browse/his) says that his is "the possessive form of he", and the second definition of "he" is "anyone (without reference to sex)". That's also what I got taught in middle/high school.
Sorry if I'm just missing something and this is a stupid question.
Something is sexist when language users think it is. The problem with he/him/his is that the primary meaning refers to males, only. Because of that, anyone reading it gets pushed towards the primary meaning.
Because it leads to a default assumption of a male actor which has often been relied upon to exclude women, especially from career roles. While a writer or speaker may intend a term generically, readers and listeners often infer (or pretend to infer) the gendered meaning.
For example, consider the following headline 'What you can tell about a doctor from the sort of shoes he wears.' You're probably not picturing any women's shoes when you read that.
It can be argued that "right" as in "right side" has good connotations, because it is also the word for correct. In Germanic languages, it is the same word as the word for "higher", which arguably also has positive connotations. Then it can be argued that "right handed" has culturally, and through these languages, been given a higher status than being left handed.
If I could flip a switch, I would make it so that "their" or something similar was a gender-neutral pronoun, so that we can avoid both sounding sexist and sounding awkward. On the other hand, languages probably have a lot of cultural baggage associated with it that are outdated. But these are artifacts of the history of the languages intertwined with the past cultures that used it; we don't tacitly embrace the old connotations that they had simply by using it in this day and age.
The word "left" is also associated with being bad: 'sinister' comes from "sinistra", Latin for left. 'Gauche', or clumsy, is the french word for left. And 'dextrous', another good thing, is from the Latin for right. And this isn't accidental, but because being left-handed used to be considered a sign of evil. Some nice etymology lists here: http://english.stackexchange.com/questions/39092/how-did-sin...
We originally changed it to "their", but quickly reverted it because "her" is relevant to the content here. Would that we could as easily "revert" this dreadful subthread—but we'll content ourselves with marking it off-topic.
Singular "they" is perfectly good, perfectly historic English (there have been countless HN threads on this, with copious citations) and it's only a matter of time till the convention goes back to being generally accepted and eliminates the pronoun gender problem [1]. In the meantime, let's restrain ourselves from having flamewars about it.
1. Which we only have because of meddling 18th and 19th-century prescriptivist grammarians in the first place. Thanks, meddling prescriptivist grammarians!
Can you explain your footnote a bit more? Pronouns are a closed class, so I have a hard time believing some prescriptivists could have changed the way normal people use pronouns.
This turns out to be a little harder to dig up from HN Search than I thought, so I've made a list of some of my favorite links on the topic. If you find any other high-quality ones, please let me know—there are several I couldn't easily find again in five minutes. I know Language Log has had many good posts about it.
There are many memorable details in this history, such as that the first English grammarian to prescribe generic 'he' was a successful female entrepreneur (who ironically was mostly an anti-prescriptivist), and that the name of another was the delightfully apropos Sir Charles Coote.
That is commonplace for non-gendered terms, but in this case the article relates more particularly to women.
I don't think "his" and "her" are sexist per se, though men tend to use the former and women the latter when referring to a theoretical or nongendered person or whatever. "His or her" everyone seems to agree is bulky, and fringe-use alternatives like "zyr" or something don't have nearly the mindspace to suggest as a reasonable alternative.
>though men tend to use the former and women the latter when referring to a theoretical or nongendered person or whatever //
Any evidence to support that assertion? If one isn't comfortable using the neutral pronouns - identical as they are with the masculine pronouns in English - the tendency is to use [singular] "they" or "his or her" IME (anecdotal as that is). I don't find women generally choose to use feminine pronouns more unless they're trying to make a point in doing so.
Example: suppose there is a sentence "Each Cub Scout must build and light a fire in order to gain his backwoodsman badge". People, myself included, will tend towards saying "gain their backwoodsman badge" rather than choosing to say "his backwoodsman badge" or "her backwoodsman badge" according to the speakers sex. Of course some people will also get upset about the gender neutrality of words that end "man".
'Singular they' is common in the UK and quite a few commonwealth countries, as well as in Ireland. While it occasionally gives rise to ambiuity, I certainly find it less confusing than randomly-selected gendered pronouns, which suggests a specificity that is often not present.
On this article, the hedline led me to think that there was something distinctive about the distribution of female names that made it far easier to guess (not tell) someone's age if they were female rather than male.
A better headline that more accurately reflected the actual content, avoided unnecessary gender classification and grammatical ambiguities would have been 'How to guess Americans' ages from their names.'
Using ‘her’ instead of ‘his’ is hardly sexist. That’s completely absurd. If it’s used consciously to make a statement there is no issue, at least not for another few centuries or so.
I merely meant that the sentence should agree with itself. If you want to talk about a woman, write "How to Tell A Woman’s Age When All You Know Is Her Name".
Not sure why you're being downvoted. I'm pretty particular about grammatical correctness but I've come around to thinking that singular "their" is a reasonable approach. I wouldn't necessarily call "his" sexist but, especially when used in the context of certain occupations for example, it does perpetuate a stereotype. And interspersing random "hers" calls attention to itself and is distracting.
There's also precedent with thou and you although that evolution was a bit more complicated and isn't quite the same thing.
That seems like a tenuous argument at best. What I got from the article was that it examined both genders, and that women tended to fall into line with this method slightly more. It would be a different story if it were developed as a method for gauging women's ages, and extrapolated to men - your reasoning would hold then.
Forcing the use of "her" instead of "his" is just as sexist. If we want to improve the mental model of the listener (reader) through the use of language, then we should make a conscious effort to be correct instead of argumentative, and say "his or her" or "their".
It's not sexist. Sometimes you use one pronoun, sometimes another. Or you can say "his or her", or "their", or whatever you feel like for any given situation.
Edit: it's not ambiguous. The person in the title is a hypothetical single person. Whoever wrote the headline decided that the person they made up was feminine.
It's only ambiguous absent other information and the second half of the sentence provides that. So now it is someone whose identity we do not know, but we do know that the someone is female.
I'm British. I know two women called Deirdre. They're both Irish. It seems that the name had fallen out of favour in Britain by the 70s, but was still fashionable in Ireland until at least the 80s.
Since Deirdre is an Irish name, it is possible that the decline in popularity (among the general population) in the UK was at least partly due to increased anti-Irish sentiment in the 1970s.
There is a lot of variation in names by nationality. Lots of Irish names (esp. female names) can be very hard to pronounce, such as Aoife, Oonagh, Caoimhe, Niamh, etc.
Tabitha
http://www.wolframalpha.com/input/?i=tabitha
Rise begins right around the second season of _Bewitched_ when the character Tabitha was introduced; peak (maybe coincidentally) appears to be around the short-lived 1977 spinoff series.
I bet Liam Neeson is the reason for the new wave of Liams, but I can't decide whether the Karate Kid remake is the reason for the wave of Jaydens, or if it's just a coincidence.
Wouldn't we expect the mode to be 75, while the median and mean would be younger? Assuming, of course, that Wizard of Oz produced any sort of lasting increase on the popularity of Dorothy as a name.
I wouldn't expect a lasting increase more of a S curve. Movie comes out, there is a spike in Dorothy usage. A few years pass and the original spike slows down. Then the movie is seen as old fashioned and you start to see a negative trend. Then Dorothy is seen as an old lady's name so it sees an even sharper decline.
I've noticed my name, Bret, spiked in popularity in 1959 and 1982, corresponding nicely with the TV shows 'Maverick' and 'Bret Maverick'.
http://www.wolframalpha.com/input/?i=bret
I was curious about one of the deadest male names, Isadore, so I looked it up. It's of Greek origin and it turns out the female counterpart, Isadora, is the ninth most popular name for baby girls in Chile in 2006. The website linked from the article indicates that it's never ranked in the top 1000 in the US. Interesting how a shared, ancient name could be so wildly divergent in usage.
Kirk Douglas was known as Isadore Demsky when he grew up in Amsterdam NY in the early 20th Century.
Apparently it was a popular male name for immigrants and first generation children in the early 20th Century. It was often shortened to "Izzy."
There was a social trend in America during the middle of the 20th Century to "anglicize" names. For example, I have uncles who changed their birth name in the early 50s from "Wozniak" to "Wagner." Even Izzy Demsky became Kirk Douglas when he grew up.
Let's take for example, the children in "the Godfather" books, The older children have "old country" names (Santino, Fredo) and the younger children have "new world" names (Michael, Connie.) It's almost as if the older kids "Americanize" the family when they go to school.
Anyhow, names are funny things when taken in aggregate.
My immigrant grandparents named my mother an anglicized version of their intended name after pressure from the older children, who said, "In America you say ____, not ____". I think there is probably something to the theory that the family gets more Americanized as the older children are raised in American culture and "correct" some of their parents' old world ways.
The combination of the SSA babynames data, which is very cool and deep on its own, with the SSA actuarial data is pretty neat, partly because I hadn't known about the actuarial data set...but when I saw that the OP had tried to calculate surviving persons of a given name and birth year, I assumed that they just used the SSA's death database...from until at least 2010, the SSA had a list of every SSA person who has died and also, when they were born, and also, their social security numbers. Since the SSN, until relatively recently, was indicative of what state the SSN-holder was actually born...well, that, combined with the babynames-per-state data, could get you very granular calculations...I'm sure the SSA's actuarial table gets it pretty much within an acceptable margin of error, but who knows, maybe some awkwardly named people were doomed to a shorter lifespan? (I'm only half joking, I think)
More broadly there are doubtless correlations to various demographics (income, education level, etc.) that have different life expectancies. Though race is certainly one. (As, obviously, is gender.)
Would be interesting to apply it to a group of friends. Since they're likely to be similar ages, you should be able to get an improved guess from combining the distributions for all of their names.
"The peak year for boys named Joseph was 1914 — when about 39,000 of them were born. Those 1914 Josephs would be due to celebrate their 100th birthdays at some point this year. But only about 130 of them were still alive as of Jan. 1."
Something quite poignant in this. I'd be interested in seeing a life expectancy chart based on name.
I'm pretty sure that would be "a life expectancy chart". It's pretty unlikely that your name has any impact on your life expectancy. But, since name popularity is influenced quite a bit by social/cultural status, and those do affect life expectancy, you'd probably see some differences along those lines.
In case anyone is interested, there have been some studies done where researchers send in two identical resumes. The only difference is that one has a traditionally 'white' sounding name, whereas the other has a name more associated with minorities. The 'white' sounding name performs better in these types of tests.
^ This article gives some more information, including an interesting story about two brothers named Winner and Loser. The most relevant quote, however, comes right at the end:
"The data show that, on average, a person with a distinctively black name—whether it is a woman named Imani or a man named DeShawn—does have a worse life outcome than a woman named Molly or a man named Jake. But it isn't the fault of his or her name. If two black boys, Jake Williams and DeShawn Williams, are born in the same neighborhood and into the same familial and economic circumstances, they would likely have similar life outcomes. But the kind of parents who name their son Jake don't tend to live in the same neighborhoods or share economic circumstances with the kind of parents who name their son DeShawn. And that's why, on average, a boy named Jake will tend to earn more money and get more education than a boy named DeShawn. DeShawn's name is an indicator—but not a cause—of his life path."
(Levitt and Dubner, "A Roshanda by any other name")
This analysis has been done countless times by countless different people, so it seems a little presumptive to attribute it to one organization or author. Name data is available and is something that we all can relate to, so it gets an easy readership.
It's surprising that Jacob isn't one of the top 25 most common male names considering that it's been the most popular male baby name for 14 of the past 15 years.
I assume its the 25 most popular names of the living. But either way I would have expected the most popular male baby name for 14 consecutive years to make the list.
Anyone knows some sort of service or website where you input a particular name and then gives you statistics like the average age of persons with the name given?
Amusing. My brother and I are almost smack on the median of our names. Yet I was named after my father (and his father), my brother after our mother's father.
Baby names were heavily discussed in Freakonomics as indicators of a variety of things. An interesting read if you like this sort of data. Relevant content: http://freakonomics.com/tag/baby-names/
Yes- Modern baby names are status symbols and are treated like a branding exercise by many helicopter-parents. Its both topical and relevant to the world we live in. As is the analytic exercise of extrapolation from incomplete but meaningul sub-sets of otherwise random-seeming data. Its the data-equivalent of found-object art, another past-time of the 20th century aspiring middle-classes.
My name is Sebastian, which was extremely popular in Germany in the early 80s (not once was I the only Sebastian in the classroom). In the USA people would now imagine a small child when hearing my name. It's very interesting how different popular names are in different countries.
Was that perhaps influenced by the name "Bastian" for the main character of Ende's 1979 book "The Never-Ending Story" (Die Unendliche Geschichte in the original German)?
That's very much culture / language / country specific. Naturally societies tend to have certain preferences in names in different time periods. But those only a tendencies, not a set in stone set of names.
Amazing that the oldest male names do not include biblical Old Testament names but the youngest male names do! A sign of increasing religious fundamentalism?
An interesting side note was that she put in a simple profanity filter but in all of her trial runs it never picked up any "fuq" or variant names.
Edit: Here are sample boy names: Shill Flay Roshard Per Coll Milius Madfrego Derry Fer Fordy Carlel Marler Rommyronance Jord Felwooke Rott Luper Bent Zekin Othen Nolanterry Jerarton
Here are some girl names Esalessie Rine Nolenn Alynna Myrtinet Faybeciline Aline Orassabenda Phina Dorgia Lideleaste Beara Sonilinn Judelia Monangeora Jarnina Geleene Emozellyn Maudra Verta Lortis Fret Kathoph