Wonderment Rank
The other day, I idly wondered about the band-aids or surgical tape visible on Michael Jackson’s fingertips. I took out my smartphone, and started typing in the query “why did michael jackson wear bandages on his fingers“. As I typed the first few words, the search box filled with a slew of Google search suggestions:
Why did michael jackson die?
Why did michael jackson turn white?
Why did michael jackson change his nose?
Why did michael jackson bleach his skin?
etc…
It occurred to me that the number of questions like these would be higher for some celebrities, like Michael Jackson, than for others. I decided to try to measure this, so I wrote a little script that measures the “Wonderment Rank” of various people and things.
You give the script a word or phrase, such as “michael jackson” or “kittens”, and it constructs a series of partial search queries:
why did ____
why does ____
why do ____
how come ___
why didn’t ___
why doesn’t ___
why don’t ___
It then counts the total amount of Google search traffic for all these partial phrases, using this undocumented API and reports the results. Wonderment Rank is reported as a single number, which represents millions of searches.
For the record, here is the Wonderment Rank of Michael Jackson, along with a few other celebrities.
whitney houston | 2393.62 |
steve jobs | 1112.22 |
the beatles | 1344.29 |
lady gaga | 1095.69 |
michael jackson | 960.31 |
justin bieber | 621.47 |
angelina | 176.97 |
jack black | 119.30 |
ben franklin | 50.82 |
davy jones | 0.40 |
sacha baron cohen | 0 |
As you can see, the recently deceased tend to score highly (except for Davy Jones). Some celebrities, such as Sacha Baron Cohen, score nary a blip, which leads me to believe that Google has some kind of arbitrary cutoff in reporting results.
The results (and my methodology) suggest that Wonderment Rank is strongly correlated to overall search popularity. It’s not the same thing, however. Consider kittens. Google Trends reveals that “kittens” have roughly twice the search traffic of “jack black”, but kittens merit a relatively low wonderment rank of 21. Yes they are cute and playful, but not exactly mysterious.
kittens | 21.80 |
puppies | 134.74 |
Celebrities and kittens aren’t the only things people wonder about. They also wonder about politics.
republicans | 2286.87 |
democrats | 1277.82 |
politicians | 1612.73 |
romney | 4374.55 |
ron paul | 2320.61 |
gingrich | 2110.03 |
obama | 1379.45 |
santorum | 733.16 |
And we all wonder about existential questions. We wonder about the motivations of God, and about Jesus. We wonder about life, death, taxes, and why do birds suddenly appear?
life | 9238.58 |
jesus | 6199.19 |
god | 4560.73 |
death | 2247.20 |
tax,taxes | 1813.87 |
birds | 1145.98 |
Men wonder about women. Women wonder about men, but not to the same degree. As a man, I expected women to score more highly, but apparently, I was wrong:
men | 10774.63 |
women | 6900.09 |
Parents wonder about their kids:
kids | 12749.90 |
boys | 10196.22 |
girls | 8862.12 |
babies | 3068.72 |
And people wonder about anybody, anyone, someone, and nobody:
no one | 14135.55 |
anyone | 11386.79 |
nobody | 2663.10 |
anybody | 4845.46 |
someone | 3743.05 |
For the nerds in the audience, here is the perl script I wrote to measure wonderment rank, and here is a version I made in ruby.
Over the next few days, I’ll post a few more results.
UPDATE: A few days after I wrote this, Google stopped providing search-query counts in their suggest API. This change makes this metric much less precise, although the API can still be used to detect some level of interest.
EDIT: I corrected the spelling of Sacha Baron Cohen – thanks Clive!
March 1st, 2012 at 3:21 pm
Done a scatter plot of wonderment vs total searches? Interesting cases should lie off the line of correlation the bulk of the cases should fall on…
March 1st, 2012 at 3:24 pm
Yeah, this makes a lot of sense.
March 1st, 2012 at 3:56 pm
it’s probably worth pointing out that this is an amusing novelty, but not actually meaningful – the results being reported by Google tend to fluctuate wildly and can’t be depended upon to be accurate.
March 1st, 2012 at 4:39 pm
We were wondering about that…
March 1st, 2012 at 5:29 pm
I’ve quickly hacked up http://dl.dropbox.com/u/5343094/scatterplot.png based on a list of top 100 ghastly celebrities of 2011 and a rather dubious list of “ten most controversial people.”
Working is at http://dl.dropbox.com/u/5343094/Wonderment.zip
March 1st, 2012 at 5:34 pm
Cool, thanks for this! Unclear from the limited IDs how freaky the ones above the line are. Might be good to show thumbnail portraits instead of dots.
March 2nd, 2012 at 12:29 am
Some sort of celebrity-o-plot? I did think that, but there is a confounding factor: when I did the names of senators i got one guy who was streets ahead on wonderment rank, but he shared his name with about half a dozen athletes, too. I also feel slightly sorry for Ray Romano, whoever he is.
March 20th, 2012 at 1:55 pm
Hi Jim,
Interesting Stats. Just wanted to mention that ‘Sasha Baron Cohen’ is actually ‘Sacha Baron Cohen’ therefore would be interested in knowing the combined stats (although probably will make no significant change in the rankings), As an aside I thought I would share a ‘security’ related story from today which I think highlights the insecure belief that people choose what they consider a ‘strong’ password.
I was in a meeting room today where there was a corporate wifi available but when I asked for the encryption key I was told the following three pieces of information:
1. The key is darthvader
2. It contains a mixture of Upper Case and Lower Case
3. It contains non standard charachters (Symbols)
Armed with thhis information I ‘guessed’ the encryption key first time
I would be interested in how many of your ‘puzzle solving’ audience would do the same as it is now obvious to me that the ‘tricks’ that some system administrators apply are not just well worn rules applied blindly.
OK…The answer as you probably also guessed is D@rthV@d3r
This could be made so much more secure by not capitalizing the first letter of each word and not using the @ sign for all ‘a’s in the passcode.
I don’t normally get much time for this sort of exercise but intrigued as to whether people that consider themselves as ‘champions’ of security are actually deceiving themselves.
?
March 20th, 2012 at 2:11 pm
Hey Clive!
Interestingly, since I wrote the article (within 3-4 days of writing the article), Google stopped providing the search volume numbers in that undocumented API, which breaks this metric. For example:
http://google.com/complete/search?output=toolbar&q=why+did+sacha+
I *am* still using the API as an informal way of measuring search demand, but it’s not as precise as it once was.
I vaguely recall not having much luck with either SBC spelling, but I can’t confirm that. There is indeed a hit for “why does sacha baron cohen”.
Regarding security:
I think it’s becoming increasingly clear that passwords aren’t a very effective security measure, and that we need alternatives. In the past 5 years I’ve made the transition from:
1) Using an easily guessed “low security” password on all but 2-3 sites, and using a slightly less easily guessed password on my email, web server and bank.
2) Using a slightly less easily guessed formulaic password that includes a few letters from the domain name for each site.
3) Using completely random strings of letters and digits for nearly every single site I access unless I truly don’t care about “sharing” it.
I was only able to achieve #3 with help from a utility (“1Password”) that stores and enters them for me. This makes my web browsing more secure, but makes certain sites hard to access via mobile. I believe they have an iOS app, but I haven’t tried it yet. The formulaic replacement using “3” and “@” you mention are already common transformations in use by dictionary cracking programs (e.g. “John the Ripper”).
I *much* prefer systems like ssh, where I can use public-key cryptography under the hood to get authenticated, and not have to worry, normally, about entering passwords at all.