The folks at Pingdom, a company focused on server performance monitoring, posted a fascinating little piece of research based on Google’s Insights for Search tool. I’m interested both in their specific research question – what social network tools are popular in what parts of the world? – and the richness of the data available via this tool from Google. (Basically, I’m feeling boneheaded that I hadn’t realized this data set was available.)
The Pingdom folks tried a simple experiment, using Search Insight to search for information on a dozen social networking sites. The Insight tool reveals how popular searches for particular terms are, and where in the world those searches are coming from. This lets the Pingdom folks conclude:
# Facebook is most popular in Turkey and Canada.
# Friendster and Imeem are most popular in the Philippines.
# LinkedIn is most popular in India.
# Twitter is most popular in Japan.
# LiveJournal is more popular in Russia than it is in the United States.
Of course, that’s not quite what these searches measure. Companies generally keep their traffic data extremely private. (We try to be a bit more transparent, publishing an analysis of Global Voices logs online, but those numbers probably aren’t entirely accurate and we increasingly rely on Google Analytics to track what’s happening on our own servers… and we don’t pubish those numbers.) The Insight data isn’t measuring traffic to those sites, or their number of active members, just the number of folks searching for those sites via Google. That may or may not be an effective proxy for interest in those networks. I’m a Facebook user, and I have the site bookmarked, so I rarely would find myself searching for the site – it’s possible that the search data is a more effective proxy for the strength of a brand in a particular market, or the level of interest from non-participants in a specific site.
On the other hand, the information Pingdom turns up through this proxy looks pretty similar to the results Le Monde published a few months ago, using information from Valleymag Datamonitor, which evidently had access to numbers that measured the users of social networking sites broken down by their national origin.
Google appears willing to share a suprising amount of data with this tool – a search for “red sox” (rapidly becoming my “foobar“) gives a graph of searches for red sox since 2004, marking peaks in the graph with news stories. (The highest peak in the Red Sox graph is assocaited with the 2004 World Series victory… the peak for the 2007 series victory is puny in comparison.) Regional interest shows that while interest in the Sox is highest in the US, there’s substantial interest in the Dominican Republic and Puerto Rico (baseball-crazy nations with citizens who play for the Sox), and a surprising amount of interest in the team from Ireland. (Okay, given the massive Irish-American population in Boston, perhaps not that surprising.) It’s also possible to graph interest in a topic by nation over time, comparing the rise and fall of Dominican and Pueroriqueño interest in the Sox over time.
Finally, the tool shows ten terms most commonly associated with a search term, and “emerging” terms that have recently been associated with your term. Predictably, the Red Sox are associated with Boston, tickets and the Yankees… less predictably, center fielder Jacoby Ellsbury appears as a “breakout”, a term that’s recently become popular.
I had some fun with another tool designed to reveal this “associated search” data, Overture’s (now unavailable) keyword selector tool. Back in 2004, you could feed the KST a term like “red sox” and discover that there had been 10,068 searches in the previous month for “red sox suck”… in contrast to 30,527 searches for “yankees suck”. I fed my favorite search terms – a list of the world’s nations – into the tool and cranked out an interesting data set that I blogged about as “the Freudian web”, our associations between particular nations and our interest in them. The data that came out of that research suggested that there were some nations where our interests were basically in seeing naked ladies, others where we mostly wanted to buy cheap prescription drugs.
What’s especially nice about the Google tool is that we can look at the interest in a search time from different nations. Search Google Insight for “Sweden” – you’ll discover that “Sweden” is a more popular search in Gambia than in Sweden itself. Why? Click through on Gambia and you’ll get a page tracking searches from the Gambia for Sweden, which reveals that Gambians are searching for universities in Sweden. (So are Nigerians, Ghanaians, Cameroonians and Bengalis.)
I suspect there’s something wonderful that can be done with this data, though I don’t quite know what experiment to run yet. I’m interested in the ways that searches can proxy interest in specific topics, especially in international news. Searching for “Ossetia” reveals a predictable uptick in interest in the past month. And it wasn’t surprising to see the most interest in the term coming from Russia. But why are Finns searching for information about Ossetia? Again, clicking through is interesting – Google tells us that the searches for information about Ossetia aren’t just from Finland, but from the province of Southern Finland, a part of the nation that borders on Russia. Perhaps Finns in that corner of the country are looking anxiously at to the west and wondering whether Russian incursions could come across their own border. Or just that there are a lot of Russians in southwestern Finland. A search for Ghana, classified by US states, reveals the strongest interest in Maryland and DC, an area that’s got a huge Ghanaian expatriate population.
Fun data sets like this have the tendency to ruin my productivity until I can find some interesting way to manipulate them. Thanks, Google, for spoiling my month of August.