Dave Sifry’s latest update on the State of the Blogosphere touches on an issue near and dear to my heart: the representation of language in the blogosphere. He offers the surprising possibility that Japanese may have unseated English as the dominant language of the blogosphere, showing data that 37% of blog posts indexed on Technorati were apparently in Japanese.
There’s a number of possible explanations for this phenomenon. Since much blogging in Japan is mobile blogging, posts tend to be shorter and more frequent – since the study considered the number of posts, not the number of blogs, this could partially explain the skew. And it’s possible that some language misidentification is going on as well.
A few months ago, sorting through the Technorati Top 100, I noted that a number of Chinese bloggers had entered the elite, and suggested that Chinese blogs as a whole may be undercounted by engines like Technorati, as many Chinese blog servers don’t contact the ping servers engines like Technorati rely on to find new blog posts. Since I made that post, it looks like Technorati has begun indexing a few more Chinese blogging sites… but it still seems to be missing most blogs on Bokee.com, for instance. I asked Isaac Mao, when I got the chance to hang out with him in Manila, whether Chinese bloghosts would start supporting pingservers any time soon. He pointed out that many Chinese geeks don’t see the advantage of having their blogs indexed by an English-language, American-hosted service…
Sifry’s post acknowledges that Technorati is likely undercounting the Korean and French blogospheres due to major blog providers’ failure to contact pingservers.
All of which suggests that English-language blogging is becoming a smaller plurality each day. Which makes me very happy that we made the decision a few months ago at Global Voices to focus heavily on translating blog posts as well as linking to them. Haitham, Veronica, David, Feng and Alice have been steadily translating content from Arabic, Russian, Spanish, Chinese and French, respectively, and we’ll be introducing our new Portuguese translator in a few days. This has let us run fantastic posts, like this analysis of the Spanish blogosphere’s reaction to today’s boycott of US goods in much of the Spanish-speaking world by David Sasaki. But it makes me hungry for even more, including projects like Blogamundo, which promise large-scale systems to help translate blog content.
In the meantime, monolingual idiots like me are made even more aware of what we’re missing…
Pingback: Lessig Blog
The Sifry post confirms a lot of what I have been expecting for some time. The rapid fire moblogging that occurs is a better model for how the upcoming generation will digest information.
Translation services would be amazing, I can’t wait until Japanese is added.
I follow the third party “lost in the shuffle” english translated posts from some Asian blogs because I believe that Asian tech pop-culture is an accurate predictor of social technologies that will eventually stick in the US. Seeing translation done in a more consistant manner would be enormously helpful in getting speedy, accurate news.
Very interesting, I have some bad news for the commenter who is eagerly awaiting Japanese translation. I am learning Japanese and converse with Japanese friends (at intermediate level), and I can assure you that Japanese is, evidently, impossible to translate properly – so I wouldn’t get your hopes up ;-)
That is, if I pop any letter from my Japanese friends into any translation engine that thinks it can do Japanese, I get utter garbage. Some of the simple nouns are translated correctly but often it’s not even clear what the person is talking about at all, it’s soooo incredibly awful.
I think there’s a lot of reasons for this, which I won’t get into, but suffice it to say, Japanese grammar and usage has grown up completely independently of any western influence. Don’t think for a minute that I’m exaggerating either. If I relied on translation to read any Japanese communication I receive, I simply would not be able to reply at all.
Maybe an example will illustrate: Here’s a translation from a completely understandable but slightly casual Japanese sentence.
1st, I used AltaVista – BabelFish to translate it. What was he actually trying to say do think? (Keeping in mind that I am a male)
Good morning, Peter excessively you do not worry and also the て is all right, when you become pregnant, human Kiyouko who everyone lightly is completed you saw to be, the human one people where the morning sickness is terrible you experience
Google does a little bit better in some areas, worse in others:
Good morning, Peter not worrying excessively, it is all right, when you become pregnant, the human capital child which everyone lightly is completed you saw to be, the human one people where the morning sickness is terrible you experience
What he was actually saying: (my translation):
Peter, don’t worry too much, everything will be alright. Everyone has varying degrees of morning sickness when they get pregnant. Some have a rather light sickness, and others get it quite nasty, like Kyoko. Different people have different experiences, it just depends on the person.
Pingback: …My heart’s in Accra » Your language or mine?
Pingback: Media SITREP » Xu Jing Lei Takes Boing Boing’s Crown
Pingback: Sunny’Blöģ® » 做女人挺好
Pingback: El Oso, El Moreno, and El Abogado » Blog Archive » Language and the Internet
Pingback: One Internet or Many? | lessig.org