My friends at TED have launched an exciting new project today, the TED Open Translation Project. It’s a powerful system to allow the “social translation” of their video content. This tool demonstrates the state of the art in social translation on the web today, and I think there are a lot of lessons in the tool and thinking behind it for anyone who hopes to make the polyglot internet more comprehensible, and for anyone thinking about online cooperation.
I’m aware that most people think of translation as roughly as interesting as developing Linux device drivers – necessary, but far from sexy. My hope is to convince you that translation is one of the keys in helping the internet reach it’s potential and to get you at least a tenth as excited about this new tool and approach as I am.
For the past couple of years, TED has shared an amazing set of videos, talks delivered at the TED conferences in California, the UK, and Tanzania. These talks are some of the most fascinating and thought-provoking video content available on the web – many smart people have discovered TED talks and promptly lost a week or more gorging themselves on intellectual candy.
(A personal top five, for those who’ve not taken a deep dive into the videos that are available. I’m not going to argue that these are the “best” talks given at TED, but they are the ones that have had the most influence on me and my work:
– Ngozi Okonjo-Iweala, former Nigerian minister of finance, on the debate on trade and aid in Africa, framed in deeply personal terms, as she talks about her family’s struggles during the Biafran war.
– Swedish doctor and scientist Hans Rosling uses statistics and visualization to rethink international development over the course of decades and centuries.
– Majora Carter on the importance of environmental issues to urban communities, and the connection between community development and the green movement.
– Oxford development economist Paul Collier explains his brilliant book, “The Bottom Billion” in eighteen minutes.
– Nigerian author Chris Abani on humanity, cruelty, compassion and storytelling. I’m not sure I’ve ever seen a talk swing between humor and brutality as rapidly and powerfully as Chris does in this talk. When he finished giving it live, I left the theatre because I didn’t want to hear anything else that day.)
For the past couple of years, these talks have been available to anyone with a good internet connection and the time to download them… but they’re only helpful to people who speak English, the language the talks were delivered in. TED, and specifically June Cohen, the director of TED Media, recognized that there’s huge international demand for TED’s content around the world – take a look at TedToChina, a fan site that offers summaries of TED talks in Chinese.
Translation is supposed to be difficult, time-consuming and expensive. Professional translators routinely charge between $0.20 and $0.40 per word – translating this blogposts into one other language would cost over $500 at market rates. The cost of machine translation has fallen from cheap to free, with powerful systems incorporated into Google and other search engines… but the results are far from perfect, and tend to miss the nuance of complex texts. Very few of us choose to read blogs – even on topics we enjoy and follow – via machine translation because the experience is so awkward.
But maybe translation doesn’t need to be so difficult and expensive. Maybe it’s something that interested, talented people will do for free, if given the right opportunities and incentives. That idea inspired the Global Voices community to launch Lingua, our project to translate Global Voices content into over twenty languages. In 2006, we discovered that Portnoy Zheng, an amazing Taiwanese blogger, was translating Global Voices stories into Chinese, and inviting other translators to help with his efforts.
We were thrilled, and started pointing Chinese-speaking readers to Portnoy’s efforts. Other groups, starting with the Francophones, proposed that volunteer translation of Global Voices content into other languages become an official feature of our community, and beginning in 2007, we’ve integrated volunteer translations into our site – under many of the headlines on the main site, you’ll see “zh”, “fr”, “mg” or another two-letter language code. Click on that code, and you’ll find yourself on a translation of that post.
There’s a growing movement to make “social translation” – translation of online information by users around the world, motivated more by community recognition and appreciation than by money – a mainstream approach to making the web more accessible to all readers. The movement has been led by the open source software community, and projects like Dwayne Bailey’s pootle toolkit, a set of tools that make it easier to localize open-source software. (Dwayne launched translate.org.za, a project that makes key software available in South Africa’s eleven official languages.) Inspiring projects in the space include WorldWide Lexicon, an open platform to allow cooperative translation of any website; Meedan, an online community that uses social translation as well as machine translation to build dialog between Arabic and English speakers, and dotsub, a powerful video subtitling and translation tool that invites anyone to become a subtitler or translator.
Cohen and her team looked closely at the tools and teams building the social translation movement and built a new community that learned from the successes and failures of other projects in the space. TED’s tool is based on dotsub, with some very powerful new features added, and their model for recruiting, recognizing and rewarding translators is inspired in part by some of the work we’ve done at Global Voices. For visitors to the site, this means that you can browse videos by language, selecting one of the 32 talks available with Spanish subtitles, or the sole talk available in Kyrgyz.
Select a talk in one of its translated forms, and you’ll get a subtitled video, a translated title and description of the talk. Featured in this description are the two people responsible for translating the talk, the lead translator and the reviewer – like Global Voices, TED is inviting translators to join the community, pairing new translators with trusted reviewers to evaluate the work and to offer any changes or suggestions. Another link on the page leads to an “interactive transcript” – this allows a viewer to select a point in the talk and fast-forward to see the slides and images that accompany the speaker’s words.
Not only is this a fantastically cool way to navigate these talks, it leads to my favorite undocumented feature of the system, which Cohen calls “the Rosetta Stone”. Pick a transcript of a talk in a language you speak. Then select subtitles in a language you don’t speak. You can watch the talk in three languages – the English of the speaker’s words, the Spanish of the transcript and the Turkish of the subtitles. (I suspect my wife, who speaks English and Hebrew well, and is learning Arabic, will addicted to this feature in the near future.)
(This ability to view the same text in many languages may turn out to be one of the most important aspects of the project in the long run. As TED translates hundreds of talks, they’re creating “parallel corpora”, the raw material for machine translation systems. This might be too small to build really strong Turkish to Vietnamese translation technology, but the idea of pulling corpora from tools like dot.sub is something that machine translation folks should be taking a close look at.)
The system is launching with 375 translations, representing 42 languages. Some extremely popular talks, like Al Gore’s talk on climate change, are available in over twenty languages – others are available just in English and one other language. What’s remarkable to me is how many of the talks were translated by volunteers – 200 of the first 300 translation posted, and June tells me that 450 volunteer translations are in the queue and will launch soon. She calculates that if TED had to pay for those translations, the 650 underway would have cost roughly $500,000. While that sum might be something sponsors, like Nokia, which is the lead sponsor for the translation project, might have been able to cover, June estimates the cost of translating all TED talks into 40 languages at over $13 million dollars. To achieve what TED really wants to accomplish – all talks in 300 languages – is over $100 million. It’s simply not possible to take on a task of that size without trying a social translation approach.
Why are people queueing up to translate TED talks for free? The system June and TED have launched leverages some of the lessons we’ve learned about social translation:
– Translation can be fun, if the content’s enjoyable. There aren’t a lot of people lining up to translate UN internal memos for free (according to some estimates, transcripts of UN meetings can cost as much as $8000 an hour to produce, leading to an organization translation budget of $100 million per year.) But TED talks are fascinating to a wide audience, and some people are excited about investing the time to translate them.
– Choice matters. On Global Voices, we don’t attempt to translate every story into every language – we let translators choose what stories they’re interested in. We don’t get a complete edition of our content, but we wouldn’t have such great participation if we assigned specific stories to translators. My guess is that TED is seeing a similar phenomenon, and that translators will initially gravitate to a small set of highly popular talks, then start translating talks that meet their personal interests over time.
– Translators need recognition. On the TED site, translators are some of the most prominently featured people on the page – click through on the translator or reviewer’s name, and you get a page featuring her photo, her work and recognizing her contributions. On Global Voices, we try to feature authors and translators equally – that model doesn’t make as much sense for TED, where the speakers are often celebrities, but it’s clear that TED is taking the translator’s role very seriously and honoring the contributions.
– Community matters. Our translators have the same sort of internal communications systems that our authors do – they divide up tasks, consult each other for assistance and support, and generally function as a tight community. My guess is that language communities are going to emerge on TED in much the same way, and that the translator/review mechanism is going to be critically important for building support, friendships and communities.
– Not all rewards are (directly) financial. GV rewards its most productive translators with travel funding to help them attend our annual meetings. I wouldn’t be surprised to see TED try something similar if they’re able to secure the funding. And we’ve found that translators use their GV experience as evidence that they are competent professional translators and gain more professional translation work from their association with us – again, I’d expect to see something similar with TED. My guess is that prominent translators in the TED community will also become “go-to” guys and gals for TEDsters who are looking for contacts in Turkey or Poland.
I’m really excited about TED’s project for two reasons. One is that it’s great to see an organization I respect and admire adopting and improving on a strategy we’ve embraced at Global Voices. June and I had coffee in NYC a couple of weeks ago, and when she told me that the translations produced by volunteers were frequently better than those produced by professional translation agencies, I was so happy I gave her a high-five. It makes perfect sense to me – translators motivated by pride, community support and interest might well do a better job than those just collecting a paycheck.
I’m also thrilled because TED operates on a very large stage, and their embrace of social translation sends a message to organizations and projects around the world who are considering whether and how they tackle issues of language. Because translation is historically difficult and expensive, most organizations have simply avoided it, except when absolutely necessary.
The internet is huge, growing, and being built by people who speak hundreds of different languages. There are editions of Wikipedia in over 200 languages, and some scholars estimate that there’s as much user content created in Chinese as there is in English. Unless we find scaleable, inexpensive ways to translate, we’re each going to face an internet that’s grows everyday, where we find less of the content understandable. Until we figure out better solutions to translation, we’re fooling ourselves into believing we’re more cosmopolitan and connected than we actually are.
Social translation isn’t the only solution, and it won’t solve the problem by itself. But it’s a great first step, and TED deserves real congratulations in building this great tool and bringing this strategy to global prominence… and for it’s commitment to the values of connection and bridging that underly their commitment to making this information available around the world.
I suposse that the next step is switch the comment timeline by lenguage when the user switch the subtitles, no?
Because now is amazing this change in TED, but in few weeks is possible that we the user will build another Babel tower in the comments panel.
It’s a great point, LuisCarlos. I suppose one response would be to allow people to translate comments as well. It’s harder, since they could be written in any language, not just English… but conquering Babel has to start somewhere.
I’d also use terms as localization and contextualization along with or instead of just translation, because that’s what we actually do on GV Lingua and what the polyglot Net requires…
the ability to find and include local references, links to blogs/resources, relevant footnotes, setc. is equally crucial at social level and even more fun, rewarding for volunteers and fostering actual communities, as compared to just ‘professional translators’
I think that’s absolutely right, Bernardo. When I talk about GV’s work, I usually talk in terms of a three part model: filtering, translation and contextualization. In the case of TED, it’s probably going to be much heavier on the translation and lighter on the context… but that depends quite a bit on the subject and the language pair. I’ve watched Ndesanjo Macha translate complex technical talks into Swahili, coining terms as he goes to describe concepts that haven’t previously been expressed in the language. That sort of work is far beyond the sorts of translation you can hand to a machine or outsource… :-)
Great news, it will be exciting to see how this develops.
A thought about social/community translation: even if everything on the web was suddenly magically translated perfectly (e.g. by MT), would people all suddenly go out and read it? From my experience in GV, the answer I think is (for the most part) NO.
Language barriers correspond with cultural differences, differences in world views, etc. and in fact them being there creates a kind of “speed bump”, which in a positive sense means that only the “interesting” stuff gets translated. That’s a *good thing* IMHO — we have enough information as it is, we need more natural filters, and language barriers are the most effective ones I know of. The fact that someone has put the time and effort into translating something means that it might be worth reading: certainly takes more effort than the 2 seconds to post a link to Twitter.
Pingback: TED Embraces Social Translation | NomadsLand Post
Pingback: TED Embraces Social Translation | EcoSilly
Pingback: links for 2009-05-15 — contentious.com
Nice post, Ethan. I want to follow on the point Chris makes in the comment above. Definitely, translation is a radically important measure of attention. Not only does it represent a huge personal effort placed behind a piece of content (a huge commitment to the ideas and the author) but it is also the most explicit way of moving content between definable (and radically diverse) networks. We at Meedan are excited to see ‘social translation’ starting to get buzz, we are even more excited to start tracking translation metadata and get a picture of which ideas and voices are truly bridging the huge/diverse/and still acutely under-informed internet population.
Pingback: TEDtoChina (TED中国粉丝团) » TED人物志 » TED周边：Ethan Zuckerman谈TED与社会化翻译
Quote: The cost of machine translation has fallen from cheap to free, with powerful systems incorporated into Google and other search engines… but the results are far from perfect, and tend to miss the nuance of complex texts.
Subtle nuances are good for literature. Subtle nuances are not good if you want readers (or listeners) to understand your message clearly.
For clearly-written texts that do not contain subtle nuances, machine translation gives acceptable results. See ‘Evaluation of international English and machine translation’ (http://www.international-english.co.uk/mt-evaluation-details.html).
Mike, I see that your group is promoting a simplified version of English for use in translation and international communication. While I think that’s a worthy project, the work I’m talking about here has to do with making existing material online available to a global audience. Unfortunately, blogs, newspaper articles and TED talks all routinely include levels of nuance that resists machine translation.
For instance, here’s a “translated” section from a Taiwanese newspaper, talking about Global Voices translation project:
“”Global Voices” in the face of pure English in Taiwan, Wei Zheng big boys have changed after the entry, he automatically will be in English translated into Chinese so that more people watch, headquarters after it knows very encouraging, “the world’s translation program “in 2007, the implementation of interpretation by the students of National Taiwan Normal University Institute of latitude as the best money managers, has been 22 kinds of languages, translation of 104 volunteers.”
(see http://is.gd/Bb3B for the original)
That’s not a subtle literary text – that’s a straightforward newspaper article. If I want to understand what’s being talked about in Taiwan or Tibet, I need human beings to help translate, or vastly better automated tools.
Pingback: New York Times on Social Translation | EcoSilly
Pingback: New York Times on Social Translation | NomadsLand Post
Pingback: New York Times on Social Translation | green hopogus
Pingback: Green Design » Blog Archive » New York Times on Social Translation
Pingback: New York Times on Social Translation | FollowGreen.com
Pingback: New York Times on Social Translation | Climate Vine
Pingback: Full Interview: Ethan Zuckerman on translation and the multilingual web | Spark | CBC Radio
Pingback: …My heart’s in Accra » What percentage of the Internet is in English? In Chinese?
Pingback: Episode 80 - June 3 & 6, 2009 | Spark | CBC Radio
Pingback: TED launches open translation titles to its TED Talks : Celebrate Research Week
Pingback: Games for Change: Part IV at Rising Wisely
Pingback: ??????? - ??????????? | ?????
Pingback: Japan: Social translation in times of crisis · Global Voices
Pingback: Japan: Social translation in times of crisis | Daringsearch
Pingback: Japan: Social Translation in Times of Crisis :: Elites TV
Pingback: Japan Earthquake and social media. A developing roundup | Erkan's Field Diary
Pingback: Япония: Социальные переводы во времена кризиса | Траффик kg
Pingback: Answer: Is there any Social Translation web site? #dev #computers #solution | InfoBot