Home » Blog » Africa » How big’s Wikipedia? Around 23 meters, near as I can tell.

How big’s Wikipedia? Around 23 meters, near as I can tell.

I’ve been reading UserFriendly since the third or fourth day it’s been online. JD “Illiad” Frazier was good enough to feature Geekcorps prominently on the site, helping us recruit dozens of volunteers. But I read less out of loyalty and more because he periodically makes me laugh so hard I fall out of my chair.

The cartoon is a reference to a recent Reuters story where Wikipedia founder Jimmy Wales talks about plans to create a static version of the encyclopedia for distribution in the developing world.

This isn’t exactly a breaking news story: Jimmy’s said from the start of the project that he was trying to create a free encyclopedia that could be distributed to people offline. There’s a page on Wikipedia – Pushing to 1.0 – on the topic which has been around since last November. And the project seems to focus, primarily, on creating a static version that could be distributed on CDROM or DVD, and less on partial paper versions.

(If Wikipedia were printed out, it would create a pile of pages roughly 23 meters high. According to a mid-2003 study, the average article on Wikipedia was 2,100 characters, or roughly 2KB. There are currently 804,000 English language articles. Assuming 600 words, we’re talking 3-4 KB per page. Let’s call it 7KB on both sides of a printed page. We’d need 229,714 printed pages to contain the English edition. A ream – 500 sheets of paper – is roughly 5cm thick, giving us roughly 10 pages per milimeter. The comprehensive edition, containing 1.8 million articles in 100 languages, is roughly 51 meters high…)

Illiad’s a little off when he references a single page in a local language…but not by much. The Kiswahili edition has 87 articles… Bambara just reached 68… Lingala’s up to 144.

Does this mean that Wikipedia’s a failed experiment in African languages? Probably not. Using the ‘net in Africa right now requires pretty good English or French language skills – while projects like Translate.org.za and Jambo Open Office are creating browsers and other tools in local languages, the vast majority of African net users at present are bilingual or multilingual. We’ve seen, in the middle east Wikipedia community, that many speakers who are bilingual in English and Arabic are contributing to the English wikipedia rather than the Arabic, because the Arabic wikipedia is so small, and there’s a perception that for the work to be recognized, used and appreciated, it needs to be on the widely used English encyclopedia.

It’s going to be interesting to follow the development of the Swahili wikipedia. There’s a small but passionate group of Swahili language bloggers, and one of the key figures in that community, Ndesanjo Macha, has gotten involved with the Wikipedia project. If the Swahili community can mobilize to bring the Swahili wikipedia to a thousand or ten thousand entries, then perhaps it makes a good deal of sense to distribute it on CD or paper to students who wouldn’t otherwise have affordable reference materials in their native language.

7 thoughts on “How big’s Wikipedia? Around 23 meters, near as I can tell.”

  1. I was blown away by the size of the Ido encyclopedia, Steven – pretty amazing!

    At 29,435 articles, my calculations say you should be able to fit the esperanto version onto 8,410 sheets of paper. That’s a thick stack, but under a meter – you could start carrying it with you wherever you go… :-)

  2. Ethan, I’m puzzled. Why is this a good thing? Wikipedia’s deficiencies are well documented. Promoting Wikipedia in Africa is only likely to increase information asymmetries and inequality.

  3. I’ve been one of the many folks posting criticisms of Wikipedia as well on the issue of systemic bias – you can see my earlier post on Ghana and GSM and an
    earlier post on systematic bias in wikipedia
    .

    The response I’ve gotten from the community – and from Jimmy, personally – is pretty simple: “Come help us out”. I thought this was a cop-out at first… increasingly I’m inclined to think that it’s the right answer. When I complain about bias in mainstream media, there’s not much I can do – or help my friends in the developing world do – to correct that bias. With wikipedia, I’m able to how friends how to create and edit entries, correct mistakes and open new topics that aren’t being well covered.

    My support for Wikipedia is basically parallel to my support for blogging in the developing world. Yes, blogs are biased towards tech issues and against the developing world. Yes, Wikipedia’s lots weaker on Africa than on Europe. But in both cases, these are problems we’re invited to help fix.

    I also think it’s worth noting that this is probably the only option. Britannica could step up and offer their encyclopedia for free in Africa – I don’t see them doing it any time soon. I think Wales has the right idea and the right motivation, but that it’s going to take a lot of work before Wikipedia is ready for a print edition for Africa.

  4. Jimmy Wales has pointed out that 50% of all edits to the English Wikipedia are done by just 0.7% of users (615 people), and that less than 2% of users (about 1500 people) have written nearly three quarters of it – so that, in his words “500-1500 people wrote the English Wikipedia”. The point is, a pretty small number of dedicated people can produce a pretty impressive resource. You don’t need tens of thousands of contributors, and even a smaller language community can do it with if there’s leadership and inspiration.

  5. It seems to me that the idea of a print edition of Wikipedia for Africa is really backwards thinking — paper is far too expensive! Where I live in the UK I can buy a solar powered, scientific calculator for £1 (~$1.50), a price which practically everyone can afford. It seems to me that this kind of machine is not a million miles away from a solar powered, electronic book.

    Based on your own calculations (804,000 * 4KB), the answer to the question “How Big’s Wikipedia?” is actually about 1GB (about 3GB uncompressed), or put another way, about the size of a scientific calculator! I am not really suggesting that you could add 1GB into a scientific calculator without significantly increasing the cost of the thing based on today’s memory prices. But I am suggesting that this will *soon* be the most realistic way of distributing Wikipedia, Project Gutenberg, and an English learner to the school children of the world. Moore’s Law says so.

    By the way, for the purposes of this device the estimate of 3-4K a page is no longer a fair one since we are now back to wanting a hyperlinked document, rather than a purely textual one. A lean binary HTML representation would go a long way in reducing Wikipedia’s currently bloated page sizes, and might make life easier on the poor calculator. Regardless of the required size though, Moore’s Law will make short work of it.

Comments are closed.