Derek Lackaff and Alexander Halavis presented some interesting comparative research under the title “Sins of Omission?” Halavis explained that they’d planned to lead off with the Colbert video clip, but had been thwarted by Jimmy Wales’s use of it. In the clip Colbert mentions that the article on “Lutherans” is shorter than the article on “truthiness” – Halavis asks whether it’s a problem that Lutherans are underrepresented in Wikipedia.
Will the underrepresentation of some topics be corrected over time? Perhaps the future is here but not evenly distributed? Halavis suggests that many critiques of Wikipedia have focused on factuality – is the information in Wikipedia accurate. There’s another set of questions about sins of ommission, not comission: authority is not completely encompassed by accuracy but has to include questions about breath, the amount of coverage on a particular topic.
Lackaff explains their experiment – they looked at three specialist encyclopedias – in physics, poetry and linguistics – and tried to check for completeness on an article by article basis. This is fairly hard to do – article titles don’t always map neatly. What they did was take the article headers from the encyclopedia, then search Wikipedia using Google to find matching articles. There were some ambiguous and false positive matches, but they felt they could make a reasonable comparison.
Of the linguistics articles checked, 424 could be found both in wikipedia and in print, while 112 were found in print only. In physics, 399 were in both, 89 in print only. In poetry, 551 were in both, 330 in print only. On the one hand, this is pretty impressive – Wikipedia has pretty good coverage of topics from specialist encyclopedias, books we’d expect to have more esoteric topics. On the other hand, we could argue that Wikipedia falls short, with 37% “missing” entries in the poetry encyclopedia.
Looking at results a different way, Wikipedia comes out much better – searching for topics that fall under the header of “linguistics”, we find 12,130 in Wikipedia versus a few hundred in the whole linguistics encyclopedia. Because Wikipedia is a generalist, not specialist encyclopedia, it’s going to have a broader remit than a specialist encyclopedia. And Wikipedia has no physical limits, no need to fit everything within the covers of a book.
There’s some fascinating questions to ask in this direction: does it matter whether Wikipedia has as full coverage in some fields? Whether coverage of some fields needs to be more complete. Halavis notes, “We’re already covering Finnish profanity better than any other encyclopedia in print” – is this a good thing, or a sign of weaknesses in other parts of the encyclopedia?
Jim Giles is a reporter for the journal Nature who was responsible for the article that compared the accuracy of articles in Encyclopedia Britannica and Wikipedia. He’s very clear that their piece was a work of journalism, not a peer-reviewed study…
(Peer review is a topic that comes up in the first talk as well – Wikipedia has been challenged on the basis that it’s not peer reviewed. Obviously, it is – “but we don’t like the peers”. In other words, it’s not PhD reviewed…)
Giles was inspired by an article in the Guardian titled “Can you trust Wikipedia”. He noticed that many of his colleages at Nature were using Wikipedia, but were very uncertain how much to trust it. He and his colleagues decided to test it, using the Encyclopedia Britannica as a benchmark.
They selected 50 articles on scientific topics, looking for appropriate EB and WP articles. They assigned each pair – stripped of identifying information – to an expert and asked them to look for inaccuracies. A team of Nature referees reviewed the expert judgements, made a final call and tallied the results.
They got 42 usable comparisons, which revealed 3.9 errors per article in WP and 2.9 in EB. There were very few major errors in either source – most were statements that were somewhat misleading or inaccurate.
Britannica wrote a rather strong refutation, a twenty page paper titled “Fatally Flawed”, which demanded a retraction by Nature. This was the result of a long process with EB, which had first asked for corrections, then demanded to review data, before producing their retraction. They offered four major criticisms:
– The data wasn’t completely released
– The data hadn’t used the correct EB sources
– The EB material was edited
– The reviewers made mistakes
Giles responds to these in terms: it wasn’t possible to release the referee reports to EB because Nature had promised anonymity to reviewers. In some cases, searches on the EB website turned up content from sources other than the main EB volumes – the comparison used these articles, because it was an attempt to compare the EB source a user would most likely use, not the totality of EB content. The “editing” of EB articles involved summing two EB articles – on “ethanol” and “the uses of ethanol”, which allowed for a better comparison with WP. And the idea that reviewers would make mistakes would likely affect both sources.
Giles mentions that there are other questions the study didn’t ask. Should EB be used as the benchmark for checking Wikipedia, but lots of the things Nature people wanted to study weren’t in EB. How do we compare omissions – what’s missing in Wikipedia versus in EB? How important is timeliness? Is EB more likely to be stale because it can’t change as quickly? (This wasn’t a major factor in the 42 article comparison.) And would a future study evaluate style and clarity? (Giles says that several reviewers found the Wikipedia articles difficult to read due to stylistic decisions.)