The architecture of serendipity

June 9, 2008June 9, 2008

Law professors Cass Sunstein and Eugene Volokh brightened my drive to Harvard last week with a dialog about “the architecture of serendipity”. Sunstein is well-known for his concerns about echo chambers and “media cocoons” that allow citizens to insulate themselves, hearing only the opinions and perspectives of people who agree with them. (He develops this idea at length in Republic.com, later Republic.com 2.0, and to a lesser degree in his excellent Infotopia. My review of Infotopia, if you’re interested.) He’s concerned that the customizability and choice offered in digital media can make it easier for citizens to insulate themselves from the sorts of differing views they need to make informed decisions as citizens.

Volokh, one of the smartest bloggers out there, believes that Sunstein overestimates the diversity of old media, suggesting that most newspapers have such a strong center-left bias that they serve as an ideological cocoon, and suggests that blogs invite people to break free of ideological bias, by linking to pieces they critique. So far, so good – this is a well-trodden path, on both sides of the argument.

It gets more interesting, in my opinion, when Sunstein starts defending old media, invoking “the architecture of serendipity”. (Here’s a clip on the NYTimes website, an excerpt from the longer dialog on Bloggingheads.tv, which includes this argument.) He argues that, the “daily newspaper, when it’s working well, builds in the architecture of serendipity.” It’s designed to draw the idea to a headline or story that you might not otherwise encounter, hoping to capture your focus and draw you into a story you didn’t know you were interested in, but which gives you information that changes your worldview.

My regular readers know that I’m interested in serendipity as one of the tools that can help combat homophily, the tendency of birds of a feather to flock together, and to share their preferred sources of information, often at the expense of other voices and sources of information. But it’s difficult just to identify good examples of serendipity, and much harder to figure out how to engineer it. It’s worth looking closely at newspapers as tools that try to generate serendipity, and to ask questions about whether we’re losing this function in a move from paper-based to digital media.

Today’s New York Times has six major stories and seven minor stories on the front page. The major stories, which include headlines, large blocks of text and, in two cases, photos or graphics. Those stories include substantial hooks to interest a reader – 200-400 words of text, plus images, designed to convince a reader to a) buy the newspaper and b) read the body of the story. The seven stories at the bottom of the page include 17-48 words of text as hook, and three include pictures. Count every mention of a page inside the edition you could turn to – the paper equivalent of a hyperlink – and there are 23 links a reader can follow from the front page.

The contrast to the online edition of the Times is pretty stark. Just counting possible links (using a search for anchor tags in the source HTML), there are 423 other webpages linked from the front page. A more careful count, ignoring ads, links to RSS feeds and links to account tools for online readers, gives 315 content links, possible stories or sections a reader could explore from the front page. While there are almost 14 times as many pages for a reader to explore, they’ve got much less information on what links to follow: while twelve stories have text hooks, the wordcount ranges between 10 and 26 words. While there’s a good chance one of those stories might convince you to click on it, you won’t start reading it on the front page, the way you might with the 200-400 word stories in the paper edition. (There are lots more images to choose from – 15, one of which is a video – in contrast to the seven images on the paper front page.)

Okay, so the paper gives 7% as many options to the reader that the online edition does, though provides up to 20 times as much text to get a reader invested in a story. So what? And isn’t this just a function of what medium is good at? If the paper edition of the New York Times could support hyperlinks, wouldn’t there be 300 on the front page? (And if computer monitors were as eye-friendly as printed paper, wouldn’t the Times website feature lots more text?)

Newspapers have at least three public-interest functions. They report news, they offer a space for public debate, and they prioritize news for readers. There are powerful online alternatives for those first two functions. I’m starting to get concerned that there’s not much good thinking about that third critical function.

As Sunstein points out in his conversation with Volokh, there’s a much wider range of information available online than there was in the days where old media was the only media. Not only do we have an explosion of citizen media, we’ve now got the opportunity to read
newspapers from around the world (including an amazing African collection via AllAfrica.com) and access a much larger wealth of newswire stories than would be available in any newspaper. We haven’t achieved perfect equity in this field – people in wealthy nations have far greater opportunity to read and write online than people in developing nations, and there are a whole lot more small-town American and European newspapers online than websites for African and Indian papers – but it’s hard to make an argument that we live in anything other than a more info-rich and info-diverse environment.

(There is, on the other hand, a good argument to be made that certain types of media, especially investigative journalism and international journalism written by foreign correspondents, are in real danger. You might be interested in a previous post on business models for “difficult journalism“. My sense is that there’s less and less support for difficult journalism, especially at papers like those in the Tribune network, which are facing strong management pressure to decrease the amount of news they report, ensuring parity with advertising.)

It’s also pretty clear that we’re not hurting for spaces in which citizens can express opinions. It’s not easy to get a letter to the editor published in the New York Times, but it’s pretty trivial these days to publish a blog. (See, again, my caveat that this is a whole lot easier to do in Canada than in Cambodia.) And there are new types of news outlets that specialize in amplifying personal opinion, like the Huffington Post, which are able to put some opinions in front of very large readerbases.

With such an embarrasment of riches, you might expect unprecedented diversity from online news sites. You’d be disappointed. Major news aggregation sites like Yahoo News and Google News offer tens of thousands of stories… but there’s a huge amount of overlap and clustering. The Project on Excellence in Journalism, as part of their 2006 State of the News Media, offers “A Day in the Life of the News“, an attempt to look at the entirity of the day’s news in the United States. They report, “The level of repetition in the 24-hour news cycle is one of the most striking features one finds in examining a day of news. Google News, for instance, offers consumers access to some 14,000 stories from its front page, yet on this day they were actually accounts of the same 24 news events.” It’s not that there aren’t more stories available on Google News – there’s tons of deep coverage accessible to anyone willing to search – but that you may be disappointed if you’re relying on Google News to put a story on the front page that you didn’t expect to be interested in but find compelling or useful (my operating definition of serendipity).

You may be disappointed for a different reason from news voting sites like Digg and Reddit. These sites rely on their users to suggest stories, and to vote on which ones should lead the coverage. As a result, these sites provide a lot of stories that may be interesting, if you share the interests (and perhaps the demographics and psychographics) of the reader/editors. But they’re pretty unlikely to surprise you with serendipity – because readers have so much in common (see Whois reddit?, in which community respondents self-report that the site’s users are 92% male, 70% employed in the IT industry or as students, and 70% from the US), they often use the same sources to find stories, and are likely to vote up stories that emphasize certain technical and political viewpoints. (See my post on “Homophily, serendipity and xenophilia” for lots more on this idea.)

And here’s where the 19th century technology of the daily newspaper proves itself to be a very powerful “persuasive technology“. When an editor assigns front-page real estate to a story, she’s telling the reader that these are the stories that demand the most attention and persuading you to read them… or at least read them long enough to decide whether or not they trigger your serendipity switch. Many newspapers have a convention of putting the biggest stories of the day “above the fold” and saving the bottom of the front page for important local stories and for “the serendipity box”, a place on the page for a story that might escape your attention if the editor didn’t feature it.

The Times, to their credit, sometimes treats half the front page as an opportunity to drive readers to stories they probably don’t know they’re interested in. Today’s largest story, in terms of page real estate, is “Inside Gate, India’s Good Life; Outside, the Servant’s Slums”, a story about class divides in modern, urban India. It’s certainly not breaking news, and it doesn’t have much to connect it to the day’s news agenda. But it’s a lovely piece of storytelling – the key factor David Weinberger identifies in getting people to pay attention to developing world news. Being on the front page works – the India story is the #3 story emailed from the NYTimes site today, suggesting that a large number of people found it compelling enough to pass it on. (Story #8 is a front page story from yesterday on South Korean mothers moving their children to Australia or New Zealand for better educational opportunities, another classic serendipity box story.)

A possible (counterintuitive) conclusion is that more choice might mean less serendipity. It’s probably possible for you to read the six major stories on the New York Times homepage, and might be possible for you to read the 13 the editor chose to feature today. I don’t care who you are, but you’re not going to read the 315 stories linked to from the Times’s online page. Navigating that page requires a great deal of personal choice – you surf through a pick the topics that are of interest to you… which may mean you filter out topics you don’t know you’re interested in, or topics you’re actively disinterested in, which might capture your attention in a moment of serendipity.

In the paper edition, you’re trading choice for trust. It’s harder to find precisely the stories you know you want, but you’ve got the opportunity to let the editor surprise you. It isn’t always the case, but the most surprising story I encounter in a given day is often something put forward by the “Old Gray Lady“.

If it’s possible to engineer serendipity with ninteenth century technology, it’s certainly possible with the resources we have today. But it’s not easy. Most recommendation technologies – the algorithms Amazon or NetFlix use to suggest what movies you might watch next – are a form of collaborative filtering. These systems take information about your preferences (either the movies you tell it are your favorites, or the ones you’ve expressed interest in by purchasing) and use this information to find other users who’ve expressed the same preferences. Then they recommend items that other user has liked that you haven’t expressed a preference about.

My friend Nathan Kurz, who’s turned his substantial brainpower to this topic more than once, argues that these systems aren’t about recommendation, but about prediction. The sorts of systems Netflix is seeking via its Netflix Prize do a good job of making consistent, safe recommendations of stuff you’re predisposed to like, penalizing systems that take a risk to try and recommend stuff you’d never heard of and are going to love. Quoting Nate:

Predict how well the user will like each of the items in the dataset, and recommend the items with the highest predicted values. And since Root Mean Square Error is easy to measure (and hence easy to write papers about) this is what many algorithms try to optimize.

The problem with this is that it tends to produce the safe recommendations in the userâ€™s comfort zone, rather than the risky recommendations that might expand their horizons. But the solution to this is not to use this same prediction system and randomize the results, but to design a system based around recommendations rather than around predictions. Instead of predicting what is most likely to be liked, give the recommendations most likely to be loved.

Nate’s got some good theories about how to build systems that engineer serendipity, but they largely boil down to matching people to people. Expand that set of people you’re taking recommendations from to include people who share some interests, but live in a different information universe, and you’re likely to diversify your recommendations. Find someone who shares your interest in early 1980s techno music, but lives in Lagos, and you’re likely to find some serendipitious recommendations.

Sunstein’s proposed solutions for architecting serendipity are also pretty human-focused. He recommends that bloggers make a conscious effort to link (civilly and politely) across ideological lines and that both bloggers and blog readers should monitor their media consumption to ensure they’re diversifying their inputs. In other words, the move into digital media may put the responsibility for finding serendipity from editors to readers. It’s hard to know whether this will happen – as Volokh observes, “to the extent that this is a problem, it’s a problem that’s a result of basic human failings, and that freedom and extra choice will reinforce those failings…”

Like most basic human failings, you’ve got to accept that something’s a problem before you can address it. There’s been a lot of celebration and self-congratulation about the diversity of voices that we’re able to hear in this new medium. (I’m guilty as charged on those scores.) It’s worth thinking about whether we’re doing as good a job of discovering new voices as we are at raising our own voices.

Bonus links:

– Professor Sunstein offers the idea that a university can serve as a source of serendipity, putting people in touch with people they’d otherwise not have the chance to interact with.

– On the subject of people you’re probably not interacting with, David Sasaki, managing editor of Global Voices’s Rising Voices initiative, goes into a maximum security prison in Kingston to visit his grantees, a group of Jamaicans learning to blog in prison.

– Sometimes it’s the pictures, not the words, that catch our eye. Jen Brea interviews blogger Cedric Kalonji about his astounding photoblog which documents daily life in Kinshasa.

26 thoughts on “The architecture of serendipity”

Kevin Donovan June 9, 2008 at 7:00 pm

I think it is important to note two types of serendipity – variety of view and variety of topic. They may or may not go together in a given system, but they are both desired.

Amazon recommendations don’t provide much variety of topic. They recommend (predict) the same genre over and over. What they do well, though, is recommend a variety of view – perusing development aid books, the user is recommended both Sachs and Easterly.

The insular nature of political blogs don’t have a variety of viewpoints nor do they have a variety of topics.

In architecting serendipity, I think it is important that both topics and viewpoints are diversely represented.
David Sasaki June 9, 2008 at 10:04 pm

Universities do have the potential to inspire serendipitous encounters, but because of their vast scale, I find that student populations still tend to break down by ethnicity, ideology, and taste. Most liberal arts colleges, which are small enough to force us out of our comfort zones, tend to have more homogeneous populations than their public counterparts.

I’m writing this comment from an even better place for serendipity – a hostel in South America. The rise of hostels throughout the world – and the special social interactions they often draw out – is very much deserving of an ethnography.

What remains clear to me after reading this post is that we need more research. It is all well to say that the format of a traditional newspaper, compared to the index page of the New York Times, contributes to more diverse, more ‘serendipitous’, reading. But do the numbers actually prove it? This, of course, would be very easy to measure online – you just look at outbound clicks from the New York Times home page.

But how do we know who reads what after scanning over the front page of the print publication? Just because the serendipity is there doesn’t mean that it is leading to more serendipitous reading. I don’t think this is a question you can answer with surveys – it’s very chic to appear worldly and I think survey respondents would self-report to have more diverse reading habits than they actually do. The only thing I can think of would be to pour over hours of of Starbucks CCTV footage to see who reads what after scanning the front page. It would be one hell of a research project.

There is one last complexity which gets thrown into the mix. Among my group of friends, you are somewhat of an anomaly in that you seem to read the New York Times starting at the index page. I never ever navigate to the index page. When I read a New York Times article it comes from either my RSS reader, from nytimesriver.com, or a link (blog, facebook, brightkite, twitter.) It wouldn’t be fair to simply compare the front page of the New York Times to the index page of the New York Times website; it should be compared to the entire media ecology which feeds readers to New York Times articles.
Ben Coleman June 10, 2008 at 11:03 am

David:

How about surveying people the day after (or a particular period afterwards) to see what articles they remember from the newspaper?

True, some might falsely claim to have read X article, but if you asked some follow up questions, you could easily find out. People’s memory may be an issue, but you could control for that.

I agree more research needed!
Nicholas Laughlin June 10, 2008 at 3:01 pm

I too read the NYT starting at the index or “front” page–and until now, David, it never occurred to me to think of this as unusual. It’s part of my morning ritual: get out of bed, make coffee, check email, read newspapers online (I’m lucky enough to work at home, so I can afford to ease myself into the day like this, slowly).

I habitually load up the home pages of, more or less in this order, the Trinidad Guardian, Trinidad Express, Stabroek News (Guyana), Jamaica Gleaner and Observer, NY Times, and London Guardian. And of course I start at the “front” pages of all these papers precisely because I want to see which stories their editors consider most important that day. It has less to do with trusting the editors’ judgement (as to what is the most important event of the moment) and more to do, I suppose, with a curiosity about how current events are being categorised into various levels of newsworthiness. (Does that make any sense?)

I suspect that most people who read newspapers online do it this way–starting at the “front”, even if they quickly navigate away–but again it would be useful to have real statistics.

Ethan, it seems to me that a crucial element in all this is understanding better exactly how and why newspaper editors decide which stories to lead with, either in the printed papers or online–and how do they decide what goes in that serendipity box? Surely some well-funded media studies department somewhere is looking into this? Interviewing the relevant editors from, say the 25 North American papers with the highest circulations to investigate how much is decided by editorial policy, how much by personal interests, how much by other factors–eg pushing a particular story for a prize, or wanting to get the best return from an expensive investigative report, or even something as “trivial” as pushing a story up because it has a particularly clever headline.

Speaking of clever headlines–I for one have clicked on a link for a story on a subject I had no prior interest in simply because the headline was witty, or because the accompanying photo caught my eye, or even because I recognised the reporter’s name in the byline. All these things complicate the questions you’re asking.

The entire subject is of course vitally fascinating and I read all your homophily/serendipity posts with the greatest interest.
Pingback: links for 2008-06-10 « andrew golis
Pingback: On the Road to Kenya (and links) | White African
Nairobian Perspective June 12, 2008 at 4:40 am

its refreshing to note that the growth of the blogosphere is sparking such intellectual discourses especially in relation to engineering serendipity which by essence is discovery by accident!your articles stirs interests and raises questions that may lead to quite some good content and discourse
Nathan Kurz June 12, 2008 at 1:56 pm

Hi Ethan —

Thanks for the context in to which you put my quote. I think the example of shared musical tastes with someone coming from a different culture is excellent.

I’m still trying to distill my thoughts on why the front page of the NYT succeeds as well as it does on serendipity. I haven’t yet had time to listen to the Sunstein and Volokh interview. But my guess is that it works because the intrinsic quality of each of the front page articles is very high in terms of being ‘compelling reading’, and because the targeted audience of the Times is just broad enough that at least a couple of the articles will be outside the usual interests of a given reader while still within a realm of comfort.

Personally, I gave up on reading the online front page of the Times Online a couple years ago when they did a major redesign that drastically increased the number of links on the front page. Previously the website looked a lot more like the print edition. At the time, I thought it was the ugliness and lack of readability that drove me off, but in retrospect, perhaps it was the loss of focus and bag-of-links approach that left me cold.

Currently, I get the majority of my news via Reddit. I disagree a bit with your assessment of the lack of breadth that one can find there: while the bulk of the readership is fairly homogeneous, there is a lot of breadth in the links if one is willing to dig a bit. You do have to work to find these, though. Normally, I quickly scan through the top 200 links on Reddit more or less daily, and generally find about 10 things that I pop up in a new window based on title and sometimes comments. Of these, I find about 5 that am I happy to have read. Tellingly, I find this 2.5% success rate to be the best means currently available for filtering to find interesting reading. I have hopes of developing a site that can do much better than this.

As to the larger question of how to better engineer serendipity, I fear that one must first answer the even broader question of why people should want to read about world events in the first place. Instead of concentrating on the ‘engineer’ part, concentrate on defining ‘better’. As the Times example points out, better is not synonymous with more. How exactly does greater awareness of world events produce an intersection of personal benefit for the individual and societal benefit for the world? I’m finding it hard to proceed in my thinking about this topic without a better definition here.
Ethan June 12, 2008 at 2:54 pm

Nate, my use of reddit is much like yours, and my success rate is much like yours… and that’s left me feeling deeply worried about homophily effects. The stories I find via Reddit that I’m happy to find are usually around US politics and usually have either a strong libertarian or left slant to them. I’m not finding a lot of meaningful international news via Reddit, and suspect that this is a reflection of the interests and expertise of the people involved in the community.

As you’ve probably figured out, I’m trying to sketch book chapters with blogposts like this one. The argument for the importance of international news – and news in general – is on the to do list. But I agree – defining serendipity involves defining what it is we’re asking news to do for us.
Pingback: …My heart’s in Accra » Saving sections of the daily newspaper
Pingback: …My heart’s in Accra » My turn on the soapbox
Pingback: Ethan Zuckerman’s work toward a Serendipity Enguine « Ida C. Benedetto
Pingback: Serendipity and education at Seeing the Forest AND the Trees
Pingback: …My heart’s in Accra » Shameless self-promotion
Pingback: New York Times article skimmer at Klintron’s Brain
Pingback: Redesigning the newspaper online « The Future of Journalism
Pingback: blog2 » Cuando la serendipia puede ser parte del diseño
Pingback: In defence of newspapers and serendipity
Pingback: …My heart’s in Accra » What if search drove newspapers?
Pingback: Different Forms of Filtering Create Different Forms of Value « Innovation Leadership Network
Pingback: Cosa ci nascondono i motori di ricerca? | Pino Bruno
Pingback: Cosa ci nascondono i motori di ricerca? | agora-vox.bluhost.info
Pingback: Newspapers vs. Blogs in an Information Diet | Technoccult
Pingback: Are we stuck in filter bubbles? Here are five potential paths out » Nieman Journalism Lab | camerareviewer.co.uk
Pingback: Joining Dots | Links last week – 080615
Pingback: Serendipity by Design - InformED :

Comments are closed.