I posted a couple of weeks back about an experiment I was beginning to run, looking at what headlines from the New York Times get “selectively amplified” by weblogs. With the (generous, much appreciated) help of Kevin Marks, my tools are now checking headlines for their inclusion in Technorati.
Anna, commenting on my blog post, observed that my data would be stronger if I reported what percentage of international stories that appeared in the Times were picked up by bloggers. So I’ve reworked the tools to track the categories the Times filed the stories into, as well as the age of the stories.
Headlines that have appeared on the Times website over the last two days are classified (by the Times web site architects – I’m using the section designation in the story URL) into 17 categories. 10 major categories (arts, business, health, international, national, NY regional, politics, science, sports and technology) have 9 or more stories in them; 7 minor categories (books, dining, education, fashion, obits, travel, theatre) have four or fewer.
Of the major categories, “politics” stories were blogged the most, with 214 links; “technology” lags well behind with 51 posts, followed by “international” and “business” with 43 each. The ranking is a bit different when we consider the number of blogposts per NYT story – “politics” continues to crush the pack with 3.82 posts per story; “technology” and “science” follow at 1.7 and 1.67 posts per story, respectively. “international” falls to 6th of 10 categories, business to 8th. Sorted by the percentage of stories that get blogged, “politics” continues to lead the pack at 78.57% – “health” edges “science” and “technology” for second at 69.23%, and “international” and “business” stay in their places at 6th and 8th.
|
blogpost per story | |
% of stories blogged | |
# of stories | |
mean story age in days | |
total blog posts |
major categories |
|
|
|
|
|
politics |
3.82 |
78.57% |
56 |
1.77 |
214 |
health |
1.31 |
69.23% |
13 |
2.77 |
14 |
science |
1.67 |
66.67% |
9 |
2 |
15 |
technology |
1.7 |
53.33% |
30 |
2.5 |
51 |
national |
1.54 |
46.15% |
13 |
1.92 |
20 |
international |
1.34 |
43.75% |
32 |
0.81 |
43 |
arts |
1.5 |
35.71% |
14 |
3.07 |
21 |
business |
0.7 |
22.95% |
61 |
0.82 |
43 |
nyregion |
0.46 |
18.00% |
50 |
0.76 |
23 |
sports |
0.25 |
15.63% |
32 |
0.84 |
8 |
minor categories |
|
|
|
|
|
obits |
1.5 |
75.00% |
4 |
2.5 |
6 |
education |
1.5 |
50.00% |
4 |
4.5 |
6 |
books |
2.5 |
75.00% |
4 |
2.5 |
10 |
dining |
2.67 |
100.00% |
3 |
1 |
8 |
fashion |
5 |
100.00% |
1 |
5 |
5 |
travel |
0 |
0.00% |
0 |
4 |
0 |
theatre |
0 |
0.00% |
0 |
0 |
0 |
The categories most ignored by bloggers are sports (0.25 blogposts per story, 15.63% of 32 stories blogged) and New York regional news (0.46 blogposts per story, 18% of 50 stories blogged). There’s a simple regional explanation for this – these sections of the Times are of most interest to New Yorkers (who, though numerous, represent only a small portion of the world’s bloggers), while the other major sections cover national or regional issues.
The reason I’m listing the categories in quotes is because the names are somewhat deceptive. “politics” contains only stories regarding US politics – stories on European politics are under “international”. But many of the “international” stories have a strong US component – of the 14 stories that were blogged, 4 explicitly mention the US, and three, Iraq.
Two focus on Canada, two Europe (France and Greece), one each on Israel and Iran. The only “international” story blogged on a low-development nation is a story about conflicts between white farmers and Masai herders in Kenya. (A quick glance at the unblogged “international” pieces reveals that the Times did run stories on developing nations, including stories on Haiti, Russia, Indonesia and North Korea.)
Here’s the full results, for anyone who’s curious.
I’m planning on running the script for a few more days to see if the results differ meaningfully. I feel pretty confident that most NYT stories that are getting blogged will reach Technorati within 36 hours; searching Technorati for links like “www.nytimes.com/2004/09/23″ gives us a general sense for how many links per day to the Times appear. At 5pm today:
9/23 -> 166 links
9/22 -> 577 links
9/21 -> 594 links
9/20 -> 833 links
9/19 -> 766 links
9/18 -> 310 links
9/17 -> 526 links
I conclude from this that many of the links a story will get will appear the day after a story appears; the variation we see on the 18th probably has to do with the size of the newspaper. So running the script for three or four days will likely ensure that most links to stories will register. I’m also hoping to run an alternative data set (possibly The Guardian) before releasing more complete results in a couple of weeks.
I always have mixed feelings about turning up evidence that supports my pet theory – that bloggers as a whole don’t give more of a damn about the developing world than residents of wealthy nations as a whole. Blogs give us a great opportunity to tell mainstream media what we care about – unfortunately, what we care about is Bush and the iPod. At least I’m converging on a title for the article or book I’m hoping to write on this subject: “The Media Sucks, and It’s Your Fault”.