Tuesday, August 25, 2009

N.Y. Times mines its data to identify words that readers find abstruse

BY Zachary M. Seward
from the Nieman Journalism Lab (http://www.niemanlab.org/)

If The New York Times ever strikes you as an abstruse glut of antediluvian perorations, if the newspaper’s profligacy of neologisms and shibboleths ever set off apoplectic paroxysms in you, if it all seems a bit recondite, here’s a reason to be sanguine: The Times has great data on the words that send readers in search of a dictionary.

As you may know, highlighting a word or passage on the Times website calls up a question mark that users can click for a definition and other reference material. (Though the feature was recently improved, it remains a mild annoyance for myself and many others who nervously click and highlight text on webpages.) Anyway, it turns out the Times tracks usage of that feature, and yesterday, deputy news editor Philip Corbett, who oversees the Times style manual, offered reporters a fascinating glimpse into the 50 most frequently looked-up words on nytimes.com in 2009. We obtained the memo and accompanying chart, which offer a nice lesson in how news sites can improve their journalism by studying user behavior.

All of the 25-cent words I used in the lede of this post are on the list. The most confusing to readers, with 7,645 look-ups through May 26, is sui generis, the Latin term roughly meaning “unique” that’s frequently used in legal contexts. The most ironic word is laconic (#4), which means “concise.” The most curious is louche (#3), which means “dubious” or “shady” and, as Corbett observes in his memo, inexplicably found its way into the paper 27 times over 5 months. (A Nexis search reveals that the word is all over the arts pages, and Maureen Dowd is a repeat offender.)

Corbett also notes that some words, like pandemic (#24), appear on the list merely because they are used so often. Along those lines, feckless (#17) and fecklessness (#50) appear to be the favorite confounding words of Times opinion writers. The most looked-up word per instance of usage is saturnine (#5), which Dowd wielded to describe Dick Cheney’s policy on torture.

This is mostly just interesting — quiz: how many of these words can you define? — but it’s also a reminder that news sites are sitting on a wealth of data, from popular search terms to click rates, that can help them adjust to reader preferences. So are Times scribes being asked to rein in their vocabularies? That might be a Sisyphean (#37) task, but no, Corbett merely advised reporters to “avoid the temptation to display our erudition at the reader’s expense.”

After the jump, I’ve taken the original chart of 50 words, which was compiled by director of web analytics James Robinson, and run my own spreadsheet that also calculates look-ups per use.


Most frequently looked-up words on NYTimes.com, 2009
Jan. 1 - May 26, 2009

No comments: