Here's a post that I reblogged from Tom Ewing over on Tumblr, followed by the reply I gave. The article in question is "How Promotion Affects Pageviews On The New York Times Website" by Brian Abelson.*
That I put the word "promotion" in scare quotes doesn't mean I think it's false but rather that something like "length of time a piece is linked on the homepage" isn't quite part of the normal connotation of the word "promotion" and this difference needs to be highlighted. I also think that when Abelson uses the words "predict" and "explain" they need scare quotes even more, that they're potentially wrong. Again, all he's showing is that if he's got one set of numbers he can get within 10 percent of another one; he's not explaining the connection between the numbers. He obviously believes that promotion is the main driver here, but he certainly hasn't shown that it's the main driver or proven that it explains the pageviews (and he's not saying that it's entirely responsible for the 90 percent, though he only gives one sentence to the possibility of influence feeding back from pageviews to promotion).
My problem with the word "predicts" is the connotation of one thing leading to a later thing — whereas "how long a piece is linked on the homepage" is as after the fact as "number of pageviews," and again we can envision pageviews influencing time on the homepage as much as vice versa.
In short, I don't think the piece earns the word "affects" in its title.
( The return of cumulative advantage )
( Footnotes )
via @bmichael on Twitter. Long, knotty, work on the extent to which promotion causes pageviews, which is obviously important to know when “pageviews” is your metric of success. There’s also a good bit at the beginning about the perils of a metric dominating your approach to your job - germane to the pageview issue but worth bearing in mind more generally.Tom, I think you should rethink your analysis here: nothing in the article tells us how much of the correlation between "promotion" and pageviews is caused by promotion leading to pageviews, how much by high pageviews leading to further promotion, and how much is due to other causes. Yet your endorsing the 90 percent figure means that you take the correlation to be entirely down to promotion, with nothing going the other way and no other inputs.**
It’s complex work, but here’s my gloss after one reading.
Say you’re an editor, and want to allocate resources, know what and who to commission, etc. Pageviews are a powerful metric for doing this, because pageviews pay the bills. Obviously promotion is going to influence pageviews to some degree, and the important question becomes: how much?
If the level of influence of promotion is relatively low, then you can reasonably suppose that the extra pageviews of that article are down to its topic, its writer, its innate brilliance, etc.
If the level of influence of promotion is relatively high, then you can reasonably suppose that writer, topic, etc. don’t make that much difference. They don’t make NO difference - partly because to some extent judgements about what works are already baked into the system, and articles about “dull” topics don’t make the front page of the New York Times. So a high level of influence wouldn’t show that, eg, you can make Lorem Ipsum text a hit with promotion. But it would show that - on the scale of pageviews of a major website - the power of promotion to drive traffic outstrips the power of anything else.
So what is the level of influence? The piece suggests it’s about 90%. 90% of the variance in pageviews can be explained by the level of promotion a piece receives.( That sounds pretty big. It probably is big. )
That I put the word "promotion" in scare quotes doesn't mean I think it's false but rather that something like "length of time a piece is linked on the homepage" isn't quite part of the normal connotation of the word "promotion" and this difference needs to be highlighted. I also think that when Abelson uses the words "predict" and "explain" they need scare quotes even more, that they're potentially wrong. Again, all he's showing is that if he's got one set of numbers he can get within 10 percent of another one; he's not explaining the connection between the numbers. He obviously believes that promotion is the main driver here, but he certainly hasn't shown that it's the main driver or proven that it explains the pageviews (and he's not saying that it's entirely responsible for the 90 percent, though he only gives one sentence to the possibility of influence feeding back from pageviews to promotion).
My problem with the word "predicts" is the connotation of one thing leading to a later thing — whereas "how long a piece is linked on the homepage" is as after the fact as "number of pageviews," and again we can envision pageviews influencing time on the homepage as much as vice versa.
In short, I don't think the piece earns the word "affects" in its title.