In 1994, I was doing some consulting to The New York Times Information Services Group (NYTISG), which had put that newspaper’s content onto America Online’s proprietary online service. It had recently launched a website containing New York Times Syndicate stories, which were taken from the front sections of the newspaper, and NYTISG was deliberating whether to launch a website for the newspaper itself.
But first it wanted to promote the NYT Syndicate stories’ website. Its staff had noticed from usage logs that one person in particular was accessing the syndicate site many times each day. So, they asked me track down that person and ask him if he’d be willing to be part of a promotional ad for the website.
Running that person’s IP address through a WHOIS engine, I it resolved to filo.stanford.edu. I phoned Stanford University’s IT Department and I learned that filo.stanford.edu wasn’t one person, but a computer server operating a Web indexing spider under the direction of two doctoral candidates, Jerry Yang and David Filo.
I reported this the vice president in charge of NYTISG. I explained what a Web indexing spider was; how the basics of that technology were in the public domain; and suggested that The New York Times Company should operate one. That way, I said, the Times’ site would offer online users not only ‘All the News that’s Fit to Print’ but also a searchable index of everything that is online.
The vice president (who now publishes a newspaper in the state of Washington) looked at me as if I was daft and he replied, “No, we’re a newspaper, not some sort of online encyclopedia or phone book.“
Of all the consulting advice that I’ve given clients during the past dozen years, it was the one bit that I wish had been followed.
The New York Times Company and most every other newspaper in the world nowadays wishes it had the online traffic, gross revenues, and market capitalization of Yahoo! — the Web indexing company that Jerry Yang and David Filo built from that spider.
I mention this because last week the World Association of Newspapers (WAN), the trade organization representing 18,000 of the world’s newspapers, announced that it had joined other traditional publishers organizations in efforts to determine if can they legally charge the search engines that index their news.
The other publishers organizations involved are the International Publishers Association (IPA), International Federation of the Periodical Press (FIPP), European Federation of Magazine Publishers (ENPA), European Publishers Council (EPC), European Magazine Publishers Association (FAEP), French association for magazine publishers (SPMI), association of French national newspapers (SPP), and the French regional daily newspaper association (SPQR), plus the French news agency Agence France-Presse (AFP). Some online groups that are dominated by subsidiaries of print publishers, such as the Online Publishers Association of the UK, have given the endeavor a “cautious welcome.”
Don’t be fooled by this initiative. Rather than catch up by doing what they should have done long ago, these publishers are searching for legal ways to tax the railroad because the gravy train has left them behind.
The publishers organizations acknowledge that search engines “provide a valuable service to publishers in terms of traffic generation” but claim that the search engines “have built their business models in large part on taking content for free.”
Let’s consider that phrase “taking content for free,” but first look at this:
- WAN – Newspaper, Magazine and Book Publishers Organizations to …
A task force of global and European publishers organizations, led by the World
Association of Newspapers, has agreed to work together to examine the options …
www.wan-press.org/article9055.html – 15k – Feb 3, 2006 -.
Did I now just take WAN’s content? Was that citation comprehensible? From it, would you know what WAN and those publishers are doing?
I ask because I’ve just cited WAN’s online content exactly the same way, word for word, that Google News’ automatically did. Neither that eight-word abstract headline nor its 25-word abstract text even explains what WAN and the other publishers’ organizations are doing.
Look at Google News or Yahoo! News and see for yourself how the search engines briefly abstract news organizations’ headlines and texts. Whether or not a gist of a news story, nonetheless its content, as cited by the search engines is comprehensible depends entirely upon if that news organization’s headline writer and text author were pithy and cogent enough. Does the publishers’ content really consist of a (often incomplete) headline of less than ten words and an incomplete text of less than two dozen words?
No. I think the publishers’ claim that the search engines are taking their content is absurd. The search engines are merely pointing people to the content on the news organizations’ own sites. Indeed, the search engines’ citations towards the publishers’ news is even less than academic abstracts or business abstracts. Ask a librarian or professor.
However, WAN President (and Group COO of newspaper publishing company Independent News & Media PLC) Gavin O’Reilly told the Financial Times “That’s often enough” for readers browsing the top stories and “the fact here is that we’re dealing with basic theft.”
If you accept Mr. O’Reilly’s logic, then don’t be surprised if the restaurant industry sues the newspaper industry for providing capsule listings and reviews that point potential diners to restaurants. The newspaper industry could defend itself by claiming that it doesn’t actually provide the food content to restaurant’s potential patrons (at best, the reviews might provide a bit of the restaurants’ flavors) and that the reviews are simply pointing potential patrons to those restaurants and thereby increasing those restaurants’ traffic. However, the restaurant owners might claim ‘That’s often enough’ to prevent people from coming onsite and browsing.
Is “theft” what the search engines are doing? Is theft pointing to something that is being given away for free? The news organizations aren’t charging anyone for accessing their news online.
If the newspaper industry wants to claim that citing its headlines online is “theft”, then it might first pursue the other daily newspapers that are doing it (such as this or this regular example) Publishers should pursue their direct competitors who are doing it, before claiming that the search engines are competitors.
Mr. ‘’Reilly also said, “The irony is that these search engines exist, largely, because of the traditional news and content aggregators and profit at their expense.”
That statement is both patently and historically false. Search engines such as Google and Yahoo! existed for many years before indexing the news organizations’ websites. During that time the search engines, without pointing to those organizations’ news, grew to dwarf those organization in terms of online traffic, asset value, and market capitalization. News has never accounted for a significant fraction of these search engines traffic or revenues.
The only basis for theft that Mr. O’Reilly might legitimately claim is that the search engines have ‘stolen’ advertisers and consumers from using those publishers’ websites more often or more fully. Google and Yahoo! now have more online advertisers than those publishers because these search engines attract more consumers than any of those publishers’ sites do. And the search engines attract more consumers than the publishers’ sites do because the search engines provide a service that those publishers long ago failed to provide but could have.
Indeed, the organizations announced their legal initiative eight days after Google announced that it was removing the ‘Beta’ from Google News. I don’t think that timing was random.
In the WAN announcement, Mr. O’Reilly referred to the’ Napsterisation’ of content, hoping to lend credence to his industry’s claim that search engines are stealing its content.
However, what the search engines are actually doing has nothing to do with ‘napsterizing content or stealing content. What the search engines provide to consumers is a package of pointers to where information including now news from all sources is located online. As I’d mentioned from my New York Times experience, the newspaper industry could have done that a dozen years ago. Or five years ago. Or now.
The newspaper industry didn’t and doesn’t. Five years ago, did attempt to create what it called Internet ‘portals’, but those website contained only the content from that newspaper company and its affiliates. A portal just to them.
I don’t wish this WAN endeavor well. I think it’s a waste of time and money that the newspaper industry could better be spending on providing the types of services that it should have done long ago or those that it needs to now. Perhaps it is too late for The New York Times or other newspaper companies to become Googles or Yahoos (or Ebays or MySpaces), yet there are still plenty of new services to be developed online.
Don’t try to tax a gravy train that you’ve missed. Start another one.