Automated Content Access Protocol

Automated Content Access Protocol ("ACAP") is a proposed method of providing machine-readable permissions information for content. This will allow automated processes (such as search-engine web crawling) to be compliant with publishers policies without the need for human interpretation of legal terms. ACAP was developed by the publishing industry with technical partners (including search engines). It is intended to provide support for more sophisticated online publishing business models, but has been criticised for being biased towards the fears of publishers who see search and aggregation as a threat, rather than as a source of traffic and new readers.

Current status[]

In November 2007 ACAP announced that the first version of the standard was ready. No non-ACAP members, whether publishers or search engines, have yet adopted it. A Google spokesman appeared to have ^[1] ruled out adoption. Google's CEO has since indicated that Google has no objection to implementing ACAP ^[2] and are working on resolving technical issues which are, at present, preventing implementation. No progress has been announced since the remarks in March 2008 and Google ^[3], along with Yahoo and MSN, have since reaffirmed their commitment to the use of robots.txt and sitemaps.

Previous milestones[]

In April 2007 ACAP commenced a pilot project in which the participants and technical partners undertook to specify and agree various use cases for ACAP to address. A technical workshop, attended by the participants and invited experts, has been held in London to discuss the use cases and agree next steps.

By February 2007 the pilot project was launched and participants announced.

By October 2006, ACAP had completed a feasibility stage and was formally announced^[4] at the Frankfurt Book Fair on 6th October 2006. A pilot program commenced in January 2007 involving a group of major publishers and media groups working alongside search engines and other technical partners.

ACAP and search engines[]

One of ACAP's initial goals is to provide better rules to search engine crawlers (or robots) when accessing websites. In this role it can be considered as an extension to the Robots Exclusion Standard (or "robots.txt") for communicating website access information to automated web crawlers.

It has been suggested^[5] that ACAP is unnecessary, since the robots.txt protocol already exists for the purpose of managing search engine access to websites. However, others^[6] support ACAP’s view^[7] that robots.txt, devised over 10 years ago and subsequently unmaintained, is no longer sufficient. ACAP argues that robots.txt was devised at a time when both search engines and online publishing were in their infancy and as a result is insufficiently nuanced to support today’s much more sophisticated business models of search and online publishing. ACAP aims to make it possible to express more complex permissions than the simple binary choice of “inclusion” or “exclusion”.

As an early priority, ACAP is intended to provide a practical and consensual solution to some of the rights-related issues which in some cases have led to litigation^[8]^[9] between publishers and search engines.

Only one search engine, the little-known Exalead, has confirmed that they will be adopting ACAP.

Comment and debate[]

The project has generated considerable online debate, in the search^[10], content^[11] and intellectual property^[12] communities. If there is one linking theme to the commentary, it is that keeping the specification simple will be critical to its successful implementation, and that the aims of the project are focussed on the needs of publishers, rather than readers. Many have seen this as a flaw.

ACAP participants[]

Publishers confirmed as participating in the ACAP pilot project include (as at 16th February 2007)

Notes and references[]

[1] Search Engine Watch report of Rob Jonas' comments on ACAP

[2] IT Wire report of Eric Schmidt's comments on ACAP

[3] Improving on Robots Exclusion Protocol: Official Google Webmaster Central Blog

[4] Official ACAP press release announcing project launch

[5] News Publishers Want Full Control of the Search Results

[6] Why you should care about Automated Content Access Protocol

[7] ACAP FAQ on robots.txt

[8] "Is Google Legal?" OutLaw article about Copiepresse litigation

[9] Guardian article about Google's failed appeal in Copiepresse case

[10] Search Engine Watch article

[11] Shore.com article about ACAP

[12] IP Watch article about ACAP

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]