LawMeme LawMeme Yale Law School  
LawMeme
Search LawMeme [ Advanced Search ]
 
 
 
 
Robots.txt and Copyright: Not Yet
Posted by Rebecca Bolin on Thursday, July 14 @ 21:04:38 EDT Copyright
It was only a matter of time until robots minding their own business got mixed up in copyright battles. Robots are doing the grunt work in dealing with a network bigger than anything managed by humans before. They are trudging through the Internet, sorting the Internet, and archiving the Internet. It was really only a question of what the facts would look like when a robot stepped on someone’s copyright. In fact, I remember a poolside conversation about what this lawsuit might look like. I have expected this lawsuit for years. But when I saw it, I barely recognized it. (continued…)

This lawsuit, Healthcare Advocates, Inc. v. Harding, Early, Follmer & Frailey, et al., was filed earlier this month. William Patry does a much better job with the facts of this lawsuit than I could, but this is my description in short. A corporation is suing the Internet Archive (IA), the non-profit that runs the Way Back Machine, for contract violation that they would block old archive pages one a robots.txt file was posted on the company's website.

I had outlined a series of issues about the legal status robots, robots.txt, and the DMCA, but this just isn't the case that will decide these issues. At best, this case will show what kind of notice robots.txt is in a contract.

Notice using robot headers matters because if robots.txt is a good way of giving notice, it could change the fair use inquiry for all kinds of robots. Owners might even be able to contract out of fair use by blocking the robots themselves (the modern 'No Trespass' sign), making IA and Google subject to those limitations with legal force. Robot Exclusion Headers and protocols are currently voluntary standards, NOT rules, especially not legal ones. If a robots.txt file is given the power of notice, it turns that standard into a rule, a scary prospect given how many other voluntary techinical standards are run, promoted, and designed by third parties.

Even the robot standards are not settled. Some commands, such as No-Email-Collection, may or may not even be considered part of the standards. Helathcare Advocates claims IA had a contract with them about the archived pages, and the robots.txt file invoked their obligation to not display old pages. Maybe I have a contract with you when you visit my webpage about what kind of robot access you can use (this is still not settled), and if I do, I can claim you should have known about generic robot exclusions, or even a particular robot exclusion command, perhaps No-Email-Collection. Here the line between rule and standard is fuzzy. No-Email-Collection may be some sort of explicit contract term in a robot-readable header, but perhaps without special contracting (as the creators of this tag always invoke) it should have no force because it is a novel voluntary standard that very few even know about. Generic robot exclusions have been around a few years, and that is a long time on the Internet, so we view them as settled.

Contracting with robots is difficult, and as a society we are going to have to figure out how to do it. Perhaps robots.txt is a good, legally-binding way to communicate with robots, or maybe it's just the best we have. Either way, there needs to be limits. Those limits need to at least be at novel commands; a robot shouldn't be able to sign something it doesn't understand. In this case, I would have liked to see some actual notice in addition to the robots.txt file, given Healthcare knew so much.

 
Login
Nickname

Password

Don't have an account yet? You can create one. As a registered user you have some advantages like theme manager, comments configuration and post comments with your name.
Related Links
· More about Copyright
· News by Rebecca Bolin


Most read story about Copyright:
Top Ten New Copyright Crimes

Article Rating
Average Score: 0
Votes: 0

Please take a second and vote for this article:

Excellent
Very Good
Good
Regular
Bad

Options

 Printer Friendly Printer Friendly

"Robots.txt and Copyright: Not Yet" | Login/Create an Account | 2 comments | Search Discussion
The comments are owned by the poster. We aren't responsible for their content.

No Comments Allowed for Anonymous, please register

Re: Robots.txt and Copyright: The Other Shoe Drops (Score: 1)
by HowardGilbert on Friday, July 15 @ 12:40:01 EDT
(User Info | Send a Message)
Access to a Web site is provided through HTTP and URL. Access may be limited by various technical means including filters and lists based on IP address or userid/password. None of these measures were apparently used in this case. The process of accessing a Web page over HTTP transmits a copy of its contents over the network to the receiver. Since the transmission is made by an agent of the copyright owner (the Web server) then the copy necessarily left in Browser memory and incidentally left in the Browser cache and in proxies is valid (there are other HTTP headers that can be used to avoid such incidental copies if the copyright owner chooses to use them).

However, a willingness of the copyright owner to allow you to view the material doesn't imply a licence to make your own copy of it. That next step, of copying the material to disk, is a second operation performed by the client and not by the copyright owner. It is legal only if there is permission or a licence to do so, through a personal copy for later may be "fair use".

There are three things here. The willingness of the copyright owner to transmit the information to any Browser makes the information itself public and precludes any claim of trespass. If I throw something at you across a public street I can hardly claim it was trespass that you received it.

The robots.txt file is a request to the particular subset of the public that happen to be robots to not make copies or indices of the material. It is not a technical means to restrict access, nor is it a contract, or anything else that is obvious under law.

The last question, however, is whether a particular agency, even non-profit, may hold and redistribute to others copyrighted material that was publicly furnished to it but which was not accompanied by any licence to redistribute. In this question the robots.txt is indirectly useful as evidence that the copyright restriction was not simply present in the text of the document but that the owner took some effort to prevent robots from even unintentionally circumventing the restriction. That doesn't seem to be the claim, but maybe it is buried in the actual case.



Re: Robots.txt and Copyright: The Other Shoe Drops (Score: 1)
by ehnonymous on Friday, July 15 @ 15:45:16 EDT
(User Info | Send a Message)
I was once taught by a Copyright prof that the best way to win a case on DMCA violation is to create a technological barrier to copyright violation that is _extremely ineffective_. Things like the famous Sharpie trick, see Can You Violate Copyright Law With a Magic Marker? By Brendan I. Koerner at Slate [slate.msn.com], prove that you can theoretically win a suit based on a DMCA violation... with a monumentally stupid technological barrier. If it's a bigger crime to "break something" and enter than to just enter without permission (compare burglary, where "breaking" and entering means something entirely different, although criminal trespass is still not the same as burglary), then apparently copyright owners can put up a limp string across their doorway, or a gossamer thread, or a note with "go away" attached by chewed gum to the doorway. Entering requires removing said obstacle, like using a program requires ignoring the shrinkwrap license and clicking "I agree," whether one does or not. To the extent that robots.txt is considered by a court to be an effective technological countermeasure to violation of copyright, that court is a ass - a idiot. - Paraphrasing Bumble, Dickens, _Oliver Twist_.



LawMeme
PHP-Nuke Copyright © 2005 by Francisco Burzi. This is free software, and you may redistribute it under the GPL. PHP-Nuke comes with absolutely no warranty, for details, see the license.
Page Generation: 0.11 Seconds