Robots.txt meant for search engines don’t work well for web archives | Internet Archive Blogs

lkfitz's bookmarks 2017-04-24

Summary:

"A few months ago we stopped referring to robots.txt files on U.S. government and military web sites for both crawling and displaying web pages (though we respond to removal requests sent to info@archive.org). As we have moved towards broader access it has not caused problems, which we take as a good sign.  We are now looking to do this more broadly."

Link:

http://blog.archive.org/2017/04/17/robots-txt-meant-for-search-engines-dont-work-well-for-web-archives/

From feeds:

Open Access Tracking Project (OATP) » lkfitz's bookmarks

Tags:

oa.new oa.ia oa.preservation oa.search

Date tagged:

04/24/2017, 14:32

Date published:

04/24/2017, 10:32