Robots.txt meant for search engines don’t work well for web archives | Internet Archive Blogs
lkfitz's bookmarks 2017-04-24
Summary:
"A few months ago we stopped referring to robots.txt files on U.S. government and military web sites for both crawling and displaying web pages (though we respond to removal requests sent to info@archive.org). As we have moved towards broader access it has not caused problems, which we take as a good sign. We are now looking to do this more broadly."