Way Back Machine and delayed indexing of pages

beSpacific 2025-11-19

Via Mark Graham, Director, Wayback Machine, Internet Archive, in response to a question by a leading government documents Librarian respective to the decreased indexing of pages as well as the fact that pages used to capture regularly routinely have failed captures, even after repeated attempts….

I manage the Wayback Machine at the Internet Archive (past 10 years) but I am new to this list. You are correct, the dominate factor at play here is that our indexes have not been keeping up with our archiving. Some large numbers of archives have not yet been indexed and others are being indexed, but then “fall out” of (short lived) indexes as they move from one level of index to another. (It is complicated!) We are in the process of configuring upgraded indexing hardware to address this, and hope to have things mostly caught up by the end of the year. I note we continue to archive more than a billion URLs/day and I assure you that no actual archives are being lost and that our indexes will eventually catch up…