That's a historical question. At this time, most if not all the bots were either...

pseudalopex · on June 16, 2024

You meant search bots and other bots? Internet Archive's bot is a crawler.

They showed no difference between search bots and archive bots. robots.txt was never for SEO alone. Sites exclude print versions so people see more ads and links to other pages. Sites exclude search pages to conserve resources. They said sites exclude large files for costs. And they can't think sites want sensitive areas like administrative pages archived.

Really Internet Archive stopped respecting robots.txt because they wanted to archive what sites didn't want them to archive. Many sites disallowed Internet Archive specifically. Many sites allowed specific bots. Many sites disallowed all bots and meant all bots. And hiding old snapshots when a new domain owner changed robots.txt was a self inflicted problem. robots.txt says what to crawl or not now. They knew all of this.

mattigames · on June 15, 2024

If it was uniquely an historical question then another text file to handle AI requests would exist by now, e.g. ai-bots.txt, but it hasn't and likely never will, they don't want to even have to pretend to comply with creator requests about forbidding (or not) the usage of their sites.