So at one point the answer was robots.txt and now it's not: https://blog.archive...

So at one point the answer was robots.txt and now it's not: https://blog.archive.org/2017/04/17/robots-txt-meant-for-sea... - that information appears to be current - email info@archive.org and request removal is the process, which some "reputation management" firms talk about. Weirdly I can't find much info.

Furthermore, I don't think archive.org tries to hide/obfuscate their user agent so it's relatively easy to block them - I know that it's possible to manually upload stuff to archive.org, and there are other sources (partnerships with Cloudflare and Brave, at a minimum) but that's not as easy as the Wayback Machine.