Nonsense. Regulation rarely works retroactively. Their model is trained and they have the money to license incremental data going forward, potentially exclusively.
My point such laws that regulate the act of scraping itself cannot work because you can easily scrape in a different country where that law doesn’t apply, and then transfer the data in - or indeed train your NN in a different country and transfer the model.
Only copyright can see through all of that, you would have to gut fair use in order to have an effective anti-scraping law.
I'm going largely by memory but when the U.S. expanded copyright at one point they actually took some stuff out of the public domain. You can look it up but the current formula is authors life plus 70 and a different formula for corporate works, and when they expanded it most recently there were actually some public domain works that become not public domain retroactively. (A quick google search reveals the 1976 Act added 19 years to the terms of existing copyrights, this might be what I'm thinking of-- in other words some works that had copyright expired then had them renewed and removed from the public domain.)
There's also copyright reversion, which is a related new provision that applied to older copyrighted works. Quoting from an article I just pulled up
"...the 1976 Act created a new right allowing authors and their heirs to terminate a prior grant of copyright, the Act also set forth specific steps concerning the timing and contents of the termination notice that must be served in order to effectuate termination. The termination of a grant may be effective “at any time during a period of five years beginning of the end of 56 years from the date the copyright was originally secured”..."
But this is a red herring because the fact a model has been trained in the past doesn't mean a copyright lawsuit is "retroactive". The infringement would presumably be occuring anew every day you make it available on your web site.