For me, the irony is the opposite side of the same coin, 30 years of "information wants to be free" and "copyright infringement isn't piracy" and "if you don't want to be indexed, use robots.txt"…
…and then suddenly OpenAI are evil villains, and at least some of the people denounced them for copyright infringement are, in the same post, adamant that the solution is to force the model weights to become public domain.
I broadly agree with you, but I don't see what's contradictory about the solution of model weights becoming public domain.
When it comes to piracy, the people who have viewed it as ethical on the grounds that "information wants to be free" generally also drew the line at profiting from it: copying an MP3 and giving it to your friend or even a complete stranger is ethical, charging a fee for that (above and beyond what it costs you to make a copy) is not. From that perspective, what OpenAI is doing is evil not because they are infringing on everyone's copyright, but that they are profiting from it.
The deal of the internet has always been: send me what you want and I’ll render it however I want. This includes feeding it into AI bots now. I don’t love being on the same side as these “AI” snakeoil salesmen, but they are following the rules of the road.
Robots.txt is just a voluntary thing. We’re going to see more and more of the internet shut off by technical means instead, which is a bummer. But on the bright side it might kill off the ad based model. Silver linings and all that.
I say this given what I understand information to be
information is about knowledge, what use is knowledge that nobody can know? useless, hence it must be the case that information wants to be copied everywhere it can, freely; for that is the essence of being information, being known.
…and then suddenly OpenAI are evil villains, and at least some of the people denounced them for copyright infringement are, in the same post, adamant that the solution is to force the model weights to become public domain.