> And while I doubt most training data is recoverable, "lossy encoding" is still...

TeMPOraL · on Jan 7, 2023

My belief is there is no fundamental difference here. That is, learning is a form of compression. Learning concepts is just a more complex form of achieving that much greater (if lossy) compression levels. If the courts will see it the same way too, things will get truly interesting.

naasking · on Jan 7, 2023

Yes learning concepts is a form of compression, but I'm not sure that implies there's no "fundamental" difference. I see it as akin to a programming language having only first-order functions vs. having higher-order functions. Higher-order functions give you more expressive power but not any more computational power.

You could say a higher order program can "just" be transformed into a first-order program via defunctionalization, but I think the expressive difference is in and of itself meaningful. I hope the courts can tease that out in the end, and we'll see if LLMs cross that line, or if we need something even more general to qualify.

TeMPOraL · on Jan 8, 2023

> I see it as akin to a programming language having only first-order functions vs. having higher-order functions.

Interesting analogy, and I think there are a couple different "levels" of looking at it. E.g. fundamentally, they're the same thing under Turing equivalence, and in practice one can be transformed into the other - but then, I agree there is a meaningful difference for humans having to read or think in those languages. Additionally, if those are typical programming languages, you can't really have the code in the "weaker" language self-upgrade to the point the upgraded language has the same expressive power as the "stronger" one. If the "weaker" one is Lisp though, you can lift it like this.

In this sense I see traditional compression algorithms - like the ones we use for archiving, images and sound - to be like those typical weaker languages. There's a fixed set of features they exploit in their compression. But human learning vs. neural network models (or sophisticated enough non-DNN ML) is to me like Lisp vs. that stronger programming language, or even Lisp vs. a better Lisp - both can arbitrarily raise their conceptual levels as needed. But it's still fundamentally compression / programming Turing machines.