One assumption is that it's one stream of bytes (Okay, two if you count stderr) and so control and metadata has to be interleaved with content to be displayed
No reason the transmission protocol and the application/ui protocol need to be coupled in their level of abstraction. You could send the same data plus just enough bytes to know which data goes with which stream, then a terminal app could see a content stream and a control stream. I'm not spec-ing out a concrete protocol design here, just mentioning it is a fundamental assumption that leads to much of the (lack of) ergonomics in the current approach. Namely the interleaving of control characters and display characters. Any additional functionality we want needing to be squeezed down into that representation makes doing so cumbersome.
> No reason the transmission protocol and the application/ui protocol need to be coupled in their level of abstraction
They aren’t as there isn’t any application protocol.
Ncurses is the closest you’ll get.
But several TUI frameworks exist (like ncurses and the charm.sh packages) which abstract the low level let’s call it a wire protocol
Sure, but I’m arguing that replacing the very minimal protocol with a heavier one would not be an innovation nor would it be useful at all as you would always need some kind of low level streaming protocol, unless you’re looking to replace pretty much all of Unix land, but that’s a whole different discussion.
Terminals, at the protocol level, assume nothing. It’s just streams of bytes which are processed then rendered.
The protocol makes no technical assumptions, which is why it has both lived so long and is such a mess