Could anyone with a custom OS make their own TCP stack that violates all rate limiting and always sends data as fast as possible / reports a humongous window size?
Rate limiting is performed upstream; the network isn't just asking nicely. Your packets will be either queued or dropped if you send them faster then allowed.
Some ISPs allow high speeds for the first N megabytes (often 20 or so), then throttle the shit out of the rest. This trick presents an entire stream as a bunch of little streams, so that you get the "first 20MB" treatment for the whole thing.
Depending on circumstances, the "upstream" place where rate limiting is performed may be your cable modem. People have been known to reflash their cable modems with cracked firmware that doesn't honor the rate limit. This is easy enough to detect and punish further upstream.
Traffic shapers usually aren't actually inspecting or tracking the internal state of the TCP stack at either end of the connection. They just count bytes and credit them to a particular bucket by identifying the flow/protocol/class based on port numbers, IPs, etc.
There are a lot of ways to do packet prioritization rules, and only some of them can be gamed this trivially. For example, a large hierarchical HTP or HFSC ruleset may prioritize new flows over existing flows but still apply an aggregate limit on the whole protocol. More modern methods like fq_codel automatically give new connections an advantage, but it only applies to a very small amount of data and doesn't treat new flows any better than sparse flows.