Back in the '90s I was doing work funded by DEC and later NSF with JPEG and one of the things I did was find an decent way to store them at a smaller size and resolution. I found that having the first scan be just the MSB of DC of Y and then later the MSBs of the other two DCs was the same as having the first be all three give or take two bytes over more than 90% of our corpus and it let something display in about half the time (when a 9600 baud modem was target).
It's been a long time but I think what I ended-up with was first scan MSB of Y DC, next six bits of Y AC. Then MSB of Cb and Cr DC, then a scan with a few bits of Cb and Cr DC and AC and finally the rest. The idea was that for a B&W, greyscale, or color thumbnail the same JPEG would be used but only the first N scans sent followed by 0xffd9 with a width and height in the img tag. Anyway, I can't be the only one that figured-out in the days of SLIP over modems that doing this trick was a good idea.
It's been a long time but I think what I ended-up with was first scan MSB of Y DC, next six bits of Y AC. Then MSB of Cb and Cr DC, then a scan with a few bits of Cb and Cr DC and AC and finally the rest. The idea was that for a B&W, greyscale, or color thumbnail the same JPEG would be used but only the first N scans sent followed by 0xffd9 with a width and height in the img tag. Anyway, I can't be the only one that figured-out in the days of SLIP over modems that doing this trick was a good idea.