> Everyone who writes about programming the Intel 286 says what a pain its segmented memory architecture was
Actually this concerns more pre-80286 processors, since 80286 introduced virtual memory, and the segment registers were less prominent in "protected mode".
Moreover I wouldn't say it was a pain, at least at the assembly level, once you understood the trick. C had not concept of segmented memory, so you had to tell the compiler which "memory model" it should use.
> One significant quirk is that the machine is very sensitive to data alignment.
I remembered from school time about a "barrel register" that allowed to remove this limitation, but it was introduced in 68020.
On the topic itself, I like to say that a program is portable if it has been ported once (likewise a module is reusable if it has been reused once). I remember porting a program from a 68K descendant to ARM, the only non-obvious portability issue was that in C, the char type is that the standard doesn't mandate the char type to be signed or unsigned (it's implementation-defined).
The segment registers were less prominent on the 80386 in protected mode since you also have paging, and each segment can be 4G in size. On the 80286 in protected mode the segment registers are still very much there (no paging, each segment is still limited to 64k).
Segments are awesome if they're much larger than available RAM - handling pointers as "base + offset" is often much easier to understand than just raw pointers.
Having done some 8086 programming recently, I did find segments rather helpful once you get used to them. They make it easier to think about handling data in a modular fashion; a large (64k maximum) data structure can be given its own segment. The 286 went farther by providing protection to allocated segments. I have a feeling overlays only really become a nuisance once you start working on projects far bigger than were ever intended for that generation of '86. MS-DOS not having a true successor didn't help either.
> > Everyone who writes about programming the Intel 286 says what a pain its segmented memory architecture was
> Actually this concerns more pre-80286 processors, since 80286 introduced virtual memory,
86 had segments, 286 added protected mode, 386 added virtual. I would agree, though, 286 wasn't as bad as people make it sound. In OS/2 1.x it was quite usable.
It was still a pain in protected mode because they botched the layout of the various fields in selectors. If they had made a trivial change they could have effectively made it trivial for the OS to give processes a linear address space.
Here's a summary of how it worked in protected mode for those not familiar with 286.
In protected mode the segment registers no longer held segment numbers. They held selectors. A selector contained three things:
1. An 13-bit index into a "descriptor" table. A descriptor table was a table describing memory segments, with each entry being an 8 byte data structure that included a 24 bit base address for a segment, a 16 bit size for a segment, and fields describing access limits and privilege information,
2. A bit that told which of two descriptor tables was to be used. One table was called the global descriptor table (GDT) and was shared by all processes. Typically the GDT would be used to describe the memory segments that were used by the operating system itself. The other table was called the local descriptor table (LDT), and there was one LDT per process. Typically this described the segments where the code and data of the process resided.
3. A 2-bit field contain access level information, used as part of the 286's 4 ring protection model.
To translate a selector:offset address to a physical address, the index from the selector was used to find a descriptor in the descriptor table the selector referred to. The offset from the selector:offset address was added to the segment base address from the descriptor, and the result was the physical address.
(The 80386 was similar. The main difference, aside from supporting 4 GB segments, was that the base address in the descriptor was no longer necessarily a physical address. The 386 included a paged memory management unit in addition to the segment based MMU. If the paged MMU was enabled the base addresses in the segment descriptors were virtual address for the paged MMU).
Here is how they packed the 13-bit index, 1-bit table selection, and 2-bits of access level into a 16-bit selector:
+--------------------------+--+----+
| INDEX | T| AL |
+--------------------------+--+----+
Now consider an OS that has set up a process to have multiple 64K segments of, say, data with consecutive indexes. When the process wants to add something to a selector:offset pointer it can't just treat that as a 32-bit integer and do the normal 32-bit addition, because of those stupid T and AL fields. If there is a carry from adding something to the offset half of the 32-bit value, the selector half needs to have 0x0008 added to it, not the 0x0001 that carry normally would add.
If they had instead laid out the fields in the selector like this:
+----+--+--------------------------+
| AL | T| INDEX |
+----+--+--------------------------+
then it would work to treat a 32-bit selector:offset as a linear 29-bit address space with 3 additional bits on top. In most operating systems user mode programs would never have reason to fiddle with the access level bits or the table specifier, so as long as you made it so malloc returned pointers with those set right for the process then almost all the pain of memory on protected mode 286 would have gone away. An array that crossed segment boundaries, for example, would require no special handling by the compiler.
So why didn't they do this?
One theory I've heard is that because the descriptors in the GDT and LDT are 8 bytes each, to get the address of a descriptor the processor has to computer 8 x INDEX + BASE where base is the base address of the descriptor table, and by having INDEX in the selector in the to 13 bits, it is already in effect multiplied by 8 saving them having to shift the INDEX before feeding it into the adder for the address calculation.
I've talked to CPU designers and asked about that, and what they have told is that shifting INDEX for this kind of calculation is trivial. The segment registers would already be special cases, and it would be likely their connection to the adder used for address offsetting could simply be wired up so that the input from the segment registers was shifted. Even if they did need a little more hardware to handle it the amount would be very small, and almost certainly worth it for the tremendous improvement it would make to the architecture as seen by programs.
My guess is that simply no one involved in designing the selector and descriptor system was a compiler or OS person and so they didn't realize how they laid those 3 fields out would matter to user mode code.
Actually this concerns more pre-80286 processors, since 80286 introduced virtual memory, and the segment registers were less prominent in "protected mode". Moreover I wouldn't say it was a pain, at least at the assembly level, once you understood the trick. C had not concept of segmented memory, so you had to tell the compiler which "memory model" it should use.
> One significant quirk is that the machine is very sensitive to data alignment.
I remembered from school time about a "barrel register" that allowed to remove this limitation, but it was introduced in 68020.
On the topic itself, I like to say that a program is portable if it has been ported once (likewise a module is reusable if it has been reused once). I remember porting a program from a 68K descendant to ARM, the only non-obvious portability issue was that in C, the char type is that the standard doesn't mandate the char type to be signed or unsigned (it's implementation-defined).