I used to think this, but now mostly (but weakly) don't. Long options buy expressiveness at the cost of density, i.e., they tend to turn "one-liners" into "N-liners". One-liners can be cryptic, but N-liners reduce how much program fits on the screen at once. I personally find it easier to look up flags than to have to page through multiple screenfuls to make sense of something. In this respect, ISTM short options are a /different/ way of helping a subsequent reader, by increasing the odds they see the forest, not just the trees.
Unix's standard error is definitely not the first invention of a sink for errors. According to Doug McIlroy, Unix got standard error in its 6th Edition, released in May 1975 (http://www.cs.dartmouth.edu/~doug/reader.pdf). 5th Edition was released in June, 1974, so it's reasonable to suppose Unix's standard error was developed during that 11 month interval. By that time, Multics already had a dedicated error stream, called error_output (see https://multicians.org/mtbs/mtb763.html, dated October 1973).
All the same, I'd be willing to believe that Unix's standard error could have been an "independent rediscovery" of one feature made highly desirable by other features (redirection and pipes). It's not clear how much communication there was among distinct OS researcher groups back then, so even if other systems had an analogue, Bell Labs people might not have been aware of it.
The story that I recall about the origins of stderr is that without it, pipes are a mess. Keeping stdout to just the text that you want to pipe between tools and diverting all “noise” elsewhere is what makes pipes useable.
S-expressions only really address a smallish part of what people seem to want out of configuration file syntaxes: a simple, recursive syntax.
Among the things that S-expressions don't address per se are the interpretation of tokens (e.g., numeric vs non-numeric token syntax), how non-numeric tokens may be interpreted as booleans, symbols, timestamps, etc.; whether to use alists vs. plists for associations and the semantics of any duplicate keys within an associative construct; how to specify the schema for a configuration object (required vs. optional elements & the types of each, etc.)
IOW, merely saying "use S-expressions" over-emphasizes syntax while under-emphasizing semantics.
ISTM one could use the same premises to reach the opposite conclusion, namely, that because awk is basically a subset of Perl (excluding CPAN, of course), many things are easier to read, often resulting in more regular, if sometimes longer code. :-)
(FWIW, I learned Perl before sed and awk, and when I was using Perl every day, it was easy enough to whip up one-liners and throwaway scripts. However, I find that as I stopped using Perl on a day-to-day basis about 17 years ago, I can't produce Perl without re-learning the language; but I can produce sed and awk a few times per year without any refresher. I suspect that -- for me -- the smallness of each of sed and awk has something to do with it. YMMV, of course.)
If you're keen to go down the wikipedia hole, https://en.wikipedia.org/wiki/Six-bit_character_code and then https://en.wikipedia.org/wiki/BCD_(character_encoding) explain that IBM created a 6-bit card punch encoding for alphanumeric data in 1928, that this code was adopted by other manufacturers, and that IBM's early electronic computers' word sizes were based on that code. (Hazarding a guess, but perhaps to take advantage of existing manufacturing processes for card-handling hardware, or for compatibility with customers existing card handling equipment, teletypes, etc.)
So backward compatibility is likely the most historically accurate answer. Fewer bits wouldn't have been compatible, more bits might not have been usable!
I'm guessing it was the smallest practical size to encode alphanumeric data, and making it bigger than it needed to be would have added mechanical complexity and expense.
https://en.wikipedia.org/wiki/Six-bit_character_code: "Six bits can only encode 64 distinct characters, so these codes generally include only the upper-case letters, the numerals, some punctuation characters, and sometimes control characters."
IIRC I think six characters was also the maximum for the length of global symbols in C on early Unix systems, possibly just because that's what everyone was used to on earlier systems.
But note that I asked about why six characters, not why six bits per character -- however your note is perhaps suggestive -- maybe the six character limit is similar to the six bit character after all: something established (possibly for mechanical reasons) in 1928? Perhaps?
Right, good questions. Pure conjecture on my part: maybe it's just that 36 is the smallest integral multiple of 6 that also had enough bits to represent integers of the desired width?
Richard P. Gabriel (one of the co-creators of Common Lisp, later a founder of the Lisp vendor, Lucid) has a few interesting things to say about CL's FORMAT in his Patterns Of Software, starting on page 101:
> What are the trade-offs? Format strings don’t look like Lisp, and they constitute a non-Lispy language embedded in Lisp. This isn’t elegant. But, the benefit of this is compact encoding in such a way that the structure of the fill-in-the-blank text is apparent and not the control structure.
IMO McDermott's OUT macro has precisely the drawbacks Gabriel predicts. While McDermott seems to think it's an advantage that
> we no longer have to squeeze the output data into a form intelligible to format, because we can use any Lisp control structure we like
his example PRINT-XAPPING basically duplicates the logic for extracting data from the xapping structure as Steele's FORMAT call, but buries the data extraction into control structure alongside constant strings that go into the output.
And that's where I think FORMAT really a win: a FORMAT control deliberately separates the control flow and constant text from the data extraction logic. Presumably you could write functions that do the same thing using McDermott's OUT macro, but they'll be more verbose and no more enlightening; what's the point?
Does it really make that much difference if comments exist "inside" the embedded sublanguage vs. "outside"? After all, you can always construct a format specification by string concatenation, and in CL, you can trick the parser into doing this for you:
(format stream t
#.(concatenate 'string
;; comment about the first piece
<piece 1>
;; comment about the second piece
<piece 2>)
...)
Additionally, for the record, CL's FORMAT is sufficiently hairy that you can achieve the effect of in-band comments if you wanted to:
(format t "~
~0{ Because the previous line ended with a tilde,
the following newline is ignored. All of this text
occurs inside an iteration construct that loops
zero times, so will not be output. However, it
will consume one element from the list of arguments,
so after this loop, we'll use tilde-colon-asterisk
to back up one element, and let what comes next
decide how to format it. So this is more or less
a comment inside a CL FORMAT string.
~}~:*~
~S" 'Foo)
I think what I'd really like is some way to expand format strings into McDermott-style code (and ideally vice versa).
I think CL-PPCRE gets this correct for a similar domain: regex strings. The library is not pedantic about whether you provide the compact string or an expanded nice version.
Out of curiosity, how often do you find yourself using CL-PPCRE's S-expression notation? (This is a genuine question: I've never felt a desire for an S-expression notation for regular expressions, so I'm curious what I'm missing out on.)
Anyhow, while it's certainly possible to parse FORMAT control strings into S-expressions, ISTM that if you want them to be invertible back into FORMAT strings, you'll end up with control structure and constant strings being contained within the S-expression, with data extraction as a separate concern. IOW, you won't get McDermott's preferred style of interwoven control, data extraction, and constant strings. For instance, you could have this FORMAT control string
I never use it for regex, because when using CL, I tend to be authoring the regex from scratch. But I found it elegant and thought it might be useful to interpret someone else's hairy regex. But the reason I mentioned it, was because I thought it might be more useful to me for FORMAT... just because decades of Perl 5 taught me regex really well, but I don't get to use CL's FORMAT syntax every week. ;-)
IMO, the bad parts Unix are of two semi-distinct kinds: the skin-deep kind, and the genetic defects.
The superficial problems are everywhere, easy to spot, and fun to complain about! The names of commands are obscure, the flags are inconsistent. Any utility's feature set is inherently arbitrary and its limitations are equally so. Just how to accomplish any particular task is often a puzzle, the traedeoffs between using different utilities for the same task is inscrutable, and the time spent contemplating alternative approaches within the tool kit is utterly worthless. Utilities' inputs, outputs, and messages aren't for humans, they're either for coprocesses or for the control layer running the utility; and so users are supposed to learn to conform to the software, rather than vice versa. There are a hodgepodge of minilanguages (sed, find, test, bc, dc, expr, etc.), but they're unnecessary and inefficient if you're working in an even halfway capable language (let's say anything at about the level of awk is potentially "halfway capable"); and so "shelling out" to such utilities is a code smell. The canonical shells are somewhat expressive for process control, but terrible at handling data: consequently, safe, correct and robust uses of the shell layer Unix is hard, maybe impossible; so nowadays most any use of the Unix shell in any "real" application is also a bad code smell.
I say these are skin-deep in the sense that in theory any particular utility, minilanguage, or shell can be supplanted by something better. Some have tried, but uptake is slow/rare. The conventional rationales for why this doesn't happen is economic: it's either not worth anyone's time to learn new tools that replace old ones, or the network-effect-induced value of the old ones is so high (because every installation has the old ones) that any prospective replacement has to be loads better to get market traction. I have a different theory, which I'll get to below.
But I also think there's a deeper set of problems in the "genetics" of Unix, in that it supports a "reductive" form of problem solving, but doesn't help at all if you want to build abstractions. Let's say one of the core ideas in Unix is "everything is a file", i.e., read/write/seek/etc. is the universal interface across devices, files, pipes, etc.). "Everything is a file" insulates a program from some (but not all!) irrelevant details of the mechanics of moving bytes into and out of RAM... by forcing all programs to contend with even more profoundly irrelevant details about how those bytes in RAM should be interpreted as data in the program! While it is sometimes useful to be able to peek or poke at bits in stray spots, most programs implicitly or explicitly traffic in in data relevant to that program. While every such datum must be /realized/ as bytes somewhere, operating on some datum's realization /as bytes/ (or by convention, as text) is mostly a place to make mistakes.
Here's an example: consider the question "who uses bash as their login shell?" A classical "Unixy" methodology to attacking such a problem is supposed to be to (a) figure out how to get a byte stream containing the information you want, and then (b) figure out how to apply some tools to extract and perhaps transform that stream into the desired stream. So maybe you know that /etc/passwd one way to get that stream on your system, and you decide to use awk for this problem, and type
awk -F: '$6 ~ "bash$" { print $1 }' /etc/passwd
That's a nicely compact expression! Sadly, it's an incorrect one to apply to /etc/passwd to get the desired answer (at least on my hosts), because the login shell in the 7th field, not the 6th. Now, this is just a trivial little error, but that's why I like it as an example. Even in the most trivial cases, reducing anything to a byte stream does mean you can use any general purpose tool to a problem, but it also means that any such usage is going to reinvent the wheel in exact proportion to how directly it's using that byte stream; and that reinvention is a source of needless error.
Of course the sensible thing to do in all but the most contrived cases is to perform your handling of byte-level representations with a dedicated a library that provides at least some abstraction over the representation details; even thin and unsafe abstractions like C structs are better than nothing. (Anything less than a library is imperfect: if all you've got is a separate process on a pipe, you've just traded one byte stream problem for another. Maybe the one you get is easier than the one you started with, but still admits the same kinds of incorrect byte interpretation errors.) And so "everything is a file", which was supposed to be great faciltiy to help put things together, is usually just an utterly irrelevant implementation detail beneath libraries.
And this gets me back around to why I think the superficial stuff hasn't changed all that much: I doubt that the "Unix way" of putting things together has really truly mattered enough to bother making the tools or shells substantially better. I got started on Unix in 1999, by which time it was already customary for most people I knew to solve problems inside a capable language for which libraries existed, rather than to use pipelines of Unix tools. (Back then there was lots of Perl, Java, TCL, Python, et al.; nowadays less Perl and TCL, more Ruby and JavaScript.) Sure, you've needed a kernel to host your language and make your hard drive get warm, but once you have a halfway capable language (defined above), if it also has libraries and some way to call C functions (which awk didn't), you don't need the Unix toolkit, or a wide range of the original features of Unix itself (pipes, fork, job control, hierarchical process structure, separate address spaces, etc.).
And that's just stuff related to I/O and pipes. One could look at the relative merits of Unix's take on the file namespace, Plan 9's revision of the idea, and then observe that "logical devices" addressed much of that set of problems as early by the early-to-mid 70s on TOPS-20 and VMS, without (AFAICT) accompanying propaganda about how simple and orthogonal and critical it is that there be a tree-shaped namespace (except that it's a DAG) and everything in the namespace works like a file (except when it doesn't).
My point is that people have said about Unix that it's good because it's got a small number of orthogonal ideas, and look how you can make those ideas can hang together to produce some results! That's all fine, though in practice the attempt to combine the small number of ideas ends up giving fragile, inefficient, and unmaintanable solutions; and what you need to do to build more robust solutions on Unix is to ignore Unix, and just treat it as a host for an ecology of your own invention or selection, which ecology will probably make little use of Unix's Unix-y-ness.
(As to why Unix-like systems are widespread, it's hard not to observe some accidents of history: it was of no commercial value to its owner at a moment when hardware vendors needed a cheap operating system. Commercial circumstances later changed so that it made sense for some hardware vendors to subsidize free Unix knockoffs. Commercial circumstances have changed again, and it still makes sense for some vendors to continue subisidizing Unix knockoffs. But being good for a vendor to sell and being good for someone to use can very often be different things...)