Well camel-case *does* reduce the length of the identifier without sacrificing c...

dalke · on Feb 6, 2016

The underscore in computing was developed to be able to separate words used as part of a variable name on computers with only upper-case. Quoting https://en.wikipedia.org/wiki/Underscore#History :

> IBM's report on NPL (the early name of what is now called PL/I) leaves the character set undefined, but specifically mentions the break character, and gives RATE_OF_PAY as an example identifier

It links to the 1964 report at http://bitsavers.informatik.uni-stuttgart.de/pdf/ibm/npl/320... which defines the "_" as the "break" character, on page 22, and on p23 says "[A]n identifier is a string of alphabetic characters, igis, and break characters with the initial character always alphabetic. Any number of break characters are allowed within an identifier; however, consecutive break characters are not permitted. Also, a break character cannot be the final character of an identifier."

(To verify the timing, "_" was not in X3.4 1963 (see http://worldpowersystems.com/archives/codes/X3.4-1963/page6.... ) and that ASCII code point was instead "left arrow".)

You argument regarding "more files to be viewed on a single screen" is valid, but incomplete. It's more a question of total program comprehension rather than a single metric.

This is hard to measure. We can look to related metrics of speed-of-identification and accuracy to see how messy the subject is. The report at http://www.cs.kent.edu/~jmaletic/papers/ICPC2010-CamelCaseUn... says that programmers who are trained in underscore style can recognize underscore style more quickly than camel case, while https://www.researchgate.net/publication/221219628_To_camelc... says that camel case is all around better.

At the very least it suggests that "most practical style" is hard to determine.

EvanPlaice · on Feb 6, 2016

Meta characters were also added for file, unit, record, and group separators. Some may correctly argue that using these chars to structure data in flat files is a more simple and technically superior solution to the alternatives.

Still, people will stick to what they're familiar with despite the technical benefits. That's why we have CSV, TSV, JSON, etc.

I'd argue that 'most practical' is whatever format most people immediately understand at first glance. That literally means everybody, not just a small subset of programmers who already use a perticular sytling standard.

dalke · on Feb 6, 2016

"whatever format most people immediately understand at first glance"

Certainly that's a useful starting point. The problem is in figuring that out when there are multiple, roughly similar representation.

But it also depends on the goal. Sometimes it's better to learn a new format (Einstein notation, bra-ket notation, copy editing and proofreading symbols, modern staff notation for music, shorthand, etc.) than to use a system that a larger subset of people will understand immediately.

Forth is an example of a programming language which is developed for programmer productivity, on the assumption that the programmer will put in the effort to be proficient in the language.

EvanPlaice · on Feb 7, 2016

I was speaking in terms of the 'eventual' case, not a starting point.

While optimizing syntax/form makes sense in highly specialized domains where no useful alternative exists, I'd argue that the opposite holds true in domains where more 'natural' alternatives are abundant.

Can't say I'm familiar with all of those. For proofreading, meta chars are necessary to indicate edits without the ability to mutate the original text. Musical notation has widely been replaced by tabs for guitar. Shorthand may be useful for writing that isn't consumed by others.

Cursive is a perfect example of a form of language that was created for efficiency. Which, arguably, held true for handwriting. But it didn't add enough of a benefit above/beyond plain handwritten text and was very difficult to duplicate digitally.

Not to rag on Fourth, I'm sure it's probably a very good language but how widely is it used today?

Like I said, no amount of research proving that programmers choose languages based purely on their technical merits can disprove the writing on the wall.

People choose what feels natural to them based on previous experience and/or common convention. Whatever choice requires the least amount of context switching overhead and allows the lowest barrier of communication between devs will win in the end.

That's why Typescript is immensely popular for developers with a strong OOP background that prefer writing code in an IDE.

For C, the low level support for types and memory access make it a natural fit for systems development. I have written low level network code in C#, it's an extremely awkward and verbose mess.

Python wins when it comes to simplicity and the ability to write really powerful functionality with a minimum amount of code. The list slicing as well as comprehensions are easy to understand and increase productivity dramatically.

Someone · on Feb 6, 2016

"Well camel-case does reduce the length of the identifier without sacrificing clarity"

I think camelCase easily beats underscores, but I also think it can reduce clarity a bit as soon as one uses abbreviations or mnemonics that commonly are written in all caps in identifiers.

Do you name your class IoChannel or IOChannel? For some, the former is a channel on a moon of Jupiter.

Do you name your variable classId or classID? For some, the former is related to Freud, so one could expect to see classEgo and classSuperego, too.

Made up examples? Yes, but I don't think you can fully ignore aesthetics; I find that camelCasing such terms as ID, IO, XML and HTML in identifiers sacrifices clarity. 'Id' in particular makes me cringe whenever I see it (yes, that makes me a bit of a snob, but I simply cannot get used to it). That certainly applies to cases where a common abbreviation also is a word or easily read as such.

On the other hand, I also think keeping such abbreviations all uppercase in CamelCase identifiers sometimes "doesn't look right", and "doesn't look right" aka "aesthetically ugly" distracts me from understanding code.

Also, the argument that "showing more" implies "more practical" isn't that strong. If it did, we could take an idea from colorForth and remove spaces, replacing them by color or font changes. We also could use multiple statements on a line.

I think I would prefer a language that allowed hyphens and punctuation in identifiers (the latter are really useful for such conventions as using a trailing '?' to indicate a method that tests a Boolean condition, a trailing '!' to indicate methods that mutate their arguments. I also like the Dylan convention of using asterisks to indicate class names.

That is 'think', though, because I don't use one for practical reasons such as the availability of libraries.

mercurial · on Feb 6, 2016

> I think camelCase easily beats underscores, but I also think it can reduce clarity a bit as soon as one uses abbreviations or mnemonics that commonly are written in all caps in identifiers.

For my money, I'll take Python conventions of CamelCase for a few things and underscores for the rest. I find underscores a lot more readable (the omnipresence of CamelCase is one of these things that irk me about C#, though it's not as bad as the mutant mix that is Capitalized_underscore which you see in some OCaml codebases).

zem · on Feb 6, 2016

why would you ignore aesthetics? I'd much rather spend my time poring over pleasant looking code than ugly looking code.

bhrgunatha · on Feb 7, 2016

Beauty is in the eye of the beholder?

Look at the never-ending discussions about s-expressions in Lisp and all its variants.