These really aren't idioms. An idiom in natural language is a phrase or saying whose meaning cannot be understood without someone telling you what it's supposed to mean (that is, you can't figure it out just by reading the words).
In programming, an idiom is a common, accepted way of accomplishing a task which you wouldn't figure out to do all on your own and whose meaning isn't obvious to programmers of other languages.
I suppose you could argue that certain idioms are more or less universal: using "i" as the index in a loop isn't obvious even if you know a lot of math (why not use j?).
But no one who doesn't program Python is going to figure out the first thing you do in any new Python project is write the line:
if __name__ == "__main__":
main()
The fact that it's impenetrable if it's not first explained to you that you should do that is what makes it idiomatic.
Idiomatic also means particular to a specific person or group. If your website is based on comparing the same idiom in multiple languages, that's a good indication it's not actually idiomatic. That's probably just a common task.
This use of "idiom" is well established outside of natural language. Merriam-Webster gives the following as the third sense of "idiom":
> a style or form of artistic expression that is characteristic of an individual, a period or movement, or a medium or instrument
So it doesn't have to be inscrutable; just something that's characteristic of a particular way of doing things. The Python community favors list comprehensions over maps and filters, so using list comprehensions is part of writing idiomatic Python even though they are not in themselves impossible to understand.
I would agree that using list comprehensions over iterating is idiomatic Python, but I would not say that the task of applying a function to a list is an idiom like how the website seems to indicate. There may or may not be an idiomatic way in a given language, it's really dependant on the language which is why you can't map tasks to idioms. C# has Python list comprehensions, but it's not the preferred way to do things and so I wouldn't call it a programming idiom. I'm not trying to argue that programming idioms are the exact same thing as idiomatic expressions, I was trying to explain what programming idioms are by appealing to the sense in which the term was coined.
I think you captured definitions really well, but I'm not sure you captured the spirit of a programming idiom.
The spirit is essentially pattern-like, that the specific syntax has a higher meaning in that language such that as soon as you see its shape you know what's up. Moreover, if you don't see that shape it doesn't register as "native" for that language.
In your explanation, the spirit is "entry point is main." It's true the mechanics are unusual to Python, but that's not necessary and not all that counts. If I did the name/main comparison some other way it wouldn't be idiomatic because that very specific construct is the idiom.
Some examples in other languages that aren't that hard to read mechanically:
param = param || default; // idiomatic for ES5 default
let bool = !!(whatever) // idiomatic for boolean coercion in JS
reversed_list = forward_list[::-1] # idiomatic for sequence reversal in Python
x = x >> 1 // idiomatic for x/2 in C
In some of those cases there are practical realities that also drive them (the bitshift divide in particular comes from embedded and other speed concerns) but they aren't constructs that you see elsewhere very much outside that language even if they're valid. That's what makes them idiomatic to that language.
"The left shift and right shift operators should not be used for negative numbers. The result of is undefined behaviour if any of the operands is a negative number."
The division case is optimized for size. The code for using a division instruction is smaller than the code for a shift followed by the magic to make it well behaved for negative integers. So it uses the division instruction.
In general, always use -O2. -Os is only a good idea if you're on embedded and a -Os program fits in memory and -O2 doesn't. There really aren't any other cases where -Os is better.
Good point! I think that's true of a lot of older "optimization" techniques--if they're that good, a compiler does them for you.
I'm curious whether any compilers have specific rules to "re-optimize" obsolete practice. Like, assuming a platform does have a cheaper way to do /=2, do any compilers recognize that a >>1 in a given context is meant to be a div 2 and replace the code?
Woah, slow down there governor. Don't you be bringing your filthy verbose Java idioms into our precious JavaScript. My father was a ECMAScript programmer and his father before him was a perl hacker. We don't take to kindly to extra unecessary characters around these parts.
You're correct, and I think "patterns" would be a more straightforward term.
I believe the confusion probably stems from the word "idiomatic," which means "using, containing, or denoting expressions that are natural to a native speaker."
i.e. "idiomatic English" simply means that it sounds like a native speaker. Likewise, "idiomatic python" would be code that reads like a fluent python programmer.
Of course they are "connected." I was explaining the confusion. However, "connected" is not the same as meaning the same thing.
1. idiom: a group of words established by usage as having a meaning not deducible from those of the individual words (e.g., rain cats and dogs, see the light ).
2. idiomatic: using, containing, or denoting expressions that are natural to a native speaker.
Also "connected" is "idiosyncratic," and that means yet a third thing. "Idiosyncratic programming" would be the opposite of what you want!
Yeah. "The C++ idiom" makes sense though, so thinking of this as a big matrix with the rows labelled with a task and the columns labelled with a language, rather than calling the rows "idioms", you can call the columns "idioms" instead.
It looks exactly like what used to probably be called "cookbook".
The first cookbook I recall was for Perl, and I later saw other language users try to emulate that, and even use the tasks it identified as a template for their own language's cookbook.
I save "idiomatic" for programming linguistic style guidelines that, for many tasks, help select among multiple ways of doing a task (or of larger structuring/approach). For example, in Scheme, tending to avoid mutations, structuring algorithm implementations to use tail calls, using first-class closures, using syntax extension for minilanguages/DSLs, etc. Pythonistas might call their own popular style "Pythonic".
There's overlap with cookbook-type stuff. For example, in some language that permits both approaches, maybe the approach of folding over a set of filenames is arguably more idiomatic than the approach of constructing a list of filenames. Consequently, the cookbook example might be of the folding approach. I don't want to call that particular task an idiom, because I find the term useful for a different purpose, but there's overlap.
Cookbooks often answer the question "which library and/or calls do I use".
If the cookbook spends time on more general tasks, like "coding an algorithm" or "traversing a collection", then the discussion might be more about what's idiomatic in the language.
Some of the fuzzy distinction might be analogous to the fuzzy distinction between language and library.
Totally agree. Does anyone know of any good language tutorials (for any language) targeted at people who are already familiar with programming that introduce a language through its idioms? I've always thought that would be a really good way to get up to speed on a new language really efficiently.
Just that little snippet of Python there could be used to quickly and effectively introduce an experienced non-Python programmer to three or four important ideas in Python.
They are idiomatic to people who haven’t learned how to program. Perhaps it was named with Product Managers, Project Managers, QA, and Business Analysts in mind?
First example I've checked is wrong [1]. Shuffle a list, with a static Random object initialization. Using a static random object is not thread-safe in .NET.
"One common solution is to use a static field to store a single instance of Random and reuse it. That’s okay in Java (where Random is thread-safe) but it’s not so good in .NET – if you use the same instance repeatedly from .NET, you can corrupt the internal data structures." [2]
Very often the very same resources which claim to be the source of truth provide bad code examples.
Quite frankly, this site's examples are pretty horrible.
Case in point, Idiom #46 in C[0] uses strncpy and predictably fails to correctly terminate the buffer.
Idiom #55[1] converts a integer to a string with itoa, never bothering to mention that it isn't part of the C standard or POSIX, while for some reason using a 4096 byte big output buffer.
Idiom #39[2] first example uses clrscr(), which I assume is some sort of old DOS function? The entire example looks unrelated to the problem.
The idea is certainly good, but you need some serious vetting system to make this usable.
That’s the fate of all such sites. They start with a nice idea and a good intent, but the crowd fills it with crap, nobody cares/is able to screen it, and it instantly becomes an unreliable source of nonsense.
Ed: btw, #1 is terminated by calloc(6), but it is indeed a trip mine looking for prey. It is unknown if strncpy was used intentionally or out of ignorance and if the next person will be aware.
Yet this seems to be the perfect place and time to remind the kind audience that sizeof measures the number of char-sized chunks, thus sizeof(char) is always the constant 1, no need to spell it out.
You are aware of how a community driven site works? If its wrong, then go ahead and put your "right" implementation there.
Beyond that, you're inferring threading into the problem. There is no mention of thread-safety in that example, nor does it claim to be. If you want to over-analyze the problem statement, that's your prerogative- that doesn't make that answer wrong however.
> You are aware of how a community driven site works? If its wrong, then go ahead and put your "right" implementation there.
It doesn't provide any indication of whether an entry has been vetted. At least with stackoverflow you can see discussion in the comments and guess whether an answer has been critiqued.
If you have sufficient expertise in an area and poke around SO long enough, you'll spot the problem with an egalitarian system soon enough: the blind leading the blind.
Unfortunately, the site doesn't inspire participation. There's no way for good answers to rise up above bad answers. And there are a lot of bad answers.
But it's not at all unreasonable to think some newbie may come along, copy & paste that solution, and then call Shuffle from multiple threads.
Thread safety is something that needs to be touched on constantly, and it's smart to write your code such that it would be thread safe. Because otherwise, it just looks like a normal function call. And it could take someone ages to figure out why everything is breaking if they were to actually use that example.
If the site is providing toy examples that are not useful in production, then it should state that.
If "idioms" mean code that adheres to best practices, given that C# code is frequently threaded, their examples should mitigate common threading issues.
It will be ignored. StackOverflow guys and all participants have put an enormous effort into making it at least partially trusted source of snippets, and I hope it will never be outranked by sites like this.
(to creator: sorry for this, but in a long run it only harms everyone, unless you “die” for it)
I'd like to draw people's attention away from this and to Rosetta Code[1].
Programming Idioms has a slightly more modern interface, but what matters more is the content. Rosetta Code has more content and better content. If you don't like Rosetta Code's interface, I'd rather you tried to improve Rosetta Code rather than splintering the already-small community of people producing this sort of content.
That's debatable. I suppose it's better than nothing for complete noobs, but a lot of the examples on that site are even worse than the crap found on Stack Overflow.
Anybody reasonable person reading my comment would know what I'm talking about; the context is directly above it. Using your proposed logic, quoting "Rosetta Code has more content and better content" wouldn't have been enough because it's missing context. There's no need to be fussy about this.
> So write a simpler one and post it. It's a Wiki.
It's my prerogative to criticize other people's content. Simultaneously, I'm under no obligation to write content on someone else's website for free.
Why should you need to ask him that when you wrote it? There's no other interpretation than "Rosetta Code has better content than Programming-Idioms". It's impossible for him to miss that. And it is indeed debatable.
> Why should you need to ask him that when you wrote it?
I'm asking him that because if he wants to debate a point I made, he should understand what point I've actually made. It's clear he didn't understand my point.
> There's no other interpretation than "Rosetta Code has better content than Programming-Idioms". It's impossible for him to miss that.
De facto there is another interpretation, because ravenstine didn't interpret it that way. You would think that it would be impossible for him to miss that, but it's apparent that he did miss that.
> And it is indeed debatable.
"Rosetta Code has better content than Programming Idioms" is certainly debatable, but if you wanted to debate that, wouldn't it make sense to start by comparing something from Programming Idioms to something from Rosetta Code? He didn't bring up Programming Idioms at all--the only site he compares Rosetta Code to is completely irrelevant (Stack Overflow). I'm not sure what point he's debating, but it's certainly not "Rosetta Code has better content than Programming Idioms".
I do agree it needs improvement. It would be nice if it had an interface to browse and do a:b comparisons in a more fluid manner. The programming idioms site does this well.
I'm a bit reluctant to trust the quality of this, however.
The first example I looked up was "Format date YYYY-MM-DD" in Python, which wanted me to use "isoformat()", which produces something else.
It was easy to correct, but I'd like to see a more "democratic" approach to getting correct examples, not just taking the latest one as fact.
edit: Sorry, more precise critique of the usage of "isoformat" would be "which _might_ produce something else" (it depends on the resolution of info in the date object).
When I view that page[1] I see two different techniques listed, isoformat() being the second. The first is to use d.strftime('%Y-%m-%d').
In regard to the second, isoformat on a "datetime" will give increased resolution beyond "YYYY-MM-DD", but a "date" object only resolves year, month and day and such a call will indeed yield the correct output (per the linked documentation at [2]).
I just did a google search for "Format date YYYY-MM-DD in Python", clicked the first link to stackoverflow and got several answers, 1st says `.strftime('%Y-%m-%d')`, 2nd `str(date.today())` and using `isoformat` is the 4th answer. I guess I'm not the only dev who would do the same in case I actually wanted to check how to correctly format a date in Python: ctrl+t, type, type, enter, click, read, read, ctrl+c, ctr+v and done. That's the competition for programming-idioms.org if you ask me.
> I'd like to see a more "democratic" approach to getting correct examples
Me too! Stackoverflow contributions are cc-licensed so maybe a tool to easily curate answers to classic issues and build on top of them would work well. I'm not sure crowdsourcing programming idioms from scratch would be the easiest way to go as I feel the work is mostly halfway done out there (but sure could use some curation).
If the date object only stores up to the day, then you can only format the string up to the day. If it stores up to the second, you can format up to the second
Well, I guess I've been reinventing the wheel for my whole career... and thank god I did, my wheel is not only more performant, but also shorter and produces cleaner code (in my subjective opinion). As simple as:
Isn't raise for "throwing" exceptions/errors though? `raise SystemExit(code)` would imply an error, but if `code` is 0, the OS wouldn't consider it an error but an human looking at it would (correctly?) assume it's a error.
> This exception is raised by the sys.exit() function. It inherits from BaseException instead of Exception so that it is not accidentally caught by code that catches Exception. This allows the exception to properly propagate up and cause the interpreter to exit. When it is not handled, the Python interpreter exits; no stack traceback is printed. The constructor accepts the same optional argument passed to sys.exit(). If the value is an integer, it specifies the system exit status (passed to C’s exit() function); if it is None, the exit status is zero; if it has another type (such as a string), the object’s value is printed and the exit status is one.
> A call to sys.exit() is translated into an exception so that clean-up handlers (finally clauses of try statements) can be executed, and so that a debugger can execute a script without running the risk of losing control. The os._exit() function can be used if it is absolutely positively necessary to exit immediately (for example, in the child process after a call to os.fork()).
Replying to your comment regarding:
> Isn't raise for "throwing" exceptions/errors though?
Well, some would argue that is idiomatic in python. A quick google search not only autocompletes "exceptions for control flow" to "exceptions for control flow python", but on the second search, most results discuss that (with many agreeing).
I guess you can say most python developer wouldn't immediately assume it's an error. I definitely can say, I don't.
Right. In Python, raising an exception is considered just as safe and normal as returning a value, so Python uses exceptions even for very common things like stopping iterations. IMHO a Python exception is quite similar to a Rust
"Result::Err".
Sure, but even in most of those scenarios, I’d prefer to structure the code differently... perhaps write a main() function that just returns, or other functions... more modular, maintainable, teatable, ... admittedly, in Python you need to use os.exit if you want to return a specific non-zero exit code, but even in that case good code style would be something like
def main()
blablabla
if foo:
return bar
...
return 8
if __name__ == “__main__”:
os.exit(main())
Of course that’s just my advice / opinion / preference, other ways are valid as well...
I would expect sys.exit() to be more efficient since it skips the final GC and calls to finalizers. Sys is always already imported, so 'import sys' amounts to a dict lookup.
These are not the same semantically. It may or may not be practical for you to prove that your program is still correct if finalizers are skipped.
In 99.9% of cases it holds true, it's easier/ cheaper to use existing/ off-the-shelf solutions.
The other 0.1% (not absolute numbers), it makes sense to innovate.
Quite a few idioms fall into this category, like premature optimisation being the route of all evil. You really don't want to be on either end of the scale.
I don't think the website is redundant. Something I think is useful to be able to compare languages, especially when using a new language. Plus, languages are interesting, they're not just a syntax, they often bring a different way of thinking.
One issue with the website is that there is no indication of example quality. For instance, the B-Tree* in C# has some issues[0]. Using classes instead of structs is horrible for memory management and garbage collection. As does not constrain the generic type to structs ("where T : struct"), this continues on from the above. None of this would be apparent unless you understand .Net's memory management, inheritance, garbage collection, and spent some time with generics. Many of the other languages have got this right.
[0]https://www.programming-idioms.org/idiom/9/create-a-binary-t...
*I have often wondered why trees aren't standard collections, inserting nodes and tree rotations etc are common use cases for B-Trees. This does feel like re-inventing the wheel, giving developers plenty of opportunity to screw things up. Tries are another structure I think they have left out, there are times when tries are often a better solution than dictionaries (urls spring to mind).
Back then, you could re-invent a shitty wheel, people would ignore you and your idea would die out. Now a shitty-wheel idea gets indexed into a search engine and can potentially fool a non-expert into believing that its actually a good idea. I suppose thats the problem education has been trying to solve...
Well, people who should have reinvented the wheel, ended up stuck with legacy dependencies they've used instead, in various states of abandonment and disintegration now, who do 200 other things besides the wheel function they just needed, and which nobody can understand and are a pain to maintain and frequently break things on updates.
Now they wish they had taken the time to just write their ten lines wheel themselves...
That's why Haskell's Hoogle [1] is amazing (at least during 7-week course at university). You type function signature and get already implemented functions.
The flip side of the naming-things-is-hard adage in programming is that once you can name the problem you are having, the problem often evaporates before your eyes.
This is a problem in general but especially in programming where people like to rename things that already exist. However, it's not exclusive to programming - when I hack my way through something ME or EE I ask the experts around me and they can usually tell me the proper name or point me to existing or very similar solutions.
The C implementation (and probably most of the other implementations) of Idiom #41 (reverse a string) doesn't correctly handle UTF-8; it just reverses an array of bytes. Normally I wouldn't take objection to this, but:
1) The problem description specifies "Each character must be handled correctly regardless its number of bytes in memory.", for which the C implementation rather plainly fails.
2) UTF-8 is so ubiquitous nowadays that any string-manipulating function that fails to correctly handle multibyte UTF-8 characters ought to be considered broken.
Do these count as “idioms”? Are these problems commonly messed up? I sampled a dozen or so and all the ones I saw are snippets for doing really basic tasks, but didn’t seem particularly idiomatic, nor like tasks that cause a lot of bugs...
I don’t know what you understand an ‘idiom’ is, but being commonly misunderstood or a common source of errors is nothing to do with being an idiom. I don’t know how getting the size of an array could not be idiomatic.
Sorry I wasn't clear. I was responding to two different claims made in the title of this site: first that the samples are "idioms", and second that people should avoid reinventing the wheel in production code. It's not clear to me how these samples accomplish either.
"Idiom" means something that is done in a characteristic style. For example, in Python, it's idiomatic to use list comprehensions rather than for loops.
I don't see how getting the size of an array could be idiomatic in that sense, because there is always only a single way to do it, it's never a stylistic choice.
Python certainly makes some of the tasks so simple as to be moot. But some people try to bend a new language to match what they're used to.
For example, R is a statistical programming language where almost all data structures and operations are for vectors or arrays. But it's not unusual to see people on StackOverflow or blogs reinventing wheels like map/reduce/filter with for loops. I agree it's just inexperience that led to it, but the "intended" way is still idiomatic.
If you're going to list Fortran, COBOL, Prolog, Ada, Lisp, etc - I'm of the opinion that you need to list BASIC as well.
You list VB - but you don't say if this is VB.NET, or prior VB versions (3,4,5,6)? VB.NET isn't quite identical to the others...
...and VB is nothing like plain-old BASIC (and later dialects). In fact, you might lump them all under BASIC, as VB owes more to BASIC than being it's own thing (though VB.NET seems more a name than being BASIC - just my opinion, though, as a long time ago VB3-6 coder - I'd call it "BASIC-ish").
Other than that - this looks interesting. UI could use a bit more work, but seems like a good start.
The wheel is, in terms of elegance and function, a perfect machine - it can't be simplified, nor can it be improved upon. The point of the idiom being that any attempt to do so will either lead back to the wheel, or an inferior, unnecessarily complex alternative.
The problem with applying this to programming is the implication that there are languages, frameworks, patterns, and idioms which are equivalent to the wheel, in that there always exists one objectively correct, perfect solution which can't be improved upon. I don't believe programming has the equivalent of a wheel, yet.
What people tend to mean when they say "don't reinvent the wheel" in the context of programming is either "don't waste time writing code that already exists and is adequate" or "don't use $X because I like $Y."
and what is the difference? Both may be valid and idiomatic in the same language. With more context such as a forum dialog or a StackOverflow question, these things are usually ironed out. But trying to do the same thing in Haskell (where you might have monadic error handling) or java (where you might have some exception handling chain) is very different even though the same function call is used in the end to actually delete a file.
It seems like the checklist blurs the lines between a language and standard libraries, something that shouldn’t be controversial but should be noted.
The checklist also seems to favor dynamic languages. E.g. does Go deserve to miss a checkmark because you can’t determine if a variable exists at runtime (only at compile time)? I can sorta see the source inclusion checkbox, but it’s not entirely clear from the example if this is testing for a macro support, AOP, or dynamic loading of libraries.
The idea is great. But the low quality of many code examples puts me off. For me it would only work if the examples shown are made by a selection of top level developers. Then it would be really nice actually.
About the site design, the little menu on the left should better not move the content window to the right. Now, when you scroll down in 'all idioms' for example, the list is right aligned and left has useless whitespace. A horizontal menu or dropdown would be better I guess.
This site would really gain by having an upvote mechanism, this way users would know what is the "best" way of doing things. Plus, needs more languages.
I don’t know what the exact process is, but one can simply create something like https://codesnippet.stackexchange.com and benefit from the established network.
The first listing was using a `SimpleDateFormat`, which for a long time really was the best way to do this. But now I would never recommend that over `String.format`, especially with the potential threading problems in the former. But the latter was not available until Java 1.5.
Realistically it would just sit there. At least that's what places like StackOverflow evidence. Meaning that whatever momentum it built up over the potential years of being the "right" answer would need to be overcome.
Downvotes are typically reserved for wrong, not merely suboptimal. Especially since optimal and thereby suboptimal can change based on circumstance.
I took a gander at a bunch of the C++ ones. They are pretty horrible. It looks like direct translations from generic snippets (just a syntax transformation).
Fancy data structure manipulation perhaps should be part of a library (API) and not "built in" to the base language. The best languages make it easy to use vast API's rather than hard-wire features into syntax.
Well, I knew it was only a matter of time before someone made this. This was one of my favorite "some day I'll have time and can make this site" ideas.
Reinventing the wheel is a good idea. It's how you learn. Programmers who say "don't reinvent the wheel" don't understand how the craft of wheel making works.
i like looking at how other languages do things. i'm pushed away from Go and pulled toward Rust, the two languages i've been thinking of getting deep into. like the currying example. that move keyword seems pretty nice.
But concentrated on the simpler tasks. Rosetta Code has examples for Gaussian Elimination, but no string to number conversion. It also has some really esoteric languages. It's really more of a Rosetta Stone, while Programming Idioms is more of a lookup for the typical but infrequent stuff you (well, I) tend to forget when working in multiple languages.
you want to be known as the one who invented the wheel, because it looks good as a resume item. That's why the wheel keeps getting reinvented again and again, and usually quite poorly.
I'd say it's great for anyone. I'm classically trained, but a decades-old university education doesn't enable me to just magically know the idiomatic way to do X in a language that was created after I graduated.
My real question would be, what does this do that Google's Stack Overflow hits don't do. One would hope that the answer is, "Give me advice on how to do it in Python 3.6 that didn't get downvoted to death because there's already a SO question on how to do it in Python 2.7." But the cynic in me knows that, in software, particularly software's social environment, the only effective way to clear out the old, obsolete, broken wheels so they're not always in the way is by re-inventing them.
Hey! I'm a self taught programmer and I resemble that remark.
I think that you're painting with a little too broad a brush, the inverse is true too, I've seen way too much blind application of GoF patterns by professional engineers.
I think the key word is understanding. what each idiom needs is a good rubric explaining why this is the/a correct approach, but that is much harder to provide
If you haven't encountered the phrase before, "I resemble that remark" is absolutely a play on the similarity between resemble and resent, suggesting in a light hearted fashion that the speaker has (perhaps) mild resentment, but recognises the truth because it is personally applicable. It also has an overtone (at least to me) of downplaying the seriousness of the "remark" by answering it with humor
For myself I'm a 25 year industry veteran, with 35 years of self-taught programming experience.
My experience with stereotypes and generalizations is that the problem is not so much that they might be wrong, but rather that they are always incomplete
In programming, an idiom is a common, accepted way of accomplishing a task which you wouldn't figure out to do all on your own and whose meaning isn't obvious to programmers of other languages.
I suppose you could argue that certain idioms are more or less universal: using "i" as the index in a loop isn't obvious even if you know a lot of math (why not use j?).
But no one who doesn't program Python is going to figure out the first thing you do in any new Python project is write the line:
The fact that it's impenetrable if it's not first explained to you that you should do that is what makes it idiomatic.Idiomatic also means particular to a specific person or group. If your website is based on comparing the same idiom in multiple languages, that's a good indication it's not actually idiomatic. That's probably just a common task.