Slightly orthogonal to the original point, but worth noting that operators can also increase confusion when they don't interact with each other in expected ways. For example, in Python, 'x += y' isn't equivalent to 'x = x + y' in all contexts. Example:
I agree, I find it awkward when a programming language (like Java and JS) provide operators for their built in types, but if you want to use different types (say, complex numbers, numbers with different precision, ...) you have to use much less readable functions instead.
About the argument that operator overloading can be abused for unreadable code: so can functions, naming, or anything else.
I much prefer languages which simply do not make a distinction between functions and operators.
This distinction seems useful only in the context of lower-level languages like C where the operators translate directly to single machine code instructions and thus they provide clues about performance.
But in a higher-level language this distinction is superficial. It's just another function, just with a much shorter name and perhaps using infix instead of prefix notation. E.g. in Lisps there is no such distinction. "+" is a valid function name. In Haskell - same, but you also have a built-in syntax for specifying that you want your function to be called using infix notation by default.
Operators/functions like `+` have well-understood semantics but I don't see how anyone could have an intuition for what `.:` does.
As with many concepts in programming, or any other field that has its own language for that matter, a lot of this comes down to experience.
If you're used to OO-friendly languages like Java or Python, you probably think of the operator (.) as looking up a field or method on an object. If you're used to FP-friendly languages like Haskell, you probably think of (.) as function composition, and it's just as familiar and everyday a sight.
To someone who is new to Haskell or broader functional programming concepts, it might not be immediately clear why ($) is useful if you already have (.), but it's something you soon use as second nature. Before long, you come across concepts like Functor, Applicable and Monad, and with them operators like (<$>), (< * >) and (>>=). The behaviours those operators represent don't even exist outside the context of the underlying abstractions, so someone who isn't familiar with the abstractions won't understand what those operators mean or why they are useful. Again, though, in Haskell these things are routine, everyday tools, just like using (++) to increment a variable in a language like C or JavaScript. That increment operation makes little sense in a functional language where mutability is not the default, and Haskell has no direct equivalent and indeed uses (++) to mean something completely different.
By the same token, to someone who uses (.) all the time for composition of two unary functions, using (.:) for composition of a unary and a binary function is only a small mental jump and makes sense.
None of this is to say you can't take things too far, but it's important to distinguish between using unnecessarily cryptic notations for no good reason (usually a bad idea) and using notations that might appear cryptic to someone who doesn't yet know the concepts but that provide a concise notation for something that is used all the time by someone who does (often a good thing).
See Silhouette's post that is parallel to the parent. These operators don't make sense until you know the underlying concepts. Teaching the whole thing to children seems ambitious enough to probably be unrealistic.
How do you deal with namespaces (or anything analogous like package names) there?
E.g. if you have a function named "+", it can likely clash with other functions named "+"
If the programming language supports function overloading that can be resolved through the types
By the way, I actually like RPN calculators, and low level stack based programming languages, where operators and functions also work the same way, but that is mostly interactive or for small self contained programs
This is a problem of the model of polymorphism a language has. It's orthogonal to the function/operator distinction. It doesn't arise based on having this distinction or not.
In Haskell you use typeclasses (you can think of them as Go interfaces or Rust traits) and without them you cannot introduce nameclashes. In Scheme (the Lisp that I know)... you don't have anything. You import things from libraries, define variable bindings and depending on the order of all this your variable is going to be bound to something, most likely a procedure... but which one? Depends on the order. Not great. I prefer the Haskell approach to this, but then Haskell is a bit complicated in other areas. :/
But again, this has little to do with "operators vs functions".
> But again, this has little to do with "operators vs functions".
It has to do with it in the sense that you can add package names or namespaces to function names which are already text, but it looks bad to do this with symbolic infix operators like "+"
It would be nice if everything that operates looks exactly the same, but maybe the field of mathematics could start with fixing their notation then instead: mathematics uses a bit of everything you can imagine:
add, subtract, multiply, division, power, root and log are all binary operators and instead of making them all look the same, mathematicians have chosen to:
* infix symbol for add and subtract
* no symbol at all (usually) for multiply
* superscript for power
* horizontal bar and put things on top of each other for division
* another horizontal bar and a superscript on the left for root
* function-name notation with subscript for log
Maybe this is a global optimum that math has converged to because this mix of different things actually is the easiest way for the human brain to read formulas the fastest, e.g. the arbitrary grouping that division creates and the fact that this eliminates the need for some parenthesis, and the easy visual difference between all the different notations, may really help. In some programming languages, if you've got dozens of parenthesis or deeply nested indentation, can you easily tell which argument is to which function?
Or, it's just a historically grown monster that could as well have turned out entirely different and can have other forms that would be faster to interpret.
A lot of these things were established long before mathematicians started talking about functions, let alone give a robust definition of what a function is. It may have obscured the fact that these things are very similar to each other.
At the point when there was an effort to formalize these things ideas like RPN starting popping up.
Mathematics also deals with a much more flexible medium. It's written on a plane, in 2 dimensions, whereas programmers confine themselves to just one. Mathematicians are also free to make up their own notation and symbols, whereas programmers are confined to a single alphabet and most often can't extend the syntax.
EDIT: Case in point: some mathematicians argued for a while if abs (absolute value) is a function. Some argued that it's not, because you need two formulas to define it. When somebody pointed out that (for real numbers) it's just "sqrt(x^2)", the first ones agreed. Sounds silly in retrospect now that we have a set-theoretic definition, but at the time it wasn't obvious and all these functions looked very different from each other.
> Case in point: some mathematicians argued for a while if abs (absolute value) is a function. Some argued that it's not, because you need two formulas to define it. When somebody pointed out that (for real numbers) it's just "sqrt(x^2)", the first ones agreed.
Isn't that a circular definition, though? sqrt(x²) = ±x, which isn't a function since there are two values in the domain for each value in the range (other than zero). The version they're equating with abs(x) is the absolute value of the square root, or abs(x) = abs(sqrt(x²)), which is true but wouldn't help prove that abs(x) is a function.
Of course, the underlying problem was the premise that a function must be defined by exactly one formula.
When people take the square root of a number during working with real numbers, they almost always mean the non-negative root. It's a convention. Only when they start considering complex numbers, this convention breaks, and the square root ceases to be a function - even according to the modern definition -, because "non-negative" still doesn't narrow it down to a single number in the general case.
Isn't that basically what I said? The convention is that sqrt(x) is generally read as abs(sqrt(x)). The definition is that sqrt(x²) = x, which has positive and negative solutions in x for any x² > 0. You can choose to ignore the negative solutions (or the positive solutions) to make it a function, but I wouldn't consider that any simpler or closer to a single formula than the piecewise-defined version of abs(x). It's an arbitrary restriction—much like the mistaken idea that a function must be defined by exactly one formula.
Why not just say that abs(x) = ±x, ignoring the negative solutions? If you'll accept "the non-negative square root of x²" then I see no reason to reject "the non-negative component of ±x". Both are single formulas with positive and negative solutions combined with a qualifier rejecting the negative solutions.
> The definition is that sqrt(x²) = x, which has positive and negative solutions in x for any x² > 0.
When people write "sqrt", they mean the function, or, "principal square root". "A square root" is a different thing. Saying "the definition" makes sense only within the context where the definition is established, otherwise we have to fallback to the common usage. Wittgenstein would laugh at this conversation.
Also, I'm not arguing for this convention ("it needs to have a single formula to be a function"). And when you start substituting symbols, almost nothing withstands this test. Because then you can always always twist the formula into several cases. And generally thinking about functions as something that is defined by formulas is a very limiting view. Almost all functions cannot be defined by formulas.
It's mostly true that operators are equivalent to functions, but there's an important distinction at parse time too for most languages.
You can have functions named foo, bar, and foobar, but without some sort of convention (usually whitespace or other delimiters) it isn't possible to look at the string foobar and identify whether this is supposed to refer to the single function by that name or to whatever having foo and bar adjacent means in your language (probably composition or application).
Contrast that with operators like +, which serve as delimiters by themselves and can be unambiguously placed next to other function names because they're drawn from a different character set.
Whether that convenience is worthwhile is up for debate, but I think it would be surprising to type a+b and get some kind of parse error that can be rectified just by typing a + b instead.
> Whether that convenience is worthwhile is up for debate, but I think it would be surprising to type a+b and get some kind of parse error that can be rectified just by typing a + b instead.
I kinda prefer that a+b lexes as one identifier: a+b. Also rabbits-are-cute.
> For mathematicians, operators are essential to how they think.
But programming is not math, just like it isn't text, or color, or datetimes, or any other type of value our programs deal with. Even in programs I've written whose purpose is numerical processing, a vanishingly small fraction of the program involves doing math.
We have no problem using a domain-specific language (like regular expressions) for the occasions where we need to work with text. Likewise for querying a database (SQL), or text styling (HTML). We shouldn't need to infect the rest of the language with all the massive complexity of "operators" just to be able to write "a+b" in every possible context.
In fact, even when I need to do math, the built-in operators are too weak to support half the math I need to do. We're paying a huge cost, and not getting our complexity's worth.
It's not even visually correct. The example here is:
x + (y + z) == (x + y) + z
which is true in mathematics but not true in most programming languages, where the values could be implicitly promoted modulo integers of different widths, or IEEE-754 floating-point values. He says he wants operators so you can see that addition is associative, but in his language, it's not.
My favorite design, in terms of separating the syntax, is actually Tcl. Remove the complexity of needing infix arithmetic operators from 99% of the language, but keep it available for the times when you want it. It still doesn't support half the math I need, but at least the language was kept simple by having math ("expr") and text ("regexp") off in their own commands.
Operators are a tradeoff between compactness and ambiguity. They add ambiguity because they are polymorphic, and the order of operations becomes implicit, governed by the operator's priority (for infix or mixed pre/postfix).
Many programming languages abuse the + operator for string concatenation, which is not commutative. This is where my tolerance for operators ends.
> Many programming languages abuse the + operator for string concatenation, which is not commutative.
This is why Julia uses `*` for string concatenation, which they claim is “more natural” than `+`. Not sure I agree - I’d rather see a completely separate operator for concatenation, such as Haskell’s `++` or Lua’s `..`.
Arguably the simplest notation for string concatenation would be just concatenation (which would be the notation for multiplication, hence why Julia's choice could be argued to eb the most natural). Unfortunately programming languages can't quite deal with the ambiguity of having `a (x + y)` and `f(x)` refer to different operations.
Some languages seem to allow it for string literals though.
I understand your confusion, but in math the + operator does not have to be commutative either. It often is, but just like in programming, you can define in it anyway you want.
I have never seen + be used for a non-commutative operation in mathematics. I have seen it used for something which is to-be-proven-to-be-commutative in the next page or so. Similarly, I have only ever seen juxtaposition (like “ab”) used for associative operations, I.e satisfying a(bc) = (ab)c.
This is a recurring problem when working with floating point values. Intuitions about concepts like associativity or even equality don't necessarily hold, but in many cases you will still get away with it, so it can feel like there's inconsistent or unpredictable behaviour. That can be a big stumbling block for inexperienced programmers working with this type of data at first, and it surely leads to many bugs.
With experience, programmers adjust their intuitive expectations and guarding against these kinds of bugs becomes second nature, and to some extent that is probably the best we can do because of the nature of the beast. However, it does illustrate the cost of violating intuitive expectations about how familiar operators and functions will behave.
I can imagine there exists advanced math where addition is also not commutative, yet they probably use `+` there too. Aren't you throwing the baby away with the bath water?
Please note that Perl 6 has been renamed to Raku (https://raku.org using the #rakulang tag on social media). With regards to using ~ , nothing has changed :-)
Yes, well known operators like `+` with well known meanings are useful.
Why would you write a post about this? Is there some kind of disagreement out there about this? I can't think of any popular language that doesn't have it!
Now, consider this instead:
n <> x &~ n
Not only do you not know how to parse this without further knowledge, you also can't pronouce it in a meaningful way or hope to guess what it means.
> Not only do you not know how to parse this without further knowledge, you also can't pronouce it in a meaningful way...
Therein often lies the problem, if you mentally cannot translate the symbols into meaningful concepts it's hard or impossible to work with them. For example, there are primitive tribes who's language only has words for the amounts of one, two and three. All amounts above that are simply referred to as 'many', no matter if it's four or a hundred. These people cannot (or at least not easily) be thought basic mathematics because they probably can mentally picture 2+2=4 being different from 2+3=5.
So if you can find words for these symbols or make up words for your own for that matter and are able to read this expression in these words it will be a lot easier to parse and understand, maybe even intuitively.
edit: another example in the Python domain for me would be list and dict comprehensions. When I first encountered them in other people's code I had a hard time understanding what they did. Searching for them was also hard, since I could only describe them (ie: loop within square brackets) and not name them. Finally when I found their name in an reply on SO I was able to search directly for the concepts and find meaningful answers on how to apply them in certain cases.
A new operator is just a letter/word you haven't learned yet. The only things wrong with operators is that non-Asian speakers don't like learning new letters and search engines refuse to index them.
Seems easy to parse for people used to working with bit sets in C. The &~ operator is actually two operators frequently used together to do a set difference(†). So it's asking whether n is different from removing all the bits of n from x. The answer is true.
†: It's so frequently used that on x86_64 there's a single instruction for that, ANDN.
Unless of course you are talking about Haskell and its semigroup combine operator and its lens update operator. In that case the expression can't type check: n has type State s a, x has type s, and n also has type s, which means s is an infinite type and fails occurs check.
The former is much easier to parse (at least to me). ~ is a unary operator, so it's easy to think of ~n as one thing. But &~ is two operators, which I think of as two things, not as one (even if the combination is common).
Few people would be able to guess what it means the first time they see it, or think of it as the "obvious way" to merge two dicts
[...]
{d1, d2} ignores the types of the mappings and always returns a dict. type(d1)({d1, d2}) fails for dict subclasses such as defaultdict that have an incompatible __init__ method.
Though I'm not personally convinced that + would be more obvious or that dict subtypes are merged often enough for this to matter.
As mentioned elsewhere in this thread, the mathematical + operator is commutative: d1 + d2 == d2 + d1. How does the + operator maintain that commutativity in the face of duplicate keys?
It is a convention, not a rule, in mathematics to use (+) to represent operations that are commutative (i.e. abelian) groups.
But this convention is not at all respected in programming (e.g. + for string concatenation in many languages) so I have to admit that it's pretty irrelevant.
That already falls apart when the + operator is used for string concatenation so I think the answer is people are using it in a less rigorous sense already.
Is that from a mailing list? I really like it, it's much more readable / user friendly than most mailing list posts I've seen. Call me an inverse luddite or something but I never could get into mailing lists.
This is the kind of thing that wouldn't hurt if would be contributed back to mailman and became the default. Many of these Open Source projects have really poorly thought out defaults (if anybody even thought of these default values, at all).
It's been about 15 years since I last installed and configured GNU mailman but I thought it was a solid application for what it does. It's great to see UI improvements in the web interface side of it.
It might also be that your brain is just more used (years of extra experience) to infix notation. So why not take advantage of this ?
In the monad world, people also prefer (>>=) to bind.
OTOH, how long does your brain need to get used to other notations?
As an aside, his arguments are completely debunked if you use RPN.
> Commutative: a b + == b a +
> Associative: a b c + + == a b + c +
> ...
Infix notation is preferred because the binary operator is visually close to both of its operands. The same equations as GP, but == is also postfix:
> Commutative: a b + b a + ==
> Associative: a b c + + a b + c + ==
> ...
As expressions getting larger and larger the operands are getting farther away from their operators. RPN is great for computers to read and maybe humans to write (RPN calculators). Not so much for humans to read.
> It might also be that your brain is just more used (years of extra experience) to infix notation.
I’ve recently started to work with a program that uses Reverse Polish (postfix) notation [1] to implement mathematical equations and I find it very hard to make sense of the equations from the syntax. I hadn’t previously been aware that there were other alternatives to infix notation.
That sounds great. The syntactic simplicity of prefix notation and Scheme seem particularly suited for early education. I imagine it will help form a intuitive mental model of computation and mathematics, before being taught the classic infix notation. I guess you'll see!
It's one of the things I really like about scala, x call y is always equivalent to x.call(y)
This combined with relaxed method name restrictions allow you to do "operator overloading" in quite a sane way and allow you to write bigint + 1 or set contains element.
Kotlin is a bit more conservative and only translated the built-in operators to fixed methods: + to the plus method etc.
The worst part is there is a talk by Steele called Growing a Language about designing Java where specifically talks about the importance of overloading so that things like future number representations like BigInteger can be first class
Operators can be useful in a setting where they are expected and natural - numbers, strings etc. Overriding them for dictionaries, streams custom classes has been shown to be a bad idea that makes the authors feel clever but results in code that is both less readable and harder to lookup - your IDE could bring up the right documentation when asked on dict.union but without some advanced syntax and type analysis will not be able to guess what | means for dicts and you'll get the documentation for bitwise or
> it doesn't matter whether the + operator binds tighter
to the left or to the right
Depending on the language, (a+b) + c may or may not be the equivalent of a + (b + c)
Interesting. Apparently this is true in Python because it uses bignum numbers.
As I understand it, it's true in Java too, because (signed) integer overflow is defined to wrap-around. It's not true in C, where signed overflow invokes undefined behaviour. In C# it depends on the mode whether you get wrap-around or an exception on overflow.
And of course it's never true with floating-point.
Reducing cognitive load [1a], transforming representations into something we can understand (e.g. x86 vs Python) [1b] and engaging our visual cortices [2] are all amazing things to get ourselves to be more productive.
[1a] Cognitive psychology: working memory 4 to 10 items
[1b] Cognitive psychology: I read it in some textbook how this works well, it also makes intuitive sense to me.
[2] This idea was stated out loud in one of 3Blue1Brown's videos, and well, in his case he demonstrates how true it is.
In this case it's justified, because there is no ambiguity. Just as a mathematician would write using traditional notation without parentheses. At the courses I attended in group theory, module theory etc. we were always required to prove that things are associative before dropping parentheses. Lisp notation just requires less characters (the '+' isn't repeated), but the convention is the same.
Depends on your language. In Lisp, + takes any number of arguments. In C++, it takes exactly two.
(I'm presuming that by 'add', you mean '+'. If not, then 'add' takes however many arguments it is defined as a function to take, and that still is language- or library-dependent.)
(And if you refer to pure math, then add or + is only a function of more than two arguments if it associates, which is dependent on the argument type.)
no, by `add` i mean exactly the function `add`, not + operator. he cherry picked the example to show how operators are "better", he's wrong. in practice sum of a sequence is more important concept than it's special case - sum of two arguments.
What function 'add'? 'add' in some computer language? If so, which language? Or 'add' in mathematics? If so, what is the difference between 'add' and '+'?
Then what is your basis for saying that the sum of a sequence is more important than the sum of two numbers? Given associativity, the sum of a sequence is the same as recursively taking the sum of two numbers. Going the other direction, if you have the sum of a sequence, then the sum of two numbers is obviously included in that. The two approaches therefore seem completely equivalent to me. So why say that one is the more important concept?
Sure, so prove that add(x1, x2, ..., xn) is equal to any permutation of the numbers x1, ..., xn. You still need to prove that add(x, add(y, z)) = add(x, y, z). So now you have two things to prove...
“Sure” what? I don’t need to prove any of that to use variadic ‘add’ function. Pretending like it’s function of just two args is just a cherry-picked example to make operators look nicer.
Surely you still want to know about properties like add(x, add(y, z)) = add(x, y, z), to take advantage of the variadic functions. I was just pointing out that making something variadic doesn’t mean you have less work to do correctness-wise - it’s strictly a quality-of-life improvement for using the operator.
The difference between function add and operator plus are solely in the resulting parsing tree. Chaining pluses together is the same as adding arguments to add. Proving correctness of operators and stuff is just not something I do when I write python code.
> Chaining pluses together is the same as adding arguments to add
No, chaining multiple pluses together in Python (as in most widely-used programming languages) is exactly the same as doing multiple two-argument additions, and is not the same as doing a multiple-argument addition.
You can see that in this Python code:
n = 2.0 ** 53
print("(1)", n + 1 + 1)
print("(2)", (n + 1) + 1)
print("(3)", n + (1 + 1))
the line (1) has the same result as line (2), but is different from (3).
That's because addition of floating point numbers is only defined for two numbers and is not associative, so `n+1+1` is parsed as `(n+1)+1` and not some magical three-argument floating point addition, which would give the same result as (3).
chaining pluses is the same as adding arguments to add
watch:
1+1+1 == add(1,1,1) == 3
you dont have to be so pedantic about implementation quirks, it's not going to become important in your lifetime, unless you consider this comment thread important, in which case, congrats.