Hacker News new | past | comments | ask | show | jobs | submit login
My favorite Erlang program (2013) (joearms.github.io)
338 points by vector_spaces on Sept 7, 2023 | hide | past | favorite | 117 comments



If the server closure F, besides it's own messages can also receive a `{become, F}` message, then you can then keep changing the server to something new again and so on.

Completely unrelated, but I remember talking to Joe at one of the Erlang conferences. He was always excited about technology and always happy to chat with anyone. He was dismayed how Windows had gotten worse and less usable over the years, and how one day we won't be able to browse our own files on it until we sit and watch advertisements for a while to unlock them. Joe was nearly right! Sure enough, years later I hear there are ads in Windows 11 and you have to go out of your way to remove them. Not quite the same yet, but by Windows 13 I am sure we'll get there.


I use Linux at home, and use Windows for work. Outside of the ads, I've been most surprised by

1. My Windows machine has a lot more of the weird issues I used to see on Linux like 6-7 years ago. Like nothing showstopping. More like quality of life reducing stuff, like the volume in one headphone will be far louder than in the other, despite the mixer showing them the same. Or weird crashes in applications (specifically with Microsoft software). Or Windows documentation pages not resolving in any browser, forcing me to use an archive site to view them. Or my laptop occasionally not detecting my external monitor after detecting it fine for weeks. Or the monitor will flash occasionally

2. Less surprising, more dismaying: Office software by default tries to get you to save files to the cloud instead of locally. Yes, having your files everywhere across all devices is nice, I know, but I would rather have my files local first, with cloud backup, rather than have my files in the cloud first. I'm old fashioned, I guess

3. WSL is pretty jank, but it is probably the only reason Windows is usable at all for me

More topically: I've always heard that about Joe being a wonderful human being, and it makes a lot of sense. When I was first teaching myself to code, I came across his book on Erlang, and it had a huge impact on me. I loved the playfulness, humility, and imagination that he brought to his work. Even though I hardly ended up writing any Erlang, reading his writing changed the way I think about so many things related to code. It's clear he cared a lot about pedagogy. I have a math background, and wish that both math and software engineering and CS culture embraced playfulness and humility more


Microsoft fired their entire QA team.

They also fired their entire technical documentation team.

They now treat Windows like freeware/shareware, and monetise it the same way -- selling ads, collecting telemetry, bundling, and being pushy.

Windows Server is abandonware, Microsoft only cares about Azure now. As far as they are concerned, hosting software yourself on-premises is a strange aberration and will eventually just go away.

Entire categories of their products are no longer available for on-prem installation, such as: CosmosDB, Data Factory, Purview, Logic Apps, etc...


"Microsoft fired their entire QA team. They also fired their entire technical documentation team."

Honest question: Is this a scurrilous rumor or the actual truth?

If you're just being hyperbolic, I'm down with that, no complaint, I'm just curious how literally true this is.

(I have to admit it'd explain a lot of what I see lately; I've observed for well over a decade now that if people held Windows to the same standards they do for "the Year of the Linux Desktop!" claims that Windows doesn't necessarily pass the test either. But it does seem to be getting worse lately, and with things like ads it's downright broken by design. Windows now survives in my house by the skin of its teeth, only because I haven't found a Linux Cricut solution and we use that every few months.)


They still hire SDETs so I assume it's not true, however I did have a manager who said his whole team were laid off from Microsoft so I imagine they had some substantial downsizing from these days [0].

[0] https://techcommunity.microsoft.com/t5/exchange-team-blog/wh...


They essentially eliminated the SDET role across the company. There may be a few left here and there, but it used to be a whole thing of its own.

The book How We Test Software At Microsoft describes in detail what the SDET role was about and how Microsoft used to approach quality.

I think it’s a tragedy that they eliminated the discipline entirely.


> Microsoft fired their entire QA team.

> They also fired their entire technical documentation team.

Explains why I, when I tried to use the new Clipchamp app, I first had to look for the docs because what should have been obvious wasn't obvious

... and when I find the docs

- they are translated (which is bad because it makes it harder to search in online resources when you have to guess what the concepts are called in English)

- and it is impossible to change language

- and navigation is so badly broken that the only way to navigate reliably between pages is to scroll to the top of the one you are currently on, find the link tree and choose another

- oh, and the translations are hilariously bad: "to share" (as in sharing, which in my language is "å dele" or "deling") was consistently translated as "a share" (as in stocks, which in my language is "aksje"). As someone who knows English reasonably well and will even sometimes translate it is easily enough to understand, but really really jarring for me and probably not understandable for someone who doesn't understand English and needs translations.


They seem to be going cloud only even on the desktop.

I tried to use the built-in video editor in Windows 10, and it told me to start using Clipchamp instead. A few clicks later the new program was installed and running, and before I could do anything it asked for my name. This is normal in many office programs that want to embed author info in metadata, so I entered it without any hesitation. A few seconds later a new email pops up in my inbox - I apparently had signed up to an online video editor service using my Microsoft account without knowledge.


I've had a headphone balancing issue on MacOS as well, probably my previous one (intel, 2016 model I believe), ended up installing a tool whose sole purpose was to rebalance the audio every once in a while if it detected it went skewed. I have zero clue how that would happen, I can't see any practical purpose for any application or subsystem to be adjusting the audio balance.


Probably it’s headphones doing spatial sound wrong?


In my case at least, my Windows laptop is the only device that has this problem. It works totally fine and as expected on several Linux laptops, my iPhone, my android, and my partners phone.


> My Windows machine has a lot more of the weird issues I used to see on Linux like 6-7 years ago. Like nothing showstopping. More like quality of life reducing stuff, like the volume in one headphone will be far louder than in the other

A while back I installed a Windows update that permanently prevented my existing Bose headphones from being able to connect to my laptop over bluetooth.


Your first paragraph. This is actually a feature of the Actor model. I didn’t realise before that it’s possible on the BEAM.


I thought this was basically the whole argument for Erlang, continuous hot-replacement of running servers and resilient server-cluster upgrades. Basically everything that kubernetes offers (from an operational perspective) has been available as a first-class programming construct in Erlang for decades. That was my understanding, anyway.


> Sure enough, years later I hear there are ads in Windows 11 and you have to go out of your way to remove them.

The intrusive ads are not new to Windows 11. They're already there in Windows 10.


A short (35 min) overview of BEAM and why it is not like other VMs: JVM, Node https://www.youtube.com/watch?v=pO4_Wlq8JeI


I have watched his similar talk The Soul of Erlang and Elixir, and it was wonderful. It gets me excited everytime.


I miss Joe. His infectious enthusiasm for doing computing _better_ left such an imprint on young me and the way I approach technology today.


    universal_server() ->
      receive
       {become, F} ->
         F()
      end.
Honestly I don't fully appreciate the power of this universal server. Can anyone help?


If you are familiar with Go, it's similar to a goroutine that waits for someone to send it an anonymous function to start executing. In Erlang, you solve problems by spawning lots of processes, and most processes are waiting to accept messages. One critical difference is that in Erlang, processes can run on remote machines, seamlessly. This process accepts a message which is a tuple containing the atom become, which is basically just an enum, and a function. Another process, such as the Erlang shell, can send this tuple message with the function to this process at any time so that it "becomes" that function.

What he is saying is that you can swap out the logic of this waiting loop with whatever protocol logic you want. His example was that he had a fleet of machines that were running this loop, and he sent all of them a function that implemented a gossip protocol. But he could easily send them all another message that turns them all into BitTorrent clients.

Joe was an absolute genius and an extremely kind person. I had the honor of meeting him once. Erlang is still one of the most beautiful technical creations I've ever encountered. It really does make you see concurrency in a whole new way.


On the subject of beauty, I really like this quote from Joe Armstrong:

“Make it work, then make it beautiful, then if you really, really have to, make it fast. 90 percent of the time, if you make it beautiful, it will already be fast. So really, just make it beautiful!”


> Joe was an absolute genius and an extremely kind person. I had the honor of meeting him once.

Completely agree. Joe was always willing to reply to my emails and answer my questions in detail. He was an exceptionally talented explainer, and his messages were always interesting and entertaining, while also being informative.

I never met him in person but I will always treasure the correspondence I had with him.


> One critical difference is that in Erlang, processes can run on remote machines, seamlessly.

Is there a concise way to explain how Erlang achieves this property?


it's not as mysterious as it sounds. every data structure (including modules and anonymous functions) has a binary serialization and every erlang vm is also an rpc server that can receive arbitrary data -- including whole programs -- and execute them. your vm of course needs to know about the remote vms to do so but that's where the rudimentary clustering mechanism in erlang comes into play


Also no shared memory in processes and they all communicate strictly through message passing, so running a piece of code in the local node, in another node in the same machine or in a node on the other side of the planet is a matter of telling which pid you want to send the message to, the BEAM will figure out how to send the message to the correct place in the cluster and your program will be none the wiser.


You can find all the details about Erlang internals in The BEAM book : https://blog.stenmans.org/theBeamBook/


It's basically a distributed RPC system. Your call is a message.


I would be interested if you could elaborate on the advantages and disadvantages of Erlang/Elixir over Go. I have often heard Elixir processes likened to goroutines. Since the process model seems like the primary advantage of Erlang, why would one prefer Erlang/Elixir over Go?


The critical difference is that processes are isolated, they don't share resources like goroutines do. This makes it possible to separate the error handling from the business logic.

When you open a file, a socket, allocate memory, borrow a database connection from a pool, etc. you don't need to write try/catch/finally, or defer() statements, or logging, or any kind of error handling like that - if your code crashes, the VM will take care of the cleanup (because it knows what resources are owned by your process) and the supervisor above your process will log the problem and restart the process if necessary.

This makes Erlang applications much safer by default, and the business logic is clearer and simpler because it's not mixed with error recovery code (the "if err != nil" every other line that Go is famous for). The error kernel (the part of the app that must be carefully written to ensure reliability) can be kept really small[1].

This comes at a cost in performance, because data must generally be copied between processes, whereas goroutines can send pointers directly to each other. But if you can afford it, it's _very_ nice. Besides the error handling this also enables crazy observability (you can connect to a running app and inspect processes[2], kill them, send them messages, jump between nodes in a cluster, etc.), live code reload, and a bunch of nice things like that.

[Process isolation also enables clustering, where processes running on different machines can talk to each other as if they were in the same OS process. The copying means it doesn't matter if the other process is remote or local.]

[1] https://medium.com/@jlouis666/error-kernels-9ad991200abd

[2] https://github.com/zhongwencool/observer_cli#demo


The clustering is what makes it feel so different, and what makes it such a compelling runtime for certain types of problems.

It's not an accident that large distributed chat programs or MQ systems are often written in erlang. From inside the code, writing a distributed system feels the same as working on one node.

With go, you can't do channels to a go process running on a different server without significantly changing the language. In erlang, you don't notice because that's just how it works.

Write an app that runs on two nodes and sends updates to all clients via websockets. In erlang, you just write it. In EVERY other language, you're loading a non-idiomatic library and probably running an MQ cluster or using an MQ service... probably written in erlang.


Copying doesn't always have to happen. The Erlang VM is free to share blobs behind the scenes.


Indeed, hence the "generally" :-)


I wrote up a fairly detailed comparison of the concurrency models, as someone who has used both for years, a few months ago: https://news.ycombinator.com/item?id=34564228


More dynamic, better management, dead easy to get distributed programs in multiple machines.


To explain further, the `receive` keyword is a bit like a switch statement (but actually a pattern matcher) for incoming messages.

Here they made a new server that takes an return process (From) and a number (N). The exclamation point sends the result back to the return process.

    factorial_server() ->
        receive
           {From, N} ->
               From ! factorial(N),
               factorial_server()
        end.

    factorial(0) -> 1;
    factorial(N) -> N * factorial(N-1).
This code then spawns the server, sends a message to that server to become a factorial server, then tells that server to send it back a message with the factorial of 50. It then specifies it's own message listener that takes whatever it receives and returns it.

    test() ->
        Pid = spawn(fun universal_server/0),
        Pid ! {become, fun factorial_server/0},
        Pid ! {self(), 50},
        receive
            X -> X
        end.
A couple of the major advantages of Erlang its distributed parallel nature, and also hot code update. Which happens in `Pid ! {become, fun factorial_server/0}` where it overides the receive loop of universal_server with that of the factorial_server. Though I think proper hot code update doesn't work like this


And /0 is the local vm?


It's the arity of the function. In Erlang there are no variadic functions, but functions with different arities can have the same name, so universal_server/0 takes no arguments, fib/1 takes 1 argument, fib/2 takes 2 arguments, the second arg may be the accumulator for a recursive Fibonacci, for example, and fib/1 may call fib/2 as an implementation detail.


`/0` refers to the arity of the function.


Normally you need to know ahead of time what your server is going to be used for and write the code in advance.

But with Erlang, you can have a distributed network of Erlang servers where the server is a generic computing resource that can do anything the client wants.

The code actually comes from the client. No need to get your system administrators to install some binary on all the machines. You simply pass the function along and the remote machine calls it.


A few people here are liking the idea, isn’t this the exact definition of arbitrary code execution (exploit)

It’s possible people are showing the capability of BEAM thou


It's the definition of any distributed system (or even single server) where you can deploy code, whether it's Erlang or a kubernetes cluster you setup.

Is pushing new code for your server to run "arbitrary code execution"? I guess we can call it that. Is it an exploit?

Depends if the code comes from some random person on the internet from mechanisms that you don't intend for pushing new code to run (e.g. through a buffer overflow on your server or XSS), or if it comes from yourself through your official mechanisms.


"Arbitrary code execution", yes, "exploit", no. You need to be "inside" the BEAM cluster and a member of the BEAM cluster to do this. That is not something you hand off to end users, just as you do not normally hand end users direct access to your database socket or other such resources. In Raymond Chen's terminology [1], if you're sending Erlang terms to the BEAM cluster, you're already on the privileged side of the airtight hatchway.

[1]: https://devblogs.microsoft.com/oldnewthing/20060508-22/?p=31...


Only if you allow untrusted clients to join to the cluster. To really make use of something like this, you’d have to control the clients rather than opening it up to anybody on the public internet.


Thanks! I get it — no need to install binaries of specific servers.



Double check whether you think of the correct abstraction level. The BEAM is more like the JVM than Kubernetes.


One of the most interesting characteristics of the BEAM, in my opinion, is that it's similar to both. As a memory-managed bytecode runtime it's similar to the JVM, and as a distributed process orchestrator+discovery and RPC system, it's similar to Kubernetes.

I often wonder what would have happened if the "BEAM renaissance" (driven largely by the birth of Elixir and associated tools) had happened a decade earlier, before Kubernetes became the de-facto standard for ad-hoc distributed computing in web software.


Basically, the idea is that it can become any type of server you’d like. The actual function to run the server is passed by the client and the Erlang process effectively morphs into that server after constructing it using the provided function. When you consider Erlang’s hot reloading abilities, this simple architecture becomes even more powerful. Another way to look at it is that the Erlang process is just compute waiting for work to do and the work is to run full-blown servers. Pretty neat.


I was typing this answer on my phone and I didn’t realize several folks already responded. Move along, nothing to see here. Haha.


I may be answering the wrong question, but I’ll give it a shot.

Erlang’s architecture is unusual; both the virtual machine and the language are built around the idea of tiny processes operating concurrently, each process running in an infinite loop waiting for incoming messages to interpret.

This allows a process to become whatever code you send it. If you need a process to control a microwave, and then run some quantum computations, and then predict the winner of tomorrow’s football game, you just send it the code it needs for each operation and it happily does so.


Ah, so TL;DR is that each process evals the supplied code and returns the output to the caller?


The universal server may or may not return anything. It just executes whatever is passed to it. In Erlang there are two ways of "calling" (in quotes for a reason). There's conventional function calling which is the regular synchronous style that we all know and love:

  f(10).
This will produce a result and return it to its caller. The other isn't really a call, it's "sending":

  Pid ! 10.
Some process id has been sent the value 10. It may be on this same node, it may be on another node, I don't have to care (sometimes I do though). This is asynchronous. Once a send is done the sending process will continue on (perhaps even terminating). At the other end of the send is a receive (hopefully, otherwise somebody's queue is getting filled up...):

  receive
    N -> ... % do something with this value
  end.
In the case of `universal_server` we don't know what it will become, it's just going to execute whatever 0-ary function is passed. That function may or may not include a "return" (sending a value back to the origin). It could also just terminate the universal server. Or it could temporarily convert the universal server into something else and then become a universal server again.


Slightly different in the low-level details, but conceptually accurate.


You execute that on any node in your system and send it a message `{become, fun some_function/0}`. Once it receives that conforming message (a tuple of two items, the atom `become` and a 0-ary function) and that node will stop being a "universal server" and become whatever process "some_function" describes.

And in his case he had access to some 9000 computers. If each was running at least one Erlang node and each node was running a universal server, then with a very simple program he could write a function, serialize the function, and distribute the function to his 9k+ running universal servers and turn them into 9k+ specialized servers.


You can achieve the same goal, if not so elegantly, if you define a node process that processes HTTP POSTs by evaling the body of the request to replace the previous handling function. In practice you'd quickly want to post a function that behaves normally. However you could also define a function that does something "normal" but has a code path for continually redefining the function.

As exotic as this sounds, this is very similar to what web-browsers do with script src tags, especially from 3rd parties. The page is saying "Hey let me eval a function that can do whatever it wants in this context. I trust you!" Most webdevs don't consider this a threat vector!


This universal server is a process listening for a message with a function and then it executes that function here the function is just a different infinite server turning a running process into something new. I think it shows off the power of erlang processes and the ability to pass functions to replace running processes with new behavior without changing pids.


It is a nice example, which would be rarely used in the real systems. Could be used for understanding how hot-code reloading could be implemented. Real usage could be if we have some big state and wanna apply some operations on it, we could send "{execute, F}" into the server and pass code (i.e. a simple reference) instead of data.


Is the same trick possible with other BEAM languages? E.g. Elixir?


Yes. Here[0] is a gist showing it done in Elixir -- as you'll see, it looks very similar to the Erlang code:

[0] https://gist.github.com/mndvns/80b00cf67d418e8359fb5566b80ae...


Here is an aggressive worm/virus that exploits this exact mechanism.

https://github.com/wmealing/Elixir-virus


Yes; it is a property of the "Erlang Run Time System"(ERTS)/"VM"(BEAM) - https://news.ycombinator.com/item?id=37415159


Yeah. I don’t see why not.


I spend a lot of time trying to explain why the BEAM is special and why concurrency in Erlang/elixir/etc is special when juxtaposed next to Go's or Java's concurrency story (now that Loom is right around the corner).

From now on I'll just link them to to Joe's favorite Erlang program and this HN thread.


This blog post was recently mentioned in this talk, which is excellent: https://youtu.be/pQ0CvjAJXz4


Excellent Presentation!

Covers/Explains all the steps in a High-Availability, Fault-Tolerant, Distributed Erlang-based system architecture beautifully.

Thank you very much for the pointer.


Is this the system where every patient is represented by an erlang process?


Don't know Erlang aside from high level understanding about the language (basically know what one who read about it, but never programmed in it, would). Why is this needed (or, at least why is it nice to have)?:

  universal_server() ->
      receive
         {become, F} ->
             F()
      end.
I mean what purpose does it serve other than having F() directly? One could just directly spawn F on the remote machines, no?

(Perhaps that would need to have the code for F already on those remote vms, whereas this eg. also serializes and forwards F's code?)

I would understand the utility if this also handled some common boilerplate, but it seems to just wait for become F message and then doing F().


This code while cool is not very idiomatic or perhaps not something you’d see in production often. That’s ok, Joe loved to explore and experiment.

The way it works is that F is a closure. It contains some code but could have captured variables as well as it’s environment. Calling F() then executes the code and it will have access to the whole surrounding environment when F was created.

In Erlang we can send a closure across the network to another Erlang node. That can be used and is cool but you still need the module code referenced in the closure available on all nodes.

Long story short, in production code you’re right, we would just call the F module directly and pass its the arguments explicitly. This is more of a cool demonstration of possibilities.


Doesn't this also allow hot-reloading on code update? You can update the code, and upon receiving, auto-reload it. So that "is just a closure" is adding a LOT of context (ie the entire codebase).


You could use this for hotloading: send every (relevant) process a message with a new closure to run, and they begin running it when they process that message, but BEAM provides other infrastructure for hotloading.

For modules, BEAM can have two versions loaded: the current version, and an old version. When you make a fully qualified function call module:function(...), it will call the current version, but calls within the module stay within the same version.

So if you load a module's new beam file, all processes that make fully qualified calls into that module get the new version. In case you're wondering, if you load a third version, any processes that still running in the first version get killed (you can check before you load a third version if you don't want that).

It's idiomatic to write receive loops as

   loop(State) ->
     NewState = receive ...
     end,
     ?MODULE:loop(NewState).
so that if a new version of the module is loaded, processes will update after they next receive a message. Then you might add a timeout or a heartbeat message to ensure the processes update in a reasonable timeframe. Or use gen_server which has the loop and calls your module so processes only linger in your old module while working on a request.


It can but it's somewhat orthogonal. What F really is, is a closure. Besides being a function it may also capture its surrounding variables. However, if in that closure it calls a function from some module like foo:bar(Var) the module foo bytecode still needs to be available on the remote node. There is a way to remotely load module bytecode to remote nodes but the just sending the closure F itself won't do it. It has to be done separately (https://www.erlang.org/doc/man/code#load_binary-3 is one way to do it or the shell prompt's nl(...) function does the same).


It's cool because Erlang's functional design means F() is just as good a "top-level" context as whatever one you were in before. Eg this will smoothly allow existing requests on the old logic but start new requests on the new logic, and eventually allow the old logic to be GC'd.


> Or would that need to have the code for F already on those remote vms

yes.

but you can also send the code to the remote VM

for example in Elixir you can `Code.compile_string` which returns the name of the module and the code as binary (beam bytecode)

You can then send that bytecode to another VM and load it in the remote VM using

   :rpc.call('node@remote_host', :code, :load_binary, [the_module_name, "filename_doesnt_really_matter", binary_code])
and then spawn it

   spawn('node@remote_host', my_module, my_function, [args])


The quoted program is intended to run on the remote machine, and deserializes the new program that the remote machine is supposed to start running. F is a variable that stands for this new program that has been serialized and sent as an inter-process message, which in this case has been sent also over the network.


There is also this lesson from Joe, about how to write a basic server in Erlang:

https://gioorgi.com/2015/erlang-lesson1/

I transcribed and explained a bit more: it shows the power of an async language like Erlang/Elixir compared to other ones. Sadly, it is little used nowadays


My favorite Erlang programmer.

RIP Joe.


My latest obsession has been the Julia language, which seems to borrow a number of ideas from Erlang in its distributed model. It's reasonably easy to replicate this kind of "instant-server" that magically works across multiple nodes. I don't know for sure, but I would be surprised if Julia's Distributed module wasn't at least a little inspired by Erlang.

I absolutely love Erlang, and I wish I had more of an opportunity to use it, but it's nice to see that some of its concepts are bleeding into other platforms.


Ah, that's great, sounds like we are allies, for example when arguing for "let it crash" and other take aways from the Erlang error reporting philosophy in the Julia community.


> What I ended up doing was making some scripts to install empty universal Erlang servers on all the Planet lab machines (pretty much like the code in this article) - then I set up a gossip algorithm to flood the network with become messages. Then I had an empty network that in a few seconds would become anything I wanted it to do.

But you already had an empty network that in a few seconds would become anything you wanted it to, that's how you installed empty universal Erlang servers on all the computers.


The difference is I think that you hand off managing the distribution of the code to the Erlang VM, making it much easier to distribute anything else.

Of course security guys are going "Wait, every machine has an open RPC channel that will blindly execute code to every other machine???"


> security guys are going

It's a bit concerning if it's only "security guys" seeing issues with this approach.


A cluster of Erlang nodes in distribution don't have a security boundary once they're connected. Neither do two threads in the same process in a traditional system.

If that's an issue, and I can see why it might be for some, you'll need to use something other than Erlang distribution to connect your cluster.

You can patch out the explicit rpc server, but I'm not sure that you can patch out receiving functions (maybe removing the code that deserializes them from external term format would work?), and you'd need to audit the whole thing to ensure nothing ever calls a function it receives.


I'm not a security guy, so I can't speak for what their concerns would be.

I think my concern would be more in the realm of, you're inherently relying on whatever peer authentication system BEAM uses (and it being bug/exploit-free), to determine whether an RPC call can suddenly inject new code into a running system.

In my experience good security is a lot like an onion - there's lots of layers and generally some crying.

If we consider a "non-BEAM" system that runs on say JVM, or even a "scripting" runtime (e.g. PHP, Python, Ruby, NodeJS) - you typically have a number of elements that (can) contribute to making the code that runs, relatively immutable outside of a deployment event (e.g., restricting access to ssh/sftp/etc protocols; varying types of filesystem permissions; readonly volumes; container filesystems; etc)

Do other systems have similar "run arbitrary code" vulnerabilities? Sure. Allowing uploaded content to be executed by the PHP runtime is a classic example. But there's almost no legitimate purpose for this, it's almost certainly a configuration error/mistake, and it's relatively trivial to prevent it completely.


You can lock things down to your hearts content using features of your OS of choice.


You're not supposed to open the Erlang RPC port to the open Internet, and there is a modicum of security barrier with its cookie if you're so inept as to run Erlang on a non-firewalled server on the open Internet.

You don't need security people for basic sysadmin know how.


You do apparently need security or basic sysadmin knowhow to understand that not all security threats originate from "the open Internet"


So? How is this specifically relevant to the topic at hand?

If you have someone malicious in your internal network that can connect to your Erlang nodes, you have bigger issues that them connecting to your Erlang nodes.

I don't see the point of your original comment. Having an out-of-the-box RPC mechanism means you gotta secure it as you secure any other internal service. That's sysadmin 101.


> I don't see the point of your original comment. Having an out-of-the-box RPC mechanism means you gotta secure it as you secure any other internal service. That's sysadmin 101

Most RPC systems can't inject code into the running system dynamically. That's the point of my original post.


Yeah, but what happens when your local web browser is tricked into opening a socket to the local Erlang instance by a malicious web page? That's the kind of thing that keeps security professionals up all night.


It’s a binary protocol though, which means that this specific scenario isn’t an issue as far as I’m aware (the only ways of making arbitrary JS-controlled network requests are XHR/fetch(), which is only HTTP, and WebSockets, which have a handshake mechanism to ensure that the other side is really expecting a WS connection, which Erlang won’t be).


Question: I am considering learning Go or Elixir to develop the backend for a high-frequency financial application. Focused strictly on concurrency and scalability benefits of each, which would be the better choice?

Granted they are probably both great at highly parallel and distributed computing applications, I am just wondering if there is an interesting differentiation between these two in this aspect that I should be aware of.


Comparison on concurrency: https://news.ycombinator.com/item?id=34564228

On scalability, given that high-frequency financial applications are generally super performance sensitive, or at least have a reputation of such, I would probably consider the BEAM VM disqualified (not the slowest language but definitely not the fastest) and even still have some serious questions around Go. I like to characterize it as the one of the slowest members of the fastest class of languages; yes, it's compiled, and that means it generally beats interpreted and VM-based languages, but it's definitely on the slower end of compiled languages due to its prioritization of compile speed over optimization time.

In the case where I ported a system straight out of Erlang into Go, without a rearchitecture, pretty much a straight port, the Go system was 5-10x faster than the Erlang system, and I'd expect that to hold. BEAM made some decisions back in the 90s that give it some significant performance headaches nowadays... not fatal for all uses by any means, it's completely possible to build a reasonable system on it, but if raw computational performance is high on your list of needs it's got some nontrivial disadvantages.


> if raw computational performance is high on your list of needs it's got some nontrivial disadvantages.

That's what Rustler, Zigler, and Numerical Elixir (Nx) solve. It doesn't mitigate issues around deep copies when passing messages between processes, but the absence of shared memory means that race conditions aren't a problem to worry about, and the memory model is such that garbage collection is never going to be a performance issue even if you scale up to millions of processes per node. It also means you can trivially scale horizontally across multiple nodes because your code doesn't (more accurately can't) depend on sharing memory.


I've been hearing "You can use this slow language and just plug in to a faster one when you need more speed" for over two decades now, and it still has yet to impress me in practice. It's still too easy to accidentally stray back into the slow language for extended periods, and it's virtually guaranteed when it's the "glue layer". Rust straight-up solves race conditions on its own terms, it is not hard to program Go in a way that doesn't have them, etc. It's easier just to write in the fast language in the first place.

Also, I think the BEAM GC is widely misunderstood. Per process it's kinda neat that it can GC without having to consider the others. But globally it's actually kinda slow. Like I said, I've straight-up ported Erlang programs into Go, network clustering and all, and Go was quite significantly faster, even with the not-so-great GC Go had in the ~1.6 era. (I suspect if I ran the same code now it would just work with no GC problems.) BEAM GC is more "hey, look at this neat trick!" than "better than anything around". I actually wouldn't consider it very good at all in modern terms. The Erlang community has a huge problem with thinking it's still 2003 and comparing Erlang to all the other competition in 2003, but unless you're still living in 2003, you need to be comparing to what's around in 2023.


> It's easier just to write in the fast language in the first place.

I’m not sure if that’s measurably true in terms of ergonomics (e.g., there’s no cognitive load difference between using a borrow checker, types, programming defensively, thinking about async at all). But I have no actual experience with go to offer a compelling counterpoint

> Like I said, I've straight-up ported Erlang programs into Go, network clustering and all, and Go was quite significantly faster

Was the JIT any help?

In your view, what technical decisions do you think prevent performance in Erlang VM? Why do you think good performance either impossible, difficult to achieve, or unlikely due to these reasons?


"I’m not sure if that’s measurably true in terms of ergonomics (e.g., there’s no cognitive load difference between using a borrow checker, types, programming defensively, thinking about async at all). But I have no actual experience with go to offer a compelling counterpoint"

The differentiation used to be HUGE. In the 200xs, especially earlier in that era, static languages were a NIGHTMARE. I personally believe this is a large part of the reason that the dynamic scripting languages won as much of the market as they did, despite their manifold disadvantages. They were going up against language environments that were even worse, by a large margin. 1990s C++ was just a nightmare.

Today the difference is much more muted. I've had the pleasure of being assigned a greenfield research project by myself for the last 6 weeks, and I'm not only making plenty of progress in such a task in a statically typed language (Go), I am now well into the point where I am actually better off in a static language, because I can make very significant infrastructure changes safely, without having to bodge adaptors everywhere.

But it's still an opinion thing, sure.

"performance issues"

The super weak typing story is a big problem. We've seen a whole bunch of attempts over the years to take a weakly typed language and speed it up with JITs. In a nutshell, the answer is, you can do better, but we didn't get anywhere near C speed performance in general like there was so much hope in the early 2000s... and the programming community loves imbibing old propaganda and still has not really grappled with the fact that JITs do not generally "work" in that sense.

Linked lists as a primary data structure for so many things is not good in 2023. While you avoid some of their issues in that the other end of the linked list is much more likely to be in cache because of the smallness of the local OS process, they're still linked lists.

The BEAM VM does a lot of bookkeeping. When this bookkeeping is useful, it is invaluable, but when you have a system just chugging along, working, and not needing any intervention, it's doing a lot of unnecessary bookkeeping that most other languages aren't. Note I'm not saying this is all bad; I'm just saying, it isn't free.

Pattern matching is probably generally slower than polymorphic dispatch, let alone static type-based dispatch. Especially when you start writing deep patterns. I'm a little heterodox on this question in general, probably. I'm actually not entirely convinced that pattern matching is somehow generally "superior" than the alternatives. There are several negatives I'd associate with it that advocates often overlook. It's not disastrously broken or anything, but I find its cost/benefit over alternatives to much more complicated than often supposed.

Immutability slows down a language. Compilers can try to optimize around it in the common case, but while it may have software engineering disadvantages it is certainly generally faster. While it's easy to show immutability has a maximum "log n" slowdown factor over mutable code, it's still easy to accidentally do worse than that, and a "log n" factor is still something you'll feel.

It's a lot of little things. And like I said, the totality still adds up to a usable language. It's still generally faster than Python, for what that's worth. But it's also not a fast language. The BEAM VM itself performs well but the language it implements is slow, and I think in the intervening years there's been a lot of languages taking a lot of different approaches that achieves the same goals without the same slowdowns.


This top-level comment in this thread links to a excellent case-study which presents an architecture that may answer your question - https://news.ycombinator.com/item?id=37418015

For performance, drop down to C/C++ (carefully);

1) https://stackoverflow.com/questions/1811516/integrating-erla...

2) https://news.ycombinator.com/item?id=14771658


This was in a top level comment on this post. It perfectly articulates what sets Erlang/BEAM/Elixir apart from other runtimes/languages. Essentially performant, fault-tolerant parallelism by design.

https://www.youtube.com/watch?v=pO4_Wlq8JeI

If you need to break out into SIMD, you also have NX at your disposal. And Rustler/Zigler provide you an escape hatch to low-level languages, with the latter you can code in Zig inline

https://github.com/elixir-nx https://docs.rs/rustler/latest/rustler/ https://hexdocs.pm/zigler/Zig.html


Without more details, you're just going to get people's favorite language recommended or about how neither are suited to high-frequency trading applications.


My question is more general: what are some relative pros/cons of Go vs Erlang/Elixir when it comes to their approach to concurrency?


I replied more in details elsewhere in the comments, but in a nutshell Erlang/Elixir processes are a tool to contain errors, and they make some opinionated design choices to achieve that.

Goroutines are a pure concurrency primitive, they don't contain errors. A goroutine crashing will take down the entire program.

Elixir would be absolutely beautiful for a trading program - you could run each strategy (trading algorithm instance) in a separate process, so that a bug in one of them can't take down the whole system, and use monitors to guarantee orders are cancelled if a strategy crashes. I actually started learning it because I worked on such a system, and it was hard to sleep at night when any small coding error in the giant C++ codebase could send the whole thing crashing.

It would definitely be too slow for HFT though :)


This is great, really appreciate your insight as you seem to have good experience with both.


I don't really know about Go. I favor functional languages, and so I am drawn to Erlang and Elixir. I recommend watching Sasa Juric's talks on YouTube to best understand Erlang and Elixir's concurrency features. One of his talks is posted elsewhere in this thread already.


Having used both, I think Go has better performance and due to its static typing you can reason about functions since their parameters are explicit. This scales better as the org grows. Elixir lets you define a spec but it is not enough. I routinely had to go several callsites up the stack ti understand the parameters. However, I heard static types are on their way and that is huge.

Elixir has GenServer and BEAM niceties which is something you'll have to replicate, likely poorly, in another environment.

As for learnability, meh. Have a senior on staff and new employees will be able to learn either.

Elixir has Phoenix, liveview, and Ecto. You'll just have to look into them, but they are good. The game changer is iex, you can terminal into any running process in the cluster and interact with it.

For me, I'm much more experienced in Go and would leverage managed kubernetes if I needed to cluster. I think it scales better for larger orgs. If it is a core group of one small team, Elixir is pretty rad.


I recently started learning Elixir and find it super interesting (along with Erlang.) The paradigm is so different. Mostly I am looking at it for hobby projects. It reminds me a bit of OpenVMS, where the OS had built in clustering facilities, processes had mailboxes, etc.

I've done a bit of Go in the past and it definitely has its place, too. It feels more mainstream. We converted a Python app to Go a few jobs back and the performance increase was incredible.


Thanks, this covers all the angles quite well. What would be the closest equivalent to Phoenix/liveview for Go (as in best django-like stack).


Nothing really; and nobody has a thing like liveview that I'm aware of. For Go, I use a request router (chi) and some helper packages for mundane things like reading env configs. I write the backend as an http api and then have a separate frontend, currently learning svelte for that. Go is very batteries included.


Build a POC in both and go with what you like working with the most.


Excuse a noob question, but what gets transmitted over the wire as `F` in the `become` message? Source, bytecode, function name, etc? ie. does the universal server need F in its "classpath" (or whatever the Erlang equivalent is)? If not, are there platform &| security issues when calling the universal server distributed?

Genuine question. It's a fantastic attribute of Erlang to be able to do this.


It will be a closure "fun () -> ...", and it's serialized bytecode + a copy of captured variables — remember, they all are immutable. For a intra-node call it will be a reference to the bytecode, no need to copy anything.

However if this code refers to any functions from modules via syntax "foo:bar/1" then these references will be resolved on the target node, and both nodes better have the same versions of modules loaded.


There are no extra security issues, because at this point you already assume that the other side is trustworthy.

I think you can assume that they are sending something like bytecode, but might optimize that, if both are eg on the same physical machine.


Related:

My favorite Erlang Program (2013) - https://news.ycombinator.com/item?id=31639382 - June 2022 (19 comments)

My favorite Erlang program (2013) - https://news.ycombinator.com/item?id=22413029 - Feb 2020 (54 comments)

My favorite Erlang Program (2013) - https://news.ycombinator.com/item?id=12396420 - Aug 2016 (38 comments)

My favorite Erlang program (2013) - https://news.ycombinator.com/item?id=8807660 - Dec 2014 (2 comments)


> Dean was doing an Erlang project so he asked “What example program would best exemplify Erlang“.

I wish such a canonical example was easy to find for every programming language, particularly those on the boundary between mainstream and exotic languages.

Most languages present themselves by waxing philosophical about HoTT, zero-cost abstractions, or parametric types. Show me an example where your language is clearly better than Python, Ruby, TypeScript, C#, or Rust, because those are the languages to beat, and they already have the entire infrastructure set up so unless you can demonstrably outperform them in some way, it's probably not worth my time to take a deeper look.


But what if it’s better for complex tasks or large programs? I also wish language home pages would show examples of what makes the language special right away but it’s not always that easy.


I miss working with Erlang and BEAM. It's a well built system and in some ways quite different from other things out there.


Do we have any alternative that compares to planet labs today?


Get a load of this guy... he knows two erlang programs.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: