Hacker News new | past | comments | ask | show | jobs | submit login

> That's a property of the function, whether or not you want the type system to help you with it.

The problem is that in, say, C# or Kotlin, subroutines with the same semantics come in two different colours. The type system "helps" you distinguish between things that aren't meaningfully distinguishable.

> I don't get it. This just looks like async/await (submit/get) with extra steps.

You only need to submit if you want to do stuff in parallel and then join (also, there are no extra steps). The analogue to:

    var a = await doA();
    var b = await doB();
is just:

    var a = doA();
    var b = doB();



> The problem is that in, say, C# or Kotlin, subroutines with the same semantics come in two different colours. The type system "helps" you distinguish between things that aren't meaningfully distinguishable.

The distinction is important though, because it affects the observable behaviour of other effects present in these functions. https://glyph.twistedmatrix.com/2014/02/unyielding.html

The only cases where you want the same function to come in two colour variants is when it's a higher-order function, and in that case what you really want is not to ignore colour but for the function to be polymorphic over colour. (i.e. higher-kinded types).


> The distinction is important though, because it affects the observable behaviour of other effects present in these functions.

Only if the language is single-threaded to begin with, like JavaScript.[1] There is no real difference in C# and Kotlin between, say, sleep and delay. It's a syntactic distinction between semantically equivalent terms (for all practical purposes).

[1]: Even then there are better ways to reason about effects.


Hmm. I find it hard to believe that whether a given block of code executes on a single OS-level thread or yields and is rescheduled across multiple OS-level threads is never semantically important (to resort to an extreme example, one kind of function can be used in a callback that's invoked by native code and the other can't) - particularly given how many languages (including Java!) started with a green threading model only to later abandon it. And so I'm very dubious about erasing that distinction in low level functions, even if it's not relevant in the vast majority of cases, because once you erase it it's impossible to ever recover it at higher level.


It is never semantically important (in a context that is multithreaded anyway) because you don't have more control over scheduling with async/await than you do with Loom's virtual threads. In either case, the code doesn't know if scheduling takes place on a single kernel thread or multiple ones. Nor does it know about the existence of other threads running concurrently. There could be some technical differences, like various GPU drivers allowing only specific kernel threads to use it, but Loom answers that with pluggable schedulers.

The difference in the native call case isn't semantic; it's, at worst, a difference in performance [1], and as far as I know, no one is making that distinction today, so there's nothing to erase. Moreover, as Java doesn't have async/await, there isn't even an artificial distinction to erase.

I don't know all the reasons Java abandoned green threads. One of them was that the classic VM that had green threads was simply replaced with HotSpot; it wasn't an evolution of a single VM. But more practically, and putting aside the fact that it was M:1, at the time there Java code relied on native FFI a great deal, whereas today it is very rare in general.

[1]: This is a little inaccurate. There could be a difference in liveness properties (deadlock) in that case, but neither async/await nor kernel threads, for that matter, make intrinsic liveness guarantees. You have to trust your scheduler in all cases.


> you don't have more control over scheduling with async/await than you do with Loom's virtual threads. In either case, the code doesn't know if scheduling takes place on a single kernel thread or multiple ones.

You do have that control: the value of having explicitly async functions is that it becomes possible to have non-async functions (just as the value of having explicitly nullable values is that it becomes possible to have non-nullable values, and the value of having explicitly-exception-throwing functions is that it becomes possible to have non-exception-throwing functions). A non-async function is nothing more or less than a function that is guaranteed to execute on a single native thread, whereas an async function might be executed over multiple native threads; in particular after `val x = f()` execution will continue on the same native thread as before, whereas after `val x = await f()` execution might continue on a different native thread.

I hope it works out, but there just seems to be so much opportunity for unforeseen edge cases; something very fundamental and global that every (post-1.2) Java programmer is used to knowing will no longer be true.


Everything on a thread always executes on the same thread, whether or not it's virtual; that's an invariant of threads. You cannot observe (unless it's a bug on our part or you're doing something complicated and deliberate) that, when running on a virtual thread, you're actually running on multiple native threads any more than you can observe moving between processors; if you could, that would indeed be a problem. The JDK hides the native thread the same way the native thread hides the processor.

That async/await allows this implementation detail to leak is not a feature, it's a bug. It forces a syntactic distinction without a semantic difference (well, it introduces a semantic difference as a bad side effect of a technical decision -- the semantic difference is that you can observe the current thread changing underneath you, which, in turn, makes the syntactic difference beneficial; in other words, it's a syntactic "solution" to a self-imposed, avoidable problem).


> That async/await allows this implementation detail to leak is not a feature, it's a bug. It forces a syntactic distinction without a semantic difference (well, it introduces a semantic difference as a bad side effect of a technical decision -- the semantic difference is that you can observe the current thread changing underneath you, which, in turn, makes the syntactic difference beneficial; in other words, it's a syntactic "solution" to a self-imposed, avoidable problem).

You're putting the cart before the horse. On the assumption that the programmer needs to be able to control native threading behaviour in at least some cases (if only a small minority), surfacing the details of the points at which evaluation may cross native thread boundaries is useful and important, and async/await is the minimally-intrusive way to achieve that. If it's really possible to abstract over the native threading completely seamlessly such that the programmer never needs to look behind the curtain, then async/await is pointless syntactic complexity - just as if it were possible to write error-free code then exceptions would be pointless syntactic complexity. But if there's still a need to control native threading behaviour then this is going to end up being done by some kind of magic scheduler hints that aren't directly visible in the code and will be easy to accidentally disrupt by refactoring, and in that case I'd rather have async/await (as long as I've got a way to be polymorphic over it).


I reject both assumption and conclusion. To the limited extent such control is needed, virtual threads with pluggable schedulers are superior in every way, and certainly less intrusive. There aren't any "magic scheduler hints", either. You can assign a scheduler to each virtual thread, and that scheduler makes the relevant decisions. That's the same separation of concerns that native threads have, too.

The only place where this could be important is when dealing with Java code that cares about the identity of the native thread by querying it directly (because native code will see the native thread, but it's hidden from Java code). But, 1. such code is very rare, 2. almost all of it is in the JDK, which we control, and 3. even if some library somewhere does it, then that particular library will need to change to support virtual threads well, just as we require for most new big Java features.

Having said that, I don't claim async/await isn't a good solution for some platforms, and in some cases there's little choice. For example, Kotlin has no control of the JDK, which is necessary for lightweight threads, and languages with pointers into the stack as well as no GC and lots of FFI might find that a user-mode thread implementation is too costly. But I think that in cases where usermode threads can be implemented efficiently and in a way thar interacts well with the vast majority of code, they are clearly the preferable choice.


> There aren't any "magic scheduler hints", either. You can assign a scheduler to each virtual thread, and that scheduler makes the relevant decisions. That's the same separation of concerns that native threads have, too.

I'm thinking of when you need to "pin" to a particular native thread. In a native-threads + async/await language you have direct and visible control over where that happens. In a green-threads language it's going to involve magic.

> Having said that, I don't claim async/await isn't a good solution for some platforms, and in some cases there's little choice. For example, Kotlin has no control of the JDK, which is necessary for lightweight threads, and languages with pointers into the stack as well as no GC and lots of FFI might find that a user-mode thread implementation is too costly. But I think that in cases where usermode threads can be implemented efficiently and in a way thar interacts well with the vast majority of code, they are clearly the preferable choice.

A priori I agree. I expected green threads in Rust to work out too. I just can't get past having seen it fail to work out so many times. Maybe this time is different.


> In a green-threads language it's going to involve magic.

No magic, just a custom scheduler for that particular thread. For example:

    Thread.builder().virtual(Executors.newFixedThreadPool(1)).factory();
would give you a factory for virtual threads that are all scheduled on top of one native thread (I didn't pick a specific one because I wanted to use something that's already in the JDK, but it's just as simple).

> I expected green threads in Rust to work out too. I just can't get past having seen it fail to work out so many times. Maybe this time is different.

Implementing usermode threads in Java and in Rust is very different as their constraints are very different. Here are some differences: Rust has pointers into the stack, Java doesn't; allocating memory (on stack growth) in Rust is costly and needs to be tracked, while in Java it's a pointer bump and tracked by the GC; Rust runs on top of LLVM, while Java controls the backend; Rust code relies on FFI far more than Java. So both the constraints and implementation challenges are too different between them to be comparable.

Not to mention that they have different design goals: Rust, like all low-level languages, has low abstraction (implementation details are expressed in signatures) because, like all low-level languages, it values control more than abstraction. Java is a high-level language with a high level of abstraction -- there is one method call abstraction, and the JIT chooses the implementation; there is one allocation abstraction, and the GC and JIT choose the implementation -- and it values abstraction over control. So whereas, regardless of constraints, Rust happily lives with two distinct thread-like constructs, in Java that would be a failure to meet our goals (assuming, of course, meeting them is possible).

There are many things that work well in high-level languages and not in low-level ones and vice versa, and that's fine because the languages are aimed at different problem domains and environments. Java should be compared to other high-level languages, like Erlang and Go, where userspace threads have been working very well, to the great satisfaction of users.

Having said that, I suggest you take a look at coroutines in Zig, a language that, I think, brings a truly fresh perspective to low-level programming, and might finally be what we low-level programmers have been waiting for so many years.


> No magic, just a custom scheduler for that particular thread.

But there's a spooky-action-at-a-distance between that scheduler and the code that's running on the thread. Code that's meant to be pinned (and may behave incorrectly if not pinned) looks no different from any other code.

> Not to mention that they have different design goals: Rust, like all low-level languages, has low abstraction (implementation details are expressed in signatures) because, like all low-level languages, it values control more than abstraction. Java is a high-level language with a high level of abstraction -- there is one method call abstraction, and the JIT chooses the implementation; there is one allocation abstraction, and the GC and JIT choose the implementation -- and it values abstraction over control. So whereas, regardless of constraints, Rust happily lives with two distinct thread-like constructs, in Java that would be a failure to meet our goals (assuming, of course, meeting them is possible).

Java isn't positioned as a high-level, high-abstraction language and that's not, IME, the user community it has. It's a language that offers high performance at the cost of being verbose and cumbersome - witness the existence of primitive types, the special treatment of arrays, the memory-miserly default numeric types, the very existence of null. I've heard much more about people using Java for low-latency mechanical-sympathy style code than people using it for high-abstraction use cases like scripting or workbooks. It's always been advertised as a safer alternative to C++ - rather like Rust.

(I'm all for trying to expand Java to be useful in other cases, but that's not grounds to sacrifice what it currently does well. For all the criticism Java attracts, it is undeniably extremely successful in its current niche)


> Code that's meant to be pinned (and may behave incorrectly if not pinned) looks no different from any other code.

Same goes for async/await. The decision to keep you running on the same native thread is up to the scheduler.

> Java isn't positioned as a high-level, high-abstraction language and that's not, IME, the user community it has.

I beg to differ. It aims to be a good compromise between productivity, observability and performance. Every choice, from JIT to GC, is about improving performance for the common case while helping productivity, not improving performance by adding fine-grained control. There are a few cases where this is not possible. Primitives is one of them, and, indeed, 25 years later, we're "expanding" primitives rather than find some automatic way for optimal memory layout.

> It's always been advertised as a safer alternative to C++ - rather like Rust.

I think you're mistaken, and in any event, this is certainly not our position. Java is designed for a good blend of productivity, observability and performance, and unless the circumstances are unusual, it opts for improving common-case performance with high abstractions rather than worst-case performance with low abstractions like C++/Rust. Roughly speaking, the stance on performance is how do we get to 95% with the least amount of programmer effort.

Anyway, regardless of what I said above, the constraints on the design of usermode threads other than philosophy are also very different for Java than for C++/Rust for reasons I mentioned. Still, Zig does it more like Java (despite still using the words async and await, but they mean something more like Loom's submit and join than what they mean in C#): https://youtu.be/zeLToGnjIUM


> Same goes for async/await. The decision to keep you running on the same native thread is up to the scheduler.

The scheduler decides what happens at each yield point, but code that doesn't yield is guaranteed to stay pinned to a single native thread. A non-async function is somewhat analogous to a critical section; the difference between async and not is a visible distinction between must-run-on-a-pinned-native-thread functions and may-be-shuffled-between-native-threads functions.


Can you give an example where this matters -- i.e. it's useful and allowed to move between native threads but not between well-known points -- given that the identity of the carrier thread cannot leak to the virtual thread?


> distinguish between things that aren't meaningfully distinguishable.

Are these semantically the same?

  Runnable action1 = () -> System.out.println("foo");
  Runnable action2 = () -> System.out.println("bar");
  action1.run();
  action2.run();

  var action3 = runAsync(() -> System.out.println("foo"));
  var action4 = runAsync(() -> System.out.println("bar"));
  action3.join();
  action4.join();
> You only need to submit if you want to do stuff in parallel and then join (also, there are no extra steps).

I'm not comparing parallel to sequential. My point was that we're already doing parallel programming:

  CompletableFuture<Result> fx = CompletableFuture.supplyAsync(() -> result);
  CompletableFuture<Result> fy = CompletableFuture.supplyAsync(() -> result);
  Result x = fx.join();
  Result y = fy.join();
Parent doesn't like the existing parallel model. New model looks like:

  // > plus no need for thread pools anymore.
  ThreadFactory tf = Thread.builder().virtual().factory();
  ExecutorService e = Executors.newUnboundedExecutor(tf);

  // > No more [...] Completable<Future>, promises et al.
  Future<Result> fx = e.submit(() -> { ... return result; });
  Future<Result> fy = e.submit(() -> { ... return result; });
  Result x = fx.get();
  Result y = fy.get();
If going from the first to the second is an improvement, I don't understand the excitement. Especially when other languages look like:

  fx <- async (return 3)
  fy <- async (return 3)
  x  <- wait fx
  y  <- wait fy


> Are these semantically the same?

No, but that's not what I meant. These are semantically the same:

    action1sync();
    action2sync();

    await action1async();
    await action2async();
> If going from the first to the second is an improvement, I don't understand the excitement.

The excitement is about going from my second example to the first.

> Especially when other languages look like:

It doesn't matter what they look like in the code. They suffer from all the problems I mention in the article: exceptions lose context, debuggers and profilers lose their effectiveness, APIs are split into two disjoint worlds. Virtual threads gives you code that doesn't just look synchronous, but behaves like that in every observable way.

What the code looks like on the screen is a very small portion of the problem we're trying to solve.


(1) That sounds like a recipe to unintentionally miss a ton of concurrency. Having to put an `await` there is a great indicator that you're forcing an order of execution.

(2) Can you give an example of an async and a non-async subroutine that have "the same semantics"?


1. There is no need for await. Threads imply sequential execution.

2. sleep vs. delay in either C# or Kotlin.


> 1. There is no need for await. Threads imply sequential execution.

How do you execute doA() and doB() concurrently? In C# you can do something like:

   var t1 = doA();
   var t2 = doB();
   await Task.WhenAll(new[] {t1, t2});
   var a = t1.Result;
   var b = t2.Result;
EDIT: Ah, never mind, I just saw "You only need to submit if you want to do stuff in parallel and then join" above.


The reality is that you rarely want to doA and doB concurrently, so optimizing syntax for that case is not useful, whereas you want to be able to call functions without having to worry about their color all the time, where "all the time" here is typically >1 time per function.

Many of you are perhaps scratching your head and going "What? But of course I concurrently do multiple things all the time!" But this is one of those cases where you grossly overestimate the frequency of exceptions precisely because they are exceptions, and so they stick out in your mind [1]. If you go check your code, I guarantee that either A: you are working in a rare and very stereotypical case not common to most code or B: you have huge swathes of promise code that just chains a whole bunch of "then"s together, or you await virtually every promise immediately, or whatever the equivalent is in your particular environment. You most assuredly are not doing something fancy with promises more than one time per function on average.

This connects with academic work that has showed that in real code, there is typically much less "implicit parallelism" in our programs than people intuitively think. (Including me, even after reading such work.) Even if you write a system to automatically go into your code and systematically finds all the places you accidentally specified "doA" and "doB" as sequential when they could have been parallel, it turns out you don't actually get much.

[1]: I have found this is a common issue in a lot of programmer architecture astronaut work; optimizing not for the truly most common case, but for the case that sticks out most in your mind, which is often very much not the most common case at all, because the common case rapidly ceases to be memorable. I've done my fair share of pet projects like that.


ParaSail[0] is a parallel language that is being developed by Ada Core Technologies. It evaluates statements and expressions in parallel subject to data dependencies.

The paper ParaSail: A Pointer-Free Pervasively-Parallel Language for Irregular Computations[1] contains the following excerpt.

"This LLVM-targeted compiler back end was written by a summer intern who had not programmed in a parallel programing language before. Nevertheless, as can be seen from the table, executing this ParaSail program using multiple threads, while it did incur CPU scheduling overhead, more than made up for this overhead thanks to the parallelism “naturally” available in the program, producing a two times speed-up when going from single-threaded single core to hyper-threaded dual core."

One anecdote proves nothing but I'm cautiously optimistic that newer languages will make it much easier to write parallel programs.

[0] http://www.parasail-lang.org/

[1] https://programming-journal.org/2019/3/7/


I hope so too.

I want to emphasize that what was discovered by those papers is that if you take existing programs and squeeze all the parallelism from them you possibly can safely and automatically, it doesn't get you very much.

That doesn't mean that new languages and/or paradigms may not be able to get a lot more in the future.

But I do think just bodging promises on to the side of an existing language isn't it. In general that's just a slight tweak on what we already had and you don't get a lot out of it.


I would imagine doA() should do@ and return the result, but startA() would start it.

If you just need to wait for both to finish, something like this should do it:

   var t1 = startA();
   var t2 = startB();
   var a = finishA(t1);
   var b = finishB(t2);
finishX would naturally block until X is done.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: