> I am totally convinced that ARC is the right way to go upfront. [...] It gives...

kgeist · on April 24, 2022

I don't think it's quite deterministic because a simple variable going out of a scope can trigger deallocation of a random object tree (which can be large), which you cannot always predict locally (as a programmer) and which can affect performance on a critical path or introduce side effects at unpredictable times, plus the overall possibility of having uncollected cycles - so in the end for a programmer -- who's not willing to manually trace ownerships themselves to understand the behavior in every detail -- it has quite same disadvantages as a tracing GC (deallocation may kick in at "unpredictable" times) with the additional disadvantages such as: fragmentation and poorer memory locality in the long term (no compaction), possible memory leaks (due to object islands), temporary young generation objects being as expensive as old generation objects (no bump pointer allocation in a nursery), excessive RC increments/decrements can trash performance (and if it's atomic it can IIRC force cache invalidation, which also makes the behavior less deterministic).

So I don't think it's a silver bullet, it's just a different kind of GC with its own set of disadvantages. I think most of the time RC is chosen because it's simple to implement or because there's a legacy system where RC is the only choice, and all the other "benefits" are just afterthoughts to further justify the choice.

gmueckl · on April 24, 2022

Determinism in ARC vs. GC discussions is probably more of a reproducability thing. With ARC, a program will always free memory in the same spots in the same way whem running on the same input. In most GCs, this is not the case. Their behaviordepends very much on external factors like passed wall clock time or system-wide memory pressure (e.g. if it rns collection whem requesting more memory from the OS failed).

This kind of GC behavior can make bugs relatedcto external resources much harder to find. And benchmarking an allocation heavy program with a nondeterministic GC can be hellish due to the increased distribution of measured runtimes.

kgeist · on April 24, 2022

>With ARC, a program will always free memory in the same spots in the same way whem running on the same input

It's true for pure single-threaded functions but I doubt it's 100% the case for asynchronous code which depends on side effects (user input, system events, async I/O): a slightly different timing and your reference count is different from the previous run, and as a consequence your object trees are deallocated differently, and in different configurations, each time as well. But it's only a problem if objects are shared between threads, though.

Even with single-threaded code, a random change to your codebase which increments a reference here and there can invalidate all your prior assumptions and have a ripple effect on the entire system.

magicalhippo · on April 24, 2022

> Even with single-threaded code, a random change to your codebase which increments a reference here and there can invalidate all your prior assumptions and have a ripple effect on the entire system.

This is true. The standard library for the language we use at work (Delphi) had a bug in its thread pool class which was of this nature. The threads in the pool would wait for work to be available, pop a work item, do work and wait for more work again.

The issue was that the reference count of the previous work item would not decrement until the local variable holding the work item was overwritten by the new work item (increasing the reference of the new work item also). In particular, if there was no more work, this reference would be held until program termination.

This caught me, as a library user, by surprise as I expected the work item to go out of scope and be destroyed when it had been executed, since no more references should be held at that point.

That said, spotting these bugs are quite easy with ARC once you dig into the code, given that the points where the references can be increased or decreased are deterministic. So as long as you have access to the source code of dependencies it's fairly easy to find the reference counting points.

GC on the other hand is completely async and opaque for the most part.

gmueckl · on April 25, 2022

Keeping dangling references for too long is a problem that is common to all automated memory management schemes. You would have had the same issue with the work issue lifetime with a GC, except that the destruction of the work item would have happened at an undetermined time after the work item reference was replaced.

mirekrusin · on April 24, 2022

Arc is deterministic, you get the same results on multiple runs (minus not synchronized async fun of course), you can profile it etc. – you don't have this luxury with gc.

pjmlp · on April 24, 2022

Not if the run depends on external data, which might happen to create a data forest with stop the world effect caused by domino effect of deletions.

mirekrusin · on April 25, 2022

Of course, what is deterministic is deterministic, if you use allocations based on nondeterministic random, then you can't see deterministic allocations - I think this is obvious and doesn't have be to spelled out.

With gc deterministic runtime creates nondeterministic deallocations.

With arc deterministic runtime creates deterministic deallocations - reproducible behavior that can be profiled and allows you to work on optimizing it.

pjmlp · on April 25, 2022

How it is reproducible if you cannot control your input data?

There are GC languages like D, where you can have C++ like RAII deterministic deallocations.

Learn to use the features.

verdagon · on April 24, 2022

I'm not sure most people use this technique, but when I was using RC-based systems, I would often assert that an object's refcount was 1 just before I let go of that last reference. Suddenly, we have RAII-like predictable destruction.

I even built it into Vale as one of its memory management options called "constraint references" [0] though later switched to generational references which gave us RAII without the halts. I sometimes wonder how far I could have taken that RC + assert model.

In my experience, the vast majority of objects even in an RC'd language do have strong refcount = 1. Perhaps we were in different domains though, I was mostly in game dev and app dev.

[0] https://verdagon.dev/blog/raii-next-steps