More

o11c · 2024-11-13T00:35:21 1731458121

Hm, this makes me wonder: what exactly makes the European pig more harmful than the Polynesian pig which was there for ~1600 years? Is there a viable way of breeding pigs more like the latter?

o11c · 2024-11-12T21:44:39 1731447879

It doesn't feel like it since there hasn't been much improvement in the latter range of time.

At least Windows 7 could claim that it resolved most of the permissions bugs that came from upgrading security to 1970s standards, so it was actually an upgrade to XP.

o11c · 2024-11-12T18:09:35 1731434975

Note that case handling is a place where postgres (which folds to lowercase) violates the standard (which folds to uppercase).

This is mostly irrelevant since you really shouldn't be mixing quoted with unquoted identifiers, and introspection largely isn't standardized.

yen223 · 2024-11-12T19:46:50 1731440810

Given that other mainstream RDBMSes lets you configure how case handling should happen, Postgres is arguably the closest to the standard.

Usual caveat of how nobody sticks to the ANSI standard anyway applies.

o11c · 2024-11-12T18:04:40 1731434680

I've been bitten by those before because they are not generated from the actual syntax-parsing code, and thus are sometimes out of sync and wrong (or at least misleading).

o11c · 2024-11-11T22:33:35 1731364415

The real question is: is there any case where a program calls `setenv` in one thread and actually wants it to take effect in other already-existing threads?

That said, GLIBC is pretty good at documenting all the dangerous functions, so it is possible to add locking/copying yourself.

fweimer · 2024-11-11T23:19:08 1731367148

Interesting idea. I strongly suspect that there are programs out there that expect that setenv changes the environ array (and they do not treat it as an opaque pointer passed to posix_spawn/execve). With a per-thread setenv, we would need a per-thread environ variable as well. Unfortunately, that's not really compatible with POSIX because environ is not declared in a header. Instead, programmers are expected to write a declaration

     extern char **environ;

into their sources, and that declaration is incompatible with environ being a thread-local variable.

o11c · 2024-11-12T01:10:48 1731373848

Hm, in the end most of the problems do come down to stuff not coming from blessed headers.

Regardless of anything else, how about:

* deprecate direct access to `environ` and add functions to replace it. Have a macro that indicates this and provide a canonical compatibility shim for people to copy if they might use old libcs.

* using linker magic, change the behavior of programs depending on whether they attempt to access `environ` or not, so old-API programs are still thread-unsafe but new ones are thread-safe.

It's amazing how much you can do with the conditionally-linked object files from a static "library". Much of C's cross-TU "UB, no diagnostic required" is inexcusable since we can detect it quite easily with zero overhead (at least, for static linking) using today's linkers by deliberately causing multiple definition errors.

Compatibility with old-ABI programs probably means fixing `environ` is not that simple, but you are libc and libc is in control of dynamic linking ...

fweimer · 2024-11-12T18:27:10 1731436030

Many years ago, glibc did something along those lines for the _res variable (with preprocessor magic instead of linker magic). For the main thread, legacy _res (the actual global data symbol) and new thread-local _res (actually *__res_state()) are the same object, but they diverge for subsequently created threads.

I don't think this would work here because it likely changes semantics too much, and not all binaries that need a thread-safe getenv/setenv combination can be rebuilt, especially since compatibility with both variants from the binaries would likely some changes to each application/library.

ptsneves · 2024-11-12T08:13:23 1731399203

All the inconsistencies you suggest sound like a trap, especially when you suggest ABI and API behaviour divergences. From what I understood of your macro idea this would lead to API changes that would lead to ifdefs for different (g)libcurl versions. Doesn’t feel good especially for software distributors.

Also the changes you mention require changes in the linker scripts distributed by a variety of toolchains and would need to check if the libc target was a blessed one. In effect the deprecation would never move to obsolete and kept around forever.

To clarify the above post libc is in control of dynamic linking through the dl*(dlopen) family of functions.

kelnos · 2024-11-12T02:07:57 1731377277

> so it is possible to add locking/copying yourself.

Not if third-party dependent libraries use getenv/setenv. (The article mentions this as a continuing problem with the steam client.)

o11c · 2024-11-12T06:11:05 1731391865

As a rule you should not assume third-party libraries are at all thread-safe.

cryptonector · 2024-11-12T17:21:19 1731432079

Glibc could crib from Illumos, which has a thread-safe putenv()/setenv()/unsetenv()/getenv().

jandrese · 2024-11-12T21:16:48 1731446208

There are so many better ways to do IPC that this hacky and dangerous getenv/setenv setup is never necessary.

I mean what kind of threading library doesn't have shared memory or message passing?

I'm guessing this mostly happens in situations where the main process can change variables like HTTPS_PROXY and a different thread is running a library that checks those variables before firing up a TCP socket.

Asooka · 2024-11-12T00:21:49 1731370909

Yes, I set up environment variables in a plugin that are later read by already started worker threads. It's not a problem for me because the worker threads are all sleeping on a runqueue, but technically I do want to set an env var in one thread and read it in another that is already running.

o11c · 2024-11-11T04:03:25 1731297805

The other major problem with traffic cameras is that they often shorten the duration of the yellow light at the same time.

WWLink · 2024-11-11T04:59:27 1731301167

Traffic cameras should be used to figure out why people are doing what they're doing at a given intersection. A lot of the time they can be used to figure that out, and eliminate the need for them.

Like I feel like the standard for traffic cameras is they should be temporary installations used in an investigation to figure out why so many people speed or run red lights or something. And yep, they do eventually figure out interesting tricks.

This one's dumb, but an observation I noticed lately is lots of intersections will change to all red and then light the pedestrian walk light before they change the car light to green. Why? Probably to lower the amount of people crossing the street getting hit by someone turning right in their car. Neat trick.

com2kid · 2024-11-11T07:07:57 1731308877

> Why? Probably to lower the amount of people crossing the street getting hit by someone turning right in their car. Neat trick.

That is exactly the reason, you can find plenty of studies on it!

thelittleone · 2024-11-11T13:23:48 1731331428

And then these studies start producing consistent revenue (>1B a year) and they can't afford to turn them off, whatever the impact on the public.

These things in isolation, to me, can seem minor, but when combined, it feels a little like death by a thousand cuts.

com2kid · 2024-11-12T03:15:04 1731381304

Having a 5 second delay between the crosswalk light and the street light doesn't increase revenue, especially because in America, free right turns are ignored by red light cameras, and the light timing change is purely about preventing accidents for cars taking right turns.

cruffle_duffle · 2024-11-11T16:44:56 1731343496

As a pedestrian that “walk before green” light is super rad. What I imagine it is targeting is letting pedestrians “win” and get into the street where they are more visible so they don’t get ran over by people making left or right turns into them. If you let the cars light “win” then people will immediately turn right and “cut off” the pedestrian.

Dunno, just a guess. My city started doing this pattern a few years ago and I am very curious about the research backing it up. It does make me feel “safer” crossing streets at busy intersections. But I’d be curious what the actual rationale says about it.

stephen_g · 2024-11-11T04:08:25 1731298105

That is specifically something that has happened in certain US cities, not in Australia.

o11c · 2024-11-10T16:23:45 1731255825

The whole point of the extensible system is that you can, in fact, extend it as much as you need.

... I just threw this into my zillion-ifier program and ... huh, it took 12 hours even with the GMP version, though I'll admit I didn't really optimize it ... the output is about a gigabyte of text

  eight tredecilliquattuorseptuagintasescentilliduoseptuagintaseptingentillion,
  eight hundred sixteen tredecilliquattuorseptuagintasescentilliunseptuagintaseptingentillion,
  nine hundred forty three tredecilliquattuorseptuagintasescentilliseptuagintaseptingentillion,
  two hundred seventy five tredecilliquattuorseptuagintasescentillinovensexagintaseptingentillion,
  thirty eight tredecilliquattuorseptuagintasescentillioctosexagintaseptingentillion,
  ...

o11c · 2024-11-08T09:07:50 1731056870

Title is confusing: this is not about the original "Roaring", but an extension of it called "Roaring+Run".

Here, "bitmap" = "set of sometimes-compact integers". The "uncompressed" and several "rle" implementations are obvious. Hm, except this only seems to be talking about a particularly-naive RLE approach (suitable for storage but not computation)? If you're doing computation I expect you to use absolute offsets rather than relative ones which means you can just do binary search (the only downside of this is that you can't use variable-length integers now).

Roaring is just a fixed 2-level trie, where the outer node is always an array of pointers and where the inner nodes can be either uncompressed bitvectors (if dense) or an array of low-half integers (if sparse). Also, it only works for 32-bit integers at a fundamental level; significant changes are needed for 64-bit integers.

This paper adds a third representation for the inner node, the bsearch'able absolute RLE method I mentioned earlier (before even reading the paper beyond the abstract).

Overall there's neither anything novel nor anything particularly exciting about this paper, but if you ignore all the self-congratulations it might work as a decent intro to the subject? Except maybe not since there are a lot of things it fails to mention (the ping-pong problem, deduplicated tries, the approach of using a separate set for sparse values in the same range, the "overall sparse but locally semi-dense" representation that uses fixed-size single-word bitsets, ...)

Drup · 2024-11-08T11:44:38 1731066278

You seem well versed into that corner. Do you have a good (and reasonably complete) introduction/exploration for these memory-efficient data-structure for computation ?

I've been working on memory representation of algebraic data types quite a bit, and I've always wondered if we could combine them with succinct data-structures.

pram · 2024-11-08T18:06:23 1731089183

Theres actually a whole website about it! I found it useful when I was doing deeper research into ElasticSearch: https://roaringbitmap.org

o11c · 2024-11-08T06:54:11 1731048851

> Stop prewashing dishes in the sink. Put them straight into the dishwasher and you’re good.

In my experience ... no, you're not. Basically any kind of sauce or dessert will get baked on and become much harder to remove, as will anything from a day before the dishwasher actually gets full enough to be worth running.

idontwantthis · 2024-11-08T18:34:25 1731090865

That shouldn’t be happening. Do you fill both soap dispensers? Check out the video in a sibling comment.

o11c · 2024-11-08T06:43:21 1731048201

I imagine most soundbars cheat since consumer protection is dead, but they're still better (and more convenient) than two simple speakers.