More

copirate · 2024-06-23T09:28:25.000000Z

You can pipe with the `pipeline*` method of open3 which is part of the stdlib:

For example:

    require "open3"
    last_stdout, wait_threads = Open3.pipeline_r("cat /etc/passwd", ["grep", "root"])
    last_stdout.read # => "root:x:0:0::/root:/bin/bash\n"
    wait_threads.map(&:value).map(&:success?) # => [true, true]

https://ruby-doc.org/3.2.2/stdlibs/open3/Open3.html

e12e · 2024-06-23T12:13:13.000000Z

Can you easily chain these, though? (gzcat some.txt|grep foo|sort -u|head -10 etc?). Especially lazily, if the uncompressed stream is of modest size, like a couple of gigabytes?

copirate · 2024-06-24T10:32:19.000000Z

Of course, you can easily chain many commands in the pipeline:

    last_stdout, wait_threads = Open3.pipeline_r(
      ["gzcat", "some.txt"],
      ["grep", "foo"],
      ["sort", "-u"],
      ["head", "-10"],
    )

I'm not sure what you mean by lazily here, but internally[0] it creates real anonymous pipes[1] between the spawned processes, so the data does not go through the ruby process at all.

[0] https://github.com/ruby/open3/blob/b8909222051b4103a19eba195...

[1] https://en.wikipedia.org/wiki/Anonymous_pipe

RulerOf · 2024-06-23T17:00:25.000000Z

I'd suspect you could do that with Open3, but if you are, why not just read the file and process with Ruby instead?

e12e · 2024-06-23T17:36:20.000000Z

I'm currently working with 150MB worth of gzipped JSON - marshalling the full file from JSON to ruby hash eats up a lot of memory. One tweak that allows for easier lazy iteration over the file (while keeping temporary disk Io reasonable) is to pipe it through zcat, jq in stream mode to convert to ndjson, gzip again - for a temp file that ruby zlib can wrap for a stream convenient for lazy iteration per read_line...).

Generally marshalling a gig or more of JSON (non-lazily) takes a lot of resources in ruby.

RulerOf · 2024-06-23T19:32:34.000000Z

Hmm. I don't typically mind throwing memory at a problem like that, but I can certainly see the issue.

Is lazy marshalling something that other languages handle better?

e12e · 2024-06-23T22:17:21.000000Z

Some do, some don't. JSON is a special case as a valid JSON file needs to be a single array or object literal - event driven (SaX style) parsing needs to be a hack (like jq stream mode). In theory json_streamer or yajl should help, but I couldn't get a combination to return a proper lazy iterator.

With file as ndjson it was easier, if a little sparsely documented (Zlib::new or #wrap?):

    my_it = Zlib::GzipReader.wrap(some_ndfile).lazy
    obs = my_it.each_line.lazy.map do |line|
    JSON.parse line
  end.first(4)

When we can get a line at a time marshalling the whole line isn't an issue.

My issue is more that it is tricky to nest ruby IO objects and return a lazy iterator - especially nesting custom filters along the way - at least more tricky than it should be.

Apparently there's a third party frame work that does seem promising:

https://iostreams.rocketjob.io/tutorial

Or manual lifting:

https://dev.to/bajena/streaming-gzipped-csv-files-from-ftp-i...

Or:

https://medium.com/smartly-io/streaming-data-with-ruby-enume...

https://github.com/lautis/piperator

I think something more like this should probably be built in, and readily available (for gzip, http, files etc). Maybe I'm greedy.

Btw the shell pipeline to convert a file would be something like this, and is fully streaming:

    # gzipped JSON to gzipped ndjson, stripping top level array:
    gzcat file.json.gz \
     | jq -cn --stream 'fromstream(inputs|(.[0]  |= .[1:]) | select(. != [[]]) )' \
      | gzip -9 \
      > file.ndjson.gzip

copirate · 2024-06-23T09:16:23.000000Z

> Ruby has no built-in for "call a subprocess and convert a nonzero exit status into an exception"

Since Ruby 2.6 you can pass `exception: true` to `system` to make it behave like your `system!`.

https://rubyreferences.github.io/rubychanges/2.6.html#system...

derefr · 2024-06-23T23:44:32.000000Z

Didn't realize that! That's one snippet I can maybe eliminate now. (As to why I didn't know: the first thing in the RDoc for Kernel#system is still "see the docs for Kernel#spawn for options" — and then Kernel#spawn doesn't actually have that one, because it doesn't block until the process quits, and so returns you a pid, not a Process::Status. I stopped looking at the docs for Kernel#system itself a long time ago, just jumping directly to Kernel#spawn...)

But come to think of it, if Kernel#system is just doing a blocking version of Kernel#spawn → Process#wait, then shouldn't Process#wait also take an exception: kwarg now?

And also-also, sadly IO.popen doesn't take this kwarg. (And IO.popen is what I'm actually using most of the time. The system! function above is greatly simplified from the version of the snippet I actually use these days — which involves a DSL for hierarchical serial task execution that logs steps with nesting, and reflects command output from an isolated PTY.)

copirate · 2024-06-18T12:52:07.000000Z

> How do you "prove" that other people are conscious?

For sentience scientists mainly look at behavioral cues:

> For example, "if a dog with an injured paw whimpers, licks the wound, limps, lowers pressure on the paw while walking, learns to avoid the place where the injury happened and seeks out analgesics when offered, we have reasonable grounds to assume that the dog is indeed experiencing something unpleasant." Avoiding painful stimuli unless the reward is significant can also provide evidence that pain avoidance is not merely an unconscious reflex (similarly to how humans "can choose to press a hot door handle to escape a burning building").

https://en.wikipedia.org/wiki/Sentience#Indicators_of_sentie...

lkdfjlkdfjlg · 2024-06-18T15:42:04.000000Z

Exactly. All of that is reasonable and the behavior described are obviously present as anyone who's ever had a dog would tell. So I don't understand why "are animals conscious" is being debated at this point.

echoangle · 2024-06-19T08:37:38.000000Z

I’m not saying that this is the case, but all the mentioned behaviors are only indicators and could also be reflexive actions which the dog is genetically programmed to do because they work. If a beetle is flipped, it also has a “program” to get upright again, but that doesn’t mean it’s aware of its situation and is actively deciding something. I’m pretty sure dogs are conscious, but you can’t really tell from the outside. LLMs also appear to reason and make arguments but I wouldn’t call them conscious.

lkdfjlkdfjlg · 2024-06-19T20:00:51.000000Z

My entire point wasn't that animals are or aren't conscious, but to ask: Why is it that people make this argument

> I'm pretty sure dogs are conscious, but you can’t really tell from the outside.

when talking about dogs, but not when talking about people?

Maybe you're not conscious, I can't tell from the outside.

echoangle · 2024-06-20T10:00:04.000000Z

You’re right, you also can’t tell for other people. You can make an assumption because they are very similar to you and you yourself appear to be conscious to yourself. But you can’t really disprove solipsism as far as I know.

copirate · 2024-04-23T14:13:11.000000Z

Or even "MY autism makes travel a challenge". There's a spectrum of autistic traits and not all of them affect travel.

prmoustache · 2024-04-23T15:20:38.000000Z

And a lot of people without autism (or that it hasn't been diagnosed) who have issues travelling and managing unexpected stuff.

LoganDark · 2024-04-23T14:55:12.000000Z

Or just "when being autistic makes travel a challenge"

copirate · 2024-01-29T15:33:47.000000Z

There is one[0] that works on all PCVR headsets and it's great, but unfortunately it's been abandoned by Google.

[0] https://store.steampowered.com/app/348250/Google_Earth_VR/

Erratic6576 · 2024-01-29T21:14:48.000000Z

Last update in February 2018. Can’t blame them; they must be short of cash

copirate · 2024-01-23T00:22:35.000000Z

YouTube also asked me if I wanted to unlink my YouTube account from other Google services. I can change this setting at https://myactivity.google.com/linked-services and it says:

> Google currently shares data across its services for the purposes described in its Privacy Policy at g.co/privacypolicy and depending on the previous choices you’ve made about your privacy settings, such as Web & App Activity, YouTube History, and Personalized ads

> As of March 6, 2024, new laws in Europe will require Google to get your consent to link certain services if you want them to continue to share data with each other and other Google services as they do today. For example, linked Google services might work together to help personalize your content and ads, depending on your settings.

The whole text is here: https://pastebin.com/iSmeJ5WB

extraduder_ire · 2024-01-24T10:46:36.000000Z

This is going to be interesting if I do it to my account.

I remember having to somehow contact support back in the day when google made me merge my youtube account and my gmail email, since they had the same name and email address. Lots of strange buggy behaviour.

copirate · 2023-12-26T09:58:44.000000Z

> We estimate that 99% of US farmed animals are living in factory farms at present. By species, we estimate that 70.4% of cows, 98.3% of pigs, 99.8% of turkeys, 98.2% of chickens raised for eggs, and over 99.9% of chickens raised for meat are living in factory farms.

https://www.sentienceinstitute.org/us-factory-farming-estima...

copirate · on Aug 23, 2023

I've read that ZFS is less safe than other Linux filesystems if you don't use ECC RAM, because it assumes that there are no memory errors and therefore doesn't provide a tool to repair a filesystem corrupted by such errors. Is this true?

Modified3019 · on Aug 23, 2023

It's not true. That's basically ancient forum myth, alongside the also incorrect "ZFS needs 1GB memory per TB of HDD" nonsense that has thankfully mostly died out finally. ZFS makes no additional assumptions when using ECC vs non-ECC memory.

It is theoretically possibly to construct a scenario where evil ram does all the exactly right things needed fool ZFS and corrupt your filesystem. Any pearl clutching about this thing which has never happened somehow also ignores that every filesystem is going to get corrupted.

In reality, while ECC memory is always nice to have, it's no more required than any other filesystem. Though personally now that amounts of +32gb are common, I generally prefer error correction/detection over ultimate speed these days. Though ironically ECC memory is actually really nice to overclock, because I can actually just check my logs and prove if my system is actually stable.

There so many actual dangers to your data in comparison that it's laughable. The biggest one being you. Followed by hardware failure, malware, and genuine ZFS bugs. I'd stay far away from raw sends of encrypted datasets in ZFS for a while, there are edge cases that haven't been resolved yet.

Edit Longer article saying the same thing: https://jrs-s.net/2015/02/03/will-zfs-and-non-ecc-ram-kill-y...

copirate · on June 25, 2023

I've been summoned by the police because of a "dot-zero" address on one of our servers.

Someone had been buying stuff online with a stolen card and the shop admins provided a list of the IP addresses used, including our server's. All the addresses were dot-zero addresses, so I assume it was just some kind of unfortunate obfuscation.

copirate · on June 25, 2023

And also the ones before that explain the attention mechanism:

https://youtu.be/wzfWHP6SXxY?t=4366

https://youtu.be/gKD7jPAdbpE (up to 25:42)