> Using a "Virtual Distributed Filesystem" (VDFS), in other words; a decentralized database that emulates a filesystem. It indexes hardware filesystems to create a master database that is synchronized in realtime between your devices running Spacedrive.
> What makes this different to Dropbox or Google Drive?
> It is not a storage provider, Spacedrive is simply a database that exists on top of existing storage layers, from cloud services like Dropbox, Google Drive and iCloud to physical devices and external drives you already own. It doesn't provide you more storage, rather a supercharged view of your existing storage.
So more like Syncthing? Or rather Windows File Sharing/Samba? I don't really get it
That explanations read like it's taking in any kind of file system, local or remote, and combines them into one unified file system. Not really new, but most other file managers focus on the popular storage-services (Dropbox, Google Drive, OneDrive, samba, ftp), and are not open source. If you can easily create plugins for any remote storage, not just the traditional file storages, or even make up your own, then it could become something promising.
Firefox on Desktop tells me to "touch my security key". Not sure how that works.
Firefox Android gives me a few hardware options to store my passkey to.
Chrome Desktop asks me to enable Bluetooth.
Chrome Android asks which Google Account to use.
Well, this is arguably a kind of compression, right? So you'd be trading CPU time for fewer bytes? Is that a desirable tradeoff at chess engine scales?
Bit packing/mapping et c. isn’t compression the way you’re thinking. What it is, is concise. It requires that the program know what each bit means, rather than telling the program what a value means (as a json structure might), so it shifts where meaning is assigned strictly to the program—a convention must be encoded in the program, not figured out at runtime, though technically you could turn this back into a form of config if you wanted, it just wouldn’t be jumbled up with your data—but it doesn’t really compress the data itself. It’s just efficient at representing it.
[edit] shorter version of the above: it stores the values, but doesn’t store what they mean.
It's not compression in the normal sense of the word. Most parsing is directly to data. So e.g. you know the square of some piece is the next 5 bits. In languages that allow it you can cast directly from the next bit offset to an e.g. byte. This is going to dramatically faster than parsing much more loosely structured JSON. As database sizes increase you also get worse performance there, so it's a double hit. So with these sort of representations you get orders of magnitude faster and smaller. Sometimes there really is a free lunch!
Also I'd add the sizes involved here are kind of insane. I wrote a database system that was using a substantially better compression that averaged out to ~19 bytes per position IIRC. And I was still getting on the order of 15 gigabytes of data per million games. Ideally you want to support at least 10 million games for a modern chess database, and 150 gigabytes is already getting kind of insane - especially considering you probably want it on an SSD. But if that was JSON, you'd be looking at terrabytes of data, which is just completely unacceptable.
To give you an example, the Syzygy tablebase for all endgame positions with 7 pieces remaining is 18.4 TB. The estimated size for 8 pieces is 2 PB.
There are different applications for different things: If you want to host a website with real-world tournament results involving only humans, you probably can get away with using more bytes. But if you're writing an engine that uses pre-computed positions, you want to be as compact as possible.
I did laugh a bit at this bit because "conventional server" and "64 TB RAM" is hilarious to think about in 2023, but will probably be the base config in a Raspberry Pi in 2035 or so:
> In 2020, Ronald de Man estimated that 8-man tablebases would be economically feasible within 5–10 years, as just 2 PB of disk space would store them in Syzygy format, and they could be generated using existing code on a conventional server with 64 TB of RAM
Is your assertion that it takes more time for a CPU to read values out of a 30 byte struct and do a couple shifts and branches than to parse a JSON representation?
There are N possible sequences, and you try N times with a success probability of 1/N each (because it is a good hash function). This means the expected number of hits is 1.
A language model estimates the probability of a sequence of words P(w_1, ..., w_n) or equivalently P(word | context).
For compression, word sequences that have higher probability should be encoded with shorter codes, so there is a direct relationship. A well known method to construct such codes based on probabilities is Huffman coding.
This works whether you use a statistical language model using word frequencies or an LLM to estimate probabilities. The better your language model (lower perplexity) the shorter the compressed output will be.
Conversely, you can probably argue that a compression algorithm implicitly defines a language model by the code lengths, e.g., it assumes duplicate strings are more likely than random noise.
Why? Only if you would reuse the same data multiple times, or if you use a unique random generator that is easily distinguishable from other data. So maybe don't use the 5x5 island, but when done right the approach shouldn't help fingerprinting?
That's probably because you confused Vim and Kakoune key bindings.
Kakoune
- n: next match
- alt+n: previous match
- shift+movement: extend selection by movement
Vim
- n: next
- shift+N: previous match
So if you press shift+N as you are used from vim, you start adding a lot of selections instead of going to previous match. I believe this difference is the most confusing for people who switch from Vim to Kakoune.
Are you sure about that? I haven't seen a node app built from source on nixpkgs yet. That includes Electron apps like Signal Desktop, which is a bit disappointing.
There is this article about trying to package jQuery on Guix:
Guix has several different npm importers (none of them merged), but it's debatable whether it is desirable to build npm packages from source when it either creates thousands of barely useful packages.
> AlphaDev uncovered faster algorithms by starting from scratch rather than refining existing algorithms
Finding that specific optimization, especially when given the comments, seems almost trivial by comparison.
Edit: I tried to understand the optimization in question. This is not the full sort3 algorithm, but only under the assumption that B < C. In that case the GPT-4 answer is actually wrong because it wasn't given that assumption.
> How does it work?
> Using a "Virtual Distributed Filesystem" (VDFS), in other words; a decentralized database that emulates a filesystem. It indexes hardware filesystems to create a master database that is synchronized in realtime between your devices running Spacedrive.
> What makes this different to Dropbox or Google Drive?
> It is not a storage provider, Spacedrive is simply a database that exists on top of existing storage layers, from cloud services like Dropbox, Google Drive and iCloud to physical devices and external drives you already own. It doesn't provide you more storage, rather a supercharged view of your existing storage.
So more like Syncthing? Or rather Windows File Sharing/Samba? I don't really get it