Hacker News new | past | comments | ask | show | jobs | submit login
Dead USB Drives Are Fine: Building a Reliable Sneakernet (complete.org)
107 points by pabs3 on Sept 4, 2022 | hide | past | favorite | 82 comments



Being a teen in the 90s, sneakernet was sometimes even more fascinating and exciting to me than internet connectivity. It had an air of secrecy to it, little bundles of precious data carried between a select group of people on various, often cumbersome and/or expensive storage media. When everything is connected by a bunch of wires, it’s just too easy.

Sometimes I wish for a return a that feeling of preciousness, instead of incomprehensible amounts of data carelessly shoved down wider and wider tubes.

For my own sneakernet-like uses, I turn to git-annex. See use case “The Nomad” here: https://git-annex.branchable.com/


I find it funny, how sending files securely still isn't a solved problem.

Record your neice blowing off the candles at her birthday party, and the next day, her mom asks you for the original (high quality) video file, so she can create a party-video... the file is shot on a phone in 4K and uses 300MB of space.

Mail? Nope, too big. Chat platforms? Too big. Cloud upload? You have to share a link, and some services (ahem, skype) actually open those links when you send them... who knows, they might even save the files there. Also, do you really wan't personal videos in the cloud? HTTP/(S)FTP servers too much of a pain to set up for a one time need. We atleast had DCC on irc, now not even that.

So a USB flash drive it is... and a car ride.


The problem is receiving files, not sending files. You can mail someone an SD card and they will receive it at their house's mailbox. But if their mailbox fills up, the letter will be returned to you. And there is a maximum size for mail that fits in a mailbox. Similarly, people need a digital mailbox you can send files to, but they need to extract their mail from it and empty it out or it will fill up.

We actually made a great solution for this in the 90s: mailservs, inspired by Usenet. We made AOL bots that would take a file, chunk it into 1.5MB sections, and send each via email attachment. Then a client side program would download all the attachments and reassemble them. We downloaded entire movies this way. After download, the mailbox was emptied; if it wasn't, the next transfer would fail.

So people just need mailserv programs to send files, and download their mail regularly.


For a while now I think we have been lacking an enabling shareable storage and auth abstraction to build cloud apps on. From this standpoint email is a little bit of shared storage: owner read, public write (with email specific secure acceptance algos), group chat: account read, account write; note app: owner read, write.


I'm sure you can come up with abstractions, but without first doing a whole system design (FRs & NFRs, mapping dependencies, customer requirements, etc) you'll have to scrap those abstractions when they don't match up with how the system ends up needing to operate. So I would recommend not even thinking about technical abstractions until you have a very large multi-layered system visualization.

For example: what is storage? A router stores packets temporarily; is that storage? A DNS resolver caches records; is that storage? A browser stores cookies; is that storage? A chat or email server & client may both store messages in different states for different purposes. What storage should be shared and shouldn't be, in what specific circumstances? Who is allowed to read and write to which thing at which time in which circumstance? All of that will affect your abstraction.


These are fair suggestions for the low level detail of my comment. I think the core functionality is providing a web friendly storage access api providing for user control and syncing between multiple nodes. It's been done before but not recently with more recent and lighter protocols. Apps like the classic sample todo or note taking apps frequently shown on HN should really just be able to connect to a user owned storage - get authorized for a stoage allocation either fronting local storage or easy to spin up a node on a cloud hosted vm.

It's sort of a shame that so much of the new web3 storage work intrinsically links storage to a blockchain, instead providing a blockchain integration as one of many authentication/provisioning models.


> Apps like the classic sample todo or note taking apps frequently shown on HN should really just be able to connect to a user owned storage - get authorized for a stoage allocation either fronting local storage or easy to spin up a node on a cloud hosted vm.

Let's think of the simplest possible implementation of that.

A todo app wants to write to 'your storage'. What are your options for storage? If it's some thing like Google Drive, that is a proprietary interface, so right off the bat, you are now implementing vendor-specific things, so an abstraction is not worth much. You could make some kind of "JavaScript Framework For Storage", aka a js library that has 50 different proprietary implementations, but that would only work for javascript apps. Which, if you only code in JavaScript, is fine, but if you want to support some other application written in another language, now somebody has to maintain those 50 different proprietary implementations in another language too. That's just not sustainable.

If instead you want to create one standard web API to access storage through, even if you could define exactly what it should do, you now need to get Google to implement that one standard web API. But why would they? They already have their own Google Drive API which works perfectly well for their own purposes. You would have to show them some significant business advantage to throwing away all the money and code they've sunk into their own API (not to mention that all their customers and their apps have sunk into it) and build and adopt this new API. (That assumes you could even get them to agree to whatever standard API you created, as they may want a half dozen extra features that have nothing to do with storage)

By modeling the whole system, you can quickly see all the weird quirky problems that you will run into trying to make this API and get it adopted. It's not impossible, but it's a much larger problem space than you imagine at first.


This sounds like Tim Berners-Lee's new project. https://solidproject.org/


Maybe? Skimming the protocol there aspects of the spec I may not be understanding that superficially seem at odds with a storage container with fairly simple interfaces. Though the security aspects seem at about at the complexity they have to be. I'm probably missing the full aims of the protocol though.


In my opinion TBL has lost all credibility after the encrypted media extensions debacle. This pays lip service to privacy but given he has put corporate interests ahead of individuals before I would not trust it. I suspect it is some kind of poisoned chalice.


YoU'vE g0t mE nOstAlgiC for AOL. I wrote my own chat file servers and apps for downloading(which amounted to opening a span of emails and adding to download manager.) Chat servers I made/used forwarded the uploaded emails, first made with a tool like you're talking about if not included, but then you'd typically unpack it yourself after dl. AOL has a bad rep with tech people, but it was sure nice them hosting all this rather than what you might have elsewhere and a fun platform to play on.

Tasks like this or even punters(html heading tag denial of service through instant message lol) were what got me interested in programming in middle school, no training or education, just hacking around trying to figure things out in a pirated VB3.0. The community was great with private chatrooms full of users testing and helping others, trying out tools. People shared libraries to learn from, use, adapt. Anyone remember the popular old and terribly named genocide.bas? I'd like to get my hands on some of these old proggies and code if they're still out there.. Or even just the cartoon sheep program which would run around the desktop interacting with window borders.


A passworded zip file would probably be my choice, easy enough to open at the other end even for technologically illiterate people, and doesn't take anything fancy to make.


The video is already compressed, a ZIP will not save any significant amount of space in that case.


Pretty sure they're zipping for the encryption, not the compression.

edit: the way I'd solve the rest is just by using a torrent magnet link. The problem with that is that you might have to teach someone how to open a port for UDP traffic, but after they've done it, sending files of any size to them becomes trivial.


If P2P wasn't so awful we could just download from computer to computer. But alas, NAT and IPv4 make that difficult


I also opt for P2P whenever I can, but it does have drawbacks as well, besides NAT. First, both have to be online at the same time. Second, seems many connections are still not symmetric, so sending files are limited by the upload speed of the sender, and together with the first problem, makes the downloaders experience suffer because of the uploader.

With that said, better infrastructure (IPv6 + better connections) can make P2P very feasible in the future, hopefully. Or software that defaults to local connections if it's possible (so if we can find the device via a private IP, use that connection instead).


The difficulty created by NAT and IPv4 is drastically overstated. Even if users' devices generally had public IPs, users still wouldn't want to install server software nor leave their computer on. There is no money to be made pushing solutions that cut out the middlemen, so no advertising continually telling people "Try FooTransfer", and thus no network effects. Instead, one user goes "I can send you this using FooTransfer" and the second user goes "that sounds scary and hard".

And doesn't Dropbox work for the given example? That's the mass-market productized/paid/marketed/surveilled solution.


Is P2P awful? I've been using Resilio Sync (neé Bittorrent Sync) for a decade now; if there is a reliable piece of software in my stack, this is it! Sharing files is a matter of passing someone a code. It works so well I really don't think about this as a problem.



There seems to be a never-ending stream of these services, each very similar to the last, but none seem to survive more than a few months, a few years at the outside.

I think GP's point is that this churn indicates that it's not a "solved problem". It's clearly solvable, but whoever runs these services doesn't find a revenue stream. It should probably be provided by your ISP just like email and usenet once were, but the cultural expectation isn't there. Maybe they get shut down because people use them for bad things? I don't know.


This has been around for years. There's 'churn' because there's almost nothing to hosting this service. None of the data is sent or goes through a cloud service, it's all P2P connections. There only needs to exist a central server for brokering initial connections and handshake between clients. Anyone can setup and run a service like this with just a few dollar domain name and a couple bucks a month of VPS hosting.

I think you're really just wanting some big name or trustworthy player like Dropbox to come out with a similar service. It's not going to happen--like I said by design wormhole and similar tech doesn't send or store data on any central service. There's no value for some big company to give people this service, they're literaly just burning bandwidth to shuttle invisible (to them) bits and data. They can't extract anything like advertising targeting, revenue, etc. from the service. It will always exist as some weird self-hosted thing.


I use a personal NAS which solves that problem nicely. The other option that’s much closer to sneaker net would be use of a direct file transfer protocol like Airdrop. There’s also stuff like file.pizza, but that’s not always reliable IME.


NAS with a DynDNS you have to pay for because ipv4 addresses are scarce I take it? I remember when I could fire up a http.server in python to host some local file, and it would be accessible across the world for as long as I wanted it


I point my own custom domain at it, so that part isn’t free. I have my NAS configured to automatically update the DNS record when/if it changes.


The difficulty in sending files isn't a technical problem, it's a political problem. In particular any efficient file transfer system will immediately be used by people to make copies of copywritten materials. This will ultimately get them in trouble with the law and shut down. So we can only have file transfer systems with obnoxious limitations.


Google drive or Dropbox link shared to their email address isn’t secure enough for a video of your nieces birthday party??


What is your take on this: encrypt your large file and send it over via some common transfer service?


Isn't this the exact use cases of all WebRTC based file sharing tools?


Forever memories of high school around 1990, passing around backpacks fulls of floppies in front of our uncomprehending classmates...

And of course a lot of time spent copying - that second floppy drive made my life so much easier !


It’s always the second-last floppy in a series of dozens that is corrupted.


I can almost hear the sound of the FDD trying - and failing - to read that one bad sector.


Rar and par files, though!

Shame it arrived so late in the game, because it was really an ideal match for floppies. Five disks of data, two... three... four? disks of parity, use 'em only if you need 'em.


High speed dubbing for the win


Some good times as a student in a big university and student housing system around ~2005 - file sharing was banned with the outside world, but inside the whole student network the p2p networks like direct connect were very much alive. Some of the best file sharing ever, so many good music collections to browse through.


I feel you. In the 90s I came into possession of a strange floppy disk with this on the label:

    Dear Friend,

    Please tell my story.

    Sincerely,
    Dave Koresh
After doing a thorough virus scan, I examined the contents of the disk. It contained a bunch of stories and essays by various authors in text-file format, nothing dealing directly with David Koresh or the Branch Davidians.

But the FBI raid of the Branch Davidian compound in 1993 caused... unrest in certain pockets of the American politisphere, primarily on the right but groups like the ACLU also got involved, and Janet Reno was mocked on SNL for her role in the incident. Nobody liked the Branch Davidians, they were a weird-ass cult preparing for an apocalypse that never came except maybe in a parrot sense[0], but there were concerns that the FBI (and potentially the ATF before) acted too aggressively, overstepping legitimate law-enforcement bounds and violating the Davidians' rights, and that caused a lot of political introspection: on our status as a freedom-respecting republic, the competence of our federal law-enforcement apparatus, what is the threshold beyond which the government legitimately could or should take action against weird-ass prepper cults. So there was a lot of philosophy and politics stuff in there, often in the form of USENET postings and other 90s internet copypasta, that dealt tangentially with those issues.

It felt... kind of weird and awesome to have and to read that stuff. Cyberpunk. Here were thoughts that people felt not quite safe transmitting or discussing openly, so they were copied and distributed as floppy-disk samizdat. It was a mysterious object, alarming at sight because it suggested that Koresh might still be alive somewhere (and the shortening of "David" to "Dave" made him seem... humbled?), whose contents held even deeper and more thought-provoking mysteries.

[0] "I heard tell once of a Jefferson City lawyer who had a parrot that would wake him each morning crying out 'today's the day the world shall end as scripture has foretold'. And one day, the lawyer shot him for the sake of peace and quiet I presume, thus fulfilling, for the bird at least, his prophecy." -- Daniel Day Lewis as Lincoln in Lincoln (2012).


Was transferring a 600KB (KiloByte) file to a friend. After 2 hours it failed. Tried again. Failed. Realized both times it ran out of hard drive space. I was 52 Bytes short.

I proceeded to walk a floppy over to his house 2 miles away.


git-annex by default will consider your data dead when it can't be checked in real time, but overriding will trust the data will survive offline forever.

I'd love for it to have some kind of expiry on it after which I need to fsck it.


> I'd love for it to have some kind of expiry on it after which I need to fsck it.

There is "git annex expire" (https://git-annex.branchable.com/git-annex-expire/).


That's cool. However I don't like the idea of plugging in USB sticks and having them modified by anything. OS X was (and still is) a terrible offender: plug in a USB drive and the OS sneakily inserts stuff outside partitions (so you cannot easily find it but if you "dd" and take the hash of the drive it's different before and after simply plugging it in an OS X computer: to me that is pure madness and it took me a while to figure out what was going on).


Pretty sure Apple at the very least injects a hidden file that easy to see if you plug the drive into an Apple device then into a non-Apple device; no idea why they are adding it, though 99% sure it is for file operation administration.

Edit: Appears at least one of the files added is called “ “.DS_Store” and is still around:

https://appleinsider.com/articles/22/02/19/google-drive-user...


I learned a roommate was snooping around my flash drives because of this. A bunch of .DS_Store folders appeared in a drive that I had never connected to an OSX machine


It's the equivalent of desktop.ini on windows, so yeah finder customisation etc..

Expect Mac software does fuck all to prevent it "leaking", so pretty much every zip from a Mac user contains one of these in each folder. Only case we had desktop.ini do the same was with Dropbox.


DS_Store just includes basic window configuration information for Macs. Stuff like list vs icons, size of the columns, etc, so you have a consistent UI look when you're accessing your remote drives.

It can be toggled to avoid it on removable devices when they're connected to Macs. It's benign, though I guess annoying to some non-Mac users.


I have to wonder - are there any USB sticks available that have a read-only toggle like floppies used to have?


No, or very niche, and there's no guarantee that those few that do have write protect enforce it in hardware. For cases when you absolutely most not allow the flash content to be modified there are USB devices called write blockers. They're used mainly in forensics, where the data might be used as evidence in court and there needs to be clear chain of custody and an audit trail to say the bits analysed are the same as the bits when the device was first seized.


You might want to have a look at USB SD card readers instead of USB sticks. Many (most?) (normal and mini but not micro size) SD cards and adapters have write-protect slider switches. Be careful though, I've found some of these sliders to be rather flimsy, I've had several break off or fall out of SD cards and adapters over the years.


They exist but they're more expensive. One brand is Kanguru: https://www.kanguru.com/collections/kanguru-usb-drives-with-...


The original USB sticks had this, but it was no longer included as the devices became commoditized.


Disk Arbitrator allows you to force OSX to mount new volumes as read-only (or prevent mounting them entirely) to avoid this. https://github.com/aburgh/Disk-Arbitrator

I wish it were built-in.


I didn't understand if it can handle this use case: a file server with (simple case) a hot swappable SATA3 bay (or any other bus), N disks (N - 1 are offline.) Send files to the server (rsync, scp, anything), remove the disk and replace it with another one, iterate. NNCP should handle this so far. But would it be able to tell me on which disk a file is stored?

If this were handled at file system level, the FS should pretend all disks are online to let me list files and directories. When I want to access an offline file or mkdir or perform any write operation on offline disks, it would raise an error and tell me which disk to load into the server. Then I try again.

If NNCP doesn't do that, is there and Linux file system able to do it? I googled but I didn't find much.


That use-case seems like what git-annex is designed for.


Curios, anyone know of any active dead drop networks besides ones listed below?

This website been around since at least 2010, but number of the drops listed were physically removed, but not delisted from the site:

https://www.deaddrops.com/


I like the idea, but I'm not sure I would plug an random USB thing in my computer...


Agree. Specifically generally speaking it’s an untrustworthy system and practical might be malicious via the file, usb-firmware, might be connected to high voltage line, etc.

Do think though as a social experiment they’re fun as long as you’re only using throwaway systems/devices.

Beyond that, personally would not suggest damaging property, but instead affixing USB to padlock like this:

https://ibb.co/7jmNXNR

All and all, basically same as geocaching [1] — which obviously could be dangerous if someone wanted to be malicious, but never heard of that happening; have heard of malicious USB dead drops though.

[1] https://en.m.wikipedia.org/wiki/Geocaching


A safer alternative would be something like a piratebox (https://piratebox.cc/) instead of a USB dead drop, since a piratebox would have no physical connection to your computer.


I keep being annoyed that Piratebox, Internetinabox, Othernet, aren't all the same project. Like of course an IIAB node should include a messageboard. Of course an Othernet node should have support for offline payload bundles and user uploads. Of course a Toosheh bundle on newly-inserted storage media should be automatically extracted and presented in the interface...

If I ever grow some more software clue, I'll be trying to unify all of the above into a single interface. Wish me luck.


(IIAB) being “Internet In A Box”

_____________

Toosheh being a “satellite filecasting technology deployed in Iran and the Middle East that uses common satellite equipment to deliver digital content without relying on access to the Internet” per Wikipedia:

https://en.m.wikipedia.org/wiki/Toosheh


Yes, agree, wireless dead drops are a thing.

That said, unlike cheap and replaceable USB sticks, they’re easy to find using RF analysis. As such, unless it’s physically secured, likely quickly be stolen. Also likely require a power source and weather proofing.


More information on NNCP (including what those four letters mean) here:

http://www.nncpgo.org/


I used sneakernet years ago, mostly as a last mile connection to internet. I would go to place with more bandwidth , download stuff burn it to CD , or save it on external hard drive, and go back.

I think many people still do it like that. There are 3.7 B people in the World without internet.


Back in the day when I went to what you could describe as an engineering high-school we had very limited internet in the dormitory as well as at home. So we organized into teams, where the leader would gather Warez links for his teammates to download during the weekend. On Sunday and Monday, back at the dorm, we would upload all the series, movies and stuff to our data hoarders ftp server, giving many more people access, who in turn would copy the stuff to USB HDDs to carry it to school. There it got copied to our classmates storage and from there it got distributed throughout the region, reaching I don't know how many people. My team, doing mostly series and games, had about 10-16MBits, all teams together reached an unheard of (for consumer internet in that region) bandwidth. We just couldn't have done it alone

Good times


I ran an open CIFS volume (windows share) in our dorm for music and movies. This was just after Napster went down but before Limewire or something. So the only way to share music was to upload everything you had and have access to whatever everyone else uploaded. You just dropped what you had in a writable folder and it would organize it in read only folders, doing any deduping for identical files and flagging non-identical files with the same meta or anything it didn’t understand.


This reminds me of Angelfire's ftp setup. You got free hosting, but to upload, you had to upload to their open ftp server and use the control panel to copy files into your webspace. The files on the FTP server had an expiration of so many hours, but was otherwise unrestricted. You could upload files larger than what you could use in your free space. It became rather interesting to just browse what people were uploading to the FTP server. I eventually found some people were just using it to transfer files, and some were using it like a message board, dropping .txt notes to each other. These notes were usually just things to indicate some larger file that was being uploaded.


I did the same, going to my dad's workplace. I had no USB drives, but pushed CD-RWs and multisession CDs to their limits. Cool times!


I've been having to do that in Silicon Valley because of the data caps on our crappy Quest internet service. I take my MacBook Pro in to work, download new SDKs, customer memory dumps, etc, then hoof it back home to work on them


It’s cool and sounds like reinventing UUCP. Which has the advantage of serving as a transparent transport for ordinary email.


http://www.nncpgo.org/Comparison.html

This is really amazing stuff. Years ago, I was heavily interested in scuttlebutt [1]. I'm interested in grid down and "occasionally connected" communications. I particularly liked the island to island sailing analogy.

SSB has a few downsides - mainly that your client needs to download full logs. You can't just request the last 30 days. They call this out as well that your first sync could take over an hour and consume several gigs of data.

I'm just today learning about NNCP, but I've used FidoNet and UUCP in the past, so the concept is pretty familiar to me. In my mental model, I would be less interested in sneakernet than standalone wifi hotspots that one could connect to and exchange data, possibly combined with an AREDN style mesh[2].

[1]: https://staltz.com/an-off-grid-social-network.html [2]: https://arednmesh.readthedocs.io/en/latest/arednGettingStart...


You may be interested to note that NNCP integrated Yggdrasil support recently (though you can also run Yggdrasil at the OS level). Yggdrasil is an always-encrypted IPv6-based mesh, and is a perfect fit for something like ad-hoc wifi (since the nodes can discover a route to each other based even on RF paths).

Yggdrasil can also run as an overlay network atop standard Internet (IPv4 or IPv6), or both. It will opportunistically find peers on a local broadcast domain and find routes to other networks over the standard Internet if need be.

Feel free to drop me a note if you like; I had a very similar experience with SSB. NNCP, while it has a bit of a learning curve, Just Works. It processes thousands of packets for me every day (hourly ZFS snapshot backups for every filesystem I have), some of which are huge, and it Just Works.


Yeah, this use-case fascinates me as well. I've been noodling around with the idea of a 249-gram drone that flies SD cards between locations. BVLOS is still problematic in the US regulatory environment, but perhaps useful elsewhere. And I can always do demo flights indoors.

It would be ultra cool, IMHO, to fly SSB and NNCP traffic between "islands", whatever form those take. For bonus points, do it with pigeons, which have the advantage of operating in a GPS-denied environment.

I don't mind the full sync for SSB, I mind the lack of any progress indicator while it's doing so. But that seems like an easy UI tweak and maybe they've addressed it since I last played with it. What I haven't figured out is whether a brand-new user can sync from another existing user on the same island, or if they need a real internet connection for that first bit.


From the same author, "NNCP is to UUCP what ssh is to telnet; NNCP is an Encrypted, authenticated, onion-routed version of UUCP!"

=> https://www.complete.org/nncp/


“Never underestimate the bandwidth of a station wagon full of tapes hurtling down the highway.”

― Andrew S. Tannenbaum


NNCP has similarity to a block chain. This is really cool!

> NNCP is to UUCP what ssh is to telnet; NNCP is an Encrypted, authenticated, onion-routed version of UUCP!


Anyone know of a existing automated way of processing untrusted USB to the extract known and expected encrypted volumes?

_____

* Core issue I have never been able to workout is how to know given the potential for firmware hack how to make sure only the known trusted data makes it out of the isolated processing system to the trusted system; open looking at anything though as long as it open source, doesn’t have to solve that specific issue.


Worse than malware, eventually you'll get people dropping (https://en.wikipedia.org/wiki/USB_Killer)s for the lulz.


FYI your link got corrupted:

https://en.wikipedia.org/wiki/USB_Killer

And interesting, always knew high voltage was a threat, but never thought of capacitor being used, since it wouldn’t require an independent power source.


Yeah but I can drop an ADUM3160 isolator on the port for a few bucks. If they kill that, I replace that for a few bucks more.

In most cases a chain of cheap hubs would work too.


For clarification, all USB ports are 5 volts DC; transformer (or computer) will take care of converting the 120/220 AC current to the necessary 5 volts DC.

Commonly two types of isolation data and power chipsets. For a dead drop, you would want a cheap voltage isolator set to trip/blow a fuse at 5 volts DC — with no data isolator.


What?

So what do you do if someone puts 120 volts on the data lines?

You very much want to isolate both. Which is precisely the function of the chipsets I mentioned.


Data chipset blocks all data flow - whole point of connecting to the deaddrop is to get data; and yes, I agree all lines when capping voltage show be checked, though would not be surprised it’s not common for off the shelf usb power isolators to only check the power lines as defined by USB standards.


No.

No, no, no.

A USB data-line isolator does not block data flow. Its entire purpose is to alow data to be communicated across a galvanically-isolated gap. It uses magnetic (transformer) or optical isolators to do that.

It blocks POWER flow, while passing data. Look up the ADUM3160 datasheet.


>> NNCP guarantees the integrity of packets, but not ordering between packets; if you need that, you might look into my Filespooler program. It is designed to work with NNCP and can provide ordered processing.

Besides Filespooler, any other options?


You'll be reliably tackled and body-searched walking around the data center floor with a usb drive in your pocket.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: