Some quibbles: \* URNs weren't *just* for ISBN-like identifiers from an assignme...

btrask · on Aug 5, 2015

Great points, you've clearly read the RFCs. :)

- You're right that URNs were not just for ISBNs, but they shaped the formation of the standard and (IMHO) made it inapplicable for hashing. A content addressing system that can't resolve any of the standardized URNs wouldn't be very useful. FWIW one of my earliest prototypes used URNs, and I still use them for ISBN links!

- Magnet links work fine today, but if you look at the original proposal[1], they really were designed for all of the wrong reasons (including explicitly popping up a JS handler). In practice everyone who uses magnet links tunnels URNs through them, which serves no point for a general purpose system. A system that supports magnet links must also support URNs (meaning the arguments against using URNs apply, and magnet: doesn't add much value).

- I've considered eventually adding ni support to StrongLink, although it's not like anyone else uses it so it wouldn't be an interoperability win. I think my hash scheme proposal is much better, so I'm hoping we can just forget about ni entirely. (But to be clear, it's extremely easy for a system to support both.)

[1] http://magnet-uri.sourceforge.net/magnet-draft-overview.txt (warning: SourceForge link)

gojomo · on Aug 5, 2015

I wrote the original magnet-URI proposal, so trust me when I say the JS-stuff was a demo hack, and the content-based names the real point. (Essentially no one ever implemented the JS-handler-negotiation, which was a quasi-web-intents mechanism before that concept was named.)

Magnet-URI's immediate predecessor was the Hash/Urn Gnutella Extensions, 'HUGE' [1], and the reason that all the examples in the magnet-URI spec are hash-URNs, and that such hash-URNs are the main way magnet-URI has been used, is because that's what magnet-URI was for.

I respect that your design opinion is that URNs aren't good for this; it's just false for you to say hash-names are against the URN specs. Neither the language of the URN specs nor historical practice supports that idea. And, hashes are, as you clearly agree, a great way to generate "persistent, location-independent, resource identifiers" (the stated purpose of URNs).

A system (P2P, CDN, local content-addressed stores, etc.) can be plenty useful even if it chooses to support only some URNs, or only some magnet-URIs. All the magnet-using systems have essentially ignored standardized/assigned URNs, and instead used ad-hoc hash URNs, and in total they've been quite useful to a lot of people.

[1] http://rfc-gnutella.sourceforge.net/Proposals/HUGE/draft-gdf...

btrask · on Aug 5, 2015

Sorry, I didn't know Magnet was your work. I think it's an example of the inner-platform effect (tunneling URIs in URIs), but I agree it's served a lot of applications (especially BitTorrent) very well.

You're right that hashes aren't prohibited by the URN specification. My argument is that the URN spec doesn't prohibit anything, because it's too broad. In fact, I think that URNs boil down to URLs, because in order to resolve many schemes (including ISBNs), you need a dynamic lookup to a central authority. I've been considering an article called something like "Locations, names and hashes" in order to explain that locations and names are effectively the same, but hashes are fundamentally different. That is my opinion of the underlying reason why URNs failed to catch on (aside from BitTorrent).

Even the practical point of interoperability is moot, because BitTorrent namespaces its hashes. It's impossible for another system to support existing URN/magnet links without "emulating" torrent files (which introduces too much ambiguity anyway).

BTW, the http://memesteading.com/ link in your profile appears to be broken.

Edit: I see you've worked on a lot of things I've read about, e.g. WARC files. Do you currently work at the Internet Archive? I was planning on approaching them at some point with some ideas.

gojomo · on Aug 5, 2015

Yes, I think many now recognize that the original idea of a stark contrast between URLs and URNs doesn't fit the fuzzy reality. (RFC3986's "URI, URL, and URN" section, https://tools.ietf.org/html/rfc3986#section-1.1.3, acknowledges this point.) My interpretation is: there's quite a few de-facto URNs in use, just without the official label "urn:" or namespace registration, which in practice has turned out to be an unnecessary formality.

(Thanks for the note regarding memesteading.com; mapping updated to work now.)

I'm no longer regularly doing anything for the Internet Archive, but can definitely help make contact! If you're in the bay area, a good way to start learning more about its projects (or show off your own) is to attend the open-house lunches, held most Fridays. (You should just shoot them a note or call before showing up, so they know the expected attendance, or warn you if it's one of the occasional days that it's not held.)

btrask · on Aug 5, 2015

Rather than de facto URNs, I'd say de facto URLs, but we don't have to quibble over that.

I'm on the east coast, unfortunately (North Carolina).

HN is going to start capping the thread depth, but you're welcome to email me if you'd like to talk more (bentrask@comcast.net). I've been trying to come up with an archival web proxy or something sort of related to WARC and the tooling around it, possibly using content addressing (although converting existing web pages seems ugly and I haven't found an ideal way).

chmike · on Aug 5, 2015

The authority is optional in an URI, the path is not. A blanc authority is not the same as an absent authority. Look at the examples in RFC3968 in section 1.1.2 at page 6 in https://tools.ietf.org/html/rfc3986.

The URI for the mailto scheme has only a relative path without authority and subdirs.

gojomo · on Aug 5, 2015

The RFC3986 path may be empty – which is essentially the same as absent.

Also, when magnet-URI was composed, the URI spec was RFC 2396, which clearly describes <path> is one of the elements that "may be absent":

https://tools.ietf.org/html/rfc2396#section-3

Also both RFC2396 and RFC3986 allow for URI schemes where everything after the 'scheme:' is opaque, and need not be strictly interpretable as authority/path/etc. (RFC2396 mentions that it will still refer to this opaque-part as a 'path', because "they are mutually exclusive for any given URI and can be parsed as a single component".)