Hacker News new | past | comments | ask | show | jobs | submit login
Never use a URL shortening service – even if you own it (shkspr.mobi)
91 points by leephillips on Feb 19, 2023 | hide | past | favorite | 77 comments



I've long had the idea -- what if domain names were designed to resolve per year (or even date)?

In other words, if you type something like gu.com[2002]/rest/of/url then it should continue to resolve whatever IP address is chosen by the person who owned it in 2002. (And the IP can be updated by that owner for the rest of time.)

For end users it wouldn't matter -- putting gu.com into the browser bar always resolves to whoever owns it today.

But anytime you embedded a link into a webpage, best practice would be to append today's year (or date) to the domain in the URL. Or if you added a meta tag to the page for when the page's content was created, all links therein would default to pointing to each domain on that date.

This way, change of ownership of domain would never affect existing links.

Obviously it doesn't solve all the problems -- the historical owner still has to be maintaining the site and serving the content. But as domain names get repurposed for rebranding and financial reasons, it means existing links could still work. Especially if we think we'll be continuing to use domain names for generations more.


A subdomain like 2002.gu.com would work without introducing changes to URL specifications, with the domain registrar giving control of that subdomain to the owner of that year. But its such a niche case and also domains are probably sold because the old service is being shut down, so it would not work in most cases.

I think an external service like archive.org that also captures redirects, so it can list all shortened URLs for a page, would be the better solution.


That would break most websites that switch hosts or use DNS based cloud load balancing.

Embedding the date in the URL would be great for an archive.org extension but it would be practically useless for preventing broken links.


The vast majority of uses for this would be to redirect you to the proper domain, which doesn't even need to touch any of your existing tech, since you could use a cloud service to handle the redirect portion.


I remembered that Tim Berners-Lee did write about a similar idea back in 2000 – introducing a date coded domain hierarchy:

https://www.w3.org/DesignIssues/PersistentDomains.html


Son of a gun. I've never felt in more esteemed company. I figured I couldn't have been the only one to think of it. Thanks so much for the link, I never would have found that on my own!


well, great minds think alike.

to be honest, your / TBL's idea is quite good!


That'd be lovely. Wish there were a way to convince ICANN or whoever to do it.


That effectively is the existing archive.org...

https:// web.archive.org/web/[date]/[full url]


This sounds like an idea for an archive.org or an archive.is browser, or plugin.

It could be compromised, maybe, by encrypting a particular date at the end of the URL. In order to point to an earlier link that the user thinks is the current link.


I can barely believe I’m typing this, but an incentivized blockchain (aka Bitcoin et al) is a legitimate option to ensure a coherent and mutually agreed upon history of information in the face of competing economic actors. Too bad about the climate change and scams that seem to be part of the bargain…


We're talking about URL shorteners here. If you have a 20 character URL, which points at a blockchain, which points at a 50 character URL - just store the original 50 character URL and skip the blockchain.

The only real use case for URL shorteners is length-limited fields like 280-character tweets, but Twitter runs their own URL shortening service already, so again, no blockchain needed.


The article talks about ownership transfer of a url database and the potential for history rewriting. An incentivized blockchain could be part of the solution to ensure history rewrites are not economically possible. I don’t have a protocol sketch, this is just a marginalia comment.

Exercise left for an interested reader, which sounds like isn’t you :)


The article says that such a URL database should never be used in the first place. You don't have to pay anybody to preserve the history of your data, if you just maintain control of your data in the first place. It's just 30 more bytes to store, I think we all can afford the cost.


I don’t mean to belabour the point, but I think the nuance is worth exploring.

The key point in my view is “control”. You’ve referred to Uber.com/Xmas-sale as an example where Uber the economic entity is in charge of their own shortening and therefore avoids the problem. However, Uber the company may fall on hard times, sell their real and intellectual property and we’re facing the same issue of a lifetime mismatch between an owner and an URI.

Incentivized blockchains are theoretically singletons and perpetual; the worst case of a history rewrite is economically disincentivized. The lifetime of the URI database exceeds the URI lifetime.

In practice, I agree with you that companies should ensure control over their URIs over their lifetime. The advantages of a blockchain approach are far outweighed by the current downsides.


If I understand you, if the URI database outlives the URI itself, the URIs in the database would still be useless, right? If somebody buys uber.com, it does not matter if we preserve "uber.com/xmas-sale", since the uber.com domain can be pointed at a different server. So you need to save the content as well, which means you'd need something like IPFS. I could see the usecase for something like IPFS on a blockchain (assuming IPFS is not working well enough as is; I have no idea)


I think it would not be controversial to say that a blockchain would not address most of the problems IPFS currently has.

Just to avoid misunderstanding -- IPFS right now does work pretty well, but there is plenty of room for improvement.


The guardian did maintain control of their data.


The other real use case is printed URLs. Also, why is Twitter’s URL shortener exempt from this concern? If I recall right, they got a special exception to have the only single-character .co domain, and if they go into bankruptcy…


> printed URLs

True, but I still think users would trust "uber.com/xmas-sale" over "tinyurl.com/FgL82" or whatever. It's not that much extra typing.

> Also, why is Twitter’s URL shortener exempt from this concern?

Twitter's character limit means that some URLs can't fit in a tweet due to length limits. So some form of shortening is required, but it's still definitely not ideal. Because everything is owned by the same company, they are controlling their own data (without a blockchain), in a roundabout way. I would never use a t.co link outside of Twitter.


> True, but I still think users would trust "uber.com/xmas-sale" over "tinyurl.com/FgL82" or whatever. It's not that much extra typing.

I would.

One of the main problems I have with shortened URLs is that they remove transparency. I have no idea where the shortened URL actually goes unless I click on it. That means that I have to blindly trust whoever is giving me the shortened URL.

In practice, that means I won't use shortened URLs unless they come from a person or company that I already trust.


Ethereum is a better choice because no impact on climate change and it’s more secure. Also smart contracts make it more convenient to store information. You can build a contract that allows anyone to publish a hash in 5 minutes. You could have a map from address to an append only list of hashes. Anyone who calls the publish(bytes32 hash) function just adds to their own list.


> no impact on climate change

Less impact, surely. Not zero impact.


Basically ENS (Ethereum Name System) but simpler and cheaper.


I don't understand the climate change comment - not yours specifically, it is common. If POW blockchains are genuinely useful (perhaps indispensable) in this way, why does it matter if it "costs" some emissions to keep in running while also providing orders of magnitude more value that would go into... reducing emissions.


> providing orders of magnitude more value

That’s the reason you keep hearing the argument: proof-of-work blockchain technology has not yet demonstrated that it provides “orders of magnitude more value.”


Because the cost is well in excess of the value, and the value won't go into reducing emissions. In effect, it's just people destroying the environment in order to line their own pockets. That's the game we need to change everywhere.


>Too bad about the climate change

The Ethereum merge to Proof of Stake was one the biggest technological achievements in 2022, how did you miss it?


Ehh, there are legitimate use-cases for a shortening service. One example is SMS messages. We can complain about how antiquated SMS is all day long but it's still one of the best ways to reach people if you don't don't have an app or have a low install base. In SMS length still matters and so taking:

    https://mydomain.com/path/to/resource?maybe=with&some=query&params=included
down to:

    https://mydomain.com/s/X3s5321
Both makes a difference and hides ugly urls. I agree using a different domain is a danger if you let it lapse but if you use your main domain then I don't see an issue.


I use SMS heavily and people often send me URLs. I would hate them being shortened ones there just as much as in any other context, for all the same reasons.


If this is your own domain this is not really URL shortening, but an alternative, and, in your example, parallel URI scheme. And you could just replace your ugly URLs with the shorter versions.


I guess it really depends on your definition of URL shortening and I'm happy to admit that maybe mine is the one that less used.

Shortening 1 = using a shorter/different domain with a shortened path that maps to a longer domain/path

Shortening 2 = using the same domain with a shortened path that maps to a longer path.

If #2 isn't considering "URL Shortening" then I guess I rest my case as I agree that nothing good comes from #1. Of course #2 is still a problem if you have long-lived urls and you don't maintain their integrity. The way we use them is for short-term one-time/short-term login links so it's even less of an issue, no one is bookmarking them/saving them and for sure they are in reference papers somewhere.


I think pretty much everyone would call that URL shortening though.


I wouldn't, personally. That's just providing a shorter direct URL. A URL shortener is a lookup service, turning a fake URL into the actual one.


How is providing a shorter URL not URL shortening? Just because it's not necessarily having to do a lookup?

Wikipedia says:

> URL shortening is a technique on the World Wide Web in which a Uniform Resource Locator (URL) may be made substantially shorter and still direct to the required page

That's what it is.


> Just because it's not necessarily having to do a lookup?

Yes. You aren't shortening a URL, you are serving a web page that has a short URL.

I realize this is pedantic, but the relevance is that we're talking about URL shortening services, which this is not using.

I don't mind short URLs much (although I greatly appreciate it if the URL gives me some hint of the content it's pointing at, even if that makes it longer). URL shortening services, however, come with additional problems that lead me to the opinion that they're unacceptable.


Of course anything that's ephemeral doesn't apply, since it's going to quickly become a broken link anyways.


The thing is, the premise of the article (new owner of domain can serve whatever they like to existing links) also applies even if a shortener is not used.

Take this URL from yahoo news that I just picked out at random:

https://www.yahoo.com/news/woman-accidentally-breaks-42-000-...

Lets say tomorrow yahoo sells off "yahoo.com" to the highest bidder. The new owner can reply to queries for that url with whatever they like. The only real difference is the URL above has some meaningful text in the title, so if an article on something very different comes up, there's some indication of something wrong. This is vs. gu.com/ABCD having no indicator of anything amiss if a new owner serves different content.

But the story really boils down to: when a domain is sold to a new owner, that new owner can change the content for any existing URL's for that domain.


I think there is a relevant difference, which is that yahoo.com is basically tied to Yahoo the company in a way that gu.com isn't. You can't sell yahoo.com without selling a major portion of Yahoo, but the vanity URL comes more cheaply.

Companies do break their own URLs, which is bad, but I think using a shortener aggravates the problem.


Redirects are the key to URLs always working. A self-hosted URL shortener is no worse than a self-hosted redirect service. If you fail to maintain ownership of the short domain or you shut down the service, well, how is that different than a same-domain redirect service?


I don’t use URL shorteners, but sometimes a web service will use one for you. Twitter does something weird, where URLs are displayed normally in tweets, but are actually hyperlinks to a shortened address using their t.co domain.


That's an easy way to track outbound links, they just track everyone clicking on their short links, although that could be said for all those services.


Why not use QR codes more? Outside of print media length doesn't matter, just auto replace them with the word "link" and the user doesn't have to see the ugly code, character limited microblogs can make an exception for links or store them separately as an attachment.

Android should allow OCR and QR input as a keyboard feature.

Link shorteners exist partly because of missing features elsewhere.


I doubt this is significant enough to matter, but shorter URLs reduce the required size of the QR code at a given error correction level, which could make them slightly easier/faster to scan with worse cameras/from further away/etc.


You can share any QR code with Google Lens and it'll give you the ability to copy the text encoded in it/open the link/etc.

QR codes aren't very information dense, though.


They can be, has like 4k of "storage" - links are but one of the many use cases.


> QR codes aren't very information dense, though.

Neither are shortened URLs.


Are there any alternatives codes which are information dense?


QR codes exclude text-only services.


I've come to think that a more malleable history is seen as a feature, not a problem, by rather a lot of people.


In a different context. News articles and academic papers probably should be permanent in most cases.

My social media feed - personal data in general - probably should not. For that, there are simply too many bad actors on the internet that will go the length to piece together a complete dossier of you, should you ever look like a juicy target. That goes for people that know they have an attack surface and should probably practice good hygiene online, but also for unwilling participants in Kiwifarm-esque witch hunts (read: neurodivergent and queer people). Not to mention what this "osint" can achieve in the hands of more powerful actors - state actors[1] even.

[1]: https://www.huffpost.com/entry/texas-transgender-database-dr...



On the main subject of URL shortening services, I think they work well especially if you set up your own. Although mass-marketed URL shortening services like bitly and tinyurl have also been abused by bad actors I still think they are good services to you use both as end/customer user and start/creator(?) user.

On the secondary subject of broken URLs and historical archiving in a sense, I think there is no issue as is. I type this because I think it many cases archiving information on the internet for the sake of strictly keeping historical data and such is a feat in itself that I have yet to see anyone competent and prolific enough to do it, if anything I would advocate for anyone and everyone as an active internet user to partake their own individual archiving of what is worth keeping. Simply put, it is and continues to be a free for all in that sense. I do commend and am a fan of what archive.org and the archive team (two separate entities) are doing but I also don't rely on them wholly. I have my own small setup. On that subject, I'm curious if anyone else here has?


For the likes of journal articles as mentioned in the article they should indeed use full links, but also download the entire website to include in the supplementary information. Data changes. There's no guarantee even an archive site will retain the original source over a long enough period of time. But the supplementary data will generally survive as long as the web version of the journal article.


I use my own link shortener when including links in images/gifs/videos. Easier to type out, especially on mobile. E.g. m8.fyi/darn and m8.fyi/pok


Complete tangent but I was struck by the typo in this sentence

> Millions of links around the web - including many on the Grauniad itself - are all now broken.

Which made very large positional changes but to just the right combination letters to produce another plausible-sounding name out of "Guardian".



It’s a common joke:

https://www.macmillandictionary.com/dictionary/british/graun...

Something to do with frequent typos in the paper, I think.


Fascinating, thanks!


I agree. URL shorteners are just bad. It doesn't matter how they're implemented, or by who.


Something weird I noticed: the author keeps misspelling "Guardian" but does it in a different way every time

There's the following: Gaurdian, Grauniad, Guarrdian

What's the joke here? Is he referencing something?


Wouldn't this also be the case if you sold your primary domain? History would potentially be rewritten by the new owner. I think it's just a general problem with link decay.


$2.5 million is not bad given how few two-letter .com there are


My guess is it goes for $15 million or more. Lots of ways to use it in Chinese.


The Guardian changes to the Gaurdian then to the Grauniad and then, finally, the Guarrdian.

Innocent typos or there is some joke i missed.

Edit: as per other comments this seem to have been a joke.


The Guardian used to have a reputation (deserved or not) for frequent typos. Private Eye (a satirical British magazine which has always taken a critical eye on the UK press) adopted the practice of always referring to The Guardian as The Grauniad as one of its many in-jokes, stylistic quirks, and running gags (see Eyes passim, ed.). As a result the nickname The Grauniad is a common epithet for The Guardian even if they no longer have any such reputation for typographical or copywriting errors.


Here highlighted in a nice letter to the editor:

https://twitter.com/arusbridger/status/1399300205925834753


The Grauniad is an old Private Eye joke from when the Guardian had legendary poor quality control and typos, and its kind of stuck.


On a tangent, I am loving how the author is using Mastodon as a commenting service.


Links in general are bad. We need to switch to using content based addresses. Something like IPFS (https://en.m.wikipedia.org/wiki/Content-addressable_storage)


It would be nice, but my IPFS experiments haven't left me very hopeful of using them for any practical purposes. People don't want to wait 2 minutes for content to load and you need a lot of visitors for the network to saturate with your content.

The only advancements/alternatives I've seen in this space are cryptocurrency-based, which is even worse in my opinion.


I doubt that will ever happen even though it does look like a neat concept/solution.


What we are currently doing by using links is essentially saying “go to the library, 4th floor, 10th shelf, 5th book from the left” when sending a link. What we should be doing is saying “go to the library and the find the Lord of the Rings book”


You're describing an index or a search machine. If every resource locator were content based, you'd have to wade through hundreds pages of SEO spam and vaguely relevant content before finding the one you were looking for, because your search terms would not match the (usually poor) description from the creator.


I know and I understand the sentiment but there’s 30 years worth of infrastructure and content you have to deal with and I don’t see how you can work around that.


The early design of the web basically set it up as hyperlinked microfiche. What you're looking for was done later as torrents.


For the TL;DR - People can do bad things with tools.

I don't think any of the examples in either this article or his companion referenced (why URl shortening is bad for users and bad for the web) are compelling reasons to avoid using a URL shortening service, from a user perspective.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: