What's particularly insidious about a lot of these link shorteners is the use of...

LeoPanthera · on Oct 14, 2018

You are correct:

$ curl -A "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/12.0 Safari/605.1.15" https://t.co/88MpPkUoJg

  <head><noscript><META http-equiv="refresh" content="0;URL=https://bbc.in/2yDY0F5"></noscript><title>https://bbc.in/2yDY0F5</title></head><script>window.opener = null; location.replace("https:\/\/bbc.in\/2yDY0F5")</script>

I had no idea they were doing it that way. How gross.

eli · on Oct 14, 2018

I assume it’s to remove the t.co page from the browser history, which of course is not relevant or useful for curl. There’s nothing in that response that looks malicious.

pdkl95 · on Oct 15, 2018

They already return different results based on the user agent header; they could easily be returning different results based on other HTTP headers, IP headers, etc.

Arguments that implicitly assume everyone receives the same data from a server are frighteningly common. This is extra strange when it happens on forums like HN that also regularly assume the same server might be A/B testing or providing "targeted" advertising - or prices - that is unique for most users.

Any discussion about data from an unknown server should always include some sort of checksum. Without verification everyone is receiving the same data, statements about a server's responses don't mean much.

tdb7893 · on Oct 15, 2018

Couldn't any site be sending different results based on any header? I guess I don't get how "they could easily be returning different results based on other HTTP headers, IP headers, etc" doesn't apply to literally every site

stingraycharles · on Oct 15, 2018

As others have pointed out, the same thing can be accomplished using a http redirect. The only purpose this kind of intermediate page has is to hide the HTTP Referer field and make it look like it was coming from t.co. This ensures that only Twitter knows which tweet someone was coming from.

Leace · on Oct 15, 2018

Of course for that there is also a standard-compliant way: https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Re...

sebazzz · on Oct 15, 2018

That doesn't work for a part of Twitter audience, among Edge users.

teraflop · on Oct 14, 2018

A normal HTTP redirect would accomplish the same thing.

mvanbaak · on Oct 15, 2018

if you enable '-v' you can see they set a cookie:

set-cookie: muc=4673c8f0-5aef-45eb-8e4b-ab06bc59944c; Expires=Wed, 14 Oct 2020 10:10:19 GMT; Domain=t.co

xxs · on Oct 14, 2018

location.replace removes the back button interaction, i.e. history.

I have always preferred to use it (location.replace) within the same site. Also it allows to better control browser cache policies.

Although it has been almost a decade since, I doubt much has changed.

ceejayoz · on Oct 15, 2018

> location.replace removes the back button interaction, i.e. history.

A 301/302 redirect works just fine for this.

TeMPOraL · on Oct 15, 2018

Yup. The RFC is literally full of phrases like "status code indicates that the target resource has been assigned a new permanent URI and any future references to this resource ought to use one of the enclosed URIs. Clients with link-editing capabilities ought to automatically re-link references to the effective request URI to one or more of the new references sent by the server, where possible." (emphasis mine).

burtonator · on Oct 15, 2018

What bothers me as someone who works with web standards is that these URL tracking services should have been fundamentally rejected as they're only used for tracking and completely unnecessary.

Additionally, they're not actually part of web technology due to Twitter's ToS...

I run a web crawling company (http://www.datastreamer.io/) and we license data to other companies based on what we crawl.

This really opens up some weird situations for us...

If a URL is copied and shared OUTSIDE of Twitter but behind a t.co URL you can't access it without agreeing to their ToS even though the link might be to the nytimes or some other service.

I was initially upset about the GDPR but I'm starting to see the light of day here.

You can't have your cake and eat it too. You can't both be on the Internet but then put up an insane ToS claiming you have rights that restrain Internet users.

It's like standing on the street corner and yelling and then saying everyone around you owes you royalties because they're hearing your copyrighted speech.

pacifika · on Oct 15, 2018

They might be unnecessary in this case but not always. For example I used to work with elearning materials which did outbound linking to other materials, which might change, or be on services not under our control. Being able to manage the link endpoint without having to republish the materials is a big win for time/effort and sometimes is just not possible.

zrobotics · on Oct 15, 2018

As an end-user of similar types of materials: no, I will emphatically state that using a link shortener makes the material worse. If you don't update the link and it points to a broken page, at least the URL normally has enough information for me to Google the underlying material. A shortened link loses all of that context.

mrep · on Oct 15, 2018

> You can't have your cake and eat it too. You can't both be on the Internet but then put up an insane ToS claiming you have rights that restrain Internet users.

Can you explain that further because you pretty much to have to have a ToS if you don't want to get sued to death for any moderately sized website? The WWW is not a complete free for all.

crunchlibrarian · on Oct 14, 2018

I just wish these awful link shortener/trackers were faster, on lower end network connections you have to sit there and stare at the waiting for t.co for two or three seconds before you actually get the link you want.

userbinator · on Oct 15, 2018

In my experience it's the target that takes the most time to load; the shortener itself is usually quick.

underwater · on Oct 15, 2018

I believe that some of the weirder redirect methods are aimed at preventing the browser from forwarding “Referrer” headers to the destination site.

the8472 · on Oct 15, 2018

That can be achieved with a much simpler <a rel="noreferrer" ...>

meowface · on Oct 15, 2018

Not for IE 11 for Windows versions below 10, or IE versions below 11 for all OS versions. Source: https://caniuse.com/#feat=rel-noreferrer

jwilk · on Oct 15, 2018

This works with JS disabled:

https://caniuse.com/rel-noreferrer

webdevetc · on Oct 15, 2018

I think the http-equiv="Refresh" redirect is done so that the http referer header is from t.co, and not twitter.com (or whereever the user clicked the link from).

(I don't think rel='noreferrer' is fully supported by all browsers)

UncleEntity · on Oct 14, 2018

I don't understand...

I used wget to get a t.co and original link (from the sibling comment) and diff showed no differences in the fetched pages.

--edit--

So HN is not a discussion site then?

LeoPanthera · on Oct 15, 2018

They detect curl (and wget) and serve up a "real" redirect. You need to spoof a real browser user agent, like I did in my comment above.

kirillzubovsky · on Oct 15, 2018

I remember reading a HN thread a few years ago where this was suggested as the cheapest way to create short links. Instead of running a server with routes on it, you just need to generate 1 static page with this meta tag per each link, then it's always there. Could it be that Twitter folks were simply trying to be efficient?

adanto6840 · on Oct 15, 2018

Um, they have to run a lookup to find the 'value' for the given 'key' regardless...? I cannot think of any positive value for the user here -- it's non-standard & slower. 3XX redirects have been around a long time and basically every single client out there knows how to use them, and those that don't can look the status codes up to see how they should handle them if they want to.

AFAICT this is purely to allow for the 'pseudo injection' of the third-party JS, presumably for tracking purposes...

Only question I'd have is why they can't read the cookie server-side instead, but I'm guessing there are cookies on other domains that their JS is looking for? Haven't done web stuff in a few years so I'm behind on CORS-ish pros/cons/knowledge.

icebraining · on Oct 15, 2018

I'm guessing there are cookies on other domains that their JS is looking for?

Nah, the browser doesn't let you do that. This SO answer suggests it's to pass the Referrer header so that the destination site knows the user came from Twitter:

https://softwareengineering.stackexchange.com/a/343667

Leace · on Oct 15, 2018

> Could it be that Twitter folks were simply trying to be efficient?

That wouldn't explain the User-Agent sniffing (curl gets a proper HTTP redirect).