Hacker News new | past | comments | ask | show | jobs | submit login
How MTA shut down my app for Penn Station commuters (medium.com/alexkharlamov)
168 points by zavulon on July 12, 2017 | hide | past | favorite | 118 comments



I can see why the MTA wouldn't want you telling people to go to a track before it is actually assigned -- if the train ends up on a different track, now you need to get all those people back up, off that platform an on to the new one, clogging stairs that that people who actually wanted that platform might be trying to use too. Obviously it would be nice to assign tracks earlier, so that people can head straight to the right platform, but sending people to potentially the wrong platform seems even worse.

EDIT: Penn is extremely platform/track constrained -- NJ Transit, LIRR and Amtrak are all sharing a fixed number of platforms, some of which are too short.

To maximize platform utilization, they have to wait until the last minute to finalize track assignments -- if you reserve one too early and the train ends up late, you're wasting an empty platform. Once you send a horde of people to a platform, moving them to a different one is a challenge (stairs/bottlenecks, communication, etc).


It's more serious than that even, it's a safety issue. The platforms are fairly narrow and have no railings, and they aren't designed to hold two full train loads of people at the same time.

If a full train pulls up and the platform is already full people are going to end up on the tracks. I haven't seen this at Penn but I have seen the issue happen in the subway when there are train delays. When this happens the MTA has to hold the arriving train in the station with the doors closed and clear the platform before opening the doors to let people off.

It's a nightmare. Their concerns here seem completely reasonable.


> The platforms are fairly narrow and have no railings, and they aren't designed to hold two full train loads of people at the same time.

It seems pretty unlikely that 100% of the people on the outbound train would use the app and trust the historical data to go to the platform early.


It doesn't have to be 100% - a significant number of waiting people will still present a safety issue.

The platforms are very narrow for the LIRR tracks, so holding 200% of a train (outbound and inbound) would be sheer pandemonium, and would practically guarantee someone falls off the platform. Even holding 100% of a train (just inbound, disembarking passengers) is already straining the platforms and methods of egress to their limits.

Holding even just 110-120% of a train load is very much a safety issue. I'm with the MTA on this one - the "mad dash" is horrifyingly inefficient, but is the safest course of action.

Of course, the correct fix to this is to fix the platforms such that passengers can wait at track-level without safety issue. But, of course, that's a multi-billion dollar problem nobody seems willing to touch.


People in that station actually talk to each other, so a few % of app users in the crowd might result in a large number of people knowing.

I'd hate to imagine what would happen during peak if a train came in on an unexpected track AND a different train on the expected track - it is not unusual for multiple trains to be announced at once. Then you'd have a pile of people trying to go up those narrow stairs while a pile of people were trying to go down.


Once people see others walking towards a platform (particularly if they just checked something on their phones), they tend to do the same.


But there are lots of trains leaving at any given time going to different places. There's no way to know which train somebody is heading to without asking them.


There aren't that many leaving simultaneously - plus they tend to be slotted into the same group of platforms.


Yes there are. That's the whole problem. It's complete chaos. There are several hundred people huddled in one area looking at one giant screen and there are several trains being called in sequence. How would you know that a given person was waiting for the 4:30pm Acela to Union Station to be called and not the 4:33pm Northeast Regional to Boston?

As somebody who commuted for a long time at peak rush hour back and forth from Penn Station to D.C., I can tell you that following people who "seem to know where they're going" would not be a reliable strategy.


There are dozens of platforms going to different places. Picking a random stranger and following them isn't going to get you where you're going.


So what you mean to say, with safety issue, is that government-provided infrastructure is so badly designed and ill-equipped to be used at the necessary scale attempts to improve matters must be stopped !


All infrastructure is a matter of trade-offs. Sure, they could have built the platforms 3 times wider, with funnels leading to the doors, and gates that open/close to prevent people from falling onto the tracks. But that would be an extreme increase in cost to build, and wouldn't lead to a significantly larger number of people able to take the train. This app doesn't improve matters: it causes an additional "tragedy of the commons" situation. By being the first one on the platform, you may get to claim your seat, at the common cost of making it harder for everyone else to get out of/on to the train.

Yes, at peak times there is congestion (just like at peak times there's highway congestion), and the existing infrastructure (built mostly before 1930) was built to be significantly safer and more efficient than existing systems. And 90 years later, it still is usually the most efficient and safest mode of transport in NYC. But given the record number of subway riders and record age of the infrastructure, it takes everyone pitching in a little to not make the situation intolerable. Part of that "pitching in" is "Don't go onto the platform before the train's riders have left".


Except that (correct me if I'm wrong), the platform is displayed 10 minutes before the train arrives, so there is already people waiting when the train arrives


It is displayed 10 minutes before the to train leaves. If a train is being turned over it will have already arrived and unloaded at that point.


In my experience after commuting for a few months, this is never the case. Even when the track listing is displayed 10 minutes prior to departure time, the train usually won't arrive for another few minutes. You get crowds of people where the doors will be. It's really not safe, although I've never seen someone fall so it can't be that bad.


The app isn't really an improvement if it has mass adoption, you'd just move the rush for seats from the station hall to the track platform.


It's midtown, in New York City. Infrastructure is space-constrained, a problem we've had since soon after the Dutch left. Shouldn't be a shock to anyone.


This app is a classic tragedy of the commoms. If everyone used the app, it would be worse for everyone.


Yeah, it seems obvious that they purposely don't assign the track until the last minute for a good reason, so of course they'd want to shut down any software to try and circumvent that system.


Have you ever taken a rush hour train at Penn Station? The alternative (i.e. current system) is everyone congregates on the concourse staring at the platform screens, and when a new one pops up hundreds of people flood in the direction of that platform in a mad dash. It's a mess.


It is exactly this alternative(the current system) that the Amtrak/MTA/LIRR prefer for safety reasons. Amtrak which also has the same procedure at Penn states this here:

http://www.slate.com/blogs/moneybox/2013/07/17/amtrak_s_unpe...

Although it is indeed inefficient and the MTA is completely inept, the platforms for LIRR trains are very narrow, they are shared by two trains and the escalators to get up to the main floor are also very narrow and crowded already.

So although the "mad dash" is in fact a mess, its a safer mess than overcrowding the platforms on the tracks.


I wouldn't call the MTA inept, I'd say the fact the United States as a whole doesn't invest a substantial amount of money in improving the transit systems for the nations most important metropolitan area is what's inept.

I certainly don't support cutting pensions or wages whicg means money needs to come in and Albany consistently constrains the MTA's budget requests and needs. There's probably waste but this alternative is ridiculous and Cuomo's billion is a drop in the bucket. We should probably be realistic as a nation and invest a trillion dollars in the system.


I think this is also dehumanizing.


Mass transit is dehumanizing. Cities are dehumanizing. A significant amount of the way that giant infrastructure works is making assumptions and rote; Turning the human problem of "how do I get here" into a mechanical process of "There are two ways to do so, X and Y, each of which follow these steps".

This is great for efficiency - that's why it's done. It's terrible for humanity. It's stagnant, it kills creativity, it leads to these incredible tragedies of the commons when so much relies upon the commons.


> It's terrible for humanity. It's stagnant, it kills creativity

Your other points may be valid but I'm not so sure about this one. Artists tend to cluster in big cities, often with mass transit (and sometimes without, like LA). I sincerely doubt there's any sort of correlation between mass transit and creativity.


Mediocre artists cluster. Good artists hermit. Great artists do both - But don't maintain city apartments.


Source?


I agree with your points. I am curious about your use of the phrase "tragedy of the commons" though, specifically how it relates to mass transit.

I would think the "tragedy of the commons" would people who elect to drive a car during rush hour rather than taking a similarly viable mass transit option if one exists.

Or are you using the term just in the context of overpopulation? Thanks.


I'm really not as convinced that mass transit is a good deal. Certainly, there are some places where it is, and some huge benefits. Less time/money/space spent on parking, for example. But the oft-cited statistics about energy-use-per-passenger-mile, while true, are far from the whole story - The human cost in additional time spent commuting and inflexibility of schedule are very real. It takes me twice as long to get to work by transit as by car. It's just barely on the cusp of not worth it - and significantly because SF lacks in parking. It means that I've got to time my comings and goings carefully - I used to work 40 hour weeks as 2x12-14 and then short days the rest, but I can't anymore, because if I stick around till midnight coding I can't get home. If you look at it purely from a "how much does commute cost" perspective, sure, mass transit looks cheap. But it costs humans real time to do. Time is money, and the hour a day I spend in additional commute time, trapped in a small box with smelly homeless dudes, is really not worth my time. It's only because of 'opportunity cost' that I bother with it - I couldn't spend that time making money, and it's value to me in additional entertainment is significantly discounted from my normal rate only because... well, I think I've just talked myself into quitting and getting a job literally anywhere else, because I'm utterly sick of the commute. And at 45 minutes each way, it's really not that bad!

To a certain extent, because it's so cheap to aggregate, we lose out on opportunities to differentiate. Rather than spreading out so we work close to home, we build giant throughfares that make for clear divisors - And then do insane things on top of that. Look at the commute times in and out of SF - the traffic is really bad flowing both into and out of the city, because lots of peninsula-working persons want to live in the city and lots of peninsula living persons want to work in the city. (I say this with great hypocrisy, living on the peninsula and working in the city, but at least I'm in Millbrae and not Sunnyvale).

The same thing has happened in manufacturing - It's so cheap to produce mass-market products, and they do enough in most cases, that it's basically impossible to find semi-niche products in many areas. You can buy cheap Chinese goods or pay 20x the cost for high quality American made, but there's no mid-market anymore. Every so often a product comes around that is mid-market, but when people flock to it, it inevitably goes down in quality. My example is Lands End jeans - In the 90s, they were great. Then they got popular, got bought out, and are now just another branding for cheap goods.

When working from home becomes more acceptable, cities are going to look like things of the past - Full of only collectivists who can't live without someone to praise them at every corner, and the poor and downtrodden who've gotten stuck in the ghettos. This is, of course, cyclical, and been given several names - "White Flight" being the current pejorative for one of the major cycles. But I'm looking forward to being a Solarian[1]; VR for interaction, Automated cars when I need to be somewhere or to transport goods, and high-bandwidth network links for everything else. I don't see anyone else arguing for lining fiber everywhere we run power to, but when I do, I will gladly vote for them.

[1]https://en.wikipedia.org/wiki/Solaria#Isolationists


I see, so the overcrowded cities(and by extension the mass transit) are the "commons" by which the individual is trying to reap the biggest reward from. Thanks.


Yet, at the same time, people living in cities with mass transit vs suburban sprawl (admittedly still not an 'ideal') prefer the cities for the increased sense of community, higher sense of dynamism, and more 'creative' things going on than what they might get in their their non-mass-transit suburban coutnerparts..


That's definitely debatable; I'd guess that many people globally live in cities because that's where the work is, not for any of the other aspects you mentioned.


I mean, I don't think your point is wrong but I would hardly call cities, at least good ones, terrible for humanity or killing creativity.


Following some rules is dehumanizing?


Being treated like cattle shuttled down a narrow pathway. Being forced to wade into a compressed crowd of people who apparently forget all manners and will push and shove you.

It's basically one of, if not the worst parts of every day. And there is nothing you can do to avoid it.


The alternative is having people actually end up squeezed in front of a killing machine (train) which I would think is more like "being treated like cattle" in the part that I care about than the "I have to walk down a small hallway.

Also, we treat astronauts like cattle. Squeezed into small areas, down tiny hallways. The Hugh-manatee.


There's always Port Authority!


The fact that they dont have more platforms and more trains is dehumanizing because it causes overcrowding.


This mad dash is a great secondary tourist must see.

The island's population doubles (1) each workday.

This view of everyday life for many is worthwhile to observe.

As well as "dont talk on the train" rule - but you have to ride at commuter times

(1) nyt 6/3/13 commuters from the other boroughs and outside the city nearly double Manhattan’s population, from 1.6 million to 3.1 million.


This is absurd. After commuting for 1 year less than a year I knew which track to go to for any of the trains I ended up on. And more importantly where to stand on the platform to be at a door when the train pulled in.

The idea that somehow it's unknown or occult knowledge as to which track to go to is silly, when they announce the track for the train over the PA, they sometimes say 'This is a track change', and the few hundred people who are on the platform already have to trample back up to the concourse and then back down to the new track.


You must be using the LIRR. Things are not as consistent on the NJTransit side if the house. I know with about 90% certainty which of two platforms to go to for my train. Sometimes, like Monday, they throw us a curve ball and board us on the other side if the concourse. Occasionally we even board on track 13 in the LIRR area.


I think the idea is that having an app encourages more people to do it, not that it can't be done without an app.


I've only done much less hellish Grand Central version, and I think Metro North publishes track assignments much earlier.

That said though, just imagine how much worse it would be if that mass rush was between platforms, fighting opposing traffic, instead of just from a (mostly) open waiting area.


>"I've only done much less hellish Grand Central version, and I think Metro North publishes track assignments much earlier."

There is so much more space to wait for track assignments as well as space on the grand stairwell to get down to the track in Grand Central(even with tourists) that I don't think they are comparable at all.

If the Metro North does publish assignments much earlier it's likely because its much safer to do so. Although I don't think I've ever seen people waiting on the platforms for Metro North trains in Grand Central.


Yeah, that's fair. If I have to take LIRR, I'll generally use Atlantic and switch at Jamaica if needed -- anything to avoid Penn.


If you're trying to get a seat on a train that originates at Penn, changing at Jamaica doesn't help, however.


I don't use the LIRR tracks often, but many of the Amtrak trains arrive at the same track or 1-2 tracks. Veteran travelers know where to go, casual travelers are stuck in this maze that is very difficult to navigate.


Thing is though, they could be providing that data MUCH earlier both on the displays and to a phone app. It's not like they don't decide where a train is going until it gets to the station, that would be ridiculous, they'd have to know a while in advance to make sure the track is available, clear, operating correctly, and do all the necessary switching (and more importantly, do so in such a way that doesn't impact other trains).

MTA's appeared a number of times on this website and by all accounts it's where tech innovation goes to die, most of their service is still based on switch wire systems that were built in the 60's. Whether that's due to bureaucratic inertia, inadequate funding, or unions or whatever I have no idea, but that's the real problem.


Based on the other comments here, I'm guessing they don't post train numbers until the arriving train has disembarked so its passengers can clear the platform before the horde arrives. It probably has nothing to do with technology.


Given the other comments on this issue, it's not so much that MTA hates innovation, but that those who would be "disruptors" don't actually understand the entirety of the problem.


For about a year a decade and a half ago, I commuted into and out of Penn Station with my father who had been doing so since the late 70s.

Not only did he know which track any of a half dozen trains that he might take would come in on but he knew where the doors would open for each of the tracks. And he wasn't the only one either. If you went down before they announced the tracks you'd see little clusters of people waiting apart from each other on an otherwise empty track. We'd usually go to the same car in order to reduce the distance on the other side. Other commuters would do likewise and so trains would have a contingent of regulars.

There's a fascinating kind of micro-expertise that develops when you do the same thing over and over again.


same thing on the subway, with an extra dimension of starting on the local, switching to the express, then transferring at another station. Pretty quickly you get to know where to stand to get on the most efficient end of the train to be closest to the stairway up to the transfer platform, etc. Makes an otherwise tedious trip somewhat interesting as a puzzle.


And on some systems, one can usually tell if an inbound train is running faster/slower than normal and where to adjust position accordingly to end up at the right door/position/etc..


The London version is now an app. I think it used to be available as some kind of map.

http://www.tubeexits.co.uk/


It's also a feature in Citymapper. Only front/middle/back but the integration means I use the information more often

https://citymapper.com/news/726/where-to-get-on-the-train


I mean, everyone I know that takes the LIRR lines up where I do and knows where the train (usually) stops. When it's a short train is when we panic...


As an interesting, although unrelated aside: platform data is available in the UK's National Rail's feeds, but the terms and conditions[1] explicitly prohibit displaying platform numbers early, as mentioned in their developer guidelines[2].

I guess the wording doesn't technically ban you from displaying historical platform information, but that would likely be a bad-faith use of the data anyway...

[1] http://www.nationalrail.co.uk/static/documents/Terms_and_Con...

[2] http://www.nationalrail.co.uk/static/documents/Developer_Gui... "Occasionally Time-Bound Data will become available through NRE feeds before it is ready to be published to the public. [...] One such example of Time-Bound Data is platform numbers. Early display of platform numbers, particularly at origin and destination stations, can lead to platform overcrowding and/or staff not having sufficient time to prepare the train for oncoming passengers. In some instances, platform numbers will be available in Darwin before being displayed on screens in stations."


UK rail data is quite open in comparison to this story, for instance check out these live track diagrams (of London Waterloo & many other areas) which show real-time movements of trains. This is how I often find out the platform numbers prior to travel and I also get information before staff during disruption.

http://www.opentraintimes.com/maps/signalling/wat#T_WAT


There are maps of many more routes at traksy https://traksy.uk/live/M+2+CARLILE


Great to see another good implementation of rail maps, I do however prefer the opentraintimes layout as its easier to scroll through while reflecting the look of operational systems.


Once upon a time, the platform used to be displayed in the National Rail app, even when it wasn't shown on the departure boards…For a long time I used that to good effect to get a seat on rush-hour trains at Paddington.

That changed a few years ago, interesting to see the formal policy behind it, and an idea for a side-project :-)


Would have been interesting to see if you could reliably crowd-source the data from commuters themselves.


Was going to post this - have users submit the track data! That also might make it easier to expand to other stations where you aren't as familiar with the track layout.


Google almost certainly already could, without even asking the commuters, since they have real time location for so many Android phones. But there's a chicken and egg problem: it's hard to get the data before you're able to offer anything in return.


Chicken and egg yes, but he had the recent historical data and a bunch of paid-up users.

I'm just amazed this app was even necessary in a first-world transport system. For all their faults, the UK rail network is pretty good at providing timely information (and as some other comments have mentioned have solid APIs for this stuff). Same with the LTA[1] in Singapore.

[1] https://www.mytransport.sg/content/mytransport/home/dataMall...


Thank you! I definitely considered doing something like that, however — the amount of work required to crowd source the data would be significant. There would be a “chicken & egg” problem — the app wouldn’t be useful to first users, and without the first users, there wouldn’t be enough data to make it useful for everyone else — so this would require a fairly large marketing campaign to get off the ground. But — one day I may still do it :)


There's hardly a chicken & egg problem if you already had a bunch of happy users. Just display "no data available" on the first day, and have a banner in the app explaining the situation, and ask them to contribute today's data. You'll most certainly get a handful of users who want to help. After 3 or so days of this, the problem will be solved.


This is something that Clever Commute does and it works really well!


For Penn Station in particular, one can beat the crowd by heading downstairs and looking at the small CRT under the stairs. You will see the platform assignment at the same time as the crowd, but you are closer to the platforms and don't have to use the same stairs as everyone else.


There is a similar issue with the MBTA commuter rail in Boston. At south station, the signage in the main waiting area and at the platform only announce the train 10 minutes before departure.

However, there are a couple of tools to get around this. For Starters, each revenue service train has a number (like 508). In a push-pull system like the MBTA the first coach in the consist is nearest to to the platform. This coach has a number like 1827. Luckily, the MBTA publishes which trainsets will be assigned to which departures. This makes it easy to know that 1827 is for train 508. All you need to do is walk out to the platform and see if 1827 is sitting there.. if so it's your train. You can get this mapping of trainset to train via a bunch of apps.

This has been further expanded upon by micro-social apps like "MBTA Rail Tracker" which has a comment section for every train. The whole thread is basically "which platform" followed by a bunch of responses and then snarkiness on why the trains are horribly late all the time.


I've been collecting Penn Station NJT and LIRR track assignment data for almost a year now. I've been intending to use ML to predict track assignment, but haven't gotten there yet. I should post it to Kaggle or somewhere similar.


Do you think the track assignment is something that's learnable? I feel like there is some schedule where, under normal operations, the train will end up. But then there are unpredictable events that cause this to change.


I think it is. Like in the article, you can see that given certain parameters like train number and time, you predictably end up with a certain track assignment. If you can also consider track assignments leading up to this time, you can discount those tracks. For example, the 8:02PM Long Beach train always leaves on track 18, except when any 7:56 train left on track 18. As a novice to ML it's currently beyond my ability to model, but my intuition is that it should be doable.


Interestingly enough, I have had this same exact idea for an app, also being one of those commuters who rely on LIRR trains, also being someone who noticed that my peak train home arrives almost always on the same track, and also being one of the (few) commuters who go down to the platform before the train is called.

It is a shame that this had to happen to OP. I personally don't see the app as an issue, because the amount of commuters that would actually use such a thing is rather low (seriously, stand in the concourse and look around at people, a large amount of them are not using their phones).

My original idea was to scrape the data from their webpage, or see if there was a way to get the data from the Train Time app's Arrival Countdown page, but according to the post, it has been removed from their website, so there goes that idea.

The lesson to me is clear though: don't try to make an app that would make commuter's lives easier. The MTA does not seem to want that, especially during the track repairs. I find it interesting that it was shut down so close to the start of the Penn Station track repairs...


Why didn't he switch to a crowd-sourced data model like GasBuddy?


Was thinking the exact same thing. One possible reason would be wrong info, but you could use some sort of weighted algorithm that would need more than one submission to match and/or trust level per submitter based on historical correctness.


Exactly. Like what Yelp / Foursquare / Google Places use.


See my comment above - it would require a significant effort to get off the ground, as the app without data would not be useful to first users, and without the data provided by those first users, it wouldn't be useful to others.


They already had an app and users. They could have extended it with this functionality.


How do you get people to input the data in time for it to be useful?


Build a reward system… reputation might be enough?


I wonder if there's any correlation or way to know based on where a train is arriving from to know where it's headed. If that were the case, you could potentially either look for that data, or if it isn't available, have a Waze-like app where commuters can report which train they are on and which track they are pulling into.

As an aside, I wonder whether the OP considered charging a hefty price (>$20 or so) for the app. This would lessen the number of users and page requests.

Having travelled from Penn many times and having had to deal with the massive stampede that ensues when they suddenly reveal the track number, I would pay dearly to have this info available to me. Even if I used it once every year, not having to fight crowds or look forever to find an available seat, that would be money well spent. Additionally, many of the peak travelers are business travelers who probably have more money to spend on an app. I realize this isn't the most democratic solution, but it could be a way to lessen the crowds nonetheless.


It's a great explanation of the need for this app, and how zavulon fixed a problem.

But organizations change how they work with customers/the public all the time, and I couldn't help but say Waaah.

Things change; tell us about the alternatives you tried to move around this obstacle.


Ok, I understand that the Railway Authority don't want people to guess / predict platform numbers because of various reasons (problems with platform maintenance, safety and last min changes etc.

But what I don't understand is that why can't they change the seat allocation process such that people can reserve remaining available seats on the trains for free if they have arrived in station lobby? This can result is reduced rush to grab seats and will help people to make a more informed decision around whether to wait for the next train.


it takes longer to load a train with assigned seating -- that's why say, Amtrak might do it, but commuter rail trains, that at peak run minutes apart and want to minimize dwell-times at highly contended platforms, optimize for loading speed.


Crowdsourcing the track data based on where commuters' phones travel, something like waze for LIRR should help.

Instead of asking users to report the track info, it can be automated to a great extent with the use of BTLE beacons on each platform (but that again needs permission from MTA / Penn station authorities unless a long lasting BT 4.0 beacon can be sneaked in somewhere).

Sad that MTA is not cooperating. Can understand why the app had to be shut down!


Why don't you sue them? It's a government agency, and I believe that the data should be available under New York's public records laws (I only have limited experience with NY's laws and IANAL). From personal experience, being nice to a government agency never got me anywhere, but our lawyer has a 95% win rate against them.


There is no law that NY government information has to be available through a realtime API.


Why would you want to know the train more than 10 minutes before departure? Surely you'd just get on the immediate next train?

Are tickets only for a specific time of train?

Or are trains less frequent than every 10 minutes?


The next train for your destination may be 30 minutes later, or over an hour away. LIRR has about 125 stations in 8 zones. It's not called Long Island because it's small and round.


www.njtwizard.com

We do this for NJ Transit with about 90% accuracy. We spent at least two months figuring out how it all works, though. It's non-trivial to do right.


How do you do it?


Whoever decided to close his access to the API should be fired.


Sad to see you downvoted. I believe that government employees who actively prevent access to publicly-owned data should be fired immediately.


It sounds like The API changed. But same sentiment -- that was a shit design decision.


It sounds like locking down the API was an entirely deliberate decision made at the same time as removing platform data from the website, which means Chesterton's fence[0] applies - we should work out why that decision was made before criticising it.

Discussion upthread is about commuters being pushed onto the tracks, which sounds like a good reason not to display data that isn't 100% certain.

[0] https://en.m.wikipedia.org/wiki/Wikipedia:Chesterton's fence



I've never heard of "Chesterton's Fence" before and appreciate the introduction to the term. That said, I don't think your application is correct. The OP in this case is not the agent of change — he did not "reform" anything and does not have the capacity to act, he just analyzed/synthesized existing data.

(Also, on reflection, "Chesterton's Fence" is an unnecessarily complicated and overly verbose way of saying: "The MTA probably has a reason for the change.")


> we should work out why that decision was made

Good luck with that. I'd be amazed if you got a response.


The API didn't change, they just made it private "for MTA use only"


Why the OP did not continue with scraping the website then? I think there will be no legal issue if he does that.


> They also stopped publishing the track number on the website — so going back to scraping was not an option either.

So the only way to gather the data would be manually from the station.


Carefully placed raspberry pi + OCR?


I'd imagine that placing a "homemade" electronic device in a densely-populated, public space is going to attract a lot of unwanted attention.


With the right 3D printed case you can make a raspberry pi look pretty benine.


From all those military guards that parade up and down the station


You joke, but Penn Station and Port Authority are effectively occupied territory. There are at least four different law enforcement agencies present and patrolling, stopping and frisking anybody who "looks out of place", hassling homeless people, and generally making you feel "safer".


Just add the ability in the app to take a photo of the screen and use OCR to identify the train/platform.

All he'd need is one or two users to take a photo of the screen and he'd have his source of historical platform data.


Ahhh.. Missed that. Thanks!


"They also stopped publishing the track number on the website — so going back to scraping was not an option either."


Scraping the MTA site against the TOS, and intentionally obfuscating requests to limit the chance of the MTA identifying and shutting it down? Yeah, no surprises here.

EDIT: Oops, missed that he didn't end up going live with the scraping.

It would still be great for the MTA to publish this data intentionally; but I'm sure the potential for trains to switch tracks and the associated backlash when they do is what prevents them from doing it.

Perhaps they could adopt the airline model, where there's a best effort to reach a particular pre-announced track, with notifications ringing out when they can't.


The author didn't do that, though. As far as MTA is concerned the app was always using an API he had been given a legitimate key for.

As an aside, fuck websites that prohibit scraping. If you send me some bytes I'm gonna do whatever I damn well please with them.


In that case screw your API, if you allow DDOS I may as well do it for the lulz.

Isn't it the same approach? If you can abuse a site you will, because they opened themselves up to the public?


That's an illogical extreme. I don't think anyone thinks that, rather, the question is whether this app was an abuse of MTA's system. They claim wide distribution of the app is a hazard. Really, it's not like MTA is stopping anyone with the inclination to do this themselves. If it's as reliable as the author says, you could do it without any automation if you were patient enough. Downvoting because that kind of exaggerated strawman reduces the quality of discussion with no benefit.


Thanks for explaining the downvote. But this is not exaggerated nor a strawman. It mirrors the extreme "screw your website" reaction and investigates whether the logic for such an extreme disregard holds.

So a website allows you to "download some bytes". Yes you can do anything you want with THOSE bytes. Does this mean you should make an app to let people scrape the website in an automated way and possibly overload it?

If airlines, restaurants give something free or discounted, does that mean you can make an app to systematically let millions of people take advantage of the arbitrage?

XKCD even has a comic for exactly this situation: https://xkcd.com/1499/

I wasn't the one who said "FUCK WEBSITES". I used a nicer word and questioned the logic. Yet I get downvoted while the parent comment with expletives is upvoted.


Pretty sure that if you read a little further, you will find that the author decided to NOT go with the TOS violating process, and instead used an official MTA API.


> Scraping the MTA site against the TOS

Where does it say it was against the TOS?


They could publish the log of passed trains, though. This will hurt nobody.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: