Hacker News new | past | comments | ask | show | jobs | submit login

His post is chock full of BS:

> If its possible and legal for Google, then why not for any and everyone else to also index and offer access to the same data.

Because Craigslist isn't suing Google. If they didn't want Google indexing their stuff, Google would comply.

> Google doesn't get special secondary property rights to privatize public data to the exclusion of anyone else.

They can and do. That's why it's beneficial to set your user agent to Google Bot when browsing news sites.

> Equal access to exchange data and search data is a principle in parallel to the notions of net neutrality.

Full of shit. It takes effort to provide the service that Craigslist does. Claiming that you have a right to that data is wrong, and moreover has already been decided by case law (you can find links in other comments on this page).

> And look at what a 3rd party application (craiggers.com) can do in recreating the whole of Craigslist in a format that gives access to data in a way that is not remotely possible in the legacy Craigslist offering.

There is a reason Craigslist doesn't make that data available - it would reduce the utility for sellers of their website, reducing their revenue and potentially drying up their business. Trying to squeeze it out of Google's cache is still copyright infringement.

> Note, Craiggers does NOT disrupt the existing Craigslist revenue model for Craigslist.

Yes, it does. By scraping craigslist in an attempt to undermine their platform you are eliminating their site's relatively utility for buyers, which eliminates the impetus for sellers to list there.




All of your points rely upon the assumption that Craisglist "owns" all of the posts submitted. I'm not saying that's right or wrong, but if that is true then wouldn't that extend to Facebook owning all content submitted to their service, Twitter owning all tweets, Flickr owning all hosted photos, and Stack Overflow owning all submitted answers?


Craigslists owns the unique compilation of their listings. That's what is at stake here.


And there's no copyright infringement as long as your compilation, based on their publicly available data, is also unique, which PadMapper's is. This has lots of precedents dating back to services derived from phonebooks.


I am interested in this. Can you point me to one or more of these precedents? Thanks in advance.



Is it?

In my interpretation it is the content of the listings, not the compilation. He could compile the data from several different sources and present the same result.


What constitutes a "unique compilation" is subject to interpretation in a court of law.

Adding/removing/modifying and changing the arrangement or display of items in a dataset sufficiently constitutes a dataset unique from the one Craigslist offers even if it is largely derivative from the Craigslist dataset.

Craigslist could argue a trespass to chattels tort or file a ToS civil suit, but there isn't much they can do to protect a dataset.

The reason Facebook makes Facebook content largely available only to those who are logged in is to hide behind their ToS and prevent scraping whether centralized or distributed.


As far as I know, in the US they don't.


So what you're suggesting is that if a service put "noindex" in its robots/metatags, they would be somehow be overstepping the bounds of what they can do with their users' content?


Not at all... that's a bit of a strawman since robots.txt/metatags are used by search engines and not the public in general.

Not wanting your data on Google != Denying access to said content


> By scraping craigslist in an attempt to undermine their platform you are eliminating their site's relatively utility for buyers, which eliminates the impetus for sellers to list there.

This is exactly what Craigslist will argue, but it's hard to imagine how that could be true if the eventual downstream destination of the data always sends the user back to Craigslist to finish the process.


I really don't get the whole argument and all this Craigslist bashing. It looks like someone built his house on the grounds of somebody else's ground and now tries to rally up the internet because the owner of the land found out about the illegal house.

Is it not just like asking "why doesn't Google, Ebay etc let me scrape their database"? Just because someone found a nicer way to display, sort, relate the data, does give them neither a legal or nor a moral right to use the data.


The startups who are Craigslist bashing just have a vested interest in getting access to Craigslist content without any ToS. It is more like demanding that a property open up because there is gold on it and they are demanding that it be made available to them because they can serve gold to the market better than the land owner. From what I am reading though, the businesses are offering nothing in the way of compensation and this is more of a property grab.


I'm not disagreeing with you. I'm disagreeing with the foregone conclusion that a more liberal ToS would spell doom for Craigslist. A nicer way to display, sort, relate the data doesn't confer rights, but--as long as the end destination is Craigslist--I don't think providing access is a necessarily a losing proposition for either party.


It doesn't matter. It's a copyright issue. Craigslist has the right to protect their copyright, it can be proven that they have a unique database, and it can also very likely be proven that the scrapers knew they were doing something illegal (e.g. willful infringement) when the scraping took place.


I agree, it doesn't matter, but I think the reasoning is probably wrong.

It would have been better for everyone if Craigslist gave away the data minus whatever is needed to finish the transaction, and include in the TOS that that you must send the user back to Craigslist to actually finish whatever they're trying to do.


It doesn't matter. It's a copyright issue. Craigslist has the right to protect their copyright, it can be proven that they have a unique database, and it can also very likely be proven that the scrapers knew they were doing something illegal (e.g. willful infringement) when the scraping took place.

As far as I know, there is no database copyright in the US.


And you'd be proven wrong by a simple google search on "database copyright"... first hit [1] shows a compilation of laws and court rulings that support copyrighting collection of records... and this stuff has been around for a decade or more.

[1] http://www.bitlaw.com/copyright/database.html


Like Amazon's and Yelp's reviews, Craigslist would probably claim copyright on the posts themselves, not just the aggregation thereof.


Yes, but as far as I can tell, that gives them exactly bupkis on PadMapper.


I think you mean "bupkis." Dick Butkus is a hall-of-fame NFL linebacker.


Thanks. I want to get stuff like that right.


> They can and do. That's why it's beneficial to set your user agent to Google Bot when browsing news sites.

Jonathan, you can't be serious about this one. Why on Earth would you go on a public record and suggest someone to illegally impersonalize another company? Don't you know it may be a jailable offense, if Google (or any other company in that matter) decides to go after you??

> It takes effort to provide the service that Craigslist does. Claiming that you have a right to that data is wrong [...]

It is equally wrong to claim you have no right, just because it takes effort to provide Craigslist-level service.

> There is a reason Craigslist doesn't make that data available - it would reduce the utility for sellers of their website, reducing their revenue and potentially drying up their business. Trying to squeeze it out of Google's cache is still copyright infringement.

Uhm, providing one company data in a different template would reduce said company revenue? Excuse me, but under which rock have you been living for the last 20 years? You ever heard of "social sharing"? "like" button? Does "API" ring any bell?? Why do you think most companies provide those?? Most would die for other developers to actually spend their time and effort to build front gates based on their data. I hope you are not working on any startup bro, because you seriously have a shitty point of view!

> Yes, it does. By scraping craigslist in an attempt to undermine their platform [...]

What do you mean by "undermine their platform" ?? Why would you automatically assume that all the OP wants to do is to kill or undermine someones business?? Further, he would have to be retarded to do so! Why would he work on a project that the core is based off of external data and the same time wanted to... kill that data source?? You lack logic here, again.

> By scraping craigslist in an attempt to undermine their platform you are eliminating their site's relatively utility for buyers, which eliminates the impetus for sellers to list there

Yes definitely -- A website where I could see all the pictures of furniture posted on Craiglist on one page which gives me an easy access to scroll down in less than 2 minutes (to actually see what I want to buy), versus spending 45 minutes clicking on each and every post on the original Craigslist website -- yes, definitely because that evil website stealing Craigslist data made me save 40 minutes I will never use Craiglist as either buyer or seller ever again.

What a bullshit!


Holy shit, is this satire? If so, this is a perfect imitation of some of the commenters who go around here pretending that the entire world agrees with their minority viewpoint that neither property nor effort have any established value.

If this isn't satire, well all I can say is i hope you build something awesome one day. I presume your above manifesto can be construed as an invitation to insert myself between you and the userbase you invested so much to accumulate?


If I have a product and a set of users, and you have a way of increasing both my revenue and my users satisfaction by inserting yourself between us, please do. The OP's point is that undermining Craigslist makes no sense for Padmapper, since that's where the data comes from. If people who list apartments get them rented faster because of Padmapper, they like it. If people looking for apartments find them faster because of Padmapper, they like it. If it increases Craigslist's revenue, they like it. If all of these points are positive, it's hard to understand why Craigslist would be against it.

Please recognize the distinction. I'm not saying they're outside their rights to deny this access or that Padmapper is somehow within theirs by violating the ToS, just that if it's a win for everyone, shutting it down doesn't make sense.


Disintermediation -- getting your users used to coming to my site to look at your data is the first step to making you irrelevant and forgettable. PM knows who the renters and the listers are; it's a small step to convince some listers to list with PM first. Maybe PM will agree to repost on CL as well, but as long as listers come to PM first, the relationship between PM and CL is flipped.

Eric might protest that all he wants is to save apartment hunters from minutes of work. The reality is that PM's long-term viability depends on him getting acces to some of that listing fee cash, which almost defintely means reducing CL's revenue. Unless you think he's going to convince agents to pay twice to get in front of an overlapping set of viewers.


> PM knows who the renters and the listers are; it's a small step to convince some listers to list with PM first.

You need to learn a little bit about Craiglist. Every single section of the site has been copied over and over again, including big names like Angie's List, Ebay, etc. But yet its been so many years and Craiglist traffic and revenue continue to grow.

CL has such an incredible solid network effect that me and most users couldnt care less how awesome the website to quickly browse CL photos grow. The moment they go solo and decide to disconnect, the moment I come back to CL to post/browse. Simple.


I still think it would be possible to work this out with a more creative ToS. Problem one could be managed by making it part of the ToS that you won't aggregate from other sites. Problem two could be managed either by making it part of the ToS that you don't post your own listings, or by making you pay to have them listed at Craigslist at the same time.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: