Hacker News new | past | comments | ask | show | jobs | submit login
GFS: Evolution on Fast-forward (acm.org)
65 points by joeshaw on Aug 12, 2009 | hide | past | favorite | 12 comments



I think I independently came up with the same idea to solve the distributed gfs master problem: use a separate bigtable/gfs1 cluster for master metadata for gfs2.

I'm glad that I'm not crazy :)


Also, does the gfs1 still have single master, so that the gfs2 has mulitple Bigtable tablets serving as distributed masters for gfs2? Is this the cause for "In fact, it just makes the bottleneck limitations of the system’s single-master design more apparent than would otherwise be the case.", as stated in the article?


gfs1 is still single master, but the workload is much simpler in this case: it serves the gfs2 master bigtable cluster exclusively. Most of the documented gfs master failures are due to misbehaved map-reduce clients. Also the gfs1 master can be down for extended period of time without affecting the master operations, due to the nature of the cluster (you're unlikely to create a million files per second resulting in much compaction and splits in metadata tablets)

The quote you mentioned actually meant that if you use Bigtable on top of gfs1, the single master failure is more apparent due to the low latency requirement of the application that use the Bigtable.


Is this vacaya related to the vacaya of hypertable? :-)


Ahaa , we had just conceived this way several months ago in China ...


Google file system, nothing to do with evolution.


Why have google not open-sourced GFS?


Google culture doesn't open-source things because it believes in software freedom for moral reasons. It open-sources them when that provides it a business advantage.


Even though there's a good bit of info out there about it, it's still a significant advantage for them. It could happen eventually, but I doubt it'll be any time soon.


That, and who but their rivals would use it? It's highly tuned to their uses and well-maintained in-house, so there's no percentage to opening it.


I don't agree; as the article mentions, GFS is used for hundreds of different tasks within Google which encompass a massive range of use patterns. It's a general-purpose system, even if it didn't start as such.

It's no coincidence that the entire Google stack (GFS/MapReduce/Chubby/BigTable) has been replicated as open source: it's because it's a broadly useful set of tools for doing large-scale work with data. It would be useful for thousands of companies if it was open-sourced.

I'd wager that the reason Google isn't open-sourcing GFS is because it's a key part of the "secret sauce" which powers a lot of Google innovation: the sauce that allows a single engineer to slap together a massively scalable, useful application in his spare time. I think that separates Google from their competition more than any specific app does.


We seem to be in violent agreement. :)

Thousands of companies might use if it was open-sourced, but I rather think many of the serious users are competing with Google, or could be competing in the future. It being part of the secret sauce is exactly why there's no percentage.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: