Hacker News new | past | comments | ask | show | jobs | submit login
Scaling Early (slideshare.net)
18 points by nreece on Nov 14, 2008 | hide | past | favorite | 8 comments



Great read :-) Hooray for another Perl fan as well as for the simplicity and power of BerkeleyDB/other similar dbs. Have you look at Tokyo Cabinet / Tokyo Tyrant ( http://tokyocabinet.sourceforge.net/index.html )

Also, what is the URL of the project/site/start-up you're scaling. I am interested very much in seeing the seeds of this effort.


Tokyo Cabinet looks intriguing. One major limitation of Berkeley DB is that it cannot reliably store its files on a network filesystem (NFS or CIFS or what have you) because of unreliable POSIX semantics. It's a tough problem, but any idea if Tokyo Cabinet tackles it?


Using NFS doesn't seem like a good idea for this situation - from the operations perspective, NFS in production is going to be hell anyway.

A better way would be (since Tokyo Cabinet is thread safe) having a separate thread/process which receives messages (using a message queuing system) that update the database files (ala MySQL replications).

(Or you can separate out the db layer as a separate network service and partition the writes/reads to go to a specific several servers based on the key. Tokyo Cabinet even includes a memcached API compatible network layer for this - the Tokyo Tyrant).


The first slide says that the presentation is by Mark Maunder with an email address at http://feedjit.com so I'm guessing that's the site being optimized.

The content in the presentation does make sense for what feedjit does.


Pretty nice.

Hate to be a nitpicker but it's one of my pet peeves:

What the slides describe isn't scaling but performance tweaking. Scalability != performance.

A lot of people get that wrong, still annoys me, silly me... :D


Well, there's two ways to look at scalability:

- The ability to handle lots of load at all, which comes down to a stability and fault-tolerance issue.

- The performance of the system under load.

Both are pretty valid.


No, only the first is valid. :) Some people _believe_ it's #2 but it isn't, sorry. :D "performance of the system under load" is just that, performance.

See the Wikipedia article: http://en.wikipedia.org/wiki/Scalability


No, the latter does matter.

If a system's performance significantly degrades under load, it doesn't scale.

Quoting the wikipedia article: 'its ability to either handle growing amounts of work in a graceful manner, or to be readily enlarged.'

The first clause is performing well under load, hence the performance requirement.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: