Hacker News new | past | comments | ask | show | jobs | submit login
Hadoop filesystem at Twitter (blog.twitter.com)
50 points by r4um on Oct 1, 2015 | hide | past | favorite | 3 comments



Hate to be the one to pick fly shit out of pepper but why "/user" and not "/temp"? I'm sure you'd get used to it over time but just for the sake of consistency if we're going to reconstitute vowels in the unix conventions, shouldn't we do it uniformly?

Anyhow, seeing an inconsistency like this from Twitter makes me feel a little better about my own shortcomings.


My reading is that /user is for data associated with twitter users/accounts, not meant to store things that go in /usr. /usr says system related stuff. /user is meant to be distinct from /logs and /tmp, not indicate a global, common utility/data directory.

    /user holds twitter user data
    /logs holds time stamped aggregated data
    /tmp  is for ephemeral data, it has a truncated name because that's 
          what people are used for this purpose
I wouldn't name it /usr if it held information about users either.


We delete anything in our hdfs /tmp after two weeks. It's very convenient for oneoff job output which you know doesn't have to be around long. /user is for important longer lived stuff.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: