This is off topic a bit, but it tells my story about building something on top of git.
Just several weeks before dropbox came out into light I have completed building a prototype of what came out to be a dropbox clone on top of git (in fact, I have had it working with mercurial and bazzar as well, I designed it to be platform independent) and thought I have in my bare hands a potential for a great startup (demos were working smoothly, auto syncing files between clients, web viewer, etc). It was a side project, which I used to work on late night and weekends, and I gained what I evaluate as great results with no much effort.
Yet, one morning, I was opening my browser, pointing, as usual, to HN and saw a post saying dropbox have raised this and that money from Sequoya Capital. I was eager to know what is this dropbox and what they do and was terribly shocked to find out they were actually doing the same shit I was in, but, for longer time, with more and smarter people and fundings.
Soon after, I dropped my project (BTW, it was named StoreAge), and yet could not use dropbox or hear anything about it for a very long time.
Today I am a happy dropbox user, and git user and waking up every morning wondering about what will be my next startup.
I think Git owes a ton of its success to github.com. Without this extremely well-designed, central repository for repositories, the uptake of Git would have been much slower, and faced much more resistance in the wild. Git is a great tool on its own, but having a centralized place for people to learn and use Git has been huge.
In addition to github, you have the switch to git of the linux kernel which really put the project over the other distributed revision control systems such as Darcs and Mercurial.
On the other hand, one could also attribute github's success to git's success. I suppose it's most likely symbiotic: github contributed to git's success and vice versa.
There are two kinds of articles about DVCSs: (1) git, hg, bzr are all better than whatever you're using now; (2) git is completely revolutionary, a whole new paradigm.
This is, of course, the latter sort. I'm wondering why hg, etc. are never called "a whole new paradigm". I've used all three. Clearly, hg & bzr are extremely similar, while git works a little differently (the staging area, the differing semantics of "add", etc.). But I don't see that git has significantly more awesomeness than the other two.
The second type of article is usually written by converts from legacy VC systems (cvs/svn) who have never experienced a DVCS before. They attribute the obvious improvement in usability to Git, never venturing to try anything else; possibly due to the same fundamental lack of curiosity or self-improvement that leads to using CVS/SVN at all (post 2005).
What's more, the migration from a single-branch to multiple-branch model is so empowering that such users tend to view all of Git through rose-tinted glasses forever after. Compared to that experience, relatively minor usability improvements like Bazaar's "every working copy has its own tree" or Mercurial's queues seem insignificant and/or unimportant.
... possibly due to the same fundamental lack of curiosity or self-improvement that leads to using CVS/SVN at all (post 2005).
I (and my successful small business) still use SVN, and have done so since 2007. This choice has nothing to do with "fundamental lack of curiosity or self-improvement." It's a choice born of careful reasoning regarding the suitability of using DVCS in a centralized organization.
I've been toying around with Hg for some time before Git. Indeed, DVCS by itself implies high degree of flexibility and whatnot.
However, I'd still like to single Git out, because of its storage format. In Hg, for every file in your repo, you get one storage file holding revisions -- plus one master file with changelogs. It's just boring files all the way down.
In Git, every object (file, tree of files, commit with a tree of files & parent commit(s)) are represented by hash. The data itself is stored somewhere -- in an interchangeable format (currently two formats are used: blob and pack with index). Storage is somewhat decoupled from toolkit. You can even fix a broken repo by literally copying in file(s) with proper content.
But the true power comes from somewhere else: you can envision defining own data types, beyond files, trees and commits, and plugging them into Git. And having them play along with the usual ones.
A general content-addressable storage :D
(Too bad Venti [1] was invented a bit earlier)
I believe Fossil (the DVCS [2], not the FS [3]) comes pretty close to that, too.
He doesn't seem to be claiming that Git is more revolutionary than Mercurial/Bazaar/&c. as a VCS.
I think it's fair to say that Git makes it easier to build some system other than a VCS from it, since all of the low-level commands for directly manipulating the repository are exposed in the shell. Contrast that with Mercurial, which doesn't even have a Python API.
(For the record, I have actually built such a system on top of hg, and I'm pretty familiar with git.)
It's kinda sad that people have kinda forgotten about darcs (at least in the mainstream) which is still an interesting project despite some of its shortcomings (and one of the first good DVCSs). That said I still prefer git.
I think it boils down to difficult to quantify factors: most notably, love. In spite of all of git's warts (or perhaps even because of them), it's very clear that git is a tool that was created by the people who would be stuck using it. I'm not saying that isn't true about the other ones. I'm just saying it just doesn't shine through as much.
What he's going on about is that git is an efficient implementation of a purely functional data structure on disk. He's advocating using it as one if you ever need one. This is tangential to its role as a version control system.
The vast majority of these comments seem to misunderstand the article as saying that Git is revolutionary for managing code.
That's wrong. The article is about using Git for managing data. The examples cited are using it as a backend for a distributed filesystem or a wiki. The OP is talking about Git as a revolutionary new kind of datastore and network protocol, not a revolutionary new kind of VCS.
I say this as someone who absolutely loves git and thinks it's the best thing to happen to VCSes in a long time: no it isn't. This kind of advocacy is worse than useless: when people realize that git isn't a revolutionary concept that will change computing and is really "just" a VCS (albeit a very good one), there will be a big backlash.
I'm the author of the OP. The comments here make me think (again) that the article wasn't clear enough: admittedly, when I first wrote it, I was just discovering git, as some of the comments said. The difference between me and perhaps many recently-baptized git fanboys is that now, three years later, I still believe exactly what I wrote. I just now also know why it came across the wrong way.
Here's what I was trying to get across at the time: git creates a whole new set of nouns and verbs for computer science that almost none of us have experienced before. Yes, it steals a lot of concepts from programs like darcs and monotone, and there are other things that do the same things that git does from a VCS point of view - but my focus is on the nouns and verbs. git exposes the plumbing of these new concepts directly to you, which is both scary and intensely powerful.
git isn't the next Unix because it will replace Unix: git is the next Unix because its concepts represent the next mind-shifting change in computer science. I mean that git is the next Unix in the same way you could say "Unix is the next Lisp" or "Dynamic Languages are the next Static Languages." Not that the new thing replaces the old thing: they have totally different uses. But that's the point: the new thing's uses are really new. Stuff that was hard is now easy.
It's hard to imagine the world before Unix pipes (and the Unix sh in general) were invented, but I used it, and IT SUCKED. The whole Unix paradigm (yikes, now I've used that word) really changed the face of computing. Even if you don't use Unix, you got changed by Unix.
git's new nouns are blobs, trees, commits, and refs. The new verbs are push, pull, merge, tag, etc. You can apply these nouns and verbs to a lot more than just source code version control.
The naysayers in this thread all sound like 1990's programmers who don't understand the value of higher-order functions or dynamic typing or macros. You can survive without those things, but some problems are just so much easier with them than without them. git is like that. If you don't get it, you're living in the past.
One final clarification: my article was written to talk about git, but it's not about git's code or API or repo format at all. bup, the backup software I started writing about two years after that article, doesn't share any source code with git, but the amazing things it does are possible because it uses the new nouns and verbs popularized by git. When new distributed filesystems and databases and social networks and wikis and massively distributed collaborative text editors arrive, they will all be using these new nouns and verbs. If you don't care about that, then yeah, git isn't the next Unix for you. But if you want to build the next generation of networks in real life, then you'll either be taking advantage of the new nouns and verbs or you'll be painstakingly building the Windows of distributed systems.
> You can apply these nouns and verbs to a lot more than just source code version control.
Do you have any examples of people (other than you) who have actually done this? It would make your argument much more convincing.
Also, the fact that other DVCS's have different, incompatible models underlying them suggests that Git's nouns and verbs are not nearly as universal as Unix pipes. If Git's nouns and verbs were universal, Git could subsume other DVCS's (ie. you could implement other DVCS's semantics on top of Git with performance as good or better than what they have already).
Many of the new "nosql" databases use a lot of similar concepts. But it's a new thing: I'm trying to see into the future here, not tell you what's already happened :)
As for different DVCSs, I think you're exactly backwards. Almost everyone commenting on these things seems to believe that git isn't special, it really does the same thing as every other DVCS, etc. And on a fundamental level this is true: it's very easy to convert from one DVCS to another, because fundamentally the models are so similar. git just exposes the model in a more obvious way. The insides are the same, the outsides are different.
I thought I was with you but I was just made a little more confused by that comment :\
I've yet to do a lot with Git, but your post has made me a lot more interested. After wikipediaing the features, I'm starting to see how one could save a lot of time by building an app on top of Git rather than recreate all this functionality from scratch.
Has anyone experience with this setup: install your software in a git repository. That way you can say something like "version=1.23" on top of the control file (such as a sourcecode file for a scripting language) and the software system checks out that particular version which is on your hard drive. There are obvious drawbacks of the system as where to install the intermediate working directory, but there might be solutions for these problems.
That way updates may be much less hassle. If you have for example a python script on your server running doing important things, an update to python might break the script and could cause some trouble. But if you could say for example "uses python=x.y" and the system silently falls back to that version even if a newer version of pyhton is installed, the script is more likely to keep running even on upgrades.
When I last looked at tarsnap, I got the impression that it's a smart compressed rsync - apparently I was wrong.
bup has a FUSE frontend, that exposes every backup set as a complete file system (also through http and ftp, but the file system angle is the most useful in my opinion). Does tarsnap have something comparable? I'm going to look at it again.
No, Tarsnap is designed for backups rather than random access -- it does things like cryptographically signing archives, which is obviously only possible if you have a concept of "this archive" vs. "that other archive".
(You can extract subsets of files, as per normal tar functionality, though, and Tarsnap only downloads the files you want plus the 512-byte tar headers it needs so that it can figure out which files match your specifications.)
In that case, I did not just describe tarsnap, or at least - did not intend to.
bup _is_ designed for backup rather than random access. But it is easy enough to make those backups look like read-only file systems (bup includes an FTP server, HTTP server and FUSE module that expose a backup set through the respective protocol or as a filesystem).
But since bup builds on git, and a bup backup set is actually a git repository, you get all the git related stuff for free - e.g. bup supports cryptographic signatures in the repository by way of git's signing support -- although, for now, the "bup" command does not implement them (so, if you want to sign or verify the signature, you'll have to use git on the repository rather than bup)
Bup's deduplication is comparable to rsync's (and it reuses rsync's main tool for that). If you change a byte in the middle of a 100MB file, you'll likely need to transfer ~16k to or from backup (compared to the other version of the same file). That's also true if a byte was inserted in the middle of the file. And if you backup 100 copies of a 100MB file, was just a few bytes changed in each file compared to each other file - you'll need less than 150MB of space/storage, rather than the 10GB or so without deduplication.
I don't want to spam but I did this : http://www.wipigi.com/ a wiki service on top of Git (and Django)...
But with GitHub's wikis it's a little useless for anyone but me. Nonetheless it was fun to build and an opportunity to learn a little more about Git (and Django).
I've been hacking at gollum (github's FOSS git-based wiki) for some time, and I might have some interesting uses for it in the education space that go well past a basic wiki.
I'am writing a gollum like wiki atm. Its not finished yet, but works already and is not far away from a (stable) release: http://github.com/entropie/oy
My problem was gollum worked not behind Apache/mod_proxy, so i was forced to write it on myself ;)
Yea that seems like a pretty powerful concept in particular. I wonder, though, if people'd really care about the ability to edit something offline when they could just go with some wiki that has private restrictions on it.
I do, however, really like the idea of downloading a complete wiki, knowing that I can resync at a later date if content in my version becomes irrelevant.
I've been thinking about this idea actually. Sometimes, it'd be really handy to have blog posts (aka essays) under version control so you can edit them in your favorite text editor offline. At the same time, that could be very restricting; imagine if you were required to use git + text editor to edit/publish all of your posts.
Having the wiki paradigm in mind, it didn't occur to me that one -would- use their preferred text-editor, but that's maybe a big benefit right there. What if it were a desktop app running in the background that monitors these text files and is smart enough to handle all git operations behind the scenes? And what -is- it about long form writing that is just inconvenient on the internet? haha
I love git, and use it to backup my life. Very interesting to know that you will still be able to checkout what you do today many years later. CVS can do this as well, it is just not so attractive being much slower.
I don't remember if I've read this before, but I feel the same way.
The great thing about git is that it's easy to understand. Once you understand the concepts (and they're really (relatively) simple) and learn the vocabulary, it gives you tremendous power. At least, it makes you feel that you have power, hence it empowers you.
> Git is actually the missing link that has prevented me from building the things I've wanted to build in the past.
I'm having a hard time understanding the second point. I have never felt that my VCS prevented me from doing ANYTHING, even when I was forced to use completely shit systems like Visual Source Safe 6.0.
Are you saying that you wanted to build systems that integrated very tightly with version control, or that the difficulties you experienced with older version control tools prevented you from being more adventurous in your coding?
It's not the presence of a bad VCS. It's the lack of a decent one.
Without git, it's really hard to take bold steps in redesigning the project, because one would be afraid it won't work, and then you'll lose all the progress you had so far.
Git solves this problem. Just start a new branch and work in it. If it works, great, merge mack to the master branch. If it fails, no big deal, discard this branch and go back to master.
Basically git encourages experimenting in a way no other system does.
Just start a new branch and work in it. If it works, great, merge mack to the master branch. If it fails, no big deal, discard this branch and go back to master.
How does that not describe more traditional VCSes like cvs and subversion?
Just several weeks before dropbox came out into light I have completed building a prototype of what came out to be a dropbox clone on top of git (in fact, I have had it working with mercurial and bazzar as well, I designed it to be platform independent) and thought I have in my bare hands a potential for a great startup (demos were working smoothly, auto syncing files between clients, web viewer, etc). It was a side project, which I used to work on late night and weekends, and I gained what I evaluate as great results with no much effort.
Yet, one morning, I was opening my browser, pointing, as usual, to HN and saw a post saying dropbox have raised this and that money from Sequoya Capital. I was eager to know what is this dropbox and what they do and was terribly shocked to find out they were actually doing the same shit I was in, but, for longer time, with more and smarter people and fundings.
Soon after, I dropped my project (BTW, it was named StoreAge), and yet could not use dropbox or hear anything about it for a very long time.
Today I am a happy dropbox user, and git user and waking up every morning wondering about what will be my next startup.