With Virtual Tables you can expose any data source as a SQLite table -- then you can use every SQL feature that sqlite offers. You can just tell sqlite how to iterate through your data with a few functions, with an option to push down filtering information for efficiency.
You can also create your own aggregates, functions etc.
SQLite virtual tables for git would be phenomenal - you could join your git history against data from other sources! And you wouldn't need to run a huge MySQL/Postgres server process to do it.
A very cursory search suggests no one has built this yet.
This is pretty cool. Looks like it's local to the current repo which makes sense for most usage. Having something like this across a swathe of repos would be useful in different ways (ex: "What has Bob committed over all the repos for our projects that involves the string 'billing'?".
Minor off topic rant about the animated example: Who doesn't put a space at the end of their prompt after the $?! Ugh!
Nice. I can see a need for this as a lot of my projects are structured like that (multiple sibling repos). Running 'git map log --grep ...' seems particularly useful.
Quick eyeballing of the source, it does not handle whitespace in directory names properly. The for loop would treat them as separate, invalid, entries.
Of course I'd also fire someone on the spot that commits a project directory with whitespace in it...
Thank you for suggesting git-map. I tried it and intend to include it into my workflow. I thought your suggestion was a good one so I tested it. It did not work for me due to: "dyld: Library not loaded: libgit2.21.dylib". I assume this is something about my setup (mac, zsh, other stuff) but if you got this working I'd like to know so I can keep trying my with my setup. To clarify: both git-map and gitql work for me, I just can't seem to combine them.
git map is just a 8 line bash script so perhaps checking if paths are setup correctly for bash to point to your git binaries if you use zsh most of the time.
You could submodule them all in an otherwise empty parent repo.
Bit of a hack, but could come in handy for other things (off the top of my head: "welcome to the team! Clone this one thing, it contains everything you need.").
> Who doesn't put a space at the end of their prompt after the $?! Ugh!
I've always had mixed feelings about that. Resolved it now, by using '>', no space, which doesn't look cluttered since there's only ~1px right next to the first char entered.
Thank you for sharing this. I was interested to know if this was a fork of gitql by cloudson. It is not. The following issue clarifies the relationship:
Mercurial has a somewhat similar concept predating this (added circa 2010): revision sets (https://www.selenic.com/mercurial/hg.1.html#revsets) (for selection, and templates for selection but git has that built-in, kind-of, via log --format)
Mercurial's are completely general, though. Any Mercurial command that can accept a revision as an argument can also accept a revset expression. And templating isn't just for log, but for many other commands, such as grep or annotate (blame), and it's the same templating language for all of them. I also find hg templates a bit easier to read, because they're Djangoish/Jinjaish instead of being printf-ish like git's. Plus, you can save and compose Mercurial templates and revsets.
I was actually hoping that gitql had finally gotten inspiration from Mercurial and git would grow a general purpose query language, but it's read-only. :-(
Revsets are a wonderful feature, and it's something I wish git had. Just being able to say I want to see what has changed between this branch head and its latest common ancestor with trunk is an incredibly simple and useful thing to be able to do.
In this case git can do the same thing, but notice you can only do it because git provides a special command for getting that revision. Recasts are really general (a greatest common ancestor function is provided, but can be easily synthesised from more primitive building blocks) and can be used everywhere, so you can bisect over changes you made that touched files matching a pattern, or whatever. They aren't something I use every day, but they are really useful on occasions and allow for some pretty robust tooling to be written.
This requires using bash. I find it kind of cheating that git ships bash on Windows so that Windows users can rely on bash for composing git commands. I'm not sure if Windows users are generally that happy about typing bash commands, but I guess nobody really cares what you have to type in as long as it's high in the Google hits for whatever operation you want to perform.
Mercurial's API (i.e. the CLI) makes a point of being usable with powershell and cmd.exe, which I think some Windows users appreciate.
Another way to look at this is that hg needed to bake this into their core, whereas git didn't need to. There is a non-zero cost to all additional code, so leaning on the shell to do work is generally a smart move.
I believe the goal is to get a diff, not a list of commits, in which case you need to figure out an expression for getting the last commit common to master and experiment so you can diff it with experiment.
I just tried that, and it seems to be the same as `git diff master experiment`. We don't want to diff the two heads. The command in Mercurial is `hg diff -r 'ancestor(master, experiment)' -r experiment`. Comrade trolor seems to have found the correct git expression.
That works. Can you explain why? Does diff see this as a single commit or as set of commits? If it sees it as a set, how does it decide what two commits to diff from that set? If it's just one, what does it decide to diff?
I'm kind of confused because gitrevisions(7) says the triple dot is symmetric difference, but exchanging master and experiment does not produce the same output from diff.
Not so helpfully it has a different meaning in 'git diff' than in 'git log'. Basically, it means the difference in the second branch from the first common ancestor of the two branches.
That makes it even more interesting, I mean the fact that this or something similar didn't get traction. I often have an idea of what I would like to know about my repo, but don't want to start hacking the answer together.
sorry, didn't note the date, I just found it because I was needing something like it and was ready to start making it because I figured - better look if someone else did the work for me first.
Very cool. I always wanted to play around with a git provider for powershell. Powershell's syntax is great for queries and you could use everything that works on the normal file system with anything that has the abstractions implemented.
The syntax seems close enough that this could just replace it, though:
ls commits | where date < (get-date).AddDays(-4) | where message -like *foo* | select autor, message, date -First 3 | ft
Imagine if we could just have this automatically for every program that generated text output. It doesn't seem beyond the realms of possibility that every tool could either a) structure its text output in a way that can guarantee simple command-piping to a general purpose query-language processing tool or b) in the presence of a "--output-json" flag, produce json which can then easily be queried.
It's the UNIX principle/tradition that line oriented text is the universal format. It's quite flexible. But I share your feeling, and I keep hearing nice things about powershell.
Wow. It's awesome. I have never been seen project like this before. It seems very useful. Anyway, I think it would be better if they demonstrate the example of usage using asciinema [1].
If you actually want to query git data in production, it's really a better idea to copy all the data into a real SQL data warehouse. If you're using github, my company (Fivetran.com) has a connector that pulls from their API.
I would assume that the whole point of the project is to be able to do SQLish queries on a git repo, and that it was written by and for people who are familiar with SQL and have a preference for it over other query languages. And they probably do enjoy writing SQL statements on the command line, however uncommon that may or may not be.
You flagged the story because you don't like SQL on the CLI? Come on. Whether or not you enjoy writing SQL on the CLI, SQL is a fine language for querying data, and is probably more common than xpath. JQuery seems like a strange choice too.
I just fail to praise the attitude. "Let's faithfully reproduce the looks of 30 y.o. technology, pseudographic tables included, with make-believe over git".
With Virtual Tables you can expose any data source as a SQLite table -- then you can use every SQL feature that sqlite offers. You can just tell sqlite how to iterate through your data with a few functions, with an option to push down filtering information for efficiency.
You can also create your own aggregates, functions etc.
Here's an article where the author exposes redis as a table within sqlite: http://charlesleifer.com/blog/extending-sqlite-with-python/