Hacker News new | past | comments | ask | show | jobs | submit login
GitHub unveils its Licenses API (lwn.net)
113 points by Iuz on March 26, 2015 | hide | past | favorite | 27 comments



I and a friend wrote a tool to check your repositories with this API:

http://put-a-license-on.it/


This is great, I was considering making something like this myself. A few suggestions:

- It seems to run out of quota very quickly, I get 403s after listing the repos of just a few users.

- The "Click here" for adding a license just plops me into a page to add a file to the repo. With some work, this could be replaced with a dropdown to choose the license, and it could send a Pull Request to add it (or ask for the user's permission to add it directly!)

- I don't see any links to the code for this, in case I want to check it out or send a Pull Request (to fix the issues listed above, for example).

But, great job so far, this should make it just a bit easier to add a license to your repos.


If you type LICENSE into the filename box, Github provides a dropdown of licenses to choose from.

https://i.imgur.com/opCBKHT.png

With that in mind, it would be nice if the Put A License On It site could open to the create screen with the filename filled in with LICENSE.

This can be done using the following URL pattern: https://github.com/[user]/[repo]/new/master?filename=LICENSE


Thanks, cool tool ! A suggestion: Making the first column to have URL linking to the repo's homepage would be useful.


For one of my repositories which I created in 2013 and then pretty much forgot about without having written any code, the license file is called COPYRIGHT instead of LICENSE because back then I thought back then that COPYRIGHT was a good name for the license file. Anyway, getting to the point:

1. While I personally have switched to naming my license files LICENSE rather than COPYRIGHT, I think others still do name their license file COPYRIGHT, so you might want to check for such a file as well (or is it github who should detect that?)

2. The link "Click here!" you provide leads me to https://github.com/erikano/pgviz/new/master where I am given a 404 by github. The reason being that I do not have a master branch in that repository but instead I have one called "devel".

3. I think you should change the link text from "Click here!" to "Put a license on it".

With that being said, I think the tool you wrote is nice and I also got a warm and fuzzy feeling from seeing that all the repositories it checked on my github returned "ISC" (aside from the one mentioned above and aside from the github pages repository which I have intentionally left without a license due to its special nature).

Off topic sidenote to anyone reading this: Most of my repositories are old I ideas I had but never executed on so they are mostly empty and almost none of them have any code. During late 2014 and most of this year, I have been working actively on two projects. None of those two are ready for use yet but there is code. One is saas-by-erik/timelog (written in C) and the other is erikano/django-timelog (Python/Django), both licensed under ISC. So if, for whatever reason, you are going to look at anything on my githubs, please look at those two. And if you feel like it, read the code and tell me where you think I'm doing something you think there is a better way to do, though note that the one I'm writing in C, I want to emphasize speed whereas for the one written in Python using Django, I currently do not worry about performance and will leave optimization of the latter (including queries eventually executed by PostgreSQL through Django) until a later point of the development. That being said, since I will optimize django-timelog at some point, if you do notice something in my code which is obviously killing performance, I will be happy for any issues reported where you tell me what you see. Sorry for this long off topic note but I felt I should add it, hope nobody is too bothered about it.


It doesn't seem to do anything after it gets a 404 back from the github API on an incorrect username.


Doesn't spot that a few of my older repos have a license within the README.

Surprisingly doesn't recognise GPLv3.


This is super handy, but it only returned 8 repos out of scores for me!


Cool.. though it only seems to return 6 repositories...


it will be nice if I can choose the license on the tool and the tool make a PR automatically


It's actually been out for a couple of weeks, see the blog post from March 9th: https://github.com/blog/1964-open-source-license-usage-on-gi...



The quibble about GitHub not correctly handling files named COPYING is spot on. I have some LGPLv3 projects, and whenever possible I put the license into two files named COPYING (a verbatim copy of GPLv3) and COPYING.LESSER (an addendum that converts GPLv3 into LGPLv3). Because that's what the FSF recommends. I have no idea how GitHub's crawler will interpret this. Maybe it only sees the first file and thinks I'm using GPLv3?

Anyway, the whole "scan the repository for anything that looks like a license" approach seems to be misguided from the beginning. What if the license is in a comment at the top of an ordinary source file, as I often do with short licenses like MIT? What if it's just a link to the FSF or opensource.org? What if it's a translation of a popular license into another language, or a link to a translation? What if the only file that contains a license is a library with a different license than what the owner intended? Approximations and second-guesses are good enough if you're just trying to pull some statistics, but open-source licenses have legal implications for everyone involved.

Just let me pick a license in the repository settings, not only for new repos but also for existing ones. And if I do so, please display my choice prominently in search results so that nobody will misunderstand my intentions. I don't really care about the API, I want to see the options in the official web interface. You are welcome to throw a warning if you detect something in the repository that seems to contradict my selection. You are welcome to suggest that I add a license file. But they should be suggestions, not prerequisites for GitHub to recognize a license in the first place. The owner, not some half-baked robot, should have ultimate authority over what the license is.

Bonus: Forks automatically inherit the license of the original repo, unless the forker explicitly picks a different one. First-time pull requesters are informed that their patches will be licensed under the same license as the repo, and by clicking "Submit", they agree.


I really like your bonus idea. But it's surprising that after you explain how complex licensing scenarios can be you still recommend to force a very inflexible single-license-per-repository model. What about dual licensing? Differently licensed code in the same repository? (e.g. for vendored third party code etc.)


Dual licensing is fine, I simply omitted it from my suggestion because it didn't occur to me at the time.

As for third-party code included in a repo, I don't think their licenses should be shown prominently on any listing. It's the job of the project maintainer to ensure that all of the licenses for third-party libraries are compatible with the main license(s) for the project, and if possible, enumerate them somewhere in the tree. Most third-party libraries probably have their own GitHub repos anyway.


I think the real reason there are so many without licenses, is because most projects on github are just a few commits and then abandoned. They are just created as a quick hack or experiment.

The popular ones with many users seem to all have a license.


I open-sourced a tool a few days ago that lets you do mass modification of git repos, including (but not limited to) adding license files: https://github.com/clever/gitbot. I've used it to add licenses to close to a hundred repositories at Clever. Would be great to see if others find it useful.


I collected the most popular repositories and their license. This post includes a CSV dataset which you can process yourself: https://npiccolotto.com/2015/03/licenses-of-popular-open-sou...


It will be interesting if they ever pursue providing a way to enforce licenses and IP.

If a user browses or pulls repository A and releases repository B, which is found to be violating A's license or IP, a log could provide evidence of culpability.


Does anybody know where the license texts come from? Copy & paste from the OSI website? In other words: Are they equivalent?

(The OSI is part of the SPDX workgroup, but that doesn't really answer the question.)


They seem to be using the same licenses they used in choosealicense.com, and those are hosted in this repo[1] and have source information embedded.

[1] https://github.com/github/choosealicense.com/tree/gh-pages/_...


The ISC licenses from choosealicense.com, the SPDX site, and the OSI site differ (very very slightly.)



I never put a license on any of my small projects. I'm the kind of person that happily combines incompatible projects. I simply don't care, and neither does anyone else.

Once my projects gain traction, issue #1 is usually "add a license". I add whatever the reporter wants.

The bottom line is that any project with actual users has a license.


> one of the leading complaints being that it takes a lax approach to software licensing

I never understood this criticism. There's plenty of software I haven't bought a license for. I don't feel that just because somebody shares their code or archives it in public that I'm entitled to a free license.

That said, I've been approached on Github about licensing my code and I'm happy to grant one. For the most part, however, I just dump code to Github because it's a convenient way to backup and dealing with licenses just creates friction. I'd rather know that somebody out there explicitly wants the code before dealing with it.


Of course nobody is entitled to a free license.

But if somebody posts code on a website where most of the public content is under free licenses and the TOS explicitly dictates that you grant certain licenses to other users for free, I think we can all have a reasonable expectation that the code in question will also be under a free license. And if the expectation is broken without a clear indicator, that's a recipe for confusion.


> most of the public content is under free licenses

Is it? I've read reports that all but a fraction of Github repos are single-commiter code dumps.

> the TOS explicitly dictates that you grant certain licenses to other users for free

Where?

https://help.github.com/articles/github-terms-of-service/

The only stipulation I see is:

> By setting your repositories to be viewed publicly, you agree to allow others to view and fork your repositories.

Unfortunately, the TOS doesn't provide a clear legal definition of fork. Does it go beyond clicking the fork button and copying the repo across Github servers? Does it including cloning the repo to a local disk? Or running the code? Or maintaining a derivative project?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: