Hacker News new | past | comments | ask | show | jobs | submit login
Would you (do you?) trust your company's source code to github?
21 points by turbinemonkey on Nov 11, 2009 | hide | past | favorite | 38 comments
We've always hosted our own SCM repos...it's not hard, so why not? Well, my company is now (partially due to me) putting more and more stuff "in the cloud" (AWS, et al.), and the git repos are next. It makes total sense from a resources perspective to use github, but I (and the other partners) are concerned about theft, loss, and/or leakage of our proprietary (gasp!) goodies. Are we (overly) paranoid, or is github actually no less risky than the disgruntled emp problem/general hosting failure problem (e.g. If we were to just put our gitorious on a rackspace slice or something)/our in-house backups failing problem?



Hell yes I would.

My current boss would rather die and take us all with him before letting code out of our network. But he is actually slowly killing me and the other devs each time the svn webserver takes a dive, by having me maintain the ACLs, and by preventing us from using git until we build out our own infrastructure.

We're not in the business of SCM. If I was in charge, I'd pay the experts to do SCM, especially the ones like github that make tools that make developers very happy. Furthermore, I have more faith in github's security team and model than the network and servers the junior sysad that was let go 6 months ago put together.

As far as protecting intellectual property...

I know it seems like the world to a software company or a developer, but your raw code is actually worthless. Your team, and how the use, integrate, improve and sell the code is where the value is. Not `server.py`.

Any employee can walk out any day with a copy of the repo and knowledge of how it can be put to use. But the chance of him putting this to work for himself, putting you out of business, is practically zero.

In short, I would do what's easiest for everybody and relax.


It's not completely accurate to say source code is useless. While I agree that it's not particularly useful to steal code with the aim of replicating functionality, security is another thing. All it takes is a few lines of rogue code slipped into your repo to, say, log everyone's personal info and send it to Estonia. (no offense to any Estonians on the board)


Oh, great call. The possibility of an unauthorized person injecting code could be disastrous.

Presumably, if you're security conscious, someone reviews all patches before they make it anywhere near shipping. But obviously that is not foolproof.

But then, what would happen if, say, someone went through the backdoor on github and patched a binary and modified the commit log to cover his tracks.

Hopefully git would fail loudly when you pull?


I have to think so -- you can patch binaries and modify commit logs all you want, but patches are still being applied in sequence, locally, when you pull. If the hashes don't match, boom.

But then, can those hashes be swapped out? We need hashes on the hashes! :-P


This sounds like an accurate assessment from someone who doesn't own the code. You get paid to write the code, so it means nothing to you if a competitor steals the code and starts a similar business. Or what if a competitor steals the code and can more easily implement some of your features or find flaws in the product an exploit them.

As an owner of code, I would never put the code in the cloud. I don't even put binaries in the cloud without obfuscating them. Every little bit helps.

And I also disagree that the code is worthless. Imagine the value of your company if all the code suddenly disappears! Not only do you have to rewrite everything from scratch, but you can't support your existing customers while you are doing it.

Final point, doing what is easy for everyone else is exactly the kind of thing that limits your competitive advantage. If all your competitors are using Github and github loses all their data, you win!


Like I said, my boss has the same attitude as you and I understand it and comply with it. It's definitely not a bad rule for a lot of businesses.

But I'm not sure what "owning code" means these days.

Almost all the software I get paid to write is based on open source software. I assume competitors are constantly looking at the same OSS projects I am, and do know about the features (and flaws) within.

Perhaps this puts us at a competitive disadvantage, but if we had to write everything from scratch in secret so we could "own it" and make sure nobody ever saw it, we wouldn't have a product yet. Actually we wouldn't be in business at all.

Also, there's almost zero possibility with git of not having a recent copy of the code somewhere, whether github is accessible or not, as a few other posts have noted.


As part-owner of the code in question, this is what one little guy on my shoulders is saying, very loudly.

Taking the other side for a moment: Really, no code in hosted environments (which is what I presume you meant by "the cloud")? In a production environment, user data is way more important than deployed code (compromise that and you may be looking at jail time in some jurisdictions, nevermind ruinous consequences to the business' reputation)...is that encrypted before it hits the disk or something? Or, do you think that any code or data not stored on machines located on premise is tempting fate?


Nice run-down. I'm pretty relaxed, BTW, but making sure i's are dotted. :-)


I'm guessing github:fi is too expensive for your boss. How much would he be willing to pay? Is the per user/repo a factor?


Would you (do you?) trust your company's source code to Rackspace/Slicehost/Linode?

They're equally likely to suffer the kind of failure that would directly expose your code.


But far, far less likely to be the target of anyone looking to acquire an absolute ass-load of proprietary source code (and github is probably the largest concentration of it today).

(Just trying to continue to run the skeptic's argument, here. I agree with the point quite a bit.)


Maybe it you should hope that some rich company does use the exposed source code, and then sue them for all they are worth. Make sure there is a copyright notice so the legal situation of the code is clear.

Just because you can play a CD does not mean you can make a fortune on covers without permission.


No. I obfuscate it all before I upload it. Call me paranoid. Not only is it a risk to have the code compromised, but if sufficient steps aren't taken to control access to trade secrets, then they can't be called trade secrets. IP has value.


I've never heard of someone losing trade secret protections because the security of their hosting provider was compromised. Have you?


No, but it is necessary to show that every effort has been made to protect the trade secrets and obfuscating code is one step in that direction.

I am no lawyer, but that's what the lawyers told me.


Fortunately, reliability isn't that much of an issue for distributed source control. On the few occasions when GitHub has had an extended outage, we just put up a temporary shared repository on a random server and everyone pushed/pulled from that until the outage was over.

For security, it depends on just how much security you think you need for your source code. What's the attack model? Do you have competitors who have so much to gain from reading your code that they'd risk industrial espionage? Is there sensitive data checked into your source control that would put you at risk if there were an accidental leak?

Personally, I don't think anyone has much to gain from reading my company's source code. GitHub has much more to lose from a privacy breach than most individual customers, so they have the best incentive to secure their systems. They also probably know more than I do about keeping the repositories secure. On the other hand, if I had an exceptional need for security, I'd want to hire an expert myself and keep full control over the servers and processes.


I wouldn't say we have an exceptional need for security, but we do have reservations about dropping the only thing that has any real value in our company right now into what looks like a helluva honey pot.

I'm not sure I can even properly enumerate the risks -- if I could, I'd be able to make a calculation pretty easily. Espionage seems absurd, but who am I to say that that's not a possibility?

That said, we're getting by by cutting back on our extraneous costs, which means exactly the opposite of "hire someone ourselves and keep full control".


Definitely seems overly paranoid to me. The real value that your company has is in your brains. The things you've learned about your customers can never be fully captured in source code. Especially considering there's no real potential for loss of the code, only exposure, I'd say the tradeoffs are worth it.


Your people and their knowledge have real value. The code alone has limited value to anyone else, without the associated expertise. And if it does leak, normal legal protections can mitigate the damage. (For example, the threat of a copyright or trade secret lawsuit may be enough to keep competitors from using or even looking at your code without permission, depending on who they are.) On the other hand, accidental breaches do happen (whether outsourced or self-hosted), so you should probably keep your secret keys and passwords even more protected than your source repository.


Nobody cares about your source code - most people struggle to understand their own.


We recently migrated from self-hosted SVN repositories to GitHub. It's proven quite beneficial in terms of allowing access to developers at off-site locations. With GitHub's use of SSH keys for authentication I haven't had to worry much about security, and as far as disgruntled employees go it's easy to revoke access to the repository (i.e. set it up so everyone has their own GitHub logins and then you can add/remove access as needed).


Yes; github is a great service and I have now been using it personally & professionally for quite a long time (june 2008) and never had any security issues.

There are the occasional downtimes, but as its dvcs thats not really a problem, just push using ssh to your sites instead (wait 5 minutes, make a cup of tea, etc)

Like other posters in the thread I agree the real value is the people writing the code, not the code itself.

I would say if you're that paranoid about code leaks setup a gitorious server (http://gitorious.org/gitorious) on your network and save the monthly fee (and the worry).

You will, however, miss out on the other useful things github has to offer. Every repo now gets its own wiki & issue tracker, so no need for trac.

My current favourite feature is gh-pages (http://pages.github.com). You can make a named branch on your repo and it will be hosted as a website. If you have an API that can be your public documentation branch; you can also use it for simple sites (my blog is a github page). All pages are hosted free of charge on their shiny new rackspace servers. These days they also support cnames mapping.

Edited for typos...


never had any security issues

You don't know for sure that you haven't had any security problems. All you know is that you haven't noticed the results of any security compromise.


Do you actually intend to use other github features? (apart from git hosting itself) If not, then what exactly are you trying to achieve by putting your code there?

- as you said git is relatively easy to maintain locally (at least compared to the standard customer-facing stuff)

- there's no difference in reliability / backup security really imho - unless you trust your local hardware less than "some random host in the cloud"

- not sure what your code does / is, but if you don't publish it, you don't even have to think about information leak (apart from standard host security)

- does the hardware cost play that big role for you? you didn't mention any other gain

- tbh, I'd be more worried about someone discovering that you host your public services' code online and starting to look for security holes just for fun - normally you don't need to think about the security of some one-off utility that you commit, but when it's online, it can tell more about your internal arch. than you want to show

So if you have a dedicated host for git and want to get rid of it, sure - even the private hosting will be cheaper. But is it more important than the other issues.


User management, SCM visualization, and forking is all easier compared to gitosis + gitweb et al....which we live without, but would be nice to have.

Our real aim is to eliminate all in-house hardware. Sysadmin is definitely not our core competency, as they say.


doesn't github provide a private, off-site option for this reason?

edit: http://fi.github.com/


Yeah, which is outrageously, painfully expensive:

http://fi.github.com/pricing.html

I'm not trying to have my cake and eat it too -- I recognize that there's a different risk profile to outsourcing hosting of any service compared with doing everything in-house. I just want to make sure I'm not veering too far off the tracks in this case.


How much would you be willing to pay? How about $3600.00? Would that be too much? Is it the per user cost of fi that is too much or just the overall cost? What if the $3600 included as many users/repos as you wanted (given your hd space) and 1 year of free upgrades.

How about $999.99 (same as Adobe Photoshop so it should only require middle management approval)? What if you got a box like the yellow google search box (it could be called that the "premium version")?

An unknown factor in all this is how many companies would be even interested in buying a github server in the first place. If it isn't that many the costs might be too low to sustain development. I am betting that for the GitHub guys it makes a ton of sense for them to sell the private small accounts on github (and only manage/fix 1 github version). and for the big guys sell them a big package.


Fundamentally, we don't want any hardware on-premise at all. What we really want is some kind of real statement from the github guys that speaks to all of the issues raised here (encryption, theft, malicious injection, auditing, the "honeypot"/juicy target problem, etc?), as I suggest to PJ below.

I'm guessing that's not going to happen, so I suppose our options are the status quo, host in a less-conspicuous location and manage our own security (as best as one can in a hosted environment), or go with the crowd and seek safety in that quasi-anonymity.


We market (and price) GitHub:FI as our enterprise product specifically because we feel small to medium size companies should be using github.com.

If your and/or your partners need some help putting your mind at ease about hosting your code with us, feel free to email me directly at pj@github.com and I'll do my best to help.


I'd much rather see a public statement on these sorts of issues. The only thing I see on the site that is even remotely relevant is a one-liner on the plans page: "We make every possible attempt to never transmit your data unencrypted."

Presumably, the amount of proprietary code you will manage will only increase over time, perhaps remarkably so. It would be somewhat reassuring if I saw something that indicated that you take this stewardship seriously, rather than tossing off "best effort" one-liners.


You're absolutely correct, we have scraps of information about our security here and there, but no formal page spelling it out. Our security page will be located at http://github.com/security, expect it up within the next day or so.


I think you are being overly paranoid, specially because if you're using git and a disaster happens with your SCM server, chances are that probably everyone has a recent, full-history clone of your repo(s). For older repos, you can just backup to S3 or whatever you feel comfortable with.


we've been using it for almost a year now, and are pretty ok with it. There are occasional downtimes that are annoying, but as others said, not a big deal. We are also reasonably happy with the simple issue tracker, although it misses some _really_ important features it is ok for our needs (but damn, I'd love attachments).


I wouldn't.


Thanks for clearing things up for me. :-P


What's the problem here? You asked, I answered.


It is some what expected for posters on HN threads such as this to explain their response in a concise fashion giving other readers a clear handle on the pros / cons of the situation and in this case helping the op to come to a decision.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: