I lead the Microsoft Open Source Programs Office team. I'm sorry this happened.
We have merged a pull request that restored the correct LICENSE file and copyright, and are in touch with the upstream author Leśny Rumcajs who emailed us this morning. We'll look to revert the entire commit that our bot made, too, since it updated the README with a boilerplate getting started guide.
The bug was caused by a bot that was designed to commit template files in new repositories. It's code that I wrote to try to prevent other problems we have had with releasing projects in the past. It's not supposed to run on forks.
I'm going to make sure that we sit down and audit all of our forked repositories and revert similar changes to any other projects.
We have a lot of process around forking, and have had to put controls in place to make sure that people are aware of that guidance. Starting a few years ago, we even "lock" forks to enforce our process. We prefer that people fork projects into their individual GitHub accounts, instead of our organization, to encourage that they participate with the upstream project. In this situation, a team got approval to fork the repository, but hasn't yet gotten started.
To be as open as I can, I'd like to point to the bug:
- The system we have in place even tries to educate our engineers with this log message (https://github.com/microsoft/opensource-management-portal/bl...): "this.log.push({ message: `Repository ${subMessage}, template files will not be committed. Please check the LICENSE and other files to understand existing obligations.` });"
A lot of commenters are sharpening their pitchforks, but this comment, in my opinion, makes it very likely that it was an honest mistake. Amazing what taking personal responsibility and earnestly apologizing can do to restore trust and credibility!
Yeah, I think this is an example of addressing a mistake that other companies should take note of. I don’t trust Microsoft-sized corporations as a matter of principle, and I don’t typically give them the benefit of the doubt, but when one of their own engineers explains in human-readable terms what specifically happened—on a holiday, no less—I’m impressed enough to believe him. Some PR flack showing up with vague boilerplate about how Microsoft values the open-source community and they’ll look into it would only have encouraged more outrage.
I always appreciate communication that acknowledges I’m a person, not a data point or a customer. I wish more companies ditched the greasy PR approach and allowed folks like Jeff to do their talking for them.
I think a key reason that corps do try to avoid admitting fault, is because in a lawsuit that can be used as evidence for a guilty verdict in a civil lawsuit in some of the most slam dunk ways.
If we remove this feature of common law legal systems, I think you will get far more admissions of fault like this one.
If you hurt people, organizations, etc for admitting their mistakes, they're going to stop doing it.
The key take away is that apologizing and admitting fault doesn't absolve one of liability. There are a number of "amnesty" laws on the books where admitting fault can server to limit or reduce your sentence - especially with tax issues. I'm not sure how desirable such a thing would be in civil law among private parties. Especially in cases where a tort is minor to one party but a big deal to the other because of disparate wealth.
E.g. if Microsoft burned your house down would an apology and explanation be enough to settle the matter? How could we encode this principle into law for minor things but not large things?
I don’t think accidentally removing credits from a software license is exactly on the level of burning someone’s house down.
In general, I wouldn’t say that causing someone harm should be dismissible with an apology, but in a situation where the harm seems pretty limited, easily reversible, and unintentional—and the apology seems genuine and even informative—I don’t see a particular benefit to causing hardship to someone who made a mistake. The communal reaction as it is should give Microsoft/Google/etc an idea of what the blowback would be if this sort of thing was a deliberate corporate practice.
The harm is depriving the author of their moral rights, and the name recognition from their work. Depending on how popular the library becomes this could deprive the author of substantial business opportunities.
And if this was a deliberate decision that Microsoft refused to undo, I’d be as outraged as everyone else here would be.
But it wasn’t, and they did undo it, and I found their response impressively civilized and professional. So I’m not really understanding why everyone seems to want to hold this guy accountable for all the shitty stuff Microsoft could have done, didn’t do, and apparently did in the past.
Alright... What is a proper response to an honest mistake in your book? We have links to a bug, explanation of what happened, a clear response (that they will check to see if this happened on other forks), they reversed the change...
Unsure why the bar would need to be higher. This is all reasonable given the circumstances.
Fully agreed. And yeah, when it comes to my expectations as far as how large companies respond to embarrassments, the bar is about as low as it gets. A tone-deaf non-apology that sounds like a robot wrote it is what I expect to hear, and I’m almost invariably right.
And that offends me. The idea that someone thinks I’d be convinced by such a response—that I’d find it persuasive and acceptable—is fucking insulting.
So when the guy who’s actually responsible for the mistake—and not some polished corporate drone—actually shows up in the comments section, explains what happened, and talks to me the way one of my colleagues would: yeah, that is above and beyond by most standards, certainly the ones I have for evil empires like FAAN(M)G companies, and I give major credit for that.
Agreed. It's mostly I think because the press can only really do bikeshedding, and so the form of what people say is all they can comment on and amplify, and so we need professional spokespeople to keep up with the rules of what's okay to say, rather than more people to make better products.
How has Microsoft reformed? Windows 11 is more intrusive than ever, it changes default programms more frequently to their desired programms and it is so annyoing to switch browser. THey still are the same old.
> makes it very likely that it was an honest mistake.
I don’t have any doubt that it was an honest mistake. They also took accountability for the mistake, shared their steps to prevent it from happening again, and they’re in contact with the original repo author directly.
At this point, anyone digging for excuses to further demonize Microsoft isn’t interested in honest discussion about this issue. This is a textbook mistake followed by rapid resolution (on Christmas Day, no less), with great communication on top.
If you read the comments, most people believed it was an honest mistake. What we didn’t agree with was that it was an excusable mistake. Microsoft have since acknowledged it’s not excusable and thus will rectify that error. So as far as I see it, all parties, both for and against MS, should be satisfied with the outcome.
In a way I'm glad that this is not taken lightly. We all need to be informed and calm, but mega-corporations should be held to higher standards. They should feel breath of an angry mob once in a while. I imagine, that if reactions to things like that would be just "meh, it's probably just a mistake" it would not be resolved as quickly or it could just be ignored. But now that it is all cleared up let's go home and Merry Christmas.
Intent is a red herring, especially when the actions of an organization, rather than an individual are being considered. More important are the actions' consequences and the organization's propensity for repeating them.
Web search engines claim fair use for snippets, and MS claims fair use for Copilot. However, a web search engine result page is only an aggregate of the excerpted pages, and it refers back to them; unlike software authored using Copilot, it does not draw those snippets together into a coherent purposeful whole, strip all reference to the original author. and claim complete originality.
That MS went ahead with training Copilot on open source code authored by people who are clearly not OK with it is why they are so short on goodwill.
Why is Copilot any different from a human doing essentially the same thing? Complete originality doesn't exist. I think the people who are complaining about Copilot have likely done the same thing themselves. I can't erase the memory of all the code I have read and written that somebody else 'own' the copyright to, so my brain will subconsciously use copyrighted code when writing software.
Because copilot is not a human and can do things that humans can't. Moreover I encourage you to learn music pieces by listening to them and then selling/giving away snippets. You will have big corporations on your heels fairly quickly even though you are a human.
Could you please stop breaking the site guidelines? You've unfortunately been doing it repeatedly in your HN comments. People are supposed to get the benefit of the doubt here.
What happened here was obviously a mistake. The thread is full of lurid accusations, because those are fun to write and talk about, but it shouldn't take even a minute's thought to see how dumb a heist this would have been.
The thread would have been a lot more fun if we could have spent it talking about what prompted your team to build this thingy, and bounce other people's approaches to the same problem off, and maybe share some war stories about dumb things bots have done on our behalf.
Thanks, regardless, for the information you've provided here. It's interesting.
> The thread is full of lurid accusations, because those are fun to write and talk about, but it shouldn't take even a minute's thought to see how dumb a heist this would have been.
Another good reminder that Hacker News is not above assuming the worst and gathering pitchfork mobs like any other social media.
The issue looked like a mistake from the start to anyone paying attention (committed by a bot, changes were consistent with a boilerplate LICENSE file being checked in).
If someone at Microsoft wanted to steal some code, forking it on Github and then publicly documenting the history of the code in the most visible way possible would truly be the dumbest way to do it.
Especially here, when the smoking gun is a change to an MIT licensed project. One of the torch-holding commenters remarked that developers at Microsoft ought to know enough about how licenses work to know what a big deal this was. Physician, heal thyself.
> Another good reminder that Hacker News is not above assuming the worst and gathering pitchfork mobs like any other social media.
I would go so far as to say almost no group is, they just have different preconceptions that encourage assumptions of malice in different ways and directions.
There is no substitute for more facts about the situation, no matter how much we'd like to assume the details that aren't given.
The obligation you're not living up to is to HN, not to Microsoft. If you want to participate here, the onus is on you not to be knee-jerk; avoiding knee-jerkism is the subtext of like 3/4 of the site guidelines.
This is veering to the meta, however, one of the best things came out of my HN experience is how to converse and discuss better. Actually I strive hard to not to be a knee-jerk person. I always take my time while writing something and try to back-up with actual events and/or facts.
The point I was trying to make that for companies like Microsoft, reaching for the pitchfork is not always a knee-jerk reaction IMHO. For all the things they have done, my initial reaction is always Oof, not again..., which is actually sad for a company of this size.
It's actually unfathomable to me for a company like Microsoft to not test these flows adequately and allowing this to happen.
For a change, I want to see a more open computing platform, a less intrusive Windows version, or a longer maintenance window for CentOS, but we all have is a brawl.
One big clue you have that the reactions here were knee-jerk is that they all turned out to be totally wrong. There's a lot of corncob dot gif happening in the aftermath.
I'm glad that my assumption proven to be wrong, by Microsoft itself, nonetheless. However, is being wrong is always equal to being a knee-jerk? Does the reputation of the company in question doesn't play a role here?
Or, shall we be stateless, and evaluate every events without any prior experience? I don't think that holds a lot of water in real world, either.
Another clue that it's knee-jerkism is that the accusations make no sense. It's hard to see what Microsoft plausibly stood to gain by modifying an MIT license. I suspect a lot of the accusations here are being made by people who simply don't know what an MIT license means. For that matter: even with a restrictive license like the GPL, it's hard to make sense of this as a heist, especially since it's right there in the public git log.
I don't think it's going to be possible to salvage the torch and pitchfork comments on this thread. For lack of a better way to put it: they're pretty dumb.
> It's hard to see what Microsoft plausibly stood to gain by modifying an MIT license.
Actually, not being able to find a plausible gain in five minutes doesn't automatically clear Microsoft (or any company) in my mind. From top of my head, I can list three technologies which I find suspicious in the long run: LSP, WSL, SecureBoot.
All in all, I just don't trust Microsoft, and think about the worst of their actions first. They're the only company (ORACLE being a firm second) which evoke this reaction for me, and this their own making over the years.
Just because I don't trust a company and assume the worst of them, and telling this openly makes me a knee-jerk person, so be it.
As I said, I'm happy to be proven wrong, but I refuse to be stateless, and look every action of this company with completely neutral eyes.
If you don't have anything well-considered to say about an event on HN, the best thing to do is not to say anything at all. Believe it or not, your personal distrust of a giant company isn't all that interesting to the rest of HN.
I think what he has to say is well-considered, and interesting. Meanwhile, the fact that you consider yourself qualified to speak for "the rest of HN" is... deeply fascinating.
I’m sure you meant to qualify that statement to say it is deeply interesting to you, rather than imply that your assessment of the interestingness is somehow objective or represents what other people think.
It doesn't matter how "dumb a heist this would have been". This is Microsoft and their track record with respect to illicit and/or illegitimate behavior needs to be continually scrutinized. They put themselves in this position. They have made plenty of intentionally dishonest "mistakes" over the years.
Track records matter and Microsoft doesn't have a clean one. Scrutiny of companies of their size should be the norm. Maybe they would do more to improve "mistakes" if more people held them accountable in a continuous fashion.
"Scrutiny" doesn't mean making up nonsense issues, which is what this is. There's no amount of weasel words you're going to apply that's going to make changing an MIT license in a public git repository part of a terrifying conspiracy --- or a conspiracy of any sort.
Removing and applying an incorrect license to open source software by a multi-national software giant is a "nonsense issue"?
Hardly "weasel words" when you look at the action that took place. It wouldn't be OK for you to do it and it isn't OK for a bot that Microsoft deployed incorrectly to do either.
"The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software."
I understand the license. Based on your response and Microsoft's attribution change I don't think you understand the argument. I won't patronize you however based on the non-response.
No, do go ahead, because I don't understand really any of the dudgeon being worked up in this thread, given the original license gave Microsoft the right to do virtually anything it wanted to with the software already. Obviously, Microsoft shouldn't take credit for other people's work; equally obviously, to me at least, that's not what they were trying to do, since all it took was a single Github link to show what had happened.
This is my favorite part of the culture change which is happening at Microsoft.
We are still working on it. And, it will take time.
Jeff's team, my team, the java team whose forked repo this unintentionally highlighted are working with dozens of other teams every week. Satya says we're all in on open source.
Hold Jeff and Me and all the leaders of Microsoft to that vision. Keep us accountable. And, know this takes time. Let's make technology and humanity healthier and more sustainable in 2022.
I realize that Microsoft is composed of many pieces, but that doesnt prevent me from treating it like a single entity. And that entity has IMHO done horrible things to it's own products and customers, so I'd rather they keep theirs hands off FLOSS as much as possible.
But if you treat it like a single entity, that means you also treat all the people working there like that entity. The janitor working at Microsoft doesn't deserve your ire, nor the secretaries, etc. But they work there, and carry around their logos on their backpacks and what not. Those people have feelings, and those feelings feel bad when the entity they work for is lashed out at. It does make one feel bad when one's company is raked over the coals, because it makes one feel responsible, even if one doesn't even know what's going on.
So, for the sake of the feelings of the innocent people who work there, I would ask everyone to please modulate the volume and intensity of ire that they project. Please speak truth to power, but don't be harsher than is needed to get your point across.
But as for some background: I am a PM for Microsoft Build of OpenJDK and from late last year to around May this year I made contributions to the gRPC_bench repo as a Microsoft employee for some experiments we have been working on, to evaluate and improve different ways of implementing gRPC exchange in Java. [1]
This fork was intended for newer experiments, one of them being about coding and running these benchmarks on GitHub Codespaces. For that, I needed the repo on an org where we, as employees, have Codespaces enabled.
The rest is HN history (back to Jeff's reply above [2]).
FYI: when linking to a line of code, simply press Y on your keyboard to have Github switch from the _branchname/path/to/file.xyz_ URL to the _sha1/path/to/file.xyz_ URL. The former can result in your URL pointing to unrelated code if lines are added or removed in future commits on the referenced branch.
Really appreciate the prompt and transparent response, Jeff. Especially at a time when I assume you were not expecting to jump in on duty.
I hope that despite all the harsh words, you can have sympathy with that behind them are legitimate concerns and suspicion stemming from past bad behavior from various part of your organization. Due to this, and Microsoft's position of power, you have a much, much lower budget for these kinds of mistakes compared to the most other orgs.
Even if there was nothing intentionally malicious at play here, it would not be far-fetched for an outsider to interpret as "implicit maliciousness through neglect".
Here's hoping that 2022 will be a year of bridging the divide and sincere alignment.
Yes, it would be far-fetched. The conspiracy theory here is self-evidently implausible. We need to stop pretending that the accusations here were made in good faith, or are anything more than wishcasting. People write about these things because it's a lot more fun for them than to write about what actually might have happened, even if what actually happened is probably a better, more lastingly valuable conversation to have.
So, I'm of the opinion that a collective (for example a company) can exhibit malicious behavior despite no malicious intention of any particular individual in it. It can be emergent, and moreso the larger the organization. Along similar lines of systemic discrimination, there does not need to exist any conscious conspiracy or malintent. For better or worse, the whole is greater than the sum of its parts.
This comment speaks towards that this is the case:
> I know Jeff personally and he's great. This happens all the time at Microsoft though. Teams try to do OSS themselves, haven't a clue how GitHub or licensing works (e.g. they think the CLA transfers copyright), and after a slap aside the head, I send them to Jeff for guidance and all is well.
(I mostly agree with your sentiment, though. I do get the impression that leadership is sincere in wanting to do right. It's just that it's not that black-and-white or easy. As another MS employee commented, this is something that has to take time and they need to be held accountable along the way. https://news.ycombinator.com/item?id=29684127)
If there was a plausible evil plan behind this licensing silliness, I'd shut up about it. But there isn't. The insinuations being made on this thread (still! continously!) are embarrassing. It's not that I think Microsoft is incapable of putting together a solid heist; it's that observers here have such bad taste in heists.
Agree with you there. I see it as manifestations of fear, tribalism, and short-sighted oversimplification around complex issues. Without wanting to dive deeper into any of those topics, I believe there are similar mechanism at play on a larger scale in COVID-conspiracy theories and the re-emergence of Western xenophobia/nationalism.
It can be hard to assume good faith when your counterpart clearly isn't. But I think it can be at least as important then. We need to come together and find common ground.
I know Jeff personally and he's great. This happens all the time at Microsoft though. Teams try to do OSS themselves, haven't a clue how GitHub or licensing works (e.g. they think the CLA transfers copyright), and after a slap aside the head, I send them to Jeff for guidance and all is well.
>I certainly understand how scandalous this looks to crowds like Hacker News.
I don't. I'm with the Microsoft employee who was pissed at how people think it's edgy to diss on Microsoft. What were the chances that Microsoft was openly doing that? Some dude who has now deleted his post said Microsoft was trying to "create a monopoly of web IDEs". These are clearly people who barely have a passing knowledge of how Microsoft works these days.
People think critical thinking means complaining endlessly. It doesn't. You can't think critically if you don't think clearly. And you can't think clearly if you're only looking for a reason to lift the pitchforks.
It took significant public outrage and press coverage before either of those were even acknowledged, a long time after.
> What were the chances that Microsoft was openly doing that?
After reading the above, is it really that edgy to be assuming the worst? If it's truly just recurring instances of different rogue employees, doesn't that speak to a systemic and/or cultural issue that needs to be addressed with additional internal safeguards and/or deterrents to prevent it from happening again?
To the credit of the relevant team here, today this was promptly addressed as soon as it got their attention. But it will take more than that to set to rest decades of precedence.
(I did not partake in the flaming and don't find it constructive or beneficial; just saying I have full understanding of the suspicion and understand that MS are still on probation)
To late to edit or delete my comment above but just to set it straight I just learned that the whole lerna debacle linked above was a nothing-burger aka fake news.
Thanks for the response. Mistakes happen, human or computer, not the end of the world, and I get it. I was just more curious to know if there is a manual process for this type of forking that was not being vetted via the bot, and to bring another example to your attention.
If you read the repo history carefully, you'll see a bot was responsible for rewriting the "LICENSE" file from an Apache license to the Microsoft (c) stamped MIT license. This human commit simply copied that same language to the file named "LICENSE.txt". It's unclear why they did that, but that human was not responsible for introducing the license text into the repo.
? Instead of a software bug, it was a human error. Is it really surprising that with a company of Microsoft's size, some employees fuck up? Likely the employee was trying to say that the contributions in this fork that were not present in upstream are covered by the new license, but failed to do so properly (by leaving the original license intact and identifying precisely which files the new license applied to and which it didn't).
Courts will take a far more generous view than you are here. If Microsoft is not profiting from the change, and fixes it promptly when pointed out, the courts will shrug at any case - no harm, no foul. It's not even clear to me that it's illegal to have the wrong license on GitHub, assuming the shipping product does not violate the correct license. As nobody has pointed to any infringing Microsoft product... What are we talking about?
Damn dude. You posted this comment on CHRISTMAS no less. Either your phone exploded or you are a VERY PASSIONATE hacker news fan. Either way ... props to you for trying your best to straighten this out immediately.
This feels like the sort of thing where straightening it out immediately, even on christmas day, was well worthwhile simply because it would let him enjoy his christmas dinner without worrying about an inevitable building dramastorm.
That is not, however, a complaint - being smart enough and giving a damn enough to realise that straightening it out immediately was a really good idea is impressive and laudable in and of itself.
I saw your comment on the cups repo. Please don't feel horrible about it. An honest mistake is an honest mistake, as long as the issues are remediated. Merry Christmas
>SQL Operations Studio was built on the back of many open source projects that all use the MIT License for a reason: it's the right way to keep moving the community forward, empowering your users to do cool stuff and build useful things for the community.
>We're just asking SQL Operations Studio to use the same license that Visual Studio Code does.
Thanks for explaining. Out of curiosity, does Microsoft have any external-facing GPL-licensed projects? Are there any restrictions to using (i.e. open sourcing something developed internally or forking something from outside MS) GPL-licensed projects? Specifically, would teams be able to get approval to fork GPL repos?
Git for Windows comes to mind. Teams can absolutely get approval for any open source license; however, for a GPL project, we'd have their open source legal team work with them to brief them on the license obligations and requirements, such as publishing code to https://3rdpartysource.microsoft.com/.
Interesting, so that's specifically for GPL-licensed projects? Or am I misunderstanding and you would have dev teams work with Legal for any open source licensed project?
Copyleft has more process, since we absolutely need our engineers to understand the obligations we have, and for some of us, it may be the first time we're being introduced to open source communities and licensing, so we have to do more education in the GPL case.
Our process revolves more about _using_ open source than forking specifically.
Whenever a build runs at the company, we have a detection task that identifies the open source that is used, storing an inventory. We evaluate the open source licenses for that inventory, and have automation depending on the license that will help inform a team that has taken a new dependency with specific legal obligations - could be to get business and legal approval for something, to take training and learn about copyleft software and licensing, or that they need to post third-party buildable source. We're also able to use that inventory to help with incident response and blast radius analysis.
To scale, we need to make sure that our guidance and policies are in front of people, but we know that engineers want to get work done (or will find a way around what we have in place), and so need to be efficient and straightforward.
Not all situations will require a business or legal approval. Our motto has been "eliminate, automate, delegate" - eliminate onerous bureaucracy and policies - automate licensing compliance and inventory and approvals - and delegate to business leaders and others when there's a need for humans to be involved.
For those that don't know Polish it means Forest Rumcajs (https://en.wikipedia.org/wiki/Rumcajs), probably a joke of the author that wants to be anonymous.
I figured something like this was the cause. I must say I'm quite disappointed by all of the negative comments before anyone from MS had a chance to explain what happened.
Thank you for the full transparency, sadly we may hear for years of people saying how Microsoft blatantly ripped off someone else's copyright, but that's probably okay, those people probably wouldn't use your projects for whatever reason anyway, when in fact it was a bot meant to keep you guys from releasing code prematurely without a reasonable license to begin with (at least that's what it sounds like?).
Thanks for clarifying. This also highlights the lack of sufficient testing, and subpar processes, eg automation should be tested against critical paths, and changing a license should require human approval.
Agree. The lack of tests has been the biggest regret of mine for this project. It started as a hackathon project a long time ago, and as it grew up, it never got the testing investment it deserved. I imagine the code coverage is about 0.001% and there's no end-to-end tests in place.
Feels like the reverse Streisand effect at play here; a high-profile minor fuckup that helps popularize a positive thing. This was fun to meme on, but I'm glad Microsoft appears to be on an upward slant ethically. It also makes me more comfortable being a Microsoft customer.
And here I was ready to sharpen my pitchfork. Thanks for communicating the error and the correction. I know this time of year can be especially challenging.
So, you are adding licenses automatically? This seems pretty risky. Why not just prevent commits that don't have a license? Shouldn't there be a human somewhere in that loop?
We have merged a pull request that restored the correct LICENSE file and copyright, and are in touch with the upstream author Leśny Rumcajs who emailed us this morning. We'll look to revert the entire commit that our bot made, too, since it updated the README with a boilerplate getting started guide.
The bug was caused by a bot that was designed to commit template files in new repositories. It's code that I wrote to try to prevent other problems we have had with releasing projects in the past. It's not supposed to run on forks.
I'm going to make sure that we sit down and audit all of our forked repositories and revert similar changes to any other projects.
We have a lot of process around forking, and have had to put controls in place to make sure that people are aware of that guidance. Starting a few years ago, we even "lock" forks to enforce our process. We prefer that people fork projects into their individual GitHub accounts, instead of our organization, to encourage that they participate with the upstream project. In this situation, a team got approval to fork the repository, but hasn't yet gotten started.
To be as open as I can, I'd like to point to the bug:
- The templates we apply on new repositories live at https://github.com/microsoft/repo-templates
- The bug seems to be at this line of the new repository workflow: https://github.com/microsoft/opensource-management-portal/bl...
- The system we have in place even tries to educate our engineers with this log message (https://github.com/microsoft/opensource-management-portal/bl...): "this.log.push({ message: `Repository ${subMessage}, template files will not be committed. Please check the LICENSE and other files to understand existing obligations.` });"