Hacker News new | past | comments | ask | show | jobs | submit login
US government commits to publish publicly financed software under FOSS licenses (k7r.eu)
773 points by dandelion_lover on April 3, 2016 | hide | past | favorite | 104 comments



It seems like they've cut a wide swath for (probably) exempting the most interesting of the software that could have been developed in the open.

> Applicable exceptions are as follows:

> The release of the item is restricted by another statute or regulation, such as the Export Administration Regulations, the International Traffic in Arms Regulation, or the laws and regulations governing classified information;

> The release of the item would compromise national security, confidentiality, or individual privacy;

> The release of the item would create an identifiable risk to the stability, security, or integrity of the agency’s systems or personnel;

> The release of the item would compromise agency mission, programs, or operations; or

> The CIO believes it is in the national interest to exempt publicly releasing the work.

[1] https://sourcecode.cio.gov/Implementation/


The important thing is that this changes the default. Instead of the default being closed and needing to justify open-sourcing something, the default will be to open-source things and having to justify keeping something closed. That's huge.


It pushes the incentives around as well, though. Unless there's some sort of marginal cost for each line of code they choose to keep proprietary, they'll inevitably do "the least work necessary" when they realize that some project is going to have components fitting these clauses, by by boxing up the entire effort under one of these clauses.

Ideally, with some sort of force opposing propriety, the incentive would be such that the government would think it in its best interests to granularize their code into small libraries, programs, and network services, so that they could open-source as much as possible, leaving tiny "classified kernels" as wrappers for libraries or plugins for apps.

For an idea of such a force: maybe create a "proriety credit" that a government department would to buy for each 1000 lines of code they refuse to release, whose funding goes to this effort to open-source everything else?


LOC-based metric tied to money... What could go wrong?


An overabundance of semicolons per line? A complete abandonment of python?


One-liners that would kill Perl masters by heart attacks half from envy, half from horror


I was using LOC as a hand-wave-y abstraction; substitute something like "total AST-node information-theoretic entropy measured via compressed size of language-canonicalized source" if you want to be picky.


This is great thinking on this matter, much better than the blunt instrument we're looking at currently.

I still kinda think the blunt instrument is an improvement over the status quo, but it's hard to say.


That looks like PACER still falls under that.

I wonder if this would make it easier to build a replacement whose interface isn't utterly terrible. Of course, the hard part would be convincing the courts to use it. Based on a sample size of one Clerk of Court, who thinks PACER is absolutely fantastic, this will be hard. To be fair, it is state-of-the-art in terms of access to court documents in the US.


Could you imagine an API-only version of PACER where developers were free to build search, mobile, other front-ends on of it? Sure, that's basically a high school project using a MongoDB or CouchDB back-end. But legal tech is horribly antiquated and govt-legal tech has a long way to go. It seems like the best move would be API-only so that individuals and companies could build services on top.

BTW - check out www.casetext.com for an idea of what private enterprise can do in terms of making public records useful.


Different branch, though. This policy only covers certain executive branch agencies.


Of course there are many exemptions. Did you expect them to release the F-22 source code or NSA PRISM architecture?


Operational stuff no, but GCHQ publish their graph database for example: https://github.com/GovernmentCommunicationsHeadquarters/Gaff...


Of course not. That's why encryption algorithms are always best kept proprietary. /sarcasm.


Sure it's wide, but it also seems pretty reasonable to me


My main disappointment is that I'm not sure it will be reasonably applied, as so many things could be interpreted as having national security implications when in reality they're only used with security-sensitive data.


Or alternately, every system that handles private information (which, for a government, is basically all of them) will be considered to be a security risk to open-source even though the dataset will still be private.

"Security through obscurity" will mean that nothing gets opened.


I agree with your comment, but I think it will be easier to pass this right now with some wide exemptions and potentially expand it somewhere down the line.


The first two look reasonable, but after that it goes downhill. "The release of the item would create an identifiable risk to the stability, security, or integrity of the agency’s systems or personnel" means "I don't feel like it and you can't make me."


I'm probably misreading it, but "the integrity of the agency's personnel" makes it sound like "the in-house developers wrote terrible code and people will criticize them for it".


Or the stability. Open sourcing certain code may literally lead to insanity! I think this was just poorly word-smithed and the "correct" interpretation would only apply the protection of security to personnel.


I believe the correct parse is:

                             / (the stability, security, or integrity of the agency’s systems)
  (an identifiable risk to) < (or)
                             \ (personnel)


This probably excludes pretty much everything except probably the most boring things - public information websites, etc. Anything within the DoD, for instance, will at minimum be tagged "for official use only", which already exempts it from FOIA requests, so I don't see this really opening much up at all.


DoD work can, in at least some cases, be open-sourced. We had some work with DARPA where open-sourcing was actually mandated in our contract (at our request), and that code was, for a while at least, in active use in the field. Although I guess research is a little different.

Whether or not it's feasible is entirely dependent on the project, though. Something that's ITAR/EAR (or classified) is never going to be open-sourced.


So they commit to nothing at all.


This doesn't include anything developed at DARPA, right?


Lots of DARPA sponsored projects have open source releases.


I presented on this three years ago to the state government here in Michigan. It makes far more sense for the state's to do this than the Feds.

Why should there be fifty different software programs to register for a state campground? Or deal with the DMV? When the Feds pass a new law some contractor ends up writing the same software fifty different times for each individual state with small differences.

I believe the key is to have a plug in architecture, like WordPress, that allow each state to adopt the program for their individual laws. I think the savings could be tremendous and its entirely possible with multiple states collaborating the software could be continually improved.

I failed to convince them that day but I think the tide is turning.


Intuitively makes sense until you realize that there are similarly a lot of line of business apps that are used and implemented again and again across federal agencies.

For example, every classified document has a classification level (Secret, Top Secret, with more details on it). And the functionality to roll up the classification is re-implemented by every contractor every single system. So even within ONE agency, every system's implementation is done from scratch.

It's the perfect kind of thing that could've been an open source library (still in a classified network though), but still extremely valuable. It's something I built and recommended to my superiors ten years ago when I worked for a government contractor, but we didn't get much traction for various reasons. (Cynically, most contractors want the free money to reimplement it, so they aren't going to stop that gravy train.)

So there is a LOT of room even within each single agency, to share code that is common to their business practices.

You'll be hard pressed to find an area in any large organization (public or private) that couldn't benefit from the efficiencies of reuse.


Why is this limited to software?

Why not also publish publicly financed: * legal agreements * HR contracts * management presentations, best practices, and training material * marketing presentations and training material * creative output such as artwork and music

so that the rest of the world can use them for free..


> creative output such as artwork and music

If I'm not mistaken, all intellectual property published by the US government is public domain by law. I may be wrong or you may be referring to something else.


Close. It's not publication by the government that is important. It's authorship.

If a work is a work prepared by an officer or employee of the United States Government as part of that person’s official duties, then it is not subject to copyright in the US.

It might be subject to copyright in other countries, though.

If the government pays for a work to be created and published, but it is not a work prepared by an officer or employee of the United States Government as part of that person’s official duties, then it will be copyrighted and the copyright owner will be whoever it would be normally. If there is an agreement between the author and the government to assign the copyright to the government, the copyright will still be valid, and owned by the government.


I suppose you're right, but isn't that simply an issue of contract negotiation? Often when a private actor hires a contractor to create IP, part of what they're buying is ownership of the IP itself. I have no idea if the US government does this or not.


Freely redistributable artifacts are one thing, but publication "in the preferred form for making modifications" is a pretty important part of FOSS.


For most documents, copy and paste into a text editor is relatively straightforward. Highly technical documents are one obvious exception.


We're talking about "artwork and music", not text.


(Next step, open source the government policy, procedure, etc.)

Laws like requiring all commits to be submitted by the author. Authors must disclose potiental conflicts of interest, affiliations, etc. Submitting commits on behalf of another entity is treason; meaning if a funder wants a policy written, they post the funds, select another entity, selected entity assigns staff, staff submits policy, etc.;

All public, all real-time.


Licensing problems. For new projects this sounds great, put the code either in the public domain or they pick a license. Which one? MIT/BSD, GPL/LGPL, or something else? The former can generally be incorporated into the later, but what if they pick something else that's incompatible? At least we can see what was done and learn from it I guess. And what if they require a particular license as part of the policy and it prevent government projects from extending existing OSS project? That would be a shame.

The intent seems good, but the details will be everything.


GPL is discouraged in government software (speaking from DOE/NSF experience) but BSD or similar is encouraged. The path of least resistance is private/closed source, followed by open source BSD/MIT/Apache, followed by GPL/LGPL, followed by commercialization.

The reason why there is a bit of resistance with open source as a default is because funding has to be traced.


Its only the article stating (and misleading) that softwares be Free Software but in actual the draft never once mention Free Software. So unlikely it would be GPL/LGPL/etc.

Its also suprising to see that this question popped up after 500+ points.


Heh. I submitted a FOIA request two weeks ago for code for a $100M system used for storing parking ticket data. They have 5 days to ask for an extension, but they haven't even responded with that.

Guess I'm going to lawyer up again..


What is the penalty for not responding in time?


For Illinois, two big things - 1. They can't reject the FOIA with "unduly burdensome" anymore, which is an incredibly common and frustrating response. 2. They can be sued for not complying, which then becomes a slow drawn out process. The upside is that it's something like 8k if the judge thinks the request was rejected/ignored intentionally or without proper followup.

Example - I did a FOIA that was rejected absurdly unfairly. Going through the attorney general took about 6 months to hear a response back, which Chicago's lawyer then used essentially a mistype to throw that request out. 8 months in total there. Fast forward 6 months later and we have a court date with a lot of small things in between then and now.


Why do these get rejected? Is it just people being lazy? Or people not understanding how to handle the request? Seems like a lot of work for them to go to court?


A few things -

1. Definitely a lot of laziness from the FOIA officers. Some are really great, but others not so much.

2. There's no standardization in how it's handled within the city. Departments in the same buildings have very different idea of what should be rejected, their helpfulness, looseness of FOIA interpretation, etc.

3. They're truly not technical in the slightest. At one point I asked for a database dump, excluding sensitive columns and tables. The response was "Wikipedia defines a Data as:", so I had some fun explaining that one.

4. Fighting the process takes a very long time, so a rejection is an easy way to get someone to go away.

5. Not a reason, but still annoying.. They wait the maximum days to respond to your requests. Of the 50 or so I've submitted, I'd say 45 of them have been last minute. 25 times out of the 50, sure... but most?

It's a lot of work if it gets to that point, yep, but I've only ever needed my lawyer to file a suit. The average FOIA might take me 5 minutes to write, so it's not really that much time - especially considering what you get for it.

Here's an example one with some pretty neat data from it: https://github.com/red-bin/chitickets


That's fascinating. It sounds like mostly the requests are rejected because of lack of time / lack of skills / laziness then?


Yep. (This post got a lot longer than I was hoping. Hope you find it interesting!)

That ticket data is actually managed by IBM, so that even adds an annoying translation dynamic.

This current suit is because the city thinks it's 'unduly burdensome' to resolve city-owned phone numbers to names that were dialed by Chicago's mayor. Initially I could have understood that, but it became pretty clear pretty soon that they really didn't want me to have those records with these rejections:

Some more detail on the rejections I've been through this past year and 4 months:

-Original request asked for the mayor's records. Rejected for "no such records". Heh.

-Second was for the FOIA officer's phone records. Again, "No such records". (This was just a test to see if the mayor's phone was the only one without any records..... and it was kinda funny ;))

-Sent a request to Chicago's IT dept for their VOIP logs, which got me a file with the first 6 digits of all of non-government phones that were dialed. Government phones were completely redacted for "privacy" reasons. So, so many things wrong there.. especially since FOIA specifically disallows that sort of redaction.

-Sent a "Request For Review" to the attorney general's office, who took 6 months to agree with me and told the FOIA officer to give me the records. Chicago's lawyers then threw the VOIP request out, since they don't have VOIP. The assistant attorney general on the case and my lawyer both said (paraphrased) "They're right, but that's a total dick move and they know it."

-Submitted a cleaned up request, without "VOIP". They sent the same file from before. Same creation time and everything.

-Now we're waiting through the long court process...

About the attorney general's RFR process, though..

You can submit FOIA requests anonymously, but you can't submit RFRs anonymously! Their reasoning was that I need to have a valid signature with my named spelled out in plain text, despite ESIGN allowing pseudonyms. So - don't like that anonymous person? Just don't respond to their request and they'll either have to sue using their name or submit an RFR using their name. I fought that one for about a month and eventually submitted a FOIA request to the AG's office asking for the email chain that led to that decision. That got rejected with something like "deliberation discussions cannot be released."


That's really interesting. Sounds fairly hostile. Although I guess from the opposing party's perspective, there is no advantage to sharing information, only potential disadvantage... and its not something you can undo.


Not sharing information gets in the way of progress and established laws. Politics really shouldn't have anything to do with it, honestly. But since it does, collectively as a group, we should be doing more besides just voting and walking on pavement ;)

I got involved with this after working with one of the old mayor candidates, Amara Enyia. The plan was to give her the parking ticket data which would be used for her campaign. The day we were supposed to meet, two of the candidates and an ex-senator effectively colluded to get her kicked out of the race by questioning her signatures. The guy who submitted it was related to Dock Walls and friends with Ricky Hendon, iirc. Her campaign manager then went to Bob Fioretti's campaign, where they invited me over, where I gave Fioretti's campaign manager and Amara's campaign manager the full data. Fioretti's manager called it "fucking golden"..... I never heard back. A bit later, he made parking tickets a big part of his campaign.

After seeing Dock Walls, Ricky Hendon and Willie Wilson colluding to get Amara out of the race (seriously, they talked about it on facebook, publicly), it seemed a little suspicious that Fioretti's team didn't bother responding, considering how much they liked it. The phone records request was just a probe to see if the collusion ran deeper. Whether it's true or not isn't the point since, 1. I don't care enough about politics. 2. It's the principal of the entire thing.

For what it's worth, Fioretti dropped out of the race and gave Rahm his support the weekend before the election. :)


I agree that not sharing is problem. Sometimes when I describe the opposition's point of view it sounds like I am defending their position. I think in order to fight an opponent, you have to understand their point of view.

Your story is concerning, but I'm not surprised. In my opinion, the system itself filters for this sort of behavior - and it gets worse in the upper echelons of the competition because in the upper echelons it becomes less and less feasible to deviate from an "optimum approach". My assumption being that the "optimum" approach includes participation in unfair and antisocial techniques.


Sorry - didn't meant to come off as aggressive or combative.. There's just a surprising amount of hostility from those who like Rahm or think Rahm is a built-in in Chicago. "Why don't you just move?" and calling me unpatriotic were pretty interesting to hear.

The idealist in me thinks those behaviors can be fixed even though it's "turtles all the way down". Though, if enough people do enough tenaciously peaceful civics work, then hopefully things will get slightly better. By treating everything as a mystery as [0] and [1] talk about, things become manageable.

[0] http://www.smithsonianmag.com/people-places/risks-and-riddle... [1] http://www.newyorker.com/magazine/2007/01/08/open-secrets-3




It's funny you say that because that's the one NSA did voluntarily:

http://www.informationweek.com/applications/nsa-submits-open...

https://accumulo.apache.org/

;)



Oh yeah, I did forget about that one. Page is kind of clutterer but the glance makes me think this was the tool that automated & checks hardening guides. Am I right?

And does it substitute for system monitoring/config tools that are popular or just complement them?



Hell yeah! That might be the one I was thinking of. I recall it had labeling or provenance tagging for security purposes. Both might have it, though.


Help us, spy you.


Heh, head over to Github and fork nsa/xkeyscore, set up a local instance on your network, and forward all interesting events straight to the NSA using their new API!


This has been a UK government stance for a while but the problem is that while the UK gov spends something like 8BN pa on software licenses the amount of new development is fairly low (outside of say the GDS - central cabinet office group pushing all this)

So for example, my local council wants to purchase software to help track customer feedback from dozens of sources (web sites, phone ins etc). Sounds perfect for a web era, message queue based project.

But my council is only interested in meeting its fiduciary duty in buying software that already exists, and has been proven to do the job required. Speculative development is out - even though pretty much everyone of 430+ councils, parishes, towns etc of the U.K. Will want something similar in the next few years.

No one is willing to stump up the cash to start the open source alternative.

It's a tragedy of the not commons.

I think there ought to be solutions in some kind of development bonds or other financing arrangements - but it's not clear how.


> This has been a UK government stance for a while

As further confirmation, it's cemented as one of the ten design principles behind GOV.UK: https://www.gov.uk/design-principles#tenth


Yeah but still... No way to get to the majority of IT spend - even if that IT spend is wasted on licenses costing more than the cost of the capital build out


A great start. Now if only the same would be required of all code arising from federally funded research, we could massively accelerate research and development in America.


Obligatory mention of 18F (https://18f.gsa.gov) within the General Services Administration who develop federal government projects in the open (and have defaulted to open-sourcing since their inception).

https://18f.gsa.gov/2015/01/16/open-source-for-good-governme...


> The custom-build software will also be published to the general public either as public domain, or as Free Software so others can improve and reuse the software.

> However, it is encouraged to be retroactively applicable for the existing custom-build software developed by agency employees in the course of their official duties.

As much as I love FOSS, anything produced by the government — including software — is public domain, right? The government is not eligible for copyright — including FOSS licensing. While I understand that the government sometimes contracts stuff out, and those third-parties seem to enjoy a weird exception to the rule, the article seems to be including stuff developed directly by the government for the government. How do they think they have any rights to it at all?


Public domain in the USA, but automatically copyrighted in practically every other country in the world.


Several companies have been started by using software developed by people working in governments and it's only fair as it's all funded by the taxpayers

Because many governmental projects are politically sanctioned, done properly, this could become a great way to do R&D where the private market doesn't.


I hope this includes voting software.


There is an effort happening in San Francisco right now to do just this. See here, for example, for a recent March 24, 2016 hearing on the topic:

https://sfgov.legistar.com/LegislationDetail.aspx?ID=2567089...

"Hearing on San Francisco's efforts to fund, design, and implement an open source voting system, including budget costs and projected roll-out schedule; and requesting the Department of Elections and the Elections Commission to report."

It's currently up for consideration in the budget.

SF resident Brian Behlendorf recently endorsed (along with EFF, Code for San Francisco, California Common Cause, local GitHub CEO Chris Wanstrath, etc), and we're looking for more high-profile supporters to help influence the Mayor's budget decision:

https://twitter.com/cjerdonek/status/716448812181512192


I could be wrong, but in most cases the US government buys appliances from companies like Diebold and they don't develop or contract software for voting. The best that can be done is regulation to force audits of the appliances like what Nevada does with casino slots. Even that would be a huge step up.


Might be a reason to long-term require all software used be open source and for that matter make it easy for a user to confirm that the system in use real-time is the system that's open-sourced too.


How do you allow a user to confirm this in real time? Hash displayed on th screen?


Users wouldn't confirm this in in real time just like users have no way of verifying in real time that their paper ballot was actually counted.

A better outcome for this type of system would be a pre-election and post-election verification of each machine (which isn't unreasonable since the number of machines is far less than the number of ballots) in which interested agents from each party/group could be present during the verification process (and use their own machine if they desire) to ensure that all the digital signatures match and there's no foul play.


Yes, that's correct, being able to download the code, hash it, and then compare it to the hash displayed real-time would be a first step towards enabling this.


the software can easily be modified to display the hash you are expecting


To start, something is better than nothing. Long-term remote inbound two part authitacation combine with like measures for hardware via meta-programming would make bypassing this very, very hard for the average hacker.


perhaps use a Merkle hash to verify your vote was counted?


There are actually foundations out there that do exactly that. If you're interested, OSET Foundation (http://www.osetfoundation.org/#welcome) has developed everything from ballot scanners to entry systems. It's all open source and they've been at it for about 10 years. They patented some of their systems and made the Secretary of commerce the owner of these patents, as far as I'm aware. Legislation is slowly moving towards adoption, given the potential reduction in cost.


In general the US Federal government does not operate polls. The various local governments operate elections while abiding Federal and State laws and regulations. In practice, the highest level where software standards are promulgated is state governments and there only some states and with variation among them.


Unless you have a way to ensure the code actually running on voting machines is the code that was published, there is no point.


Yes there is, it significantly raises the bar for maintaining plausible deniability while enabling voting fraud.

Right now if there is a bug that creates incorrect vote totals it would be very hard to prove it was done intentionally. If the software was required to be open source, then just the fact that they were running different software than was published would be strong evidence that it was intentional.


The low tech solution tried in India is to have a paper trail on the electronic voting machines itself.

>> ... will be fitted with a printer that will have a drop box to store the paper trail of the votes cast. This is expected to alleviate concerns — expressed earlier by the opposition led by BJP — over possible tampering of EVMs to favour a certain candidate.<<

http://www.business-standard.com/article/pti-stories/2019-ge...

http://timesofindia.indiatimes.com/india/EC-ready-with-new-a...


Wow, this is great. I hope it prevents another contractor-related debacle like healthcare dot gov.


There was a lot of things that went into that. There was a huge time crunch on the project and until the bill was signed there was only so much groundwork that could be laid because you could only make passable guesses at the final requirements.


If this actually works, wow


Floating points.... Floating points everywhere!


It exempts prior work from this clause though it says it encourages vendors to adhere to it. I have a hard time seeing that anything really interesting will be ever released because it will likely have national security exemptions in it and like taxes many companies will find a way to avoid it.


Yeah.. just about anything can be declared under the umbrella of "national security" if you reach hard enough.

For a true to life example, look at the NSA.


This is excellent as it puts something (sorta in writing). But the government only has a limited ability to retain copyright. If there is a wholly in-house developed program, one _should_ be able to get the source under a FOIA request.


Let's start with open sourcing the effing voting machines.


And after the first data journalist wave of analysis reveals an average of 10,000 redundant projects worked on by agencies at any given time, open access will go dark again.


It is completely contrary to the notion of an open and free society that the government is permitted to spend taxed dollars developing anything meant for domestic use in secret, much less it being (more or less) the default policy.

Looking at the exemptions carved out of this rule (assuming this source is accurate), this looks more like it is codifying the secret nature of government operations than it is opening up anything substantial.

Hard to regard this as good news, although it may be another straw on the back of the government finally falling and the emergence of an actual open and peaceful world.


How exactly is I "contrary to th notion of an open a d free society." I think that's just nonsense. And you think this could only be good if it leads to the government failing and a failed government obviously leads to world peace?

I think some just let irrational hatred make them believe inane fairy tales.


The basis for transparency is government, at least in the common law tradition, is the understanding that without it, government actors will tend to act in their own interest rather than in that of the public.

Do you think that this is a reasonable belief?

If so, then it logically and obviously follows both that 1) development of products by the government (to the extent that this is even the reasonable purview of information age government at all) need to be conducted in a profoundly public way and that 2) a failure to do this will result in a properly disaffected populace and shortened moral stature on the part of the government.

It is no fairy tale to suggest that secrecy in government is a substantial part of the formula of government failure more generally. The only assertion I'm making that might not yet be in evidence is that, amidst the information age, this aspect of good government is more important and that its effects will be realized far more quickly.


I want more State government to adopt this policy


I believe it'll be BSD licence


What percentage of US government software is currently developed as FLOSS?


One of the current issues is that government-written code is pure public domain, so any contractor (SAIC) can FOIA some code the know their DoD handlers want implemented (ViSTA, written by the VA) and charge $4B. That is a tragedy of the commons if I ever saw one.


I want the source code of government cryptography programs.


You do realize the security lies with the keys/passwords used to encrypt, and not in the program/algorithm itself?


Great news.


    >npm install xkeyscore

    npm ERR! 404 'xkeyscore' is not in the npm registry.
    npm ERR! 404 You should bug the author to publish it
Hm.


It probably depends on left-pad




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: