If Rackspace, given their major support for various opensource projects, were to...

jacques_chester · on Oct 15, 2016

IBM sponsor a fleet of POWER-based systems at the OSU Open Source Lab[1].

Edit: You already said this ("Or, they could, like IBM provide SSH access to interested projects."), and if you'll excuse me, I'm going to go hide in shame.

In my day job we've interacted with an IBM team who are porting our entire buildpacks pipeline[2][3] (which uses Concourse) to run on ppc64le. We fall under the Cloud Foundry heading the list of projects.

The eventual goal is that we will be able to run x86 workers (on a regular commercial cloud) and some POWER workers at OSU-OSL or SoftLayer, and build both kinds of binaries from the same pipeline.

I believe the eventual eventual goal is that all Cloud Foundry pipelines and products will be fully available across both x86 and ppc64le, including first-class integration with any pipeline producing binaries. Given that buildpacks represents the bulk of the binary volume, it makes sense to ensure our entire pipeline works on ppc64le.

Disclosure: I work for Pivotal, not IBM, and I'm not able to commit either to anything.

[1] http://osuosl.org/services/powerdev

[2] https://buildpacks.ci.cf-app.com/

[3] https://github.com/cloudfoundry/buildpacks-ci

cm3 · on Oct 15, 2016

What would be the course of action for an opensource project to set up a CI worker there (ideally per-commit on X branches, not periodic) such that it could be integrated in a pre-merge check? I'm not bound to Gitlab CI runners, but it's the first thing that came to mind given the popularity of github and gitlab.

jacques_chester · on Oct 15, 2016

I honestly have no idea. I imagine access is mediated by OSU, not IBM. The contact page (http://osuosl.org/contact) seems like the place to start.

One of the tricky parts about running PRs is that you're running arbitrary code, for which the main threat is the exfiltration of secrets. You need to lock down the workers fairly tightly to avoid unintended consequences. I'd be interested in reading more about how Travis, CircleCI, Gitlab et al do it -- some light googling didn't turn up any specifics.

Edit: looks like CircleCI call this out explicitly and state their defences -- https://circleci.com/docs/fork-pr-builds/#security-implicati...

cm3 · on Oct 15, 2016

That CircleCI post seems to talk about different issues than I see when I think safety of random CI jobs.

It's a non-trivial problem to solve, especially with caching of artifacts involved. You'd probably want to run a sandbox inside a vm and secure the vm itself first, while having only ephemeral storage attached. Barring a container escape via just read/write/execve allowed inside the sandbox, which could probably also used to escape the surrounding vm, there isn't much you can do if you support running random stuff in a CI job.

Actually, maybe CI needs to be limited to tools that can run on something like ZeroVM.

Limiting persistent state and spinning up machines (vm or bare metal) for each job, while having no permanently active job runners, sounds like another defense to consider.

That said, I very much doubt any of the CI services goes to such great lengths, given the limitations involved.

jacques_chester · on Oct 15, 2016

> Limiting persistent state and spinning up machines (vm or bare metal) for each job, while having no permanently active job runners, sounds like another defense to consider.

I can imagine how I'd do this with Concourse, but it'd be confusingly meta in approach -- a pipeline which builds a new pipeline with a new worker for each PR.

I still think the exfiltration threat is the worst. Any secret injected into the environment of any tested codebase is vulnerable -- especially if your logs are public.

cm3 · on Oct 15, 2016

> I still think the exfiltration threat is the worst. Any secret injected into the environment of any tested codebase is vulnerable -- especially if your logs are public.

Fair point, though instead of worrying about that, I think the real solution is to have test-only keys and also make sure logs can be shared without fear of leaking data.

jacques_chester · on Oct 15, 2016

We (buildpacks team) get some of the way by ensuring that all secrets in our logs are redacted -- we actually wrote a rough-and-ready tool (concourse-filter[0]) for this purpose. It works on a whitelist principle. Any environment variable emitted to stdout or stderr is redacted unless it appears on a whitelist[1].

You're right that in the longer run, providing per-test keys will be the safest option. It's on our radar as part of the overall "3 Rs" effort[2].

[0] https://github.com/pivotal-cf-experimental/concourse-filter

[1] https://github.com/cloudfoundry/buildpacks-ci/blob/1c345c30e...

[2] https://medium.com/built-to-adapt/the-three-r-s-of-enterpris...

cm3 · on Oct 15, 2016

Right. Unfortunately, Rotate and Repave are not common practice, just like periodically restoring backups isn't.

jacques_chester · on Oct 15, 2016

We're working on it. One day I expect it'll be considered normal.