If Rackspace, given their major support for various opensource projects, were to provide POWER9 runners for, say, Gitlab CI, this could be a major help in porting software. Or, they could, like IBM provide SSH access to interested projects. But the CI part is important to ensure there's no regression, and given the scarce availability of POWER9 (or even POWER8) hardware to the general public, let alone opensource developers, Gitlab CI integration sounds like the more practical service.
IBM sponsor a fleet of POWER-based systems at the OSU Open Source Lab[1].
Edit: You already said this ("Or, they could, like IBM provide SSH access to interested projects."), and if you'll excuse me, I'm going to go hide in shame.
In my day job we've interacted with an IBM team who are porting our entire buildpacks pipeline[2][3] (which uses Concourse) to run on ppc64le. We fall under the Cloud Foundry heading the list of projects.
The eventual goal is that we will be able to run x86 workers (on a regular commercial cloud) and some POWER workers at OSU-OSL or SoftLayer, and build both kinds of binaries from the same pipeline.
I believe the eventual eventual goal is that all Cloud Foundry pipelines and products will be fully available across both x86 and ppc64le, including first-class integration with any pipeline producing binaries. Given that buildpacks represents the bulk of the binary volume, it makes sense to ensure our entire pipeline works on ppc64le.
Disclosure: I work for Pivotal, not IBM, and I'm not able to commit either to anything.
What would be the course of action for an opensource project to set up a CI worker there (ideally per-commit on X branches, not periodic) such that it could be integrated in a pre-merge check? I'm not bound to Gitlab CI runners, but it's the first thing that came to mind given the popularity of github and gitlab.
I honestly have no idea. I imagine access is mediated by OSU, not IBM. The contact page (http://osuosl.org/contact) seems like the place to start.
One of the tricky parts about running PRs is that you're running arbitrary code, for which the main threat is the exfiltration of secrets. You need to lock down the workers fairly tightly to avoid unintended consequences. I'd be interested in reading more about how Travis, CircleCI, Gitlab et al do it -- some light googling didn't turn up any specifics.
That CircleCI post seems to talk about different issues than I see when I think safety of random CI jobs.
It's a non-trivial problem to solve, especially with caching of artifacts involved. You'd probably want to run a sandbox inside a vm and secure the vm itself first, while having only ephemeral storage attached. Barring a container escape via just read/write/execve allowed inside the sandbox, which could probably also used to escape the surrounding vm, there isn't much you can do if you support running random stuff in a CI job.
Actually, maybe CI needs to be limited to tools that can run on something like ZeroVM.
Limiting persistent state and spinning up machines (vm or bare metal) for each job, while having no permanently active job runners, sounds like another defense to consider.
That said, I very much doubt any of the CI services goes to such great lengths, given the limitations involved.
> Limiting persistent state and spinning up machines (vm or bare metal) for each job, while having no permanently active job runners, sounds like another defense to consider.
I can imagine how I'd do this with Concourse, but it'd be confusingly meta in approach -- a pipeline which builds a new pipeline with a new worker for each PR.
I still think the exfiltration threat is the worst. Any secret injected into the environment of any tested codebase is vulnerable -- especially if your logs are public.
> I still think the exfiltration threat is the worst. Any secret injected into the environment of any tested codebase is vulnerable -- especially if your logs are public.
Fair point, though instead of worrying about that, I think the real solution is to have test-only keys and also make sure logs can be shared without fear of leaking data.
We (buildpacks team) get some of the way by ensuring that all secrets in our logs are redacted -- we actually wrote a rough-and-ready tool (concourse-filter[0]) for this purpose. It works on a whitelist principle. Any environment variable emitted to stdout or stderr is redacted unless it appears on a whitelist[1].
You're right that in the longer run, providing per-test keys will be the safest option. It's on our radar as part of the overall "3 Rs" effort[2].