Hacker News new | past | comments | ask | show | jobs | submit login

> what do people use to run benchmarks on CI?

Typically, you purchase/rent a server that does nothing but sequentially run queued benchmarks (and the size/performance of this server doesn't really matter, as long as the performance is consistent), then sends the report somewhere for hosting and processing. Of course, this could be triggered by something running in CI, and the CI job could wait for the results, if benchmarking is an important part of your workflow. Or if your CI setup allows it, you tag one of the nodes as a "benchmarking" node which only run jobs tagged as "benchmark", but I don't think a lot of the hosted setups allow this, mostly seen this in self-hosted CI setups.

But CI and benchmarks really shouldn't be run on the same host.

> What does the rust project use?

It's not clear exactly where the Rust benchmark "perf-runner" is hosted, but here are the specifications of the machine at least: https://github.com/rust-lang/rustc-perf/blob/414230abc695bd7...

> What do other projects use?

Essentially what I described above, a dedicated machine that runs benchmarks. The Rust project seems to do it via GitHub comments (as I understand https://github.com/rust-lang/rustc-perf/tree/master/collecto...), others have API servers that respond to HTTP requests done from CI/chat, others have remote GUIs that triggers the runs. I don't think there is a single solution that everyone/most are using.




Do I really need dedicated hardware? How bad is a VPS? I mean it makes sense but has anyone measure how big the variance is on a VPS?


Dedicated hardware doesn't need to be expensive! Hetzner has dedicated servers for like 40 EUR/month, Vultr has it for 30 EUR/month.

VPS's kind of doesn't make sense because of noisy neighbors, and since that has a lot of fluctuations, because neighbors come and go, I don't think there is a measure you can take that applies everywhere.

For example, you could rent a VPS at AWS and start measuring variance, which looks fine for two months but suddenly it doesn't, because that day you got a noisy neighbor. Then you try VPS at Google Cloud and that's noisy from day one.

You really don't know until you allocate the VPS and leave it running, but the day could always come, and the benchmarking results are something you really need to be able to trust that they're accurate.


Is there something to be said for practicing how you play? If your real world builds are going to be on VPS’s with noisy neighbors (or indeed local machines with noisy users), I’d prefer a system that was built to optimize for that to one that works fantastically when there is 0 contention but falls on its face otherwise.


Different things for different purposes. Measuring how real software under real production workloads in variable enviornments behaves is useful but inherently high-variance. It doesn't let you track <1% changes commit-by-commit.

Field work vs. lab work.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: