BenchExec "uses the cgroups feature of the Linux kernel to correctly handle groups of processes and uses Linux user namespaces to create a container that restricts interference of [each program] with the benchmarking host."
Certainly better, but you’re always going to be better off maximizing the runtime to a level where it just swamps any of the other effects. Then do multiple runs and take an average.
BenchExec "uses the cgroups feature of the Linux kernel to correctly handle groups of processes and uses Linux user namespaces to create a container that restricts interference of [each program] with the benchmarking host."
https://github.com/sosy-lab/benchexec