I believe (per the stuff at the bottom of https://www.kernel.org/doc/Documentation/vm/overcommit-accou... ) that the kernel does the accounting of how much memory the new child process needs and will fail the fork() if there isn't enough. All the COW pages should be in the "shared anonymous" category so get counted once per user (i.e. once for the parent process, once for the child), ensuring that the COW copy can't fail if the fork succeeded.
As pm215 states, it doubles your memory commit. It's somewhat common for large programs/runtimes that may fork at runtime to spawn an intermediary process during startup to use for runtime forks to avoid the cost of CoW on memory and mapppings and etc where the CoW isn't needed or desirable; but redis has to fork the actual service process because it uses CoW to effectively snapshot memory.
Not if your goal is to make it such that OOM can only occur during allocation failure, and not during an arbitrary later write, as the OP purports to want.
It's not really wrong. For something like redis, you could potentially fork and the child gets stuck for a long time and in the meantime the whole cache in the parent is rewritten. In that case, even though the cache is fixed size / no new allocations, all of the pages are touched and so the total used memory is double from before the fork. If you want to guarantee allocation failures rather than demand paging failures, and you don't have enough ram/swap to back twice the allocations, you must fail the fork.
On the other hand, if you have a pretty good idea that the child will finish persisting and exit before the cache is fully rewritten, double is too much. There's not really a mechanism for that though. Even if you could set an optimistic multiplier for multiple mapped CoW pages, you're back to demand paging failures --- although maybe it's still worthwhile.
> It's not really wrong. For something like redis, you could potentially fork and the child gets stuck for a long time and in the meantime the whole cache in the parent is rewritten.
It's wrong 99.99999% of the time. Because alternative is either "make it take double and waste half the RAM" or "write in memory data in a way that allows for snapshotting, throwing a bunch of performance into the trash"
A forked process would assume memory is already allocated, but I guess it would fail when writing to it as if vm.overcommit is set to 0 or 1.