That's fine if you actually want to fork, but it's pretty awful if your real goal is to immediately exec. What I would suggest in that case (with MMU) is a limit on how many pages the new process can dirty before it execs. Then you only need enough memory to provide that limit. And it shouldn't be too hard to statically prove that the limit is sufficient. (The parent process would be suspended in the meantime so it can't cause any pages to be duplicated.)
The fork+exec is sort of a special, common case already. There have been attempts to explicitly implement that because even the current fork() has turned out to be too slow or too convoluted to implement, as witnessed by things like vfork() and posix_spawn(). We could just have fork() and fork_and_exec() sycalls separately.
The other common case is to actually fork() a child process. Even then it would suffice to just reserve enough physical pages to make sure the child won't OOM-by-write, but only copy pages when they're actually being written to so you wouldn't be copying a gigabyte of memory even if you never used most of it.
The kernel would allow applications to allocate at most physical mem size + swap size - any memory reserved for kernel. Thus you could still use MMU to push the least used pages into swap, and you could run more or larger programs that would fit in your RAM, but when memory is low these programs would always fail at the point of allocation, not memory access.