Slightly off-topic, but... am I the only one who is troubled by the fact that the explanations of "fork and exec" are almost never, ever, discuss the rationale for picking this design instead of "CreateProcess()" a.k.a. "just_launch_this_exe_file()", or even acknowledge this alternative design? A student would probably expect that to launch an app, there is a system call that would do exactly that: you pass it the name of the application/executable file, and it launches. Instead, there is a small dance of duplicating itself and then re-writing one of the copies--try not to get tangled up in the legs while doing that!
The fork+exec has many unobvious advantages over CreateProcess, but also disadvantages, and one of them is that it's, in my opinion, a completely unexpected approach: I can't think of any other system object that can only be created by first copying another already existing object and then overwriting the newly created one if needed. Files of any kind (folder/pipe/socket/etc)? Memory mappings? Signal handling? Process groups/sessions? Users/groups? The processes seem to be unique in this regard. How did people come up with this approach? That would be an interesting discussion, I think.
Instead, it's just described as the completely self-justified, with description of "zombies" thrown in the end for boots: and the zombie processes are pretty much an accidental artifact of early UNIX design; if the fork was returning not only globally visible (and therefore unstable) PID but also an fd associated with the child process, neither wait/waitpid syscalls nor reaping duties of PID 1 would have been necessary.
From "Operating Systems: Three Easy Pieces" chapter on "Process API" (section 5.4 "Why? Motivating The API") [1]:
... the separation of fork() and exec() is essential in building a UNIX shell,
because it lets the shell run code after the call to fork() but before the call
to exec(); this code can alter the environment of the about-to-be-run program,
and thus enables a variety of interesting features to be readily built.
...
The separation of fork() and exec() allows the shell to do a whole bunch of
useful things rather easily. For example:
prompt> wc p3.c > newfile.txt
In the example above, the output of the program wc is redirected into the output
file newfile.txt (the greater-than sign is how said redirection is indicated).
The way the shell accomplishes this task is quite simple: when the child is
created, before calling exec(), the shell closes standard output and opens the
file newfile.txt. By doing so, any output from the soon-to-be-running program wc
are sent to the file instead of the screen.
Running code in the target process space before the target executable is loaded is not possible with CreateProcess. Configuration of the target process has to be done by the operating system, via an ever-growing list of attributes you can pack into STARTUPINFOEX. https://docs.microsoft.com/en-us/windows/win32/api/processth...
UNIX fork/exec is also significantly faster than CreateProcess, but I'm not sure how much of that is necessary rather than contingent, or results of an ever growing list of "security" programs insisting on a veto. https://stackoverflow.com/questions/47845/why-is-creating-a-...
Fork+IPC architecture - effectively multithreading but where the processes don't automatically share memory and can crash separately - also requires fork(). However this has very much fallen out of fashion.
() I suppose you can emulate fork by calling CreateProcess with an empty executable and CREATE_SUSPENDED, then use various debug APIs to overwrite its memory map with your own, but this is very messy.
I know, but running code in the target's process space isn't necessarily required nor does the API need to take on an amazing slew of options. That the NT API did is merely indicative of its growth, not an indication that fork is the one true way.
These things are possible with APIs like CreateProcess or posix_spawn, but involve the API ending up with a million little customizable parameters in the attempt to simulate all of the behaviors that can trivially be implemented if you just support "let me run some code in the target process before I replace its memory"... which is like, exactly why "easier to implement" sort of means: something is often easier to implement if it is a single universal and general primitive that forms an elegant solution to a wide set of problems rather than being a myriad collection of hacks for various use cases. As an API--particularly on systems with copy-on-write pages--fork is amazing.
Am I the only one who is troubled by the fact that the explanations of "fork and exec" are almost never, ever, discuss the rationale for picking this design instead of "CreateProcess()" a.k.a. "just_launch_this_exe_file()"
It comes from a hack in Unix for PDP-11 systems. This was before paged virtual memory. "Fork" worked by swapping the process out to disk. Then it duplicated the process table entry, with one entry set to the swapped-out copy and one set to the in-memory copy. This was simple to implement, and would still work if you invoked something big enough that both sides of the fork could not both fit in memory. Thus, a shell could invoke a big compiler. That's the real reason.
There are lots of other ways to do it. Some other systems have "run" as a primitive. Plan 9 offers all the options - share code, share file handles, share data. Windows has more of a "run" primitive. QNX has a system where a new process starts empty but attached to a shared object, with control starting in the shared object. The shared object then does the work of loading the program to be executed, so the OS doesn't have to.
> The fork+exec has many unobvious advantages over CreateProcess,
What are the advantages you see, that are not rooted in the fact that it's very difficult to create any remotely sane declarative API in a language as imperative and primitive as C? In other words which of these advantages would still apply if you wrote your OS in, say, Ocaml?
Splitting fork and exec allows you to roughly hew the process environment of a clone of the parent into shape, process state-wise, before you sacrifice it to birth the child-process which will inherit many of these desired (and generally a few undesired) traits. So one big advantage is that it allows you to re-use an existing (and growing!) set of imperative commands to modify aspects of a running process to specify the same aspects for a child process,and that piecemeal mutation is basically the only way to express anything of any complexity in C, particularly if you don't want to break API compatibility all the time.
The downside is that you generally end up inheriting a bunch of cruft that you really didn't want to, that the imperative and piecemeal nature opens up problems with race conditions, and that there is a lot of overhead only partially mitigated by various complex hacks (COW, various more "leightweight" fork alternatives, special flags to general purpose system calls that are only there to control behavior upon forking, ...).
So what would the ideal API look like, in your opinion? CreateProcess, as it exists now, has lpStartupInfo->lpAttributeList parameter which is a huge and growing list of various flags and knobs; what would an extensible, declarative API look like in an OS written in OCaml?
> you don't want to break API compatibility all the time.
Indeed you don't, unless you're okay with forcing your users to rewrite all the libraries and/or applications every couple of years. That's true of any programming environment, imperative or not.
> various complex hacks (COW, various more "leightweight" fork alternatives
Copy-on-write is as much a hack as persistent data structures are: it's just more coarse-grained.
What I hate about fork is that it fundamentally doesn't even make sense. You fork a process with a window, what happens to that window? Does it get duplicated? Do both control the same window? It has no sensible behavior in the general case without cloning the whole machine, and even then, your network isn't going to get forked. There are limited cases where it could make sense, but the fact that that's not true in general should make it fairly obvious that we need a different spawning primitive. It boggles my mind that people teach fork as if it could be the primitive.
I'm not saying a particular implementation is ambiguous as to what it does. Obviously any implementation will do something. I'm saying that fork as a concept is ambiguous as to what it should do.
You're forking a process, not the whole system. Sorry, but I fail to see the ambiguity. If the window is in the process as some framebuffer, it gets cloned too. That may not mean you'll get two windows on your monitor, because your display engine is HW thing, and not a process.
It's because you're thinking "it does X! it's not ambiguous!" but still missing that the question is over why it SHOULD do X instead of Y, not over whether it does X or Y. To a user, it sure as heck doesn't make sense to fork a process and still end up with one window for both of them. For lots of processes the process is its windows as far as the user is concerned.
Usually because you've had the browser open for a while and are seeing the state after the parent, but before the sibling comment was posted. At least that's usually why it happens for me; refreshing to see if someone else already has said what you want to say is rarely uppermost in my mind when a reply comes to mind.
Another possibility (pretty clearly not the case here) is when there are lots of replies: either you just miss it among the others, or ypu just can't be bothered to scroll down half a mile.
Well we do have a library call for launching a new process called posix_spawn(3), but it's complicated mainly because they are a lot of options you want to configure when launching a process: which file descriptors would you like to close, which would you like to share with the newly created process, what uid the new process should have, what working directory it should have, etc. It's just a large and complicated call.
But with fork/exec, all of that setup becomes just normal code you run after fork but before exec. You want the child not to have a certain file descriptor? Just close it after fork. You want to drop privileges when running the child? Same thing, just setuid/setgid/setgroups after fork. You want to set up resource limits for the child? Again just setrlimit after fork.
It avoids a lot of complexity in the system call itself. (Naturally, it adds some other complexity elsewhere.)
Yes, I know. But you only figure it out after you've done you fair share of IPC and management of worker processes by hand. When you're a fresh student, this "fork+exec" just seems like a pretty ridiculous way to organize things: why not just launch the executable you want to launch immediately? Nope, that's at best postponed until the chapter on IPC, at worst it's never discussed at all, so you're left puzzled and with "okay, I guess that's how things are done, if you say so..." feeling.
Oh, and by the way: we have open()/fcntl() for opening files and then fiddling with their settings instead of one open() call with 20 arguments; we could easily have had launch() with the same parameters as execve() that would launch the new process suspended, then we could use... I dunno, even fcntl() on the process's descriptor to configure all those things and then send it SIGCONT when we're done setting it up.
When you are a fresh student, and when fork/exec is too advanced for you, just use system(3). Not recommended for production use, but that's indeed the easiest way to launch an executable.
Do we optimize our system call design for fresh students or for professionals?
We were talking about the presentation of "fundamentals" in the courses on Computer Science, right? The fork/exec design is highly non-trivial and so should probably deserve more discussion, with examples of alternative designs and design considerations, than mere "that's how new processes are created, nothing special here, let's move on".
If you want to look at it as ‘fundamentals’, fork only requires the concept of a task, whereas your top-level comment's suggestion also requires (a) secondary storage, (b) organized into files, (c) that can be named.
(That's not why Unix developed with fork, though.)
The (b) and (c) are not really needed: early timesharing systems managed to launch processes straight from the punched card decks IIRC. And without (a), what do you even need the second, identical task for? I guess it could be used parallelize crunching the numbers, but that's it; and probably that should perhaps be done from the system operator layer anyway? Like, "launch 5 copies of program on this deck but mark one specific slot in memory different for them, so they would know their ID".
Besides fork() and spawn() there's another option which is to create an empty process, configure it as needed, then start it. IMO this is what simple, orthogonal primitives looks like. This generally isn't possible in Unix but AFAIK it's possible in Mach.
I think fork was a nice solution back when threads weren't common: you just fork, run as many syscalls as you want to modify the running environment, and then do exec.
Unfortunately, with threads, you're now stuck with locks that are potentially held by threads that no longer exist. So you can't call any function that may potentially acquire a lock, like printf, or your program may randomly hang. This feels... inelegant.
After a fork(2) in a multithreaded process returns in the child,
the child should call only async-signal-safe functions (see
signal-safety(7)) until such time as it calls execve(2) to
execute a new program.
And since any process can be multithreaded thanks to shared libraries being able to internally do whatever the hell they want to... yeah. Python, for example, in its implementation of subprocess module, had to shift a lot of work into pre-forked parent (such as disabling the GC and allocating all of the memory for exec()'s arguments), to add explicit "call_setsid" argument to reduce the usage of "preexec_fn", and even then, it still has lovely comments such as
/* We'll be calling back into Python later so we need to do this.
* This call may not be async-signal-safe but neither is calling
* back into Python. The user asked us to use hope as a strategy
* to avoid deadlock... */
Well, hope is usually not a valid thread-safety strategy, but there is really not much else the Python implementation can do.
The fork+exec has many unobvious advantages over CreateProcess, but also disadvantages, and one of them is that it's, in my opinion, a completely unexpected approach: I can't think of any other system object that can only be created by first copying another already existing object and then overwriting the newly created one if needed. Files of any kind (folder/pipe/socket/etc)? Memory mappings? Signal handling? Process groups/sessions? Users/groups? The processes seem to be unique in this regard. How did people come up with this approach? That would be an interesting discussion, I think.
Instead, it's just described as the completely self-justified, with description of "zombies" thrown in the end for boots: and the zombie processes are pretty much an accidental artifact of early UNIX design; if the fork was returning not only globally visible (and therefore unstable) PID but also an fd associated with the child process, neither wait/waitpid syscalls nor reaping duties of PID 1 would have been necessary.