A few years ago, as a hobby project, I designed a programming language and wrote an interpreter for it in Java. (I've never released it or shared it with anyone; I was never quite happy with it and eventually moved on to other things.) As you can imagine, an interpreted language with the interpreter written in Java is going to be rather slow to start.
So, I implemented exactly the solution you describe here. I made my interpreter run as a daemon and listen for requests on a Unix domain socket (with a custom binary protocol). I then wrote a client in C, which opened that Unix domain socket and sent it a script to run. The C program also redirected stdin/stdout/stderr to/from the interpreter through the Unix domain socket, if connected to a tty it could switch to/from raw mode, it notified the server if certain signals occurred, etc. Using the interpreter this way, startup time was a lot faster. This approach can be applied to any language which supports runtime evaluation of code.
A long-running background interpreter carries significant security risks, though. A process that lacks most linux capabilities could use it to regain access to things it shouldn't have access to. A process could interpose itself and get access to data and code it shouldn't. If there are multiple users involved, one could bypass ACLs and similar. Essentially, you're removing a significant proportion of the security guarantees the OS provides around isolated processes.
Some of the concerns you raise can be addressed. For example, make the daemon and socket per-user and give the socket 600 permissions. Also, the SCM_CREDENTIALS message (on Linux) or LOCAL_PEERCRED/getpeereid (macOS/*BSD) can be used to validate the caller has expected UID.
The issue with missing-normal-capability processes is more difficult to address. One possibility, at least on Linux, would be to get the pid from SCM_CREDENTIALS message, and then read /proc/$PID/status to check capability bits. It could default to denying access to processes with less capabilities than it itself has.
(SCM_SECURITY can be used to pass the SELinux security label from client to server, which could also be used as a security measure; maybe the server could refuse access to processes running with a different label, or have a whitelist of allowed labels; if SELinux is being used to sandbox a process, that would prevent it from accessing the background interpreter unless that was explicitly allowed by whitelisting.)
So, I implemented exactly the solution you describe here. I made my interpreter run as a daemon and listen for requests on a Unix domain socket (with a custom binary protocol). I then wrote a client in C, which opened that Unix domain socket and sent it a script to run. The C program also redirected stdin/stdout/stderr to/from the interpreter through the Unix domain socket, if connected to a tty it could switch to/from raw mode, it notified the server if certain signals occurred, etc. Using the interpreter this way, startup time was a lot faster. This approach can be applied to any language which supports runtime evaluation of code.