Compared to Wireguard with MFA? as far as I can tell Wireguard doesn't support MFA at all. Pure key auth is not acceptable in these contexts because of stolen keys and devices.
Rotate keys regularly through a secure channel. At some point these discussions devolve into a variant of "Which ammunition is least likely to kill me when I inevitably shoot myself?"
I think instead of firewall rules, it's better to use WireGuard pre-shared keys. WireGuard supports per-peer 256-bit pre-shared keys that get hashed with the ECDH result when setting up the session keys. If you've set up WireGuard without pre-shared keys, then behind the scenes the all-zero pre-shared key is used.
These pre-shared keys are set in the WireGuard server from userspace via ioctls. If the peer doesn't know its assigned pre-shared key, it's unable to complete the WireGuard key negotiation. So, you can use this mechanism to have a userspace daemon enable and disable WireGuard peers.
This morning, I was looking into a daemon to use a post-quantum algorithm and 2FA to set up a new WireGuard peer for 24 hours. Grover's quantum search algorithm effectively cuts the 256-bit pre-shared key to 128 bits, but if you use post-quantum algorithms to secure the pre-shared key generation, even a powerful quantum attacker has to do 2**128 work for every 24-hour session they want to break. The post-quantum algorithms are a bit computationally expensive, but they only need to be done every 24 hours to still give you forward secrecy against a quantum attacker.
The idea is the daemon exposes long-term Curve25519, NTRU, and Classic McEliece public keys. The client generates a new Curve25519 keypair to use as the 24-hour WireGuard peer's identity. The client sets up an encrypted session using the hash of all 3 key exchange mechanisms as the session's symmetric key. The username and a timestamp are sent to the auth server, which responds with the user's password salt, along with ephemeral Curve25519, NTRU, and McEliece values. The pre-shared key for the 24-hour WireGuard peer is the hash of all three ephemeral key agreements (using the same client-side ephemeral keys used in setting up the encrypted channel) concatenated with the Argon2 hash of the user's password and the RFC 6238 TOTP (Google Authenticor) time-based value. If the client responds with a correct Poly1305 MAC (using the auth session key) on the 24-hour pre-shared key, then the daemon will set up the 24-hour peer using that pre-shared key (using ioctl sycalls to communicate with the WireGuard kernel module). After 24 hours, the daemon removes the new peer.
Note that Grover's quantum search algorithm will cut the 256-bit pre-shared key and the ChaCha20-Poly1305 256-bit keys down to effectively 128 bits. Even if Curve25519 and one of the two post-quantum algorithms used are completely broken, you still get a secure channel. You lose perfect forward secrecy, getting forward secrecy blocked to 24-hour chunks.
If Cruve25519, NTRU, and McElice are all broken, then you lose forward secrecy (and the attacker can read the user names, but presumably traffic pattern analysis is pretty good at identifying the user anyway) and the password is the weak point, vulnerable to Grover's quantum search algorithm. The strongest password I've memorized and used in production was a wordlist generated from 128 bits from /dev/urandom. I'm weird; 99.9% of users would rebel at memorizing 160-bit to 256-bit passwords.
In order to prevent user enumeration, if a user doesn't exist, instead of returning an Argon2 passowrd salt, return some concatenated siphash values (with secret keys) of the username (so repeated queries for non-existing users give consistent results). I've also worked out how to use key rotation to get non-existing users changing their password salts every 60 days (as wold happen when real users change their passwords), but have the offsets uniformly distributed, so all of the fake users don't appear to change their passwords the same day.