They could use the description of what the software is supposed to do and an understanding of the current code to figure out how it should work and what edge cases need to be tested. They can also test random inputs and so on. When you write new software, you do this. When you add features, you do this. When you do a rewrite or refactor, you should also do this.
Is it?
I hope I won't step on somebody's else toes:
GenAI would greatly help cover existing functionality and prevent regressions in new implementation.
For each tool, generate multiple cases, some based on documentation and some from the LLM understanding of the util. Classic input + expected pairs.
Run with both GNU old impl and the new Rust impl.
First - cases where expected+old+new are identical, should go to regression suite.
Now a HUMAN should take a look in this order:
1. Cases where expected+old are identical, but rust is different.
2. If time allows - Cases where expected+rust are identical , but old is different.
TBH, after #1 (expected+old, vs. rust) I'd be asking the GenAI to generate more test cases in these faulty areas.
"You have to catch everything" is much easier said than done, but "add at least one new test"? Nominally the people doing the rewrite should understand what they are rewriting.
Usually, the standard a rewrite is held to is "no worse than the original," which is a very high bar.