I think this is indirectly a great argument for automated, test generation or equivalence checking. The reason is that these translations might change the function of the code. Automated testing would show whether or not that happened. It also reveals many bugs.
So, they should solve total, automated testing first. Maybe in parallel. Then, use it for equivalence checks.
So, they should solve total, automated testing first. Maybe in parallel. Then, use it for equivalence checks.