Using Aider with o3 in architect mode, with Gemini or with Sonnet (in that order) is light years ahead of any of the IDE AI integrations. I highly recommend anyone who's interested in AI coding to use Aider with paid models. It is a night and day difference.
With aider and Gemini Pro 2.5 at least I constantly have to fight against it to keep it focused on a small task. It keeps editing other parts of the file, doing small "improvements" and "optimizations" and commenting here and there. To the point where I'm considering switching to a graphical IDE where the interface would make it easier to accept or dismiss parts of changes (per lines/blocks, as opposed to a per file and per commit approach with aider).
Would you mind sharing more about your workflow with aider? Have you tried the `--watch-files` option? [0] What makes the architect mode [1] way better in your experience?
I use o3 with architect mode for larger changes and refactors in a project. It seems very suited to the two-pass system where the (more expensive) "reasoning" LLM tells the secondary LLM all the changes.
For most of the day I use Gemini Pro 2.5 in non-architect mode (or Sonnet when Gemini is too slow) and never really run into the issue of it making the wrong changes.
I suspect the biggest trick I know is being completely on top of the context for the LLM. I am frequently using /reset after a change and re-adding only relevant files, or allowing it to suggest relevant files using the repo-map. After each successful change if I'm working on a different area of the app I then /reset. This also purges the current chat history so the LLM doesn't have all kinds of unrelated context.
I use Gemini in VScode via Cline and also in Zed. I like Aider, but I'm not sure how it's "light-years ahead IDE AI integrations" ubless you only mean stuff like Cursor or Windsurf.
Aider has a configuration for each supported LLM to define the best diff format for each; so for certain ones they're best at diff format, Gemini is best at a fenced-diff format, Qwen3 is best at whole file editing, etc. Aider itself examines the diff and re-runs the request when the request when the response doesn't adhere to the corresponding diff format.
Edit: Also the Aider leaderboards show the success rate for diff adherence separately, it's quite useful [1]