> I've had Claude Code write an entire unit/integration test suite in a few hours (300+ tests) for a fairly complex internal tool. This would take me, or many developers I know and respect, days to write by hand.
I'm not sure about this. The tests I've gotten out in a few hours are the kind I'd approve if another dev sent then but haven't really ended up finding meaningful issues.
Just to be clear, they weren't stupid 'is 1+1=2' type tests.
I had the agent scan the UX of the app being built, find all the common flows and save them to a markdown file.
I then asked the agent to find edge cases for them and come up with tests for those scenarios. I then set off parallel subagents to develop the the test suite.
It found some really interesting edge cases running them - so even if they never failed again there is value there.
I do realise in hindsight it makes it sound like the tests were just a load of nonsense. I was blown away with how well Claude Code + Opus 4.5 + 6 parallel subagents handled this.
I keep seeing posts like this so I decided to video record all my LLM coding sessions and post them on YouTube. Early days, I only had the idea on Saturday.
I'm not sure about this. The tests I've gotten out in a few hours are the kind I'd approve if another dev sent then but haven't really ended up finding meaningful issues.