This makes me wonder. Is anyone practicing TDD with genAI/LLMs? If the true valu...

quantadev · 2025-01-20T04:18:27 1737346707

I'd sort of invert that and say it's better to use LLMs to just generate tons more test cases for the SQL DBs. Theoretically we could use LLMs to create 100s of Thousands (unlimited really) of test cases for any SQL system, where you could pretty much certify the entire SQL capability. Maybe such a standardized test suite already exists, but it was probably written by humans.

diggan · 2025-01-20T12:09:06 1737374946

At that point, you'd get a ton more value from doing Property Testing (+ get up and running faster, with less costs).

If I'd had to have either code or tests generated by a LLM, I'd manually write the test cases with a well-thought out API for whatever I test, then have the LLM write tests that implements what I thought up, rather than the opposite which sounds like a slow and painful death.

quantadev · 2025-01-20T15:12:22 1737385942

I hadn't heard of "Property Testing" if that's a sort of term of art. I'll look into it. Anyway, yeah in TDD it's hard to say which part deserves more human scrutiny the tests or the implementations.

serbuvlad · 2025-01-20T12:54:04 1737377644

Are you sure that LLMs, because of their probabilistic nature, would not bias against certain edge cases. Sure, LLMs can be used to great effect to write many tests for normal usage patterns, which is valuable for sure. But I'd still prefer my edge cases handled by humans where possible.

quantadev · 2025-01-20T15:10:12 1737385812

I'm not sure if LLMs would do better or worse at edge cases, but I agree humans WOULD need to study the edge case tests, like you said. Very good point. Interestingly though LLMs might help identify more edge cases us humans didn't see.

XorNot · 2025-01-20T03:58:40 1737345520

TDD suffers from being inflexible when you don't fully understand the problem. Which on software is basically always.

Everytime I've tried it for something I make no progress at all compared to just banginf out the shape that works and then writing tests to interrogate my own design.

iSnow · 2025-01-20T10:55:52 1737370552

Happy that it's not just me. I tried it a couple of times, and for small problems, I could make it work, albeit with refactorings both to the code and tests.

But for more complicated topics, I never fully grasped all the details before writing code, so my tests missed aspects and I had to refactor both code and tests.

I kinda like the idea more than the reality of TDD.

necovek · 2025-01-20T12:30:53 1737376253

TDD is supposed to teach you that refactoring of both the code and tests are "normal": iow, get used to constant, smallish refactors, because that's what you should be doing.

Now, the issue with badly defined problems is not that it's just badly defined, it's also that we like to focus on the technical implementation specifics. To do TDD from scratch requires a mindset shift to think about actual user value (what are you trying to achieve), and then go for the minimum from that perspective. It's basically an inverse from common architecture approach, which is design data models first, and start implementing next. With TDD, you evolve your data models along with the code and architecture.

And it is freaking hard to stop yourself from thinking too far ahead and letting tests drive your architecture (code structure and APIs). Which is why I also frequently prototype without TDD, and then massage those prototypes into fully testable code that could have been produced with TDD.

diggan · 2025-01-20T12:10:53 1737375053

I think in general people tend to overdo TDD if they do TDD, aiming for a 100% test coverage which just ends up doing what you and parent mentions, solidifies a design and makes it harder to change.

If instead every test is well intentioned and focus on testing the public API of whatever you test, not making assumptions about the internal design, you can get well tested code that is also easy to change (assuming the public interface is still OK).

necovek · 2025-01-20T12:34:07 1737376447

It's extremely hard to really do TDD and get code that's hard to change. If you persevere with a design that's hard to change, every single change in your failing-test-fix-implementation TDD cycle will make you refactor all your tests, and you'll realise why the design is bad and reduce coupling instead.

What really happens is that people write code, write non-unit "unit" tests for 100% coverage, and then suffer because those non-unit tests are now dependent on more than just what you are trying to test, all of them have some duplication because of it, and any tiny change is now blocked by tests.

hitchstory · 2025-01-20T14:24:25 1737383065

You can get 100% coverage by focusing on testing the public API too. These two things are completely orthogonal.

vrighter · 2025-01-21T07:40:03 1737445203

dude, if you have the llm write the tests, then you have no confidence it's testing what you think it is. Making the test worthless

CSSer · 2025-01-21T07:49:44 1737445784

Dude, I was suggesting that you, not the LLM, write the tests in this scenario.