Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I still haven't found what an actual working product in AI Agents could be and write about my journey into capabilities, frameworks, and restrictions here: https://jdsemrau.substack.com/

Initially I thought there is a use-case in finance, but the barriers of entry are incredibly small and the value add is not that large.

Currently, there seems to be a lot of traction in code generation (Cursor, Lovable, et al), but I have not seen that work on a useable code base/workflow.



From our observations on why - you need to have an extremely tight validation loop on everything you do for AI agents to be useful. They also need a ton of highly specific instructions and context. This requires a deep understanding of the platforms and tooling or a highly standard way of working (coding).

This is why tools like cursor work so great, they’re able to work in a super tight feedback loop with the compiler, linter and tests. They operate in a super well-known, documented environment.

If we can replicate the same thing on business systems… that’s when the magic happens - just very hard to do without deep knowledge of those platforms and agentic AI because everyone does stuff differently in each org. The overlap of people with skills in both AI and specific business ops areas is absolutely tiny.

An example of where we’re using this is in a fully AI native CRM (part of SynthGrid - see https://mindfront.ai). We don’t even have any way to interface with it outside of AI, but we’d also never want to do so again anyway because the efficiency gains are so huge for us.

The Pareto frontier will continue to inexorably advance forward, dragging even the complex or non-standardized domains in with it. For those tightly integrated business systems, we’ll probably see huge gain in utility, if not function, from the improved underlying models combined with the excellent tools. Be sure to try out Claude 4 Opus hooked into some systems if you haven’t already!


The tighter the scope/validation loop, the closer the "agent" gets to a more narrow business case of machine learning. These more traditional cases are in comparison significantly cheaper to implement and maintain. If you consider as an example traditional scorecards+policy rules in credit underwriting vs an agent that "reasons" over the loan application context.


Yeah for sure - that's exactly why we use that approach - it's unsurprising, simple and definitely works.

One difference is that you don't necessarily need structured data in, just output validation from the LLM. This is a big difference from ML because you're not having to worry too much about doing complex data engineering - or at least it solves the annoying ingestion problem in many cases.

Another observation is that most businesses don't have any ML engineering capabilities in-house - they're pretty much willing to pay a premium, because unlike the bespoke ML solutions, you can just do it with an off-the-shelf system (provided it's designed with the right validation loops).

The agent is in some ways an abstraction that just enables use and adoption - even if it would be orders of magnitude worse than normal ML solutions - it's competing against no solution, not ML-based ones.

Last thing is just around what level of autonomy people expect from these things. You can go pretty far, but like flipping N coins, the more you flip the greater the chance that something goes awry. Agents still need a lot of guidance and it's up to the system builders to bring that to them, either by connecting humans or very tightly integrated, well-designed tools.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: