Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> We're trying to make building code agents easy and cheap.

What is your plan to beat the performance and cost of first-party models like Claude and GPT?



Hey -- good question! We're focused on a narrower task right now that aims to save frontier tokens (both input & output). Our merge + retrieval models are simply smaller LLMs that save you from passing in too much context to Sonnet, and allow you to output fewer tokens. These are cheap for us to run while still maintaining or improving accuracy.


I can import my entire codebase to Gemini and get more than a nuanced similarity score in terms of agent guidance.

What’s the differentiator or plan for arbitrary query matching?

Latency? If you think about it - not really a huge issue. Spend 20s-1M mapping an entire plan with Gemini for a feature.

Pass that to Claude Code.

At this point you want non-disruptive context moving forward and presumably any new findings would only be redundant with what is in long context already.

Agentic discovery is fairly powerful even without any augmentations. I think Claude Code devs abandoned early embedding architectures.


Hey, these are really interesting points. The question of agentic discovery vs. one-shot retrieval is really dependent on the type of product.

For Cline or Claude Code where there's a dev in the loop, it makes sense to spend more money on Gemeni ranking or more latency on agentic discovery. Prompt-to-app companies (like Lovable) have a flood of impatient non-technical users coming in, so latency and cost become a big consideration.

That's when using a more traditional retrieval approach can be relevant. Our retrieval models are meant to work really well with non-technical queries on these vibe-coded codebases. They are more of a supplement to the agentic discovery approaches, and we're still figuring out how to integrate them in a sensible way.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: