Hacker News new | past | comments | ask | show | jobs | submit login

For a few uni/personal projects I noticed the same about Langchain: it's good at helping you use up tokens. The other use case, quickly switching between models, is a very valid reason still. However, I've recently started playing with OpenRouter which seems to abstract the model nicely.



If someone were to create something new, a blank slate approach, what would you find valuable and why?


This is a great question!

I think we now know, collectively, a lot more about what’s annoying/hard about building LLM features than we did when LangChain was being furiously developed.

And some things we thought would be important and not-easy, turned out to be very easy: like getting GPT to give back well-formed JSON.

So I think there’s lots of room.

One thing LangChain is doing now that solves something that IS very hard/annoying is testing. I spent 30 minutes yesterday re-running a slow prompt because 1 in 5 runs would produce weird output. Each tweak to the prompt, I had to run at least 10 times to be reasonably sure it was an improvement.


It can be faster and more effective to fallback to a smaller model (gpt3.5 or haiku), the weakness of the prompt will be more obvious on a smaller model and your iteration time will be faster


great insight!


How would testing work out ideally?


Use a local model. For most tasks they are good enough. Let's say Mistral 0.2 instruct is quite solid by now.


Do different versions react to prompts in the same way? I imagined the prompt would be tailored to the quirks of a particular version rather than naturally being stably optimal across versions.


I suppose that is one of the benefits of using a local model, that it reduces model risk. I.e., given a certain prompt, it should always reply in the same way. Using a hosted model, operationally you don't have that control over model risk.


What are the best local/open models for accurate tool-calling?




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: