For a few uni/personal projects I noticed the same about Langchain: it's good at...

sroussey · 2024-04-13T16:22:19 1713025339

If someone were to create something new, a blank slate approach, what would you find valuable and why?

lordofmoria · 2024-04-13T16:29:48 1713025788

This is a great question!

I think we now know, collectively, a lot more about what’s annoying/hard about building LLM features than we did when LangChain was being furiously developed.

And some things we thought would be important and not-easy, turned out to be very easy: like getting GPT to give back well-formed JSON.

So I think there’s lots of room.

One thing LangChain is doing now that solves something that IS very hard/annoying is testing. I spent 30 minutes yesterday re-running a slow prompt because 1 in 5 runs would produce weird output. Each tweak to the prompt, I had to run at least 10 times to be reasonably sure it was an improvement.

codewithcheese · 2024-04-13T18:01:19 1713031279

It can be faster and more effective to fallback to a smaller model (gpt3.5 or haiku), the weakness of the prompt will be more obvious on a smaller model and your iteration time will be faster

JeremyHerrman · 2024-04-14T04:05:20 1713067520

great insight!

sroussey · 2024-04-13T20:49:28 1713041368

How would testing work out ideally?

jsemrau · 2024-04-13T18:25:11 1713032711

Use a local model. For most tasks they are good enough. Let's say Mistral 0.2 instruct is quite solid by now.

gnat · 2024-04-13T19:41:26 1713037286

Do different versions react to prompts in the same way? I imagined the prompt would be tailored to the quirks of a particular version rather than naturally being stably optimal across versions.

jsemrau · 2024-04-13T19:48:01 1713037681

I suppose that is one of the benefits of using a local model, that it reduces model risk. I.e., given a certain prompt, it should always reply in the same way. Using a hosted model, operationally you don't have that control over model risk.

cpursley · 2024-04-13T20:24:42 1713039882

What are the best local/open models for accurate tool-calling?