More

benjamincburns · 2025-02-05T21:44:09 1738791849

Disclosure: I'm an engineer at LangChain, primarily focused on LangGraph. I'm new to the team, though - and I'd really like to understand your perspective a bit better. If we're gritting the wheels for you rather than greasing them, I _really_ want to know about it!

> Every time I've tried to apply general purpose RAG tools to specific types of documents like medical records, internal knowledge base, case law, datasheets, and legislation, it's been a mess.

Would it be fair to paraphrase you as saying that people should avoid using _any_ library's ready-made components for a RAG pipeline, or do you think there's something specific to LangChain that is making it harder for people to achieve their goals when they use it? Either way, is there more detail that you can share on this? Even if it's _any_ library - what are we all getting wrong?

Not trying to correct you here - rather stating my perspective in hopes that you'll correct it (pretty please) - but my take as someone who was a user before joining the company is that LangChain is a good starting point because of the _structure_ it provides, rather than the specific components.

I don't know what the specific design intent was (again, new to the team!) but just candidly as a user I tend to look at the components as stand-ins that'll help me get something up and running super quickly so I can start building out evals. I might be very unique in this, but I tend to think that until I have evals, I don't really have any idea if my changes are actually improvements or not. Once I have evals running against something that does _roughly_ what I want it to do, I can start optimizing the end-to-end workflow. I suspect in 99.9% of cases that'll involve replacing some (many?) of our prebuilt components with custom ones that are more tailored to your specific task.

Complete side note, but for anyone looking at LangChain to build out RAG stuff today, I'd advise using LangGraph for structuring your end-to-end process. You can still pull in components for individual process steps from LangChain (or any other library you prefer) as needed, and you can still use LangChain pipelines as individual workflow steps if you want to, but I think you'll find that LangGraph is a more flexible foundation to build upon when it comes to defining the structure of your overall workflow.

benjamincburns · on Jan 11, 2023

Kaggle or HuggingFace

benjamincburns · on Sept 22, 2022

Relevant: https://slack.com/blog/news/dear-microsoft

benjamincburns · on April 13, 2020

A lot of the other responses here are too complicated and specific. Here's my attempt to put it into easier-to-digest terms:

In a traditional web stack you have a backend and a frontend. The frontend is the stuff the browser runs, and, simplifying a bit, the backend is everything else.

Ethereum smart contracts basically let you replace your backend logic and database with code that runs on the Ethereum blockchain network. Depending on your application can decide to run only a few parts of your backend on Ethereum, or the entire backend.

It's very slow when compared to traditional backends like nodejs, etc, but it has the benefits of censorship resistance and excellent availability. Better still, you don't need to run or maintain servers to support it if you don't want to (although there are benefits to doing so).

In this case, they're using Ethereum's replacement for DNS, ENS.

benjamincburns · on May 10, 2019

Amazon is becoming more and more like AliExpress with respect to product accuracy and quality. As an example, I recently bought tomato seeds for my garden. I searched for rainbow tomatoes, and the bulk of the results were clearly photoshopped photos made to look like they'd grow tomatoes with colors that'd get lost in a ball pit.

It's easy to evaluate on things like that, but on stuff where corners can be cut I definitely worry.

discreditable · on May 10, 2019

A while back I bought habanero seeds on Amazon. They sprouted and did well, but they put out regular sweet peppers. I've bought catnip seeds that were fine. Anymore, I only buy seeds from the store though.

benjamincburns · on Jan 24, 2019

Thanks to cumulative windowing, it does have a dropped packet signal: https://en.wikipedia.org/wiki/Transmission_Control_Protocol#...

benjamincburns · on Nov 29, 2017

> to actually make a transaction on the blockchain

On which blockchain, exactly? On the Ethereum Mainnet you can transact for a fraction of a cent USD and it'll typically be verified (mined into a block) within a minute (often faster).

On the Bitcoin chain you choose your own transaction fee, but if you're not keeping up with market rates your transaction might take quite a long time to be verified (again, mined into a block).

benjamincburns · on Aug 23, 2017

> If you KNOW someone is wrong, why not debate them? Why go after them personally?

Alternatively, just let them be wrong.

lurrr · on Aug 23, 2017

Absolutely. I was trying to say that silencing someone just because they have an opinion which you disagree with is not they way to approach things.

zaptheimpaler · on Aug 28, 2017

Duty calls... https://xkcd.com/386/

benjamincburns · on Aug 23, 2017

I don't disagree that anonymity offers protection against situations like this, but bear in mind that anonymity is very much a factor in the lack of civil discourse online. I'm not suggesting for a second that stripping anonymity is a solution to any of these problems, but personally I feel like I'm less likely to engage in the sort of behavior which might invite an angry mob while posting under my real name.

hueving · on Aug 23, 2017

> but personally I feel like I'm less likely to engage in the sort of behavior which might invite an angry mob while posting under my real name.

The problem with this is the 'sort of behavior' you are referring to is posting any statements that disagree with the worldview of the majority of the people in your circles.

Without anonymity you will also lack civil discourse when all of the sane people on the minority side know not to speak up because they fear retribution.

Anyone who publicly considered voting for Trump was accused of being a sexist, racist, xenophobic, Islamophobe. So instead of any sane discourse, there was just a surprise upset when Trump won because everyone was convinced the strategy of accusing all Trump supporters of 'isms' until they shut up was working.

benjamincburns · on Aug 23, 2017

FWIW, most of what you're saying here is my rationale for the first half of the sentence you quoted. Specifically, there are times when it makes sense to speak up, and being able to do so anonymously definitely quashes fear-driven self censorship.

While generally I'd agree that self censorship is a bad thing, I'd also agree that an ability to control one's impulsiveness is just the opposite. Impulse control is what I was referring to above as my general reasoning for why I default to using my real identity online, not self censorship.

flukus · on Aug 23, 2017

> I don't disagree that anonymity offers protection against situations like this, but bear in mind that anonymity is very much a factor in the lack of civil discourse online.

We might lose some civility but we also gain a lot of honesty which is absent everywhere else now due to PC culture. So the question is whether we prefer sometimes uncivilized but honest discussion or a civil veneer of what people really think.

benjamincburns · on Aug 23, 2017

To be clear I was speaking in terms of likelihood, not extremes. I personally feel I derive a benefit from identifying myself online, but YMMV.

Also I really hope I didn't give the idea that I think all discourse must be kept PC, or even civil. That'd be a rather sad world to live in.

benjamincburns · on Aug 6, 2017

I agree that those who participate in a process like this shouldn't circumvent its rules -- that just makes the whole problem much more difficult to manage. However, I think you could just as easily argue that the process itself is scummy for requiring the job posting when a well-qualified candidate has already been identified.

Why is a candidate's country of citizenship the test for whether a role must be posted publicly before it can be filled? Wouldn't it be best for both the economy and the company if this rule were followed (without circumvention) for 100% of job openings? If your answer to that is "no," then why require it in any circumstance?