> Modern BERT with the extended context has solved natural language web search. ...

nroets · 2025-04-12T06:55:12 1744440912

And models often makes reasoning errors. Many users will want to check that the sources substantiate the conclusion.

vidarh · 2025-04-12T07:29:49 1744442989

The point is that the secret sauce in Google's search was better retrieval, and the assertion above is that the advantage there is gone. While crawling the web isn't a piece of cake, it's a much smaller moat than retrieval quality was.

pixl97 · 2025-04-12T21:17:23 1744492643

Eh, I don't really see that.

Crawling the web has a huge moat because a huge number of sites have blocked 'abusive' crawlers except Google and possibly Bing.

For example just try to crawl sites like Reddit and see how long before you're blocked and get a "please pay us for our data" message.

literalAardvark · 2025-04-13T09:16:37 1744535797

My experience running a few hundred very successful shops (hundreds of thousands of orders per month) is that there's no need for quotes around 'abusive'.

95% of our load is from crawlers, so we have to pick who to serve.

If they want our data all they need to do is offer a way for us to send it, we're happy to increase exposure and shopping aggregation site updates are our second highest priority task after price and availability updates.

vidarh · 2025-04-13T21:26:59 1744579619

It may be tricky, but it's a piece of cake compared to doing good retrieval.