We launched Metaphor earlier this morning! It's a search engine based on the same sorts of generative modeling ideas behind Stable Diffusion, GPT-3, etc. It's trained to predict the next link (similar to how GPT-3 predicts the next word).
After GPT-3 came out we started thinking about how pretraining (for large language models) and indexing (for search engines) feel pretty similar. In both you have some code that's looking at all the text on the internet and trying to compress it into a better representation. GPT-3 itself isn't a search engine, but it got us thinking, what would it look like to have something GPT-3-shaped, but able to search the web?
This new self-supervised objective, next link prediction, is what we came up with. (It's got to be self-supervised so that you have basically infinite training data – that's what makes generative models so good.) Then it took us about 8 months of iterating on model architectures to get something that works well.
And now you all can play with it! Very excited to see what sorts of interesting prompts you can come up with.
Yeah exactly. That's why you can't really do it with a language model like GPT-3, you have to bake into the architecture the concept of a "link" as a first-class object.
This is interesting! I wonder how different the results are from just indexing the contents of the page and semantically searching them (vs. trying to predict the next link). Have you tried anything like that?
Maybe an "unfiltered" machine learning model trained on real world user generated content is showing something different and "unexpected" when compared to what the mainstream "approved" search engines would show you... Hmmm, who would have guessed it right? And you can't even argue that it was game seod to show you this results.
Alternately, disinformation often shared with that sort of phrasing will be brought up with that sort of phrasing.
People showing actual sources rarely say "the real truth" because it is implicit that no one source has all of "the real truth" _and_ the phrase is a dog whistle.
It's not a "dog whistle". By that same logic , both dogwhistle and disinformation are "dogwhistles" too. People who distrust MSM tends to use phrases such as "real truth" and people who read and repeat MSM tend to use phrases like dog whistle and disinformation.
This isn't a flaw in the search engine either. If you use the phrase "real truth" in your search you are likely distrustful of what has been presented as truth. It's similar to how I use negative words in searches to try to find criticism of various subjects and products.
You're hitting on a big question about the role of search engines.
Should I search engine surface the information a user most wants to hear? Or the most objectively accurate information?
Yes, yes, "objective" can be disputed in some cases, but not all. The COVID vaccines do not, in objective fact, have 5G microchips in them. They just don't.
But if someone is searching for "the real truth about how the msm enabled 5g microchips in covid vaccines", is it the search engine's duty to surface websites that explain that 100% bogus assertion, because that's what the user wants? Or should a search engine return sites that debunk the theory?
Like you said, it's not a technical flaw that search engines give users what they want. But is it a mission flaw?
If a tool that is supposed to be helping you find information gives you wrong information (eg sends you to a link that COVID vaccines have 5G chips) then I'd say that is a failure of the tool.
I tend to agree. But if I want to find wrong information and the tool doesn’t give me what I want, isn’t that also a failure of the tool? At least in the “product does not meet user expectations” sense of failure?
I very much agree. But from a UX point of view, that's a failure. This is a case where the user's desire (find my evidence of something I believe) is not aligned with what IMO should be a search engine's mission (surface the most relevant and accurate information for a query).
One search string that really illustrated the problems with modern-day Google for me is "best things to do in hawaii". Try it and see what I mean. It's just link after link of blogspam. You get extremely long pages filled with ads and generic stock photos of Hawaii, but which are bereft of any actual content. I just want a single person's account of how they went to Hawaii and what they liked/didn't like, but it's impossible to find, even though I'm sure it's out there on the internet somehow.
The best thing to google if you want an answer to this question is something like "reddit best thing to do in hawaii" which gets you actual accounts from actual real people who actually went to Hawaii and have interesting things to say about it.
I tried this with metaphor.systems as well, using their prompting language - "My favorite place to go in Hawaii is:". Unfortunately, I still didn't get great results, though some of them showed some promise.
With these kind of generative systems, the way you prompt is is often really important. You have to kind of talk as though you've stumbled upon someone talking about the thing you're looking for. Not quite sure what you were looking for but I seemed to have better luck with a prompt that looked like this:
"I'm thinking about traveling to Hawaii next month. Does anyone have recommendations for what to do?
I just came back from Hawaii and loved it. If you're going you have to check this out"
I believe that is not a problem with Google, or at least it would be the same problem for any large search engine. There is money to be made when your search result ranks high, thus, a lot of a useless affiliate marketers do their absolute best to rank high without providing any value. I don't believe Google is interested in promoting those useless articles, but it's a cat and mouse game.
By the way, most non-technical people have never heard of Reddit.
I’m not sure exactly what you mean. If it would be a problem for all equivalently sized search engines, that just means it’s part of the problem space. The onus is on Google or any other search engine provider to weed out low quality sites.
I've been using Metaphor for a few weeks now and have almost entirely switched from Google and other search engines. Keyword based search simply doesn't come close when it comes to getting the _right_ results. While I have to sift through a few pages of results on Google and then maybe find what I'm looking for, on metaphor, there's almost no SEO spam or Wikipedia-style links dominating the top results. It directs you to sources that are relevant to your search query. I don't know how they did this (probably a lot of very specific and targeted tricks), but Alex and team have created a marvelous product and I'm excited to see where this goes! Congrats on the launch!
I used to work on Google search but it was a long time ago so hopefully I am not too biased here.
I think it would really help the UI to have better snippets. Ie, the text that appears below the blue link for a set of search results. In Google search results the key words are often bolded, as well. It helps you skim through and see which of the results are going to be a good fit.
Maybe there is some fancy AI thing you can do to generate snippets, or tell me more about the page. For example one of the search results for your sample query is:
Online resources in philosophy and ethics
sophia-project.org/
That doesn't really tell me anything without clicking on it. Is it good? I don't know... I usually don't click on that many results from a Google search, people often decide after only selecting one or two, based on the snippet.
How will you afford to keep the search engine up to date without expensive retraining of the entire model? My understanding is that fine-tuning will not result in the same accuracy as a full retrain.
Surely expanding the index is not the only sort of change that needs to occur over time, though. Like for the example "My two favorite blogs are SlateStarCodex and", the model not only needs to have an up-to-date list of blog URLs in the index, it also needs to have an up-to-date understanding of what SlateStarCodex is. If SlateStarCodex changes to AstralCodexTen after the model has been trained, does that prompt still work?
EDIT: It looks like the answer is "no". Substituting in AstralCodexTen gives a bunch of weird occult and esoteric blogs, not rationalist blogs. These are the top results:
Or maybe it's because the new name is less relevant and not used as much anyways? If you want to hardcode in these updated names and make judgements about what is REALLY relevant you should stay on Google
Going from the supposedly curated examples, the Wikipedia page for the "most Jackson Pollock-like", the "most Dalai Lama-like" and the "most Elon Musk-like" figure from the 2nd century is Secundus the Silent.
Given that his name is Secundus and his Wikipedia short blurb mentions twice that he lived in the 2nd century AD, I think your AI has decided that he is just the most 2nd century figure.
Congrats on launching! I found myself using this more than I expected in the closed beta. I used it most for opinionated prompts (e.g. "the PG essay I gave my parents to help them understand startups was..."), but also had some luck with finding content by its description (e.g. "I really like the intuitive explanation of [college math topic] at ...".
Apart from the gains of the civil rights movement, there have been other factors which have occurred such as African American communities have suffered from very high imprisonment rates in young males
This worked really well when I tried using it to search for papers. I didn't record the details but it was something related to converting a mesh to boundary representation.
Someone else posted a link to their for profit search engine. I find having to login to use the product a bit disturbing. What if I don't want my data collected?
When I read the word title, I expected a search engine, which finds metaphors based on my text input. Too sad that there still isn’t anything like this :(
What a bizarre complaint. Are you under the impression that normal search engines only return "accurate" results? Why would an "AI for search engines" magically gin up a consistent and reliable way of determining accuracy?
My complaint is not about search engines in particular. I agree the quality of search engines is lacking across the board.
That said, with a traditional search engine, you can poke into its algorithm and find out what is it that made it decide that a spam link was more relevant than an useful link. You can examine what makes your implementation more or less accurate, and devise potential changes that would make it more accurate.
With the current state of AI, you're crossing your fingers and hoping for the best. You're combining several black boxes that have been trained on arbitrary subsets of internet data, and hoping that they've inferred something from that training data about the current shape of the internet that leads to useful results.
From what I understand from the demo on the website, it's not a Large Language Model.
The following is how I think it works :
They are probably using diffusion model conditioned on the input prompt to organize the spaces of link.
Search engines in the deep learning era usually embed responses (here links) and queries (here text prompt) in some joint space.
And to get the response they usually do an approximate near neighbor search.
Here they probably replace this neighbor search by a diffusion process.
This is akin to building a learned index. The diffusion process is an iterative process that progressively get you links closer to your query. This diffusion process is kind of a learned hierarchical navigation small world.
Because you need your response to be an existing link at the end of the diffusion process you must project to the discrete space of existing links. There are two schools of thoughts here : If you did your diffusion in a continuous space you can do an approximate near neighbor search in the buckets around to do this projection. Alternatively you can stay in discrete space. You do your diffusion along the edges of a graph. Something akin to train your network to play wikipedia speedrun but on the whole internet.
But diffusion model can be more powerful by not embedding them in the same space (you can do still it but you can do something more powerful).
The problem of embedding in a same space is that with this embedding process you define what is a relevant answer instead of learning the relevancy from the data.
With a diffusion generative model, among other things, what you can do instead to build your database is for each link you read the associated page and you use GPT-3 to generate n queries that would be appropriate to your document (or portion of document). Then you use the diffusion model to learn the mapping query to link with this generated example pair (generated query, link).
Diffusion models solve the mode collapse problem. Aka one query can have multiple different responses weighted by how often they appear in the training data. So they are the natural candidate for building a search engine.
Let's take the wiki-speedrun diffusion approach that is easier to understand.
At query time : You embed the input prompt to a vector.
You start from the wikipedia home page. You use your diffusion model network( that takes as input the current page you are on, and the input prompt vector), and you predict which link to follow (or whether or not you should stop because you have arrived). That takes you to a new current page, and you use your diffusion network again to pick the next link to follow. After doing this n (~20 times?), you have arrived on the relevant page.
Concerning the signals you can use : You have a lot of design freedom. You can basically add any information that you can embed, to pass it as input to the diffusion network. One such signals can be for example a context vector that represent your previous queries.
In a similar way during the organization of the links space, you have lot of freedom to define what is a relevant query. You can score it semantically (if you don't have a lot of real user signal), or if you have already plenty of users that have made plenty of queries, you learn the mapping with the pair (successful query of the user, link).
We launched Metaphor earlier this morning! It's a search engine based on the same sorts of generative modeling ideas behind Stable Diffusion, GPT-3, etc. It's trained to predict the next link (similar to how GPT-3 predicts the next word).
After GPT-3 came out we started thinking about how pretraining (for large language models) and indexing (for search engines) feel pretty similar. In both you have some code that's looking at all the text on the internet and trying to compress it into a better representation. GPT-3 itself isn't a search engine, but it got us thinking, what would it look like to have something GPT-3-shaped, but able to search the web?
This new self-supervised objective, next link prediction, is what we came up with. (It's got to be self-supervised so that you have basically infinite training data – that's what makes generative models so good.) Then it took us about 8 months of iterating on model architectures to get something that works well.
And now you all can play with it! Very excited to see what sorts of interesting prompts you can come up with.