> Who chooses the white list, and why should I trust them? Is it democratically ...

RileyJames · on Nov 17, 2020

Great idea, but why build a search engine at all in this case? You can use DDG + your filter and see only the results from your whitelist.

Could easily be implemented for any current search engine.

To a large extent, this is what you already do when you view a page of search results. Filter them based on your understand of what sites / results hold value.

Shared404 · on Nov 17, 2020

> why build a search engine at all in this case?

On a public scale, you could make an argument for tighter integration/better privacy with the lists. For example:

    Browser -----Request-to-SE-----> Search Engine
      ^                                   |
      |                      Unfiltered Results (In YAML/JSON)
      |                                   |
      |                                   V
      |--Desired Results------ Local Filtering/Rendering

On a private scale, if you are only crawling sites on the allow list than you have the possibility of being able to better maintain a local database of sites to show up on the search.

Edit: Possibly this could be easier to use to set up distributed search as well, as each node could index a given list, and then distribute that list similarly to DNS. Don't really know how well that would work though, just an idea.

edgyquant · on Nov 18, 2020

Aside from this a big reason to build this is it seems a lot simpler than writing a giant web crawler ala google and thus is a good target for an open source solution. Which is the biggest problem with duck duck go.

Shared404 · on Nov 18, 2020

Do you have thoughts on implementing the distributed search?

I'm thinking about playing around with this in my spare time, but that part seems the hardest to do.

rakoo · on Nov 18, 2020

Don't start from scratch, take a look at yacy (https://yacy.net/) that already does most of what is discussed here

edgyquant · on Nov 18, 2020

I would think everyone would run their own “crawler” but maybe you could use a ledger and delegate sites to different workers. If you’re whitelisting sites you could maybe only crawl a link or two deep.

It’d take a lot of cycles while small but if you get a network growing you could even have sub-networks with their own whitelist additions (and every user has a blacklist.)

Jtsummers · on Nov 17, 2020

That’s what directory sites offered once upon a time. It was a pretty good way to discover new content back then. I spent a lot of time on dmoz when I wanted to find information about various topics.