Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

As I’m professionally working on a niche search engine, let me offer this: it’s a notoriously hard problem that seems simple at first, but requires catering to a bazillion different edge cases; every optimisation you do makes another case worse.

Having said all that: I also hate how shitty search almost everywhere is. It’s hard, but not that hard.



I’d be happy if it catered to exactly one edge case: ”Show me all emails that contain this word”


…which is the problem I was referring to: by optimising for that—your—use case, those of other people will invariably suffer.

We only have a single text field as the input; how are we supposed to guess whether you want to find an exact match of the phrase, a fuzzy match, at least one of the words provided, or any other possible variation? Also, are you interested in the content, the subject, the recipient, the sender address you used, a header field, an attachment, what have you? Do you want them ranked by the frequency of the word, or the position from the start of the text? Does it count those occurrences in quoted passages of previous mails downthread multiple times? What if it’s a stop word?

There are of course sensible ranking solutions and heuristics for these questions. I just want to highlight it’s not as trivial as it first sounds. Most mail clients probably don’t ship with a Lucene index—while they should.


You could always... you know... ask?

I use Thunderbird and it's approximately 100x better at searching for emails than Excel. I just tell it if I'm looking in the subject, in the body, in the sender, whether it's fuzzy, etc, and then it pulls up the emails.

Whereas Excel doesn't ask shit and, in return, doesn't have a working search.


Outlook on the other hand has an extremely powerful search


Having only a single box is a fully self-imposed leg wound




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: