For full text search, the only global frequency information Postgres uses is the... | Hacker News

Hacker News new | past | comments | ask | show | jobs | submit

login

brianhempel on Sept 29, 2014 | parent | context | favorite | on: Postgres full text search is good enough

For full text search, the only global frequency information Postgres uses is the stop-word list. It does not do TF-IDF ranking. [1]

For example, if you search for "Bob Peterson", Postgres will rank these two documents the same:

"I saw Bob."

"I saw Peterson."

In contrast, an IDF-aware search would notice that "Peterson" occurs in fewer documents than "Bob" and score "I saw Peterson" higher for that reason.

[1] http://en.wikipedia.org/wiki/Tf%E2%80%93idf

[2] http://stackoverflow.com/questions/18296444/does-postgresql-...

rorydh on Sept 29, 2014 [–]

TF–IDF ranking doesn't seem to be too complex a thing to implement. Maybe this is an opportunity for someone here to contribute to the open source project.

jeltz on Sept 29, 2014 | [–]

The concurrency aspects of this seem a bit tricky. How do we ensure that a bloated index does not screw our results too much?

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact