More

jfbaro · on Sept 14, 2023

Congratulations to all PG community!

jfbaro · on June 21, 2023

Can this be used in a way similar to “supercomputers” proposed for Haskel?

jfbaro · on May 31, 2023

coW? So devs can have their ephemeral databases in seconds at no cost? Only the reads and writes they use and the “delta” storage of their thin clone?

Bitenporal support?

Smart anonymization (like Tonic.AI)?

Looking forward to hearing the announcements

jfbaro · on March 15, 2023

Even if a GTP-X will be able to get my description of a system and because it understands the industry lingo and practices create a extremely optimized RUST code, why would we need systems like apps and webapps (like internet bankings) in a world where we all have access to GTP-X?

Like program a KUKA robot to manufacture typewriters in 2023.

How "computer systems" will look like?

jfbaro · on Oct 14, 2022

Congrats PG team and community

jfbaro · on Sept 7, 2022

Interesting. I am personally more interested in precise and diverse health sensors, that can measure my health in real time.

jfbaro · on July 11, 2022

I have seen this presentation from a company that used PG Full Text search for a pretty complex use case. Interesting -> http://matheusoliveira.s3-website-us-east-1.amazonaws.com/pr... (updated thanks to ddevault)

ddevault · on July 11, 2022

Please don't use URL shorteners.

http://matheusoliveira.s3-website-us-east-1.amazonaws.com/pr...

jfbaro · on July 11, 2022

Updated! Thanks

RowanH · on July 11, 2022

Good link thinks. Have to say we're moving to pgSearch after implementing SOLR alongside a PG/Rails backend. Makes the stack simpler, less components to worry about (less headaches for gem versioning dependencies with SOLR). Lot to be said for it after reading through what's available now...

sandGorgon · on July 11, 2022

are you planning to do relevance ? this is one of the blockers for us - TFIDF vs BM25 or something.

From what i know - PG fulltext seaarch does not implement relevance.

mamcx · on July 11, 2022

An alternative is to just layer the FTS on top of vanilla sql to get the extra stuff (this is what I do for my eCommerce backend), so is pretty simple to have something alike:

  SELECT ..
    -- Get the fts 
    IN ( FTS QUERY)
  ORDER BY
     -- The relevance is hardcoded? mayber in another table that store th rankings?
     (
      Products, 
      Inventory,
      Invoices,..
      )

I found is much easier and predictable if I code the "rankings" based on the business logic instead of let the FTS engine guess it. You can store that stuff as normal columns or use the "sources" (ie: products, inventory) as ways to know what could be more important to pull first.

This have the nice property that our search results are ver good and better: Never return non-sensical stuff! (like searching for a apple in the store and get and blog post!)

sandGorgon · on July 12, 2022

i think we are talking a little bit different. it doesnt matter what the variables are (your business logic) or some other variables.

Given a certain variable, tfidf/bm-25 will order by relevance and not by match. So it answers the question, what if the name match was off by two characters and the inventory number is less than 200.

tf-idf does not tell you what to order by...but it takes care of all the edge cases of ordering.

now if ur not using text match anywhere and only using business variables...then this entire thread is not for u. But FTS and lucene attack full text search primarily, and that's where the relevance vs ordering discussion comes from

rocmcd · on July 11, 2022

This has been my understanding of the state of Postgres full-text search. It's great if your search requirements are fairly vanilla, but I haven't seen any solutions for more advanced search needs, such boosting, relevance, scoring, etc.

SahAssar · on July 12, 2022

Why does ts_rank not work for you? https://www.postgresql.org/docs/current/textsearch-controls....

sandGorgon · on July 13, 2022

ts_rank functions use the term frequency within that document. not a global term frequency (which is why u need a separate index like what elasticsearch does).

this is important, cos if a word is too common, its considered less significant for a document match. When we calculate IDF, it will be very low for the most occurring words such as stop words (“is” is present in almost all of the documents, and tf-idf will give a very low value to that word).

there's someone who implemented this, its pretty cool. but definitely performance takes a hit versus a separate elasticsearch cluster. https://codebots.com/crud/How-to-efficiently-search-text-usi...

jfbaro · on July 7, 2022

Is this the start of a future where we can write high level code (Idris, Agda, Coq) and the resulting code will run as fast (and as safe) as RUST? Interesting.

jfbaro · on June 15, 2022

1. Shorter cold starts 2. Secure environment 3. Heavily tested in production

These are good PROS of Isolate for Serverless Computing, IMHO.

jfbaro · on May 23, 2022

Great list. As I can add anything here, I will say:

- Bitemporal support OOTB (storage would be more expensive, as temporal data needs more disk space)

- CoW capabilities OOTB, so it would be super easy (fast and cheap) to create ephemeral database for development purpose.

- Charge per request (ms of reads, ms of writes) - for the sake of being more specific about serverless.

- AI capabilities that detects the use of the database and suggests indexes or other tweaks to make the database as fast as possible (and cheap), even if schema changes, database size increases or query patterns change

- PostgreSQL support (and all its extensions... I know that's a hard one as PS is based on MySQL)

- OOTB capabilities for Masking and/or anonymizing of data (PCI, PII, etc)

Thanks