Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

We[1] Create "Units of Thought" from PDF's and then work with those for further discovery where a "Unit of Thought" is any paragraph, title, note heading - something that stands on its own semantically. We then create a hierarchy of objects from that pdf in the database for search and conceptual search - all at scale.

[1] https://graphmetrix.com/trinpod-server https://trinapp.com




I'm tempted to try it. My use case right now is a set of documents which are annual financial and statutory disclosures of a large institution. Every year they are formatted / organized slightly differently which makes it enormously tedious to manually find and compare the same basic section from one year to another, but they are consistent enough to recognize analogous sections from different years due to often reusing verbatim quotes or highly specific key words each time.

What I really want to do is take all these docs and just reorder all the content such that I can look at page n (or section whatever) scrolling down and compare it between different years by scrolling horizontally. Ideally with changes from one year to the next highlighted.

Can your product do this?


Probably without too much difficulty. If you have a sample to confirm, that would be great. frederick @ graphmetrix . com




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: