Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

How do you crawl or backfill the existing data?

Like if you wanted to implement a search engine or recommendation system it’s not clear to me how you retrieve a training set.




There is a page[0] in the Bluesky docs on this, though it effectively boils down to “fetch each user repository”.

[0] https://docs.bsky.app/docs/advanced-guides/backfill


This is a great question and I am looking for the answer as well.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: