Hacker News new | past | comments | ask | show | jobs | submit login

I'm unfamiliar with the parquet format and trying to understand - are you storing the raw scraped data in that format or are you storing the result of parsing the scraped data?



We are storing the result of the parsed scrape as parquet. I would advice to store the raw data as well in a different s3. The database should only have the data it needs and not act as a storage.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: