Hacker News new | past | comments | ask | show | jobs | submit login

OP is missing that a correct implementation of Databricks or Snowflake will have those instances are running inside the same AWS region as the data. That's not to say R2 isn't an amazing product, but the egregious costs aren't as high since egress is $0 on both sides.



Author here and it is true that costs within a region are free and if you do design your system appropriately you can take advantage of it but I've seen accidental cases where someone will try to access in another region and it's nice to not even have to worry about it. Even that can be handled with better tooling/processes but the bigger point is if you want to have your data be available across clouds to take advantage of the different capabilities. I used AI as an example but imagine you have all your data in S3 but want to use Azure due to the OpenAI partnership. It's that use case that's enabled by R2.


Yeah, for greenfield work building up on R2 is generally a far better deal than S3, but if you have a massive amount of data already on S3, especially if it's small files, you're going to pay a massive penalty to move the data. Sippy is nice but it just spreads the pain over time.


> Sippy is nice but it just spreads the pain over time.

That egress money was going to be spent with or without sippy. It's not "just spreading" the pain, it's avoiding adding any pain at all.


I could be mistaken, but I believe AWS would still charge for one direction of an S3 to Databricks/Snowflake instance/cluster.


AWS S3 Egress charges are $0.00 when the destination is AWS within the same region. When you setup your Databricks or Snowflake accounts, you need to correctly specify the same region as your S3 bucket(s) otherwise you'll pay egress.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: