Dropbox hasn't dropped AWS, they moved things off AWS as it made sense to. The article is talking about two things, the move of file storage and a network backbone. Neither of which were done recently.
The file storage move from S3 to Magic Pocket is detailed in these blog posts:
Early SSL/TLS termination is to reduce latency; the longer-lived connections from PoPs to Dropbox datacenters is over a TLS 1.2 connection with PFS. See an earlier blog post[1]:
> We use TLS 1.2 and a PFS cipher suite at both our origin data centers and proxies. Additionally, we’ve enabled upstream certificate validation and certificate pinning on our proxy servers. This helps ensure that the edge proxy server knows it’s talking to our upstream server, and not someone attempting a man-in-the-middle attack.
(N.B.: I work on security at Dropbox, and consulted on this design)
I'm not a lawyer, but I did work on one of the previous Transparency Reports[1]. From our most recent one:
> Between July and December 2016, Dropbox did not comply with any non-US government legal process unless issued by a US court as a result of the Mutual Legal Assistance Treaty process.
... if that helps answer what you're getting at :)
I don't know if they use SSL/TLS to their upstreams, I'm just saying terminating in at the edge doesn't mean that is the end of all SSL/TLS. It is totally normal to terminate SSL/TLS at the edge, pretty much anyone using an HTTPS load balancer or CDN does it, but the LB or CDN can still use SSL/TLS to the upstreams and verify certificates of upstreams.
The 'value' cloud services provide is purely for experimental services an organization does not want to commit physical assets. Those deluded to believing cloud services provide any value beyond test service deployments are propaganda poster boys for today's tech sucker awards.
Surely, there are people out there who overpay for AWS resources. But the truly deluded are those who purport to understand the sheer width and breadth of organisations and their wildly different requirements and priorities well enough to to brand them suckers.
You are extremely wrong. The Dropbox story shows that if the bulk of your value comes from selling a commodity (storage) then you need to improve your margins by moving away from a provider that also makes the bulk of their money selling the same commodity.
For companies where the value lies in the utility of a service that can't be easily replicated, you have pricing power to make the convenience of AWS worth the expense.
Pretty sure that the value of Dropbox doesn't come from the storage. It comes from their software. The magic thing that allows to replicate all your data seamlessly between as many computers and phones as you want, without the need to click any button or understand what the word storage even means.
Actually, since the servers themselves are physical assets with a useful life, and indeed could be leased independently from the business model during that useful life, they'd be modeled as depreciation not amortization. But yes, EBITDA would be increased either way.
Realistically, the valuation multiples on EBITDA for mature SaaS businesses are different from those for datacenter operators, and someone building a financial model for Dropbox would now likely take a blend of these this into consideration. So EBITDA is by no means the end-all be-all here.
EDIT: And taxation is a different story altogether, as in either the cloud or own-hardware case, Dropbox can write off expenses or depreciation respectively.
Good point. I always wonder how such accounting changes truly reflect the financial performance of the company. On one hand it's a lot of upfront cost to Dropbox with high risk too but all that won't show up completely if this cost is amortizable over the long run.
Nearly every company does that. It's amazing how little many companies actually own. I've worked in offices where everything from building over furniture to even the plants was leased. The company ran with close to zero assets (services business). Pretty sure that this wouldn't make any sense if it wasn't for tax purposes.
I don't think this article is news. The editorialized title in HN is unfortunate and I wish a mod can change it. The article talks about the reasons Dropbox decided to move 500PB storage into our own data centers. Also like the article mentions, "They still use AWS for some workloads".
Yeah and that's why it amazes me that Netflix uses Amazon a straight competitor that uses the money it makes from Netflix to compete with Amazon prime.
Netflix probably in a better position to do the math than random people on HN. I think they have done their homework and decided that it is overall cheaper to keep ___some___ of their infra on AWS.
Yes! Thank you! Every time some piece like this is posted 'experts' arise here and know everything better. Like there is a bunch of idiots working at Netflix who are happily burning money...
I hear this line of thinking a lot but I don't really buy it.
You could just as easily say it's amazing that Amazon allows a competitor like Netflix to run on it's platform. The reality is that relationship is more complex than that.
Netflix is going to exist regardless of whether Amazon lets them run on AWS.
Amazon letting them run on AWS is brilliant as they get a piece of Netflix's pie. So even if Netflix beats Amazon Prime, Amazon gets to dip into their pot via revenues from AWS. It's a great hedging of bets.
They're totally separate businesses. The people that treat Netflix like their customer aren't the same people making strategic decisions about Amazon Prime. Amazon is a ridiculously large company.
Migrations and self hosted infrastructure don't do themselves. Netflix can get quite a lot of stuff done on AWS for the $200k+ dollars it would take to hire one single employee.
What's the largest cloud provider that does NOT offer a Netflix-like streaming video service? Amazon/AWS, Microsoft/Azure, and Google Cloud all have offer premium streaming content of some type. IBM Cloud?
My point was that Netflix uses AWS services exclusively instead of dividing it between multiple services at least though now days they probably get really deep discounts from Amazon to stay on AWS because of competition and it probably is lot cheaper for them then building and maintaining it's their own servers.
Netflix built their own CDN ages ago. Given that their core business logic is basically static content delivery whats left to AWS probably is a lot less than commenters in this thread seem to think.
AWS doesn't even let you do BGP. If you want to use them for a CDN, you're monopolized into their network and their blend of ridiculously overpriced bandwidth.
They might be using AWS for the canonical store of data, but the sanity of using "the cloud" goes out the window the second you need to ship a lot of traffic to the public internet.
While I don't know what pricing AWS has for high bandwidth customers since it's not public, I do know that the public prices are really high.
The thing is, even for smaller companies it's easy to get way better deals on bandwidth. I'm currently using a provider that runs openstack and I'm paying less than $0.009/GB, from the first byte(no commitment). I'd say that is very cheap compared to AWS pricing at $0.05/GB above 350 TB.
With colocation or renting dedicated servers and IP-transit you can get way lower prices. A dedicated 1 Gbps is about 300 USD/month with my provider, and the prices drops hard for dedicated 10, 40 and 100 Gbps uplinks.
Never used OVH myself, but they are providing 2 Gbps at £340/month for their highest quality bandwidth, and £64/month for bandwidth they think would be good for downloads.
I think using cloud providers for somewhat high bandwidth services are basically throwing money out the window. Until prices really drops(if they ever will) I'll continue thinking that clouds are for temporary workloads or prototyping with the ability to easily scale at a high cost.
Below $10k/mo bandwidth plays are irrelevant - the cost of the support infrastructure is just too high.
At $10k/mo what you do is plop a rack in one of the well connected buildings in NYC, VA, SF, SJC, CHI and get either flat rate 10Gs or 1G commits over 10Gs. You should be paying between 0.55 and 0.75 per Mbit/sec on 95th percentile on 1G commit. You put into the same cabinet your edge nodes that actually would be pushing the traffic out and use AWS or GCP for your compute workloads.
To be fair, Netflix only runs some of their infrastructure on AWS. Basically, if I understand correctly, their applications run on AWS and their delivery is through their own CDN built on top of various ISPs[1].
Correct, but some is probably an understatment. They are pretty much run everyting on AWS except what you cited (e.g video processing and content delivery). Netflix is the biggest AWS user.
> We rely on the cloud for all of our scalable computing and storage needs — our business logic, distributed databases and big data processing/analytics, recommendations, transcoding, and hundreds of other functions that make up the Netflix application. Video is delivered through Netflix Open Connect, our content delivery network that is distributed globally to efficiently deliver our bits to members’ devices.
They have a completely different use case that benefits from being able to scale up and down quickly. I don't see them moving away from AWS anytime soon.
Please, how fucking stupid. Of course they moved as much as possible off AWS, and only use AWS for new and experimental services. The markup of cloud servers is immoral, and the brainwashing that cloud services are a value at all is a testimony to how many people simply do not do the math.
I am not sure what you are talking about. They moved out the 500PB data storage because ______at that scale______ it is cheaper to run it on your own hardware. This implies few things like having engineers with skills that are really not easy to find for starting and a bunch of other challenges that you need to solve.
What does AWS do when someone moves 500PB of storage off of their systems? Do they sit idle till some other big customer comes along or is their growth so phenomenal that a 500PB move off doesn't even slow new (storage) deployments at all and just keep up their pace?
My guess (and it's a terrible guess) is this may be about 4mo growth... and they might take the opp to retire older lower capacity drives and just keeping up the build-out.
That's a good point --but they didn't start with 500PB, when they first went to AWS they probably only used a few PB to begin with and grew "organically" over the years. But likely still had some contractual stuff for this kind of scenario.
12 TB Disks are pretty recent phenomenon, not 3 to 4 years ago when Dropbox did this exercise. Backblaze numbers are a good source. So adjusting your calculation, that should be 3 Month No-Storage-Buyout scenario.
I had a chance to compare all of the file services a few years ago in a large corporation and I thought out of all of them, the DropBox engineers were the strongest. Microsoft second.
I am curious on the hardware build out used for the storage nodes. Some of the major issues I have seen with all the appliances out there for storage are the following:
1. Network throughput on the appliance is fast but for an Enterprise level the 10gigE cards used become a bottleneck for transactions because of how the software hypervisor scales the data.
2. Power consumption of the appliances in a rack mount environment are too high and leave needed space that has to stay empty because of the facility power per rack limitations.
3. The software hypervisor scales the stack vertically and relies on the software to load balance horizontally. The performance in a high transactional environment becomes dependent on the software to scale instead of the natural horizontal distribution that can be setup on the hardware out of the box. Standard multi-purpose storage arrays scale horizontal with very little over head from traditional software storage management. I only found one company whose software does not force the stack to be vertical but they fail to meet a reasonable performance in network/power.
Streaming petabytes of data to keep a dynamic constant (static overall storage requirement that changes it's data life cycle via retention rules) becomes very hard with premade hardware.
Does anyone have any recommendations or has attempted a similar exodus from S3 that they can share?
To fit 500 petabytes on 1500 tapes, you'd need to put 333TB on a tape. Do any currently available tape formats support that much data on a single tape?
Granted, I stopped paying attention to tape drive capacities a while ago, but the upcoming LTO-8 standard will "only" support 13TB/tape, so you'd need 38,000 of them to hold 500 PB.
I've seen some higher density announcements in the 200 - 300TB/tape range, but couldn't find any products available.
AWS does offer a "Snowmobile" [1] product, that can hold 100PB on disks in a tractor trailer.
"I have a few qualms with building storage data centers. For a Linux user, you can already build such a system yourself quite trivially by putting 1500 tapes into the back of your minivan and then mounting them locally via curlminivantapefs."
I recently misplaced one of my 32GB SD Micro cards I was using to have an alternate image for my Raspberry Pi. So I just bought another one for the price of a half-decent lunch.
And I did have a flashback to my first hard drive, all 220 glorious megabytes of it, and a great deal more expensive than "a decent lunch", and get a sweet hit of that yeah... I am in the future, aren't I? feeling.
What will you tell your customers, exactly, while you go on a road trip with their data? And what happened to the data between the time you load Tape 0001 and Tape 1500?
Moving 500PB on a live application is not a trivial task.
Dropbox is cool and all, but i hate their pricing model. I just want to keep about 50-100 GB there (i don't hoard stuff). I don't wanna pay $10 / month for that.
At backblaze i pay less than $0.5 / month. I could pay quadruple, since Dropbox is a lot better service (and quite a different one, too). But not 20 times as much.
I know that Dropbox doesn't care about my tiny dollars and all. But why not let customers pay for what they use? This "constant growth" bullshit is probably the reason they don't care.
what bugged me about their pricing model (perhaps it's changed?) is that things people share with me count to my cap. I had a client share stuff with me via dropbox. So I signed up. I get "2 gig free!" Yay. Another client shares stuff with me. Then another. Then another. Then I'm at my 2gig cap with just other peoples' stuff. They all think it's great, because none of them are paying for it. I needed to pay money to support their free use of the service, and I declined. Just one more $x/month service I didn't need to get hooked to. Maybe the pricing model has changed what they count now?
It was largely the principal of the thing. It's yet another $x/month, on top of other services being paid for to help service/manage those clients. Example - PM tools. I'm already paying for project management tools - put the files there, damnit. "oh, but I like dragging to my dropbox!". Oh, and someone else uses box.com, and so on.
Again, what gets my goat is partially the payment, but mostly the "it's free for them, but I end up needing to pay to accommodate their use of the service" angle. If they - other parties - were paying for it first, it would bother me less. But raving about how 'free' a tool is which I have to pay for to work with them (and they're using it precisely because it's free) bugs me.
Getting sucked in to a few more $x/month things here and there 1) dilutes where stuff is supposed to live (wrt files) and 2) just becomes a drag on finances. A handful of services can end up going from several hundred dollars in to the thousands if you don't keep tabs on things.
Maybe I'm an HN failure because saving several hundred dollars per year matters to me? I guess my skills must suck - most people can apparently rustle up $300/hr Rust/Go/Elixir project work simply by starting to formulate the idea that they're considering taking project work. I'm not that skilled.
> While cost is always something that we consider, our
> goal with our network expansion was to improve
> performance, reliability, flexibility and control
> for our users — which we have succeeded in doing,”
> a company spokesperson told TechCrunch.
Who believes this?
It was, easily noticable by everyone, wanting to reduce the
cost. Which is fine, everyone does so, so why not admit that
it was the primary impetus?
I would not want to outsource control over any larger company
that I were to run to other, even bigger companies.
Performance and reliability definitely affect subscription rates. Revenue growth is actually more important than decreasing costs (as long as your costs don't start increasing disproportionately).
I can see how you would think it is only about cost.
However, then you've had Amazon tell you "no" when you try to get more resources, and when you've had your business go offline for hours while you wait for news from Amazon, you may want more control.
Three data centers "biult" by only the dozen people on the infrastructure team? Not possible. I wish articles like this wouldnt hype small teams where it is obvious that most all of the work was outsourced. It would seem that dropbox here was still operating as customer: ordering rather than physically biulding much of anything. Those dozen dropbox people were not running cables.
The article is unfortunately not very specific and I can not find much information with a quick search, but:
It's very possible that they did this all on their own provided they rented cabinets/cages in an existing facility like Equinix, which is extremely common. Then they do not need to manage power/generators, fiber into the building, or any other data center necessities.
It does not take very many people to do the wiring inside a cage, especially considering Dropbox has been doing this over the span of a few years. If you've ever been physically in a data center (personally visited a few Equinix facilities myself), it's mostly one person from a company working in a cage that is wiring everything. I've rarely seen multiple people do wiring, and that same person will come back every day thereafter to continue working until the job is done.
What they are doing sounds entirely feasible. If money is no object in regards to equipment, with 3 data centers, I'd honestly say you only need 4 competent people to get the job done.
3 on-site. 1 remote/office.
From there, the more the merrier. A guy to lift the equipment up as well is nice sometimes - can be pretty heavy!
"We’re talking about a company that had 1500 employees, with just around a dozen on the infrastructure team" - what the rest 99% of the company is doing, marketing?
Yes, marketing. Also software development, web development, design, customer services, corporate services, support, sales, PR/comms, human resources, finance & accounting, legal, and so on.
1500 sounds like a lot, but keep in mind how many customers they have, and in how many countries they sell their services.
The wording got mixed up a bit. "Infrastructure" was a much larger organization, but there was an infrastructure team of about 12 people who worked on Magic Pocket at the time.
I was also thinking that those numbers are really weird, for a company which does infrastructure(fileservers)-in-the-cloud. Obviously some devs doing software for various platforms, business-side folks and so forth. But 1500:12 is a really weird ratio, given the core function of the business. If they're not spending on infra staff... what staff are they spending on?
I am reading it from a desktop and I had to do a double take when I instantly noticed how it was techcruch but didn't have the epically annoying fixed sidebars and delayed loading ads messing with my scroll bar. I came here to suggest we add /amp to the end of all techcrunch submissions. Title, content and a single static ad at the end, if all news sites were like this I would not complain.
Wait, but using the non-AMP version is a decision too. Why do you get to make that one instead? Shouldn't the person posting the link get some discretion?
Principle of least surprise. techcrunch.com has a default experience, amp is secondary. Since we can't all read each others minds the default behavior should be to use the ...default.
Why is there always so much hate for amp on HN? I'm on a PC and I vastly prefer these pages to the huge/bloated default pages offered by so many publishers. Using the TC page linked in the original post as an example, if I turn on Ghostery it blocks 22 trackers! https://imgur.com/a/iTnlT
It's not all about privacy and trackers, but about search monopoly that well might abuse it's power to become content provider monopoly as well using the fact they have fastest CDN.
Though it's less of a problem with self-hosted AMP, but still a problem because it's still tech from company that might choose to improve web and can very well do it with Chrome market share, but instead decide to replace it with own thing.
I have good reasons to not trust Google because over years they provided worse experience for Firefox on their own services and it's still annoyingly true on Android.
Oh wow, I did not even notice this until you pointed it out.
So the solution is SO bad that the "amp" tag even becomes part of the URL?!
Now THAT is really weird ... well aside from me disliking AMP anyway, that ... hmmm. We can send people to the moon, probably soon elswhere but ... we have to denote that ... we use AMP by ... a appended string called 'amp'.
Um, what is this crap? When did they actually move? How long did it take?
The title of this HN link leads you to believe this is recent, but the article makes it seem like a multi-year effort, that in fact, could have finished a long time ago.
The timeline is really, really significant here. Did they initiate this back in 2013? 2015?
My company has seen a remarkable shift in trust for online services change the last two years. Was Dropbox ahead of this curve? Were they someone who brought this change in mentality around?
This blog article is crap because without a basic sense of _when_ it could be that they were riding the coattails of other industry leaders. Or they were leading the charge.
A good one. People should realize that building your own DC is an art itself and very few companies can do it at the quality bar as AWS. HN does not take jokes btw. :)
Dropbox has moved user data from S3 to its own colocated data centers over the past few years, and is also doing compute in those data centers too. The compute actually existed for quite a long time - in the past you'd be talking to a Dropbox run server which would connect back to S3 to retrieve data.
Dropbox is definitely still an AWS customer, just not a major S3 or EC2 customer anymore. For example, all transactional email uses SES, and DNS is hosted on Route 53.
The file storage move from S3 to Magic Pocket is detailed in these blog posts:
https://blogs.dropbox.com/tech/2016/03/magic-pocket-infrastr...
https://blogs.dropbox.com/tech/2016/05/inside-the-magic-pock...
https://blogs.dropbox.com/tech/2016/07/pocket-watch/
The network backbone is talked about here:
https://blogs.dropbox.com/tech/2017/09/infrastructure-update...