Migrating Dropbox from Nginx to Envoy

e40 · on July 31, 2020

Also note that we’ll cover the open source version of the Nginx, not its commercial version with additional features.

It always kills me when very successful companies don't buy software from other companies.

I remember being at a lunch with a prospective client that really loved our technology. About 1/2 way through, he said he really would love to purchase our software, but the CEO doesn't allow them to use anything but OSS. What they make? Non-OSS software.

Just blows my mind.

JoshTriplett · on July 31, 2020

In a business context, I'd definitely consider paid support for an Open Source product. But I'm not interested in a proprietary version that I can't modify or get third-party support for or otherwise work with in a pinch; I'm certainly not going to make a business dependent on it. Push the proprietary version hard enough and I'll reconsider whether I even want to use the Open Source version, or if it might be on more tenuous ground that might get undermined in the future (pushing back on improvements to the Open Source version to maintain differentiation, or worse, deciding to switch to a non-FOSS license in the future).

nyanpasu64 · on July 31, 2020

> Push the proprietary version hard enough and I'll reconsider whether I even want to use the Open Source version

Qt is pushing hard for commercial licensing (which I heard prevents you from using the open-source version), putting L/GPL FUD on their websites, and trying to track users of their installers more.

JoshTriplett · on July 31, 2020

The model of "copyleft if your project is open, pay us if your project isn't open" is one where I have no problems or concerns, and will happily use the open version and recommend that people building something proprietary purchase a paid license. Nor will I typically worry about the motives or future of the project unless I have some other reason to. And the KDE Free Qt Foundation means I never have to worry about Qt going proprietary.

stavros · on July 31, 2020

Does anyone know a good license for that? Maybe the Prosperity license?

EDIT: I've just realized that I want a revenue-limited trial, rather than a time-limited one. I basically want the Prosperity license, but with the ability to say "you have to pay me if your company makes more than $100k in annual revenue". Is there a license like that?

I've emailed the License Zero people, hopefully they'll do something for that.

JoshTriplett · on July 31, 2020

> Does anyone know a good license for that?

The "that" in question was "copyleft if your project is open, pay us if your project isn't open". For that, try the AGPL or GPL, depending on your use case and customers, and then sell alternate licenses for people who don't want to make their own code open.

> Maybe the Prosperity license?

That isn't an open source license, despite its efforts to be ambiguous on that front. That and the even worse "commons clause" are exactly the kind of license that motivated the latter half of my original comment at https://news.ycombinator.com/item?id=24005833

stavros · on July 31, 2020

The Prosperity license seems like a better fit to me, as the GPL/AGPL has a different set of constraints (e.g. customers who want to keep their code closed).

What's wrong with a license that's non-OSI "open" but gets developers paid from large companies to develop otherwise open software?

JoshTriplett · on Aug 1, 2020

The comment you originally responded to said:

> The model of "copyleft if your project is open, pay us if your project isn't open" is one where I have no problems or concerns, and will happily use the open version and recommend that people building something proprietary purchase a paid license. Nor will I typically worry about the motives or future of the project unless I have some other reason to. And the KDE Free Qt Foundation means I never have to worry about Qt going proprietary.'

Software under a proprietary license is something I can't build other Open Source software atop of that I expect other developers to use and collaborate on. I don't want to have a forked ecosystem of proprietary-with-source-available software, I want to actually collaborate on Open Source software.

With Open Source, I'd feel confident that if we had to manage it ourselves, or fork it and add a patch, or get a third-party to develop a patch, or work with a dozen others with the same needs we have and collaborate on it, we can do so. It's reasonable to build an ecosystem or community or company around. You cannot replicate that with any non-open license; by the time you're done granting all the necessary rights, what you'd have is a non-standard-but-Open-Source license, at which point you'll get more traction if you use an existing well-established Open Source license.

I don't really care about encouraging the development of more proprietary software, whether or not it happens to have source available. There are already well-established models for getting people to pay for proprietary software. If someone is looking for a funding model for Open Source, and what they find is "turn it proprietary and generate FUD that it's as good as open", that's a failure. And when people are looking for Open Source and they find proprietary-with-source-available, it undermines decades of careful work and explanations by the FOSS community, and generates substantial confusion.

It's your software, and ultimately your choice how to license it. Various companies have tried the "source-available but still proprietary" model. Just please don't call it open or contribute to the FUD around proprietary-with-source-available licensing.

Speaking from experience, when encountering software under a proprietary-but-source-available license that tries to act like it's open, the response from companies that actually deal in Open Source is not "Ah, OK, let's pay them and use it", it's "yikes, get this away from us, how did this even end up in consideration, let's either use something Open Source or use something actually proprietary, not something by someone who is either confused about the difference or hopes other people will be". (The set of engineers and lawyers who deal with software licensing professionally, at various companies, tend to talk about it with each other.)

stavros · on Aug 1, 2020

That makes sense. Unfortunately, as someone else in the thread has mentioned, the ways of monetizing OSS misalign the incentives between user and developer, and they haven't really been very successful anyway.

I develop multiple popular libraries that thousands of people use, yet I've never seen a single cent from them, which is fine for me because I don't develop them to make money. However, it's really hard to foster an ecosystem when companies who extract millions of dollars of value from FOSS don't feel like they need to give back.

JoshTriplett · on Aug 1, 2020

> Unfortunately, as someone else in the thread has mentioned, the ways of monetizing OSS misalign the incentives between user and developer, and they haven't really been very successful anyway.

Many ways of monetizing it don't misalign incentives. As a user, I don't value support less just because software is more reliable; on the contrary, I trust the software in higher-value contexts because it's reliable, and in those contexts I need the support more. I don't value "host this for me" less just because the software is easy to install and configure (because I still don't want to be the system administrator if I don't have to). And "please develop this feature for me" has great alignment of incentives.

> However, it's really hard to foster an ecosystem when companies who extract millions of dollars of value from FOSS don't feel like they need to give back.

You're a lot less likely to get paid for software that's under an all-permissive license (e.g. MIT or Apache or BSD). It's unfortunate that so much of the ecosystem has settled around permissive licenses; with such licenses, your best strategy for making money may be "use the software to get hired somewhere or to build reputation for consulting". There's a reason companies love permissive licensing, and subtly (or unsubtly) discourage copyleft. Strong copyleft gives you more options to monetize something, either now or in the future.

That said, I also do agree that there need to be more good ways of funding Open Source.

stavros · on Aug 1, 2020

Maybe you're right, I do tend to choose MIT/BSD usually. I think I'll switch to GPL, though I don't entirely agree with your first paragraph (I am less likely to pay for software if I can host it myself, for example, though possibly not by much).

pnutjam · on July 31, 2020

Well, once I paid the fees to license our logo from subcompany A, and the rent on all my servers to subcompany B; we didn't make any money.

stavros · on July 31, 2020

Yes, this is an example of a company I wouldn't want to charge for my software, at least until you're into six figure revenues.

nitrogen · on July 31, 2020

Not sure but I think maybe they were mixing up profit vs. revenue and pointing out how a company can ensure it has zero profit. But I guess you'd also have to be careful about which entity or sub-entities are covered by a license and the revenue limits.

stavros · on July 31, 2020

Ah, I see, by "subcompany" they meant their own subsidiaries. Yes, that's why I said "revenue" in the original post, thank you for the clarification.

tlb · on July 31, 2020

Unreal engine charges a 5% royalty on sales after the first $1M in revenue. And source is all on github, so it's basically zero friction until you get big.

https://www.unrealengine.com/en-US/faq

RabbitmqGuy · on July 31, 2020

Also have a look at polyform licenses

https://polyformproject.org/licenses/

stavros · on July 31, 2020

Ahh, yes! I think I want a cross between the Prosperity and the Small Business license.

account42 · on July 31, 2020

Qt is more and more trying to get away with restricting the open source version as much as they can without triggering KDE's escape rights.

speedgoose · on July 31, 2020

And nowadays, most people use Electron instead of Qt.

tonyarkles · on July 31, 2020

I’ve been using Qt/C++ for a few months here and I’m really really impressed. It’s really nicely done, and the performance is fantastic. Honestly, I’d have probably considered Electron, but the software runs in a somewhat resource constrained environment and has pretty significant soft real-time processing requirements (the C++ part)

jgalt212 · on July 31, 2020

which is deeply unfortunate due to how resource hungry Electron is.

boris · on July 31, 2020

What if it were a single source-available version (that also allowed you to, say, get third-party support/customizations)?

pm90 · on July 31, 2020

If the licensing model asks for companies to pay for the "Enterprise" features which are still OSS, it would resolve this problem for me at least (not sure about the OP).

In general, its been my experience that the closed source enterprise only crap that most companies push for is exactly that: crap. I suspect its because those features are treated as a business expense, and thus built to keep costs low. Almost every time those features are underwhelming and buggy. If its OSS, at least I can contribute a patch; if the thing is popular, likely someone else has already fixed it.

Enterprise support is a fucking joke; they will delay delay and delay. If you push hard enough, they say "its on the roadmap" without giving any gurantee of when it will be fixed. The only time Enterprise support has really worked well is in my org that got the best support package for GCP. GCP's support for urgent issues and product feature requests has been somewhat reliable and predictable. Much more than literally any other enterprise (I'm looking at you Okta).

JoshTriplett · on Aug 1, 2020

> GCP's support for urgent issues and product feature requests has been somewhat reliable and predictable.

Can you give an example of a product feature request that succeeded via that support channel?

pm90 · on Aug 1, 2020

- we had requested some way of being able to remove IAM permissions that went stale. GCP introduced a “recommendations” feature that indicates which IAM roles have been unused for a long time and can safely be removed.

- we requested a way to prohibit the provision of ILBs on shared VPC subnets without explicit grants; GCP introduced a role for that

- we had issues with the number of compute instances being too high on our VPC and reaching limits because of the number of vpc peerings. In the past, every GKE cluster created a new peering. We have a bunch of GKE clusters, and as we added more the Max number of instances we could provision was reduced significantly. GCP introduced and fast tracked a feature that enabled all GKE clusters to use a single peering rather than create a peer per GKE cluster.

There’s a bunch more. But on this front I have been a happy customer.

JoshTriplett · on Aug 1, 2020

Thanks for the details!

JoshTriplett · on July 31, 2020

If you're saying "source-available" as opposed to Open Source: complete non-starter. I'm looking for Open Source, not a faux knock-off of it; don't try to give me a subset of Open Source license terms that you think would placate me while not actually being open.

afiori · on Aug 2, 2020

the free software people say the same thing about open source

nrmitchi · on July 31, 2020

I think a part of this is that engineers in particular have a preference to use software which can be treated as a transferable skill if/when they move on. They would rather use the OSS version of Nginx, of Envoy, because they know they will have access to it in the future. I think there is some aversion to becoming familiar with the features, functionality, and characteristics of a piece of software that your current employer is paying a non-trivial amount for, when you know that chances are your next employer will refuse. This may not be in the best interest of the current company, but it's a bias that I think impacts a lot of engineers.

pjmlp · on July 31, 2020

It is a generation thing, back when I started the only free beer software was my own.

Even for code listings I had at very least to buy the medium where they came.

FireBeyond · on July 31, 2020

You can purchase the commercial version of Nginx and not use its specific paid feature subset. Alternatively, "I am familiar with this, and we can do x% of what we need with the OSS version, but we had to pay to get the last part."

nrmitchi · on July 31, 2020

I see what you're saying, but this is basically suggesting that Dropbox should have made a donation to F5 (a public company with an $8B market cap).

I think there is a valid point you're making for smaller companies that are providing both open-source and commercial versions of software, but I don't think Nginx is a great example of that.

tgma · on July 31, 2020

Why does it blow your mind? Various obvious and sane reasons for this, including cost. I bet many of those companies you have in mind “buy” Windows and macOS, and if they are sufficiently big, most certainly buy Oracle or SAP for their corporate operations, finances, and accounting. It’s usually only the production side of sufficiently large internet-scale companies that is biased towards building. Most of the time you can easily explain it with “it’s cheaper than the contract with supplier”. Often, it is strategic ownership over your fate and not being locked into the vendor that comes into play as well. The vendor positioning in the market and its leverage may also change over time and pose a risk down the road (getting acquired, changing focus, abandoning the product, going out of business, etc.) Many times it is just that their problem is so unique to their scale that the generic solution does not technically work for them or the pricing model is not designed to be a fit.

In the particular case of nginx, I can tell you their reputation is not great in adapting to the users’ needs.

rebelnz · on July 31, 2020

I think the point was a financially successful company not contributing to an open source project even after making a bunch of money just seems un-ethical? Maybe I'm old-school but I still think we should be supporting each other in this type of situation especially if one of us strikes it big? Sure - move away from Nginx but maybe throw some $ their way for the service they provided even if you don't legally have to ...

tgma · on July 31, 2020

I did not infer that from the OP’s comment, but in any case, I don’t know the specifics of their arrangement or whether they had been a commercial customer or not. Last time I checked Nginx had been doing fine selling itself for half a billion.

But more abstractly, I don’t actually agree with that sentiment. I see more of a responsibility to give back in the form of patches and collaboration than throwing $ at the problem. I see the nginx approach of open source simply as a business tactic no different from Windows Home/Pro customer segmentation except Home is free for tactical reasons to kill off other competition. It is a calculated business move; if your business model sucks-—which it obviously did not in nginx case—-does not imply others are acting less than ethically or they should pay you out of pity. (That said it might be strategically important for them to keep your head above water and survive for their own benefit as their vendor, but that’d be a different angle.)

I suppose the difference between free software vs open source is also relevant to this discussion, and I could relate to your sentiment when facing the former much more than the latter.

pjmlp · on July 31, 2020

All fine and dandy, except that those patches and collaboration don't pay bills.

nicoburns · on July 31, 2020

Often big companies employ people directly to work on open source projects that they use heavily. That does pay the bills.

pjmlp · on July 31, 2020

Big companies like Dropbox....

chii · on July 31, 2020

> patches and collaboration don't pay bills.

implying that just because your open source project is being used, that it is entitled to fund the bills of the project maintainers.

I think patches and contributions are a form of bill paying.

rleigh · on July 31, 2020

Patches and contributions take a non-negligible amount of time and resources to review, test and integrate, as well as adding to the ongoing maintenance burden. They might be welcome, but they are absolutely not cost-free and I wouldn't consider them a "form of bill paying", the benefit (if any) is far too indirect and it doesn't directly help the bottom-line in any way.

ivalm · on July 31, 2020

Something should pay for the bills and if the oss project creates lots of real value then I would prefer to live in a world where some of that value goes to pay the bills. The alternative world simply discourages oss since devs would have to work other jobs. There is qualitative difference when you have a dedicated core team vs just everyone contributing patches.

pjmlp · on July 31, 2020

When supermarkets and landlords start accepting them, then yes.

tgma · on July 31, 2020

Could it be that OSS is simply not a sustainable business model for the long haul and it was simply successful in a period of history when vast money was made quickly by landgrab expansion of technology to consolidate/provide many basic services and the code itself wasn’t the competitive differentiator? I don’t know but that’s a possibility too. I question why one would be concerned in keeping OSS alive, as a business, assuming it cannot survive on its own feet. There’s no inherent reason OSS should somehow forcefully live. It’s already changing its character via AGPL and Mongo license-style things in the face of AWS cloud simply deploying and milking cash.

(The above is assuming the concern that it is funding that’s a problem today; I don’t quite see it that way [for instance, I strongly suspect Nginx to have made more money than DBX so far, so who are we to say who’s been more successful; market cap ain’t everything], but that’s a hypothetical to think about.)

Moreover, supporting a project does not equate supporting its existing maintainers. It could mean taking some partial ownership including the review side and having some developers on your own payroll. Seems like that’s how the big project are done most of the time. The Open Core model we are focusing on is a niche and arguably more akin to fremium products than free software as a thing with communal ownership.

chii · on July 31, 2020

> OSS is simply not a sustainable business model

OSS is not a business model, but more closely matches charity and non-profits, and run on donations and altruism of their users.

i find it annoying that people here keep saying that a company _should_ pay for their open source software usage just because they have money to do so. They don't have an obligation. They could donate - and some do - but it is in no way required of them, regardless of how much value they derive from using said OSS.

Open-core projects, which has a somewhat useless core and a paid for 'enterprise' version is not, under my eyes, a proper OSS project, but instead is a way to market their proprietary product.

pjmlp · on July 31, 2020

The serendipity was GPL getting uptake thanks to Linux and GCC.

Linux via the ongoing lawsuit with BSD back then, and GCC because UNIX vendors started charging for their compilers, with GCC being the only alternative available.

However everyone needs to pay their bills, therefore the push for non-copyleft licenses, thus in a couple of years GPL based software will either be gone, or under dual licenses.

You already see this happening with BSD/MIT based alternatives to Linux on the IoT space, NuttX, RTOS, Azure RTOS, Zephyr, Google's Fuchsia, ARM's mbed, Arduino, ...

ClumsyPilot · on July 31, 2020

>> "There’s no inherent reason OSS should somehow forcefully live"

What on earth does this tirade even mean? Every business lives 'forcefully' and fights for survival. Sometimes it comes with values, i.e. we dont use child labour in DRK to mine thallium, fairtrade, organic, etc. OSS is one of those values.

Is there a business that lives 'effortlessly'?

pjmlp · on July 31, 2020

FOSS doesn't have anything to do with values, or do you refuse FOSS software tainted by corporation's contributions that don't share your values?

Because then it is going to be a very thin selection available.

ClumsyPilot · on July 31, 2020

FOSS software absolutely has value in an of itself, and I will take it even if it comes from satan himself.

As chirchill once said: "If Hitler invaded hell I would make at least a favourable reference to the devil in the House of Commons."

doteka · on July 31, 2020

Get that “implying” crap out of here, this isn’t 2008 4chan.

wwright · on July 31, 2020

For what’s it worth, many companies have a hard time justifying “unnecessary” expenses to their boards or shareholders. Depending on company structure, their hands may be somewhat tied.

Not all companies, of course; and to be clear, I think such a company structure is a problem itself and agree with you.

count · on July 31, 2020

I think you'd be hard pressed to find a board or shareholders who think 'support contract for essential component of our infrastructure' is 'unnecessary'.

wwright · on July 31, 2020

To clarify, I meant that it can be hard to convince the board to donate money to an OSS project when there is no "need."

e40 · on July 31, 2020

Just to be clear: the product I was sell was not OSS and the product they were building was not OSS. That's why it blew my mind.

SaveTheRbtz · on July 31, 2020

The subject of monetizing opensource software is a tricky one. Some companies pursue the Open-Core principle, others monetize through the consulting services or cloud infrastructures.

As for investing into opensource, Dropbox is trying to do that when possible, for example we (along with Automattic) did sponsor HTTP/2 development in Nginx.

gitgud · on July 31, 2020

Personally I think that monetisation of open-source goes against the consumer of the OSS in practically all cases.

- Open-Core::: Features are not added to core, as they want people to upgrade.

- Consulting::: Ease of use is ignored, as if it's too easy people won't need consultants.

- Sponsoring Goals::: Software is almost held at ransom, until goals are reached.

The best way to help open-source software is to donate or contribute code... if you're trying to maximise profits, then just make it propitiatory

JoshTriplett · on July 31, 2020

> - Consulting::: Ease of use is ignored, as if it's too easy people won't need consultants.

Some problems can only be made so easy. Some problems require custom work. Sometimes you need paid support not because the product is low-quality but because you need to know that you can call someone at 3am because your service is down. There are lots of reasons to have consulting.

> - Sponsoring Goals::: Software is almost held at ransom, until goals are reached.

You're assuming the work would get done one way or another. Sometimes people have many other things they could be doing, and they need to justify spending more time on a project than they already do. Or sometimes, people have a fixed amount of time but they're happy to prioritize things people want and will pay for.

(No argument about open-core; that definitely has problems.)

Other great approaches include hosting the software as a service. Depending on the nature of a project, many people may want a service whose primary value proposition is "we'll host this for you so you don't have to maintain and administrate it".

mfontani · on July 31, 2020

From my (admittedly) limited experience, paid support isn't consulting.

Paid support surely is, as you say, about calling someone at 3am and having them look into an incident.

From my experience, that's not about helping you get the most out of the product, and a hand in tailoring it to your needs - that's the consulting part, and is usually paid for separately (and at much higher rates).

stavros · on July 31, 2020

You can just charge companies that make over a certain amount of annual revenue. Then OSS and small companies can use your software fine, but when they get big they have to pay you.

PopeDotNinja · on July 31, 2020

Do you have an example of sponsorship goals actively gating software development? I haven’t seen this one. “I’m not patching this zero day until I get to $1,000,000!”

tgma · on July 31, 2020

Isn't that what RHEL is, in different words obviously.

PopeDotNinja · on July 31, 2020

I suppose, but you know you’re what you’re buying when you sign up for RedHat. I was trying to imagine a scenario where free OSS project does that, like Kubernetes or React.

freedomben · on July 31, 2020

Disclaimer: I work for Red Hat but opinions are my own

I totally disagree. Red Hat patches/maintains things regardless of whether people pay for it. Everything is always available open source. There are numerous derivatives of RHEL that get these for example.

The money you pay for Red Hat stuff is for support. There are always free-as-in-speech and free-as-in-beer alternatives of red hat products.

tgma · on Aug 1, 2020

It might very well be my misunderstanding, so I apologize, but doesn't RHEL get security updates that are unavailable to the "rest of us" for a bit?

Bombthecat · on July 31, 2020

Since when does blender 3d offer consulting?

edw · on July 31, 2020

It's not necessarily about money. An engineer can burn through tens of thousands of dollars per month in cloud spend because they have access to the AWS [or] GCP console, but that same engineer may not have the first idea about how to get the CFO's sign-off to purchase a license that will facilitate a halving of that spend. And that same CFO can institute a policy against using credit cards for recurring payments that prevent that engineer from expensing the purchase through a corporate card. And the software company may not offer a bill-via-invoice option — or they may only offer it for amounts greater than the amount the engineer wants to spend.

So much of what happens in sufficiently large organizations has nothing to do with profit maximization. Think confederacy of dunces, not a conspiracy of greedy evil geniuses.

rebelnz · on July 31, 2020

Exactly my thoughts when I read the article - a hugely successful company not contributing to an open source project which enabled them to succeed in the first place ...

dmicher · on July 31, 2020

There are different paths companies take. Some buy and it really works for them and their business, since overhead is small and everything just works. The other set of companies have more sophisticated requirements: when they want to have full control on what is going on, understand what the code is doing to better optimize everything else around it, faster shipping cycles and being able to implement what you want with out waiting for the next shipping cycle with commercial software, community and knowledge base around it etc.

rogerdonut · on July 31, 2020

> when they want to have full control on what is going on, understand what the code is doing to better optimize everything else around it, faster shipping cycles and being able to implement what you want with out waiting for the next shipping cycle with commercial software, community and knowledge base around it etc.

I'm a bit confused by this - I work for HAProxy Technologies and we do have an enterprise product. Many of our customers contribute code directly into the community and we backport those features into the latest enterprise stable version. This means they do not have to wait until the next shipping cycle to take advantage of a new feature. There's also a large community & knowledge base around HAProxy.

Your reasoning may be right when dealing with "closed source enterprise software" but it doesn't line up when we start talking about open source/open core.

dmicher · on July 31, 2020

Shipping cycle is one of the reasons mentioned. And as you can imaging, unfortunately, contributing to Nginx open source is not an easy thing (but they have a great product for sure). If HAProxy is different in terms of contributions - it is great!

FireBeyond · on July 31, 2020

What stops them buying commercial licenses for Nginx and then using the OSS version? They're not obligated to, certainly, but I hardly think Nginx would say "you must use only the commercial version".

move-on-by · on July 31, 2020

Just speculation, but there are motives to not use the closed source version beyond purely profits driven point of view. One of the prime benefits of OSS is that you have the power to change it whenever necessary. If something is breaking bad- you might not have the luxury of waiting on support to track down and fix your problem. If you don’t need the features of the paid version- then using the paid version is actually limiting your options.

Johnny555 · on July 31, 2020

I don't see what's surprising -- companies that earn money selling product can make even more money by cutting costs. And if OSS software gives them an equivalent (or even better) solution, why wouldn't they use it? For any sizable production deployment, the cost of nginx licenses could be applied to hire a number of engineers to help maintain the OSS software.

I don't know what their volume licensing is like, but at $2500/server list price, costs add up quickly.

P4wl0w · on July 31, 2020

Isn't this simple economic reasoning?

If you buy something or worse you have to pay license fees on a regular base your earnings will be smaller.

We live in a world that is driven by economic growth so the ultimate goal is to maximize profit.

Of course this has a moral aspect to it as well and I see it but in this case I think it is not outraging enough to be something on the scale of a scandal.

Many businesses use ideas or products for free to start a successful enterprise that earns a lot of money.

Bombthecat · on July 31, 2020

In germany its the opposite, no free software in production! Only software with enterprise support!

We Germans are very risk adverse (i hate that sometimes)

organsnyder · on July 31, 2020

That's true in many US companies as well. People like having a vendor they can fire when things go awry, rather than they themselves getting fired.

crawsome · on July 31, 2020

Reminds me of private companies who profit from public resources.

Like selling tap water in bottles.

adolph · on Aug 1, 2020

Is tap water not sold for commercial use at market rates? The public resource steward is leaving money on the table if they aren't.

holografix · on July 31, 2020

This also rubbed me the wrong way. As an individual I think that shows selfish and opportunistic behaviour and it raises a red flag about that organisation in my mind.

However, for profit companies are not here to do what’s “correct” they’re here to make money for its investors. If I had decision making abilities at Nginx I’d be conducting a comprehensive review of the free OSS offering and redacting the features and overall value with extreme prejudice.

Dropbox never paid because it COULD not pay. If you have an enterprise, paid version of your OSS product it has to be impossible for an enterprise to use it for free.

dragonwriter · on July 31, 2020

> If you have an enterprise, paid version of your OSS product it has to be impossible for an enterprise to use it for free.

Why? Most enterprises, especially ones that aren't tech firms, are going to shell out for enterprise support even if there are no additional features. Crippling the community version doesn't necessarily help enterprise sales, it can reduce overall mindshare reducing enterprise traction or, worse yet, mean that a third-party downstream edition with richer open-source features becomes dominant and it's creator gets “your” enterprise support contracts.

chii · on July 31, 2020

> shell out for enterprise support even if there are no additional features.

i don't feel this to be true.

Also, an enterprise that's large would want some features that are irrelevant to a small shop. For example, single-sign-on integration with various providers.

aliswe · on July 31, 2020

I do though ... Imagine a manager being dependent on a system he ~~bought~~ installed for free, which he didn't buy support for, and is now malfunctioning. It's his fault. And this is what I've seen in practice as well.

Just think about the commercial success of SUSE Linux?

dwaltrip · on July 31, 2020

> However, for profit companies are not here to do what’s “correct” they’re here to make money for its investors.

While partially true, this is overly reductive. Companies can and often do take actions that serve goals beyond "increase upcoming quarterly profits".

unionpivo · on July 31, 2020

You can't redact the features that are already open sourced.

And besides, if that were to happen people would just go behind some other open source web server, and push that.

ganfortran · on July 31, 2020

This ain't mind-blowing by any means IMO.

If the said company has unknown track record, then doing business with them is risky.

What if the company goes out of business in near future? Or get acquired (actually I think A lot of infra companies's end goal is to get acquired)? What if they raise the price out of sudden? How extensible/customizable their solution is?

The trust is the key here. If I am in the position to buy software from somewhere and cost isn't the primary concern, the money would goes to a known/stable figure in the industry.

Angostura · on July 31, 2020

In the case of buying from a small company this can make sense. If they fold it is good to know that the software will still be around.

PopeDotNinja · on July 31, 2020

I’m increasingly concerned about being screwed by non-OSS vendors. Imagine a use case like Slack. Say you have an employee that goes to visit a family member in Venezuela & connects to the company Slack. Slack has been given a mandate to terminate accounts for people in Venezuela by the Trump administration, and now your key employee is cut off from communication, or perhaps your Slack account gets flagged.

C1sc0cat · on July 31, 2020

HR/ The company should be providing advice on going to "at risk" areas.

Also if your going to china take a disposable phone and a laptop that is clean ands can be wiped on return.

randompwd · on July 31, 2020

That not a non-OSS issue. That's a SAAS issue.

Even if your SAAS was OSS, they could still deny you access as you're inhibiting their server, not your own.

PopeDotNinja · on July 31, 2020

Fair point.

nickdothutton · on Aug 1, 2020

Business is motivated to avoid anything which is a tax. Said another way, they are motivated to avoid or escape from anything that grows in line with earnings. If their infrastructure grows, their bill from nginx will grow, modulo the skills and efficiency of their infrastructure teams and the speed of whatever servers they are buying.

apexalpha · on July 31, 2020

In my company thousands of CentOS servers were running, we still had the support license though.

badrabbit · on July 31, 2020

You have a good problem. What sucks is when you sell a foss solution and they want paid support and SLA but the foss maker does not want free money in form of closing out issues/bugs/features they might anyways workon without getting paid for it.

tinganho · on July 31, 2020

I fully agree. One other good point with paid software is that is more long term. It will be supported as long as there is money involved.

Just look at the JS ecosystem. Everything is for free. But also shitloads of crap. A lot of libraries left unmaintained.

organsnyder · on July 31, 2020

Not sure what nginx is like, but in my experience, the developer/operator experience of commercial software tends to be subpar. For instance, when I worked at a shop that used a ton of Red Hat software (millions of $$ per year in licensing), the commercially-supported versions often were a pain, with requirements like phone-home (that didn't play well with the mandatory corporate proxy), documentation behind a paywall and hard to discover (yes, we had login accounts, but Google couldn't index it), and other disadvantages. The OSS equivalents were easier to access, had better (or at least better-indexed) documentation, and we didn't need to worry about per-seat licensing (again, we were paying for it, but we still had to track it).

If you're going to sell software that has an OSS variant, make sure the commercial experience actually outshines the free one.

freedomben · on July 31, 2020

I agree, we (at Red Hat) try so hard to make awesome documentation but then put it in hard-to-reach places. I really wish we didn't do that. I'd like to see us publish it all widely.

That said you'd be amazed at how much of man pages is written by Red Hat but isn't attributed, so nearly everybody on every distro benefits from our documentation without realizing it.

neximo64 · on July 31, 2020

Makes sense actually. Your motives are conflicted so you can't see it.

Also if I can ask, is your product also closed source (in any nature at all), but made with open source components?

ram_rar · on July 31, 2020

I feel so old now. There was a time, when I used to discuss with senior engineers @ Yahoo! to use NginX over Apache. Nginx was the hot thing, popularizing C10k [1]. Now in my current team, I have junior devs in my team pushing for Envoy over HAProxy/Nginx setup.

Is this trend happening primarily because devs are pushing for GRPC over REST? What benefits does Envoy offer over Nginx, if you're still a REST based service. I am not fully convinced of operational overhead that NGINX brings.

[1] https://en.wikipedia.org/wiki/C10k_problem

Matthias247 · on July 31, 2020

The sibling comments point towards the difference in configuration if you take the "out of the box" product. But there is also a vast difference in how code is organized, in case you ever have to touch it.

From my point of view Nginx feels "old". It's a C codebase without a great amount of abstractions and interfaces, and instead having a bunch of #ifdefs here and there. Unit-tests and comments are not to be found. Build via autotools.

Envoy looks as modern as it gets for a C++ codebase - apart from maybe the lack of using coroutines which could be interesting for that use-case. It uses C++14, seems to be extremely structured and documented, has unit-tests, uses Bazel for builds, etc.

So I think the experience for anyone being interested in working on the code will be very different, and people that prefer the style of project A will have a very hard time with the style of project B and the other way around.

ncmncm · on Aug 1, 2020

I looked around at the code in Envoy.

"As modern as it gets"? Very, very far from it. Everywhere I looked it was all-over public virtual functions. It looked, more than anything, like Java, which is essentially, more or less, C++92 with some bells on.

The code might be OK, but, as with typical Java code, everywhere I looked was boilerplate, hardly any of it doing any actual work. I would hate for somebody to look at Envoy and think that was what good, modern C++ code looks like.

Virtual functions are a good answer to certain problems that come up, once in a while--in C, for such a problem, you would use function pointers. Inheritance is a pretty good answer to certain problems that come up a little more often.

But neither is a good answer to any organizational need, and a big project that reaches for virtual functions and inheritance as first resort makes me shiver.

MrBuddyCasino · on July 31, 2020

> uses Bazel for builds

Is this unanimously good? I've heard both praise and horror, never used it myself.

SaveTheRbtz · on July 31, 2020

One of the senior engineers once said to me that "Bazel is like a sewer: you get back what you put in."

Bazel requires a lot of upfront effort but the power of (a programmatically accessible/modifiable) dependency graph and a common build/test system across all the languages is very hard to underestimate.

MrBuddyCasino · on Aug 1, 2020

> very hard to underestimate

Are you sure?

st1ck · on July 31, 2020

It's good for Dropbox, since they use Bazel.

chucky_z · on July 31, 2020

The operational overhead shifts to more API stuff, so people can write 100 lines of code instead of modifying 1 line of config, it feels like.

This is never going to end as more things shift towards being core APIs that allow you to write code instead of configure things. It's not even configuration-as-code, it's just code managing configuration files.

edit: I think my comment comes across maybe kinda rude. My beef with Envoy is that the documentation is _extremely_ complex, and I've repeatedly asked 'How do I get started with xDS?' and been pointed to the spec, which I think took some time to read through and when I asked others about how to setup LDS/RDS/CDS/SDS was met with a like 'what are these things...? just use xDS,' which led me to a lot of frustration. This has been my experience each time trying to approach Envoy, and xDS.

jrockway · on July 31, 2020

I think the problem with xDS is that their example go-control-plane repository is completely useless. It's overly complicated with frightening-sounding details that don't matter to someone experimenting ("you MUST MUST MUST CACHE THIS how to do so is an exercise left to the reader").

I ended up reading the specs and found them very clear, and wrote my own xDS implementation: https://github.com/jrockway/ekglue/blob/master/pkg/xds/xds.g... I did this after reading the source code for the most popular xDS implementations and finding myself horrified (you know the popular xDS implementation I'm talking about). Now I have a framework for writing whatever xDS server I desire, and it can be as simple or as complex as I want it. For example, for my use cases, I'm perfectly happy with a static route table. It is very clear what it does, so I have that. What annoyed me was having to configure the backends from Kubernetes for every little service I wanted to expose to the outside world. So I wrote ekglue, which turns Kubernetes services and endpoints into Envoy clusters and Envoy cluster load assignments. This means that I never have to touch the tedious per-cluster configs, and still get features like zone aware load balancing. And I don't have to take on complexity I don't want -- the woefully under-specified Kubernetes Ingress standard, service meshes, etc. (I also plan to use ekglue for service-to-service traffic because xDS is built into gRPC now... just haven't needed it yet. It's great to use the same piece of software for two use cases, without having to maintain and read about features I don't need.)

TL;DR: take a look at the spec. It's really well thought out and easy to implement. Just don't cut-n-paste from Istio because they got it really wrong.

chucky_z · on July 31, 2020

On that note, the grpc spec specifically calls for load balancing that doesn't actually do the proxying, but instead hands out assignments, with the server passing it's current load back to the load balancer service. it sounds like in this case the grpc client is using some array of xDS, but the server is using xDS along with...?

jrockway · on July 31, 2020

I feel like xDS is a relatively new addition to gRPC. I think there is another parallel implementation inside gRPC of external load balancing, which may convey server load information back to the gRPC client.

I looked up the current state of the xDS code, and there's a lot more of it than I remember. The EndpointDiscoveryService based gRPC balancer is here: https://github.com/grpc/grpc-go/blob/master/xds/internal/bal.... It appears to balance similarly to Envoy; locality-aware with priorities.

(That doesn't surprise me because I don't remember any field in the ClusterLoadAssignement proto that sends load information back to the client. Health, yes; load, no. But I could easily not remember it being there because it hasn't been something I've tried to implement.)

But yeah, the way to look at endpoint discovery is like DNS. DNS can return multiple hosts, and clients will spread the load by picking one at random (sometimes, if you're lucky). EDS is similar to this, but is a streaming RPC protocol instead of connectionless UDP, so it's theoretically easier to operate and monitor.

The other xDSes do more things -- CDS lets you discover services (that EDS then gives you endpoints for). RDS lets you make routing decisions (send 10% of traffic to the canary backend). SDS distributes TLS keys (and other secrets). ADS aggregates all of these xDSes into one RPC, so that you can atomically change the various config versions (whereas requesting each type of stream would only be "eventually consistent"; doing it atomically is good where route table changes add new clusters, the client is guaranteed to see the new cluster at the same time the route to that cluster becomes available).

It is all somewhat complicated but very well designed. This reminds me that I want to look more deeply into gRPC's support of xDS and add some integration tests between gRPC and ekglue.

pjmlp · on July 31, 2020

Yep, gRPC is the new toy for distributed computing, after everyone realised that DCOM, CORBA, RMI, Remoting actually made sense instead of parsing XML and JSON text formats all the time.

Slartie · on July 31, 2020

I had to chuckle as well when I read that article and the part about gRPC. Seems like the pendulum is swinging into the other direction again - back to where we've already been ten or twenty years ago. New name of course, but same concepts.

One really starts to feel old at such occasions.

Nitramp · on July 31, 2020

It certainly does look like that. I do think though that we've learned a number of central lessons in the process:

- treat messaging as a first class concept, not something to hide & abstract away.

- do not attempt to implement polymorphism in a messaging protocol. Do not bind your messaging protocol to a programming language's type system (they serve different purposes).

- bake fundamental monitoring & maintainability concepts into the protocol (e.g. intermediaries must be able to understand what responses are errors).

- have a well understood, simple backwards and forwards compatibility story.

- etc.

All of this is stuff we didn't understand in RMI or CORBA or SOAP etc. REST was a great wakeup call, both in simplicity and some of the messaging protocol concepts (such as error modelling). It is missing the application level binding - there's just no good reason why you wouldn't have a statically checkable method/request/response type binding.

I am a bit weary on whether gRPC will go over board again in complexity. We'll see.

pjmlp · on July 31, 2020

Sure we did understand it, that is why they used an IDL. independent of programming language's type system.

Apparently DCE IDL now comes in proto files.

What newcomers did not bother to understand is why we were using those formats in first place.

Rest assured, maybe in 20 years we will be introducing this cool RPC protocol, based on YAML on something. Thankfully by then I should be retired.

Nitramp · on July 31, 2020

I think IDLs are an important point among many others (I attempted to list some above). I think we might be missing substantial improvements if we'd just say "oh so we're back to a static IDL description, all old is new again".

And even within IDLs, we've made major progress. Compare the mess of SOAP's data type system, various attempts at inheritance and polymorphism in SOAP and CORBA, pointers in CORBA etc.

twic · on July 31, 2020

What's the beef with polymorphism?

Shorel · on July 31, 2020

I read that part as:

Protocol Buffers are good enough to make us forget the traumas caused by CORBA.

SaveTheRbtz · on July 31, 2020

Totally get it! The team (@veshji and @euroelessar) struggled a bit in convincing me that the new Envoy way is a simpler one. I do not regret giving in.

Operationally, there are many differences (esp. around Observability) but if I were to distill it down to one thing it is a clean separation between data- and control-plane. This basically means that it was designed to be automated and the automation layer (xDS) itself runs just like any other normal service in production.

amw-zero · on July 31, 2020

This is just the software industry. Maybe it’s because we’re so young. Maybe it’s because software is relatively easy to change and experiment with.

Who knows. All I know is, it’s exhausting, and ultimately it’s terrible for the end user. We have no idea what we’re doing when we pull in a new dependency like this. There’s tiny corner cases we don’t think about, and those get passed on to the user.

Innovating is fun, but exhausting in aggregate.

dilyevsky · on July 31, 2020

Envoy is lot more configurable and rivals nginx on performance (especially throughput). Codebase is a lot more manageable (but that’s my personal preference). Runs circles around nginx on observability features.

rdli · on July 30, 2020

Really great post. I'm glad the post in particular mentioned community, because I think in the end this is the huge advantage Envoy has over NGINX. NGINX, could, in theory, resolve all technical issues raised in the post. But the fundamental tension between the open source and commercial versions cannot be resolved.

(Disclosure: We use Envoy as part of Ambassador, and so of course we're big fans!)

dmicher · on July 30, 2020

I know some people might find it a little controversial, but I’m super excited about our load balancing future and that we probably have the biggest Envoy deployment in the world now. When we moved most of Dropbox traffic to Envoy, we had to seamlessly migrate a system that already handles tens of millions of open connections, millions of requests per second, and terabits of bandwidth. This effectively made us into one of the biggest Envoy users.

user5994461 · on July 30, 2020

Well, a single server doesn't really need to do more than 10Gbps or 100k connections. Going above is a "simple" matter of managing horizontal scaling.

What I wonder about is how do you distribute the traffic on the higher level? I imagine there are separate clusters of envoys to serve different configurations/applications/locations? How many datacenters does dropbox have?

I was running a comparable setup in a large company, all based on HAProxy, there was a significant amount of complexity in routing requests to applications that might ultimately be in any of 30 datacenters.

SaveTheRbtz · on July 30, 2020

We had a large rundown of our Traffic Infrastructure some time ago[1]. TL;DR is:

* First level of loadbalancing is DNS[2]. here we try to map user to a closest PoP based on metrics from our clients.

* User to a PoP path after that mostly depends on our BGP peering with other ISPs (we have an open peering policy[3], please peer with us!)

* Within the PoP we use BGP ECMP and a set of L4 loadbalancers (previously IPVS, now Katran[4]) that encapsulate traffic and DSR it to L7 balancers (previously nginx, now mostly Envoy.)

Overall, we have ~25 PoPs and 4 datacenters.

[1] https://dropbox.tech/infrastructure/dropbox-traffic-infrastr... [2] https://dropbox.tech/infrastructure/intelligent-dns-based-lo...

[3] https://www.dropbox.com/peering [4] https://github.com/facebookincubator/katran

dilyevsky · on July 31, 2020

Katran - nice! Any issues with it at all? Do you use it with xdp capable hardware or just normal driver offload?

SaveTheRbtz · on July 31, 2020

It works beautifully. We use driver offload (i40e on the Edge.)

alexeldeib · on July 31, 2020

Cool to see someone using Katran in production. Really interesting stack you have there.

SaveTheRbtz · on July 31, 2020

Actually, all the props for that go to Katran's author himeself. When we hired Nikita V. Shirokov (tehnerd), the first thing he did was replacing IPVS with XDP/eBPF-based Katran, which improved our Edge servers throughput by 10x, from ~2MPps to ~20Mpps.

He also contributed a lot to Envoy migration migrating our desktop client to it and adding perf-related thing like TLS session tickets' lifetime to SDS.

user5994461 · on July 31, 2020

Great. Exactly what I was looking for =)

traceroute66 · on July 31, 2020

@SaveTheRbtz

"we have an open peering policy"

That's a bit of a lie given you have a minimum 50Mbps requirement before you even consider a peering request.

I would call that Selective, not Open !

eric4smith · on July 31, 2020

It's interesting almost no web server provides an easy way to deal with multi-tenant multi-domain architectures in a good way that includes automatic SSL.

Caddy is the closest, but still not near enough.

There is this small segment of the market that we operate in that requires thousands of TLS connected domains to be hosted behind a dynamic backend. It's services like Tumblr, Wordpress.com, or any other hosting service where you can get a "custom domain" to point to your own blog or site.

NGINX - No.

Apache - Nope.

Caddy - Can do (but need lots of workarounds)

Envoy - Nope.

Everyone focuses on a few hand-coded domains and no automatic TLS. Maybe this part of the market is too small anyway. Sigh.

mholt · on July 31, 2020

Several companies use Caddy for exactly this purpose. Fathom Analytics for example uses it for their custom domains feature. Caddy can even reactively provision certs during TLS handshakes. It's a native feature. Why does it require lots of workarounds?

bschwindHN · on July 31, 2020

Yeah I'm not sure what they're getting at, I've used Caddy as well for similar "custom domain" features, it was super easy. Thanks for creating it!

eric4smith · on July 31, 2020

Yes. Caddy is what we use, since not much else can do it as easily as Caddy can. And it's our go-to tool for several projects that require custom domains. And we really, really, appreciate it!

I'm just saying that it's not something that is documented well or purpose built for that scenario.

sladey · on July 31, 2020

Is there any mature integration to achieve this with Kubernetes?

mholt · on July 31, 2020

WIP: https://github.com/caddyserver/ingress/

elithrar · on July 31, 2020

You can definitely “lazy load” TLS certs into Envoy.

The SDS (Secrets Discovery Service) supports this, and is touched on in TFA: https://www.envoyproxy.io/docs/envoy/latest/intro/arch_overv...

You provide a gRPC service that can return the keypair needed for any host, with host config also being dynamic.

https://www.envoyproxy.io/docs/envoy/latest/configuration/se...

simplyinfinity · on July 31, 2020

We are using OpenResty with lua-auto-ssl for exactly this purpose, and it works like a charm.

xenonite · on July 31, 2020

Lighttpd seems to have solutions for that. Did you have a look at it?

https://redmine.lighttpd.net/projects/lighttpd/wiki/Docs_SSL...

> A traditional problem with SSL in combination with name based virtual hosting has been that the SSL connection setup happens before the HTTP request. So at the moment lighttpd needs to send its certificate to the client, it does not know yet which domain the client will be requesting. This means it can only supply the default certificate (and use the corresponding key for encryption) and effectively, SSL can only be enabled for that default domain. There are a number of solutions to this problem, with varying levels of support by clients.

Then, the best approach seems to be the following:

> Server Name Indication (SNI) is a TLS extension to the TLS handshake that allows the client to send the name of the host it wants to contact. The server can then use this information to select the correct certificate.

e12e · on July 31, 2020

Traefik makes it fairly easy (inasmuch as it makes anything easy). But it's just a proxy, not a web server.

eric4smith · on July 31, 2020

Traefik can work yes - documentation is really terrible though for anything approaching that use case. We had to really futz around with it and eventually went back to Caddy. Our use case is several thousand client domains that just proxy to some backends.

e12e · on July 31, 2020

Did you move to caddy 2?

Our initial use-case was ingress for docker swarm - after a fooray into k8s with the "traditional" nginx ingress with its rather hackish let's encrypt contraption.

I briefly looked at caddy 2 - but wasn't able to find any out-of-the box tricks for listening to docker messages and dynamically configure sites in a sane way.

Do you use custom code and configure caddy via the api?

mholt · on July 31, 2020

> I briefly looked at caddy 2 - but wasn't able to find any out-of-the box tricks for listening to docker messages and dynamically configure sites in a sane way.

Like this? (Am not a Docker user, but I know this is an insanely popular solution) https://github.com/lucaslorentz/caddy-docker-proxy

There's also a WIP ingress controller: https://github.com/caddyserver/ingress/

toast0 · on July 31, 2020

I would think the way to do this would be to run a separate TLS daemon that handles the certificates (including acme challenges, presumably) and then pass the socket to your http server, either by proxying it (preferably to a unix socket), or like actually pass the FD with the session keys.

I don't think hitch (formerly stud) supports acme challenges, but that's where I'd start.

andrenth · on Aug 2, 2020

Apache can do automatic TLS with mod_md.

spacewander · on July 31, 2020

One of my friends brought me up this post in the morning. The post is awesome and inspirational (caused a discussion in our chant group), though I can't agree with some trivial points.

> Nginx performance without stats collections is on part with Envoy, but our Lua stats collection slowed Nginx on the high-RPS test by a factor of 3. This was expected given our reliance on lua_shared_dict, which is synchronized across workers with a mutex.

The `a factor of 3` is quite large to me. Maybe you put all your stats in lua_shared_dict? You don't need to synchronize the stats every time. Since the collection regularly happens in per-minute frequency, you can put the stats as Lua table, and synchronize them once per 5/10 seconds.

It look like that the compared Nginx is configured with a system which has been survived for years and not up-to-date. The company I worked with used a single virtual server to hold all traffic and routed them with Lua code dynamically. And the upstream is chosen by Lua code too. There is no need to reload Nginx when a new route/upstream is added. We even implemented 'Access Log Service' like feature so that each user can have her favorite access log (by modifying the Nginx core, of course).

However, I don't think this post is incorrect. What Envoy surpasses Nginx is that it has a more thriving developer community. There are more features added into Envoy than Nginx in the recent years. Not only that, opening discussion of Nginx development is rare.

Nginx is an old, slow giant.

SaveTheRbtz · on Aug 1, 2020

We've made a note about how inefficient our solution was and what was the plan to fix it. Sadly, to get proper stats in nginx we needed two things:

* C interface for stats, so we can would have access to from C code.

* Instrument all `ngx_log_error` calls so we would have access not only to per-request stats but also various internal error conditions (w/o parsing logs.)

That said, we could indeed just improve our current stat collection in the short term (e.g. like you suggested with a per-worker collection and periodic lua_shared_dict sync.) But that would not solve the longterm problem of lacking internal stats. We could even go further and pour all the resources that were used for Envoy migration into nginx customizations but that would be a road with no clear destination because we would unlikely to succeed in upstreaming any of that work.

rolls-reus · on Aug 1, 2020

> The `a factor of 3` is quite large to me. Maybe you put all your stats in lua_shared_dict? You don't need to synchronize the stats every time. Since the collection regularly happens in per-minute frequency, you can put the stats as Lua table, and synchronize them once per 5/10 seconds.

Any pointers on how to achieve this for someone just starting out with lua and openresty? I have the exact same thing (lua_shared_dict) for stats collection, would love to learn a better way.

spacewander · on Aug 4, 2020

You can look at https://github.com/knyar/nginx-lua-prometheus/pull/75 for inspiration.

alinspired · on Aug 4, 2020

nginx had cold (for American standards) and conservative community to begin with, commercial version and F5 ownership likely "closed" it even more

it's a pity that community never evolved with nginx growth and success

shay_ker · on July 31, 2020

> C++14 is not much different from using Golang or, with a stretch, one may even say Python.

That's... definitely a stretch.

pmlnr · on July 31, 2020

If anyone was wondering, this is solely for proxying, not for oldschool web server functionalities, eg. static file serving.

SaveTheRbtz · on July 31, 2020

We've actually started experimenting with converting our static file serving to "proxying to S3 + caching." This is simpler from deployment and development perspectives (for companies that do not have a distributed filesystem, like Google with its GFS):

* for deployment we do not need to maintain a pool of stateful boxes with files on them and keep these files in sync.

* for development, engineers now have a programatic interface for managing your static assets.

caiobegotti · on July 30, 2020

I'm positively surprised that Dropbox (at least from what I understood from the post) didn't require lots of changes or patches on top of the upstream codebase of Envoy to migrate their traffic!

SaveTheRbtz · on July 30, 2020

We did require some of them[1]. Esp. painful were Transfer-Encoding quirks, and some dances around old HTTP/1.0 backends and request buffering.

Compared to NGINX though, it was relatively easy to push these fixes upstream. Community is very welcoming to outside contributions.

[1] https://dropbox.tech/infrastructure/how-we-migrated-dropbox-...

veshij · on July 30, 2020

We do have some local patches as well (mostly for integration with out own infrastructure - stats collection, some RPC specific stuff). As SaveTheRbtz mentioned we encountered some issues with non-RFC clients, corner cases which were not exposed when envoy is used in "trusted" environment, etc., but all our fixes are now in upstream, so next migrations will be way easier both for us and for other envoy users.

weitzj · on July 30, 2020

I did not quite get how they configure envoy? Did they write their own control plane? Use ambassador/Istio/Gloo?

veshij · on July 31, 2020

We have a mix of static and dynamic configuration. We started with almost everything defined in the configuration and implemented our control plane only for endpoint discovery service. Over the time we implemented more and more features there (certificates, tls tickets, route and vhost configuration, etc). We decided to write own implementation on control plane - actually the core part is pretty simple and easily expandable.

euroelessar · on July 31, 2020

We have built our own control plane in golang tightly integrated with an existing infrastructure (service discovery, secrets/certificates management, configs delivery, feature gating, and so on).

mperham · on July 30, 2020

Did you consider using commercial nginx? If so, what made you decide against it?

dmicher · on July 30, 2020

Sadly, it would probably be as hard to maintain as an opensource version. We really want to have access to the code to make sure we can fix, troubleshoot it, understand it fast...

Things that may've help:

-- Configuration definition (e.g. protobufs.)

-- More focus on observability: error metrics (instead of logs), tracing, etc.

-- gRPC control plane.

-- C++ module development SDK.

-- (ideally) bazel.

Some dataplane features like gRPC JSON transcoding, gRPC-Web, and http/2 to backends.

dmarble · on July 31, 2020

Don't any of the major commercial open source vendors offer custom terms to give access to the commercial source? I'd imagine they'd contemplate it for big deals. Seems like one of the only ways to keep some of these sophisticated customers onboard.

SaveTheRbtz · on July 31, 2020

open source argument is valid -- most software enterprise vendors do provide source code access (under NDA.) The rest of the arguments stand though: as it is right now, it way more developer/operator friendly to use Envoy in our production.

xmichael0 · on July 30, 2020

The price is really insane for Nginx commercial.

mperham · on July 31, 2020

As an Enterprise software vendor myself, I can assure you: everything is negotiable at Dropbox’s scale including very deep discounts.

user5994461 · on July 31, 2020

About $2000 per host.

xet7 · on July 31, 2020

How does Envoy compare to Caddy 2 ? https://caddyserver.com

SaveTheRbtz · on July 31, 2020

To tell you the truth, we didn't consider it. From what I can get from the architecture docs[1], it can be a decent platform for apps, but might not be the best choice for a general purpose ingress/egress proxy (at least for now.)

[1] https://caddyserver.com/docs/architecture

mholt · on July 31, 2020

It is a great choice for a general purpose proxy. (That's kind of the point.)

atonse · on July 31, 2020

But they mentioned that they wanted to use C++ instead of go to get even that extra performance out.

I use Caddy a lot and it's perfectly fine for my scale, but at dropbox's scale, maybe go wouldn't be enough for the ingress part?

mholt · on July 31, 2020

I just lament the increasing deployments of programs written in memory-unsafe languages to the edge, in general.

I am more curious what makes the author think Caddy "might not be the best choice for a general purpose ingress/egress proxy" (there were no other qualifications to that statement, but no evidence to support it either).

mperham · on July 31, 2020

Yeah, to its credit, the article brought it up but then kinda hand waved away "envoy had many more security issues than nginx". Having a huge load of C library dependencies in a user-facing service seems like a bug these days.

Part of reducing dependencies in my own software was a conscious decision to minimize future CVE exposure.

DarkWiiPlayer · on July 31, 2020

> C++ instead of go to get even that extra performance out

Might as well use C then (with some hand-written asm sprinkled in where the compiler gets confused and doesn't see an obvious optimization) for that. And I'm not even being sarcastic here (I wish I was though)

xet7 · on July 31, 2020

Found some related discussion here: https://caddy.community/t/caddy-to-replace-envoy-for-ha-setu...

Havoc · on July 30, 2020

Sensing a bit of a trend here. Didn't another major player recently make the same switch?

SaveTheRbtz · on July 30, 2020

I think the best slice of who's migrating to Envoy can be observed via EnvoyCon talks[1][2]:

* Lyft (of course)

* Spotify

* Stripe

* Square

* eBay

* Yelp

* Pinterest

Plus the support from major cloud providers: Google, Microsoft, and Amazon.

[1] https://envoyconna18.sched.com/ [2] https://envoycon2019.sched.com/

user5994461 · on July 31, 2020

They must all be GRPC users. Developers are pushing GRPC and protobuf pretty hard in companies. The next step down the road is to move to envoy as the load balancer. Otherwise these protocols don't work well over traditional HTTP infrastructure.

stock_toaster · on July 30, 2020

So, seems like nginx is fine until your company reaches the "we are worth billions now" scale?

user5994461 · on July 31, 2020

nginx is never fine for load balancing, they put basic features like metrics behind the paid edition. It's not sane to operate in production.

https://thehftguy.com/2016/10/03/haproxy-vs-nginx-why-you-sh...

donavanm · on July 31, 2020

I work for a billions company. Nginx is still fine. Youll need to be prepared to pay for better operations, management, visibility, and protocol support. You can either pay them or build it in house, but you will want to pay.

jhgg · on July 31, 2020

discord is also on that list - although we have not spoken much about it yet!

dmicher · on July 30, 2020

It may actually become a trend. For well known reasons:

- Community

- Nginx served us well for almost a decade. But it didn’t adapt to current development best-practices

- Operationally Nginx was quite expensive to maintain

- C++

- Observability and monitoring

etc...

freedomben · on July 30, 2020

I'd add another reason: so many people only use nginx as a reverse proxy, and the proxy configuration feels duct-taped on sometimes. Envoy being written as a proxy first makes it a better interface IMHO.

ci5er · on July 30, 2020

Is C++ generally considered to be "better"?

I've always looked at it (esp. with STL) as kind of a "Swiss-Army-Chainsaw" and you were going to shoot your eye out. Maybe that view is old and things are better - but I learned a while back that sending a young gun into a C++ application's code-base would lead to a world of pain)

Maybe that learning is no longer accurate? What do you think?

dmicher · on July 30, 2020

When we are comparing C, Lua (Nginx) and C++ (Envoy). Yes C++ is better :).

ci5er · on July 30, 2020

Honest question: In what way?

Platform wars are over-ish. We have the same compile targets. What they call "Undefined Behavior) is relegated to ... well ... platforms we are not supporting.

C is fast - simple - easy(ish) to learn, and easy to "fuzz" in testing.

I can't speak to LUA - but C++ looks like a mine-field (to me).

Why do you declare that C++ is "better"? (Seriously interested - I don't even know enough these days to have a debate. I just gave up on the C++ hell-hole years (decades?) ago, and maybe should have kept up)

euroelessar · on July 31, 2020

There are many reasons which lead to a cleaner codebase, some of them are: RAII, smart pointers, constant types, reusable containers, standard algorithms library, cleaner way to define interfaces, etc.

Overall our experience is that C++ code is smaller, simpler to write/read, and has a smaller chance of mistakes than equivalent logic written in C.

Of course many of this points are relevant only for relatively modern C++ (c++11/c++14 or newer), before that the cost/benefit ratio was much less clear.

ci5er · on July 31, 2020

Fair enough. Thank you.

In my case - C (as a language) had a smaller footprint, and if the targets were limited, it was easier to learn, to lint and to code-inspect.

Admittedly, this was mostly before C++14. I guess this might be a case of "once bitten, twice shy".

Thank you.

dmicher · on July 31, 2020

I'm not declaring that C++ is better for everything. In our case it is better because it makes this part of the infrastructure more sustainable: there are more engineers who can code well in C++ vs C in our company and industry overall. Also it is easier to code in C++ as it is general purpose programming language with a lot of libraries available, open source projects, community around it etc.

ci5er · on July 31, 2020

Maybe we program in different verticals. I have not found it to be so. (I have still upvoted your comment).

dmicher · on July 31, 2020

I value your opinion.

ci5er · on July 31, 2020

Well, that is very gracious - esp. here.

I value yours.

You have different experiences than I do - so our conclusions will differ.

That said - I "feel" as if C++ is a dangerous serpent of a language. Maybe I need to spend 6 months re-acquainting myself in complex environments with more developers than just me, and re-evaluate that presumption on a medium-size project.

Thank you.

Legogris · on July 31, 2020

Your happiness is important to me.

user5994461 · on July 31, 2020

I think what they're trying to say is that developing plugins for nginx in Lua is not great. It's not a rant about languages.

This has to do with nginx not having the required features (features blocked behind the paid edition or GRPC non existent), forcing to develop plugins in Lua to compensate (the only supported language) and Lua is too slow for this sort of stuff (processing Gbps of traffic in real time is no joke).

ci5er · on July 31, 2020

That makes sense. Thanks. (I still prefer C - but I am admittedly getting old and C++ sucked in the mid-90s or so)

EDIT: BTW -- I am not going to argue with LUA throughput. I'm still not sure what the thinking was there (maybe time-to-prototype?) - but C plugins run faster than Apache's do. By, like, a lot. (And I like Apache! ...Having used it since 1996)

Shorel · on July 31, 2020

It depends on the team. C++ can be better. It can also be worse.

Thaxll · on July 30, 2020

It's a drop in the water compared to Nginx usage.

SaveTheRbtz · on July 30, 2020

That is indeed true. But, I remember the time when we were rolling out nginx back in 2000's and exactly the same thing was said about Apache.

Havoc · on July 30, 2020

Not if the people switching is the cool crowd. Which is exactly what I think is happening here

stock_toaster · on July 30, 2020

HAProxy is pretty popular too.

nine_k · on July 30, 2020

Because most nginx usage is different?

Of course, you can serve static assets using Envoy, and maybe even connect a fascgi app without very much hassle. But it's quite a bit less straightforward.

iampims · on July 30, 2020

Slack announced they were going to switch.