Hacker News new | past | comments | ask | show | jobs | submit login
Microsoft cURLs too (haxx.se)
268 points by TXCSwe on Jan 13, 2018 | hide | past | favorite | 103 comments



One thing I am always sure to share with collegues when we discuss curl, is the fact from the command line you can generate the underlying C code.

This is pretty useful when creating a CLI for pretty much any app, and I've used it regularly to generate a CLI for an app.

My post on how to do it: http://austingwalters.com/export-a-command-line-curl-command...


There is one for generating goalang code as well: https://github.com/mholt/curl-to-go Also, it does not depend on libcurl.


That's amazing. Never knew that. Learn something today.


It wasnt until 2006 that curl added HTTP/1.1 pipelining support. Hence I always used netcat instead of curl because I utilised pipelining heavily for text/html retrieval (IME, most servers supported it).

Imagine something like this with curl:

   curl << eof
   http://example.com/a.htm
   http://example.com/b.htm
   eof
where curl only opens a single connection.

Alas, AFAIK, pipelining is still not enabled in the curl binary.

As I understand it, the --libcurl option only generates code for what is possible with the curl binary, e.g., curl_easy_init(), curl_easy_setopt(), etc.

As such, it will not generate code using curl_multi_init(), curl_multi_setopt(), etc.

I have to automate the code generation myself.


If you invoke "curl http://example.com/a.htm http://example.com/b.htm", curl will only use a single connection.


Yes, but it will wait for Response A before sending Request B.


It probably isn't too obvious here, but yeah I know how it works =)

(I'm Daniel, who wrote the blog post discussed here and leads the curl development...)


To reproduce:

   curl --libcurl 1.c http://example.com/a.htm http://example.com/b.htm

   grep curl_multi_init 1.c
https://curl.haxx.se/mail/archive-2008-02/0036.html


Here is how RFC 7230 defines "HTTP/1.1 pipelining":

"6.3.2. Pipelining

A client that supports persistent connections MAY "pipeline" its requests (i.e., send multiple requests without waiting for each response)."

AFAIK, the curl binary does not do pipelining by this definition.

And, AFAIK, it will not generate code to do pipelining by invoking it with --libcurl.


Here is how RFC 7230 defines "HTTP/1.1 pipelining":

"6.3.2. Pipelining

A client that supports persistent connections MAY "pipeline" its requests (i.e., send multiple requests without waiting for each response)."

AFAIK, the curl binary does not do pipelining.

And, AFAIK, it will not generate code to do pipelining by invoking it with --libcurl.


Brilliant, thanks for sharing.


This is amazing! Thanks for sharing


Amazing tip!


Curl is alright, and congratulations on this massive and very impressive step forwards, but the CLI not exactly very user friendly. httpie[1] is a great tool if you find curl invocation somewhat arcane.

1. https://httpie.org/


That seems like a tool designed for a very specific use-case (JSON APIs), rather than the general-purpose protocol interaction of cURL. The fact that it appears to be doing some things "behind your back" to be "helpful" like reordering headers or otherwise manipulating the data would certainly be unwanted in a general-purpose tool.


True. But for those of us who work with JSON APIs more-often-than-not it's very good. Being able to simply pipe in a file as a payload saves me a lot of time.

I know cURL does 1001 other things too, so the two tools aren't really in competition. HTTPie is more akin to Postman or Insomnia.


Use the "jq" command line tool to format json output. It's a life changer.

curl 'https://raw.githubusercontent.com/LearnWebCode/json-example/... | jq '.'


HTTPie is great, but sometimes you need curl. Curl supports a lot more protocols.


Plus, it's easy to forget that httpie is SLOW. Sometimes when testing an API the timings are worrying, until you remember and switch over to cURL to confirm everything is right with the world.


I don't know why you are getting downvoted. That's true, starting a Python interpreter is a relatively expensive operation. There is a noticeable lag with most CLIs written in Python.

There has been some work recently to make the interpreter start faster. I hope we will see the result in Python 3.7.


Yes, but MS seems to disable many of them.


curl is much more than http.


Well, I am not sure if I would call it arcane. There are only two things which I find a little non user friendly:

1. By default curl doesn't follow redirects and I think most use-cases (not all) require that behavior (at least from the cli).

2. Similar to wget, many users who start using curl do it to download something, probably a file. But opposite to wget curl doesn't write a file but to stdout. Actually, I find curls behavior much more UNIX style, but it is probably the first obstacle every user has to tackle. Nevertheless, in the end this makes curl easier to use, because you do not have to remember which parameter is used to set the output file name, but instead just use the universal unix operator '>'.


curl may not have the greatest CLI, but it's a huge step up from doing the equivalent tasks with powershell.

not to mention this brings windows closer and closer to nix. windows making itself more familiar to nix increases the number of ops folks that are willing to transition from nix shops to windows, increasing windows share in the server market

plus, as others have pointed out, curl handles protocols besides http.


Yea but can you export a httpie command from the browser?


No, and that's a real shame :(


For debugging on remote servers this is pretty handy (at least, one it makes its way into the server versions).

Now I have hope they'll put in a text editor that understands unix line endings.


It's like they're trying to get a record by not doing that. For years I've been baffled why Notepad at the very least couldn't understand unix line endings.


It's possible they lost the source code and literally can't. They've done it before:

https://www.bleepingcomputer.com/news/microsoft/microsoft-ap...


Equation Editor was developed by a third party and MS probably never had the source code in the first place: https://support.microsoft.com/en-us/help/4057882/error-when-...


Notepad is literally a window with the standard Windows Edit control inside it, so they certainly have the source code.

My guess as to why they don't care to support \n-only is that there's been very little need to; anyone who needs more advanced editing isn't going to use Notepad anyway.

As the sibling comment mentions, WordPad (which is similar but with a RichEdit control) does support \n-only.


If you want notepad-with-LF-support, ReactOS, the open-source Windows NT clone, has this.


"It's like they're trying to get a record by not doing that. For years I've been baffled why Notepad at the very least couldn't understand unix line endings."

Probably the same reason that the CMD shell is so bad, and is only now being fixed up: since Windows is proprietary, nothing can be improved unless either the Microsoft decision-makers that are responsible for the product choose to spend budget on it, or there is a directive from higher-up in the company.


It's because windows line endings are actually the technically correct one.

Ever tried raw output to terminal?


I go back and forth on this.

On the one hand, you're right in that you need both a line feed and a carriage return to actually "start a new line" (+1 to Windows). On the other, it seems wasteful to have two characters do a job that, in text files, could be just as easily done by one; editing and storing text is not the same as printing it, especially when the print incantation is a vestige of mostly-obsolete hardware (+1 to Unix).


Truthfully I have the exact view as yours (upvoted) but people rant without understanding. They are creating Tomorrow "evil" monopolies with their bandwagoning


Maybe according to the defined behavior of those characters, but that can't be why notepad doesn't support Unix line endings, as it still doesn't actually line feed when it sees a line feed character. I have never tried but I also suspect it won't do a carriage return for just a carriage return character (to be fair, that wouldn't really makes sense in a text editor anyway).


WordPad has supported Unix line endings since Windows 95 ;)


> They ship 7.55.1, while 7.57.0 was the latest version at the time. That’s just three releases away so I consider that pretty good. Lots of distros and others ship (much) older releases.

Indeed. I am running latest High Sierra and:

    curl --version
    curl 7.54.0 (x86_64-apple-darwin17.0) libcurl/7.54.0 
    LibreSSL/2.0.20 zlib/1.2.11 nghttp2/1.24.0


On a side note, for all my daily download tasks (other than debugging some web API) I've settled with aria2[0]. It seems to support every protocol used in modern Internet and has plenty of flexibility (connection multiplexing, bandwidth control etc.). It can even serve as 24/7 torrent client managed via remote API.

0. https://aria2.github.io/


BTW another program (and a library) that should, IMHO, be made a standard component of every modern OS (except those that would choose to exclude it for a practical reason, e.g. some of extremely-lightweight and heavily specialized embedded ones) is SQLite3


Cool. I've been installing wget on every Windows PC in my authority since the days of Windows 98SE. Every net-enabled operating system is to have such a tool installed by default.


Now, just ~1000 other commands to go


But they're _all_ already there via the WLSS (Bash on Ubuntu on Windows) if you want them.

Not to mention that the PowerShell equivalents of a lot of *nix commands are _much_ better. "Everything is an object" is a brilliant philosophy and it's a joy to use.


Regardless of its merits, they blew it in terms of marketing. PowerShell is destined to be a deadend technology only used by Windows sysadmins.


I don't think MS envisioned PS as anything other than a sysadmin tool. If they did it doesn't seem to come though with the design.


Doubt that. You think in the PowerShell design phase, the team would have shrugged if some MS decision maker somehow mentioned "BTW, 10 years from now, for developers, we're going to be promoting installing a Linux emulation layer and using Bash"? I bet the PS folks would have been dismayed.


Just install busybox.exe from https://frippery.org/busybox/ to some directory in your %PATH% or in C\: and run 'busybox sh'.


Cool, now I don't need to remember the arcane incantation to download a file with Powershell.

Do you think we'll see things like the good old "curl <some url> | bash" for Windows now? They still have no package manager worth using.


Windows 10 has a "package manager", of course, that's quite good, but both user culture, developer disinterest, and years and years of inertia keeps it from being used for enough things to cross the network-effect side of some treshold of "worthwhile".

The 'download | execute' paradigm, on the other hand, is the complete antithesis of package management, to where your desire to obtain and execute the code trumps your willingness and patience to wait until it has been vetted by your preferred package manager, and installed in a less haphazard way. I fail to see your point.


If you are referring to the package management stuff in PowerShell, that's just a unified interface to several different package management systems, each of which handles specific types of packages. Windows has MSI, VSIX, PowerShell modules and probably some others as well, all of which are separate from Microsoft Update.


> curl <some url> | bash

That's still not a package manager worth using.


Have you tried chocolatey? I’ve really enjoyed using it


MSYS2 now comes with pacman! it's great.


Hasn't that been popular as iwr <some url> | iex for a while?


scoop(1) is a pretty cool package manager

1) https://github.com/lukesampson/scoop


> the good old "curl <some url> | bash"

I would be thankful if the habit of trusting a random IP with control of one's shell could die, forever.


Why would you be using a random ip? That incantation is usually used with a URL where you would have just downloaded and installed the software manually anyway.

There are various arguments against the curl-piped-to-shell idiom but "random ip" doesn't seem like a valid one.


A URL and an IP are not equal.

Building a script that acts differently for a web browser, a normal download, and curl, is trivial, and I've seen it happen. Here's a proof-of-concept[0] someone else wrote.

Manually downloading is safer, at least you can review, curling straight into a shell is inherently unsafe.

The better option is still a package manager, but curling straight to a shell is very unsafe.

[0] https://jordaneldredge.com/blog/one-way-curl-pipe-sh-install...


Why would you be using an IP and not a URL?

And why wouldn't you trust the source of the install script as much as any other installer?

Do you audit the binary installers you use as well?

I don't disagree about a minor difference between the methods, but I definitely disagree with piping to sh being "very unsafe". If you trust the site/author enough to run their code on your computer at all, the install method risk difference is but a tiny drop in the bucket.


> If you trust the site/author enough to run their code on your computer at all

Piping curl, means you can't be sure it came from the author's site.

It means you can't be sure you're getting the same software you've been considering installing.

It means a broken connection is a broken install, with no cleanup and no idea what it has changed.

> Do you audit the binary installers you use as well?

Don't install random binaries either. The security implications of that should be fairly obvious.


> Piping curl, means you can't be sure it came from the author's site.

Up to the trustworthiness of the CA system yes you can. If the author's site is serving malicious downloads to the curl UA then you're probably hosed either way. It would be easier to just slip malicious code in the software itself.

> It means a broken connection is a broken install, with no cleanup and no idea what it has changed.

This is the real draw of package management. The argument surrounding curl|bash should really focus on this rather than hand-wavy security.

> Don't install random binaries either

Nobody who is running curl|bash isn't installing a 'random' binary but downloading an installer from a source they trust.


> Up to the trustworthiness of the CA system yes you can. If the author's site is serving malicious downloads to the curl UA then you're probably hosed either way. It would be easier to just slip malicious code in the software itself.

If they have HSTS, otherwise you might end up using plain ol' http by accident. Like over at surge.sh, but at least they use a package manager.

> Nobody who is running curl|bash isn't installing a 'random' binary but downloading an installer from a source they trust.

But you can't trust it, because most shell scripts out there are woefully inadequate. So you're one broken connection, one WiFi drop, from corrupting your system. At least a binary needs to be complete to run.

Example: Heroku's CLI [0]

If it breaks on the echo, you could end up overwriting your entire source list.

e.g. It breaks to:

echo "deb https://cli-assets.heroku.com/branches/stable/apt ./" > /etc/apt/sources.list

instead of the intended

echo "deb https://cli-assets.heroku.com/branches/stable/apt ./" > /etc/apt/sources.list.d/heroku.list

And it'll work too, because the entire thing runs as the root user.

> The argument surrounding curl|bash should really focus on this rather than hand-wavy security.

They're the same thing. A broken connection with curl | sh is a security problem. As is downgraded https, because of an accidentally misconfigured host. As is running without even the basic check of seeing if you get the complete file before executing it.

Everything about curl | sh is inherently untrustworthy.

[0] https://cli-assets.heroku.com/install-ubuntu.sh


> Everything about curl | sh is inherently untrustworthy

Nope. Only one item is of minor concern (which I've covered many times in this thread) and it has an easy and known solution.

The rest of your objections are not specific to curl-piped-to-sh and are irrelevant to this discussion.

The sky is not falling, so I'm not sure what your agenda really is.


You didn't answer the 'random ip' question.

> Piping curl, means you can't be sure it came from the author's site.

Assuming you take the same precautions you'd take with any other software download (like using https), there's no difference between curl-piped-to-sh and clicking on a link to a rpm, deb, exe, or anything else.

> It means a broken connection is a broken install

Not if the shell script is written correctly. And if you can't trust the source of your software to get that right, then you can't trust them to get the regular installers right either, so there's no difference here either.

> Don't install random binaries either

No one's advocating for random binaries, but you do have to install binaries from time to time, no? Or are you getting your CPU microcode updates in source form too?

I get it, it's a knee-jerk cargo-cult reaction to flame folks who don't see the huge issue with piping a https URL from the software's main web site to the shell, but if you actually think about it, it does not have the major flaws that you claim it does.


> You didn't answer the 'random ip' question.

Just because you have a URL, doesn't mean it is connecting to the expected IP. Not everyone uses https yet. Not everyone uses HSTS to protect against downgrading.

> Assuming you take the same precautions you'd take with any other software download (like using https), there's no difference between curl-piped-to-sh and clicking on a link to a rpm, deb, exe, or anything else.

And that assumption isn't a guarantee. Remember who these kinds of installs target: developers with little experience. You can't ensure they'll notice a missing 's'. You can't ensure a worn-out admin will either.

> Not if the shell script is written correctly. And if you can't trust the source of your software to get that right, then you can't trust them to get the regular installers right either, so there's no difference here either.

No. I can't. Have a glance over Heroku's Ubuntu script.[0] It's not fenced, if that echo breaks, it could case some chaos. In fact, none of the commands are even checked for success, except the su.

> No one's advocating for random binaries, but you do have to install binaries from time to time, no? Or are you getting your CPU microcode updates in source form too?

If you look one level up, it's specifically binary installers. And secondly, I use a package manager, which has some review of this kind of thing.

I wouldn't curl a microcode update. That's asking for trouble.

[0] https://cli-assets.heroku.com/install-ubuntu.sh


> Just because you have a URL, doesn't mean it is connecting to the expected IP.

Ignoring how weak of an argument that is, I don't see how that is any different of a risk between curl-pipe-to-sh and regular software downloads.

> Remember who these kinds of installs target: developers with little experience.

I think that's a little condescending. I image these kinds of installers target folks who want to get up and running quickly and conveniently, regardless of their experience. And I imagine, on average, folks pasting this into their shell have more than average experience already, since they (1) went out of their way to try this software and (2) know how to open a shell and copy commands into it.

> You can't ensure they'll notice a missing 's'. You can't ensure a worn-out admin will either.

You think it's more likely they will notice the missing 's' in the click-to-download-the-installer scenario than in the paste-a-command-into-the-shell scenario? I find that hard to believe.

> Have a glance over Heroku's Ubuntu script.[0] It's not fenced, if that echo breaks, it could case some chaos.

So file an issue. If their normal installer has bugs in it, things would break too. I don't see the difference. Buggy installers are buggy, which is just an argument against buggy installers, not against different install methods.

> I wouldn't curl a microcode update.

You missed the point. You can download it any way you like; unless you have its source, though, you can't audit it at all. So your claim that binary software is untrustable falls short in the practical world.


> So your claim that binary software is untrustable falls short in the practical world.

Where the hell do you think I made that claim?

> So file an issue.

Are you just trying to ignore everything I say? I responded to a claim that shell installers would be written correctly, with evidence that a fairly sizeable company doesn't get it right.

Your response to that is that it doesn't matter.

> You think it's more likely they will notice the missing 's' in the click-to-download-the-installer scenario than in the paste-a-command-into-the-shell scenario? I find that hard to believe.

A giant green bar is a little bit more than a single letter.

---

You really don't seem interested in a conversation about the shortcomings that exist. You seem interested only picking holes and saying that you are correct.

I have no interest in responding to that kind of conversation.


> Where the hell do you think I made that claim?

Right here:

https://news.ycombinator.com/item?id=16144785

You said:

"Don't install random binaries either. The security implications of that should be fairly obvious."

> a fairly sizeable company doesn't get it right.

Fairly sizable companies mess up a lot of things. That still isn't a good argument against piping to shell though, since it isn't exclusive to that method.

> A giant green bar is a little bit more than a single letter.

Ah but you don't get the green bar for the file you are downloading, you only get it for the page that linked to it. So that's not good enough either.

> You really don't seem interested in a conversation about the shortcomings that exist.

You aren't raising many valid ones; it isn't my fault that the holes are so easy to find.

If you ease up on the hostile language and come back with some arguments that are a bit stronger, maybe you won't feel like this conversation is so one-sided.

I am truly interested in some serious arguments against piping to shell, if there are any besides the one I raised, since all I hear are these bogus ones any time this topic comes up. I have no horse in the race, but cargo cult shunning of a popular install method isn't right. One ought to have real arguments which stand up to a little scrutiny.


> I am truly interested in some serious arguments against piping to shell

Then here goes one last shot.

---

> You said:

> "Don't install random binaries either. The security implications of that should be fairly obvious."

I also said to use a package manager. So far as I'm aware, most package managers install binaries. I'm not an advocate against binaries.

In fact, I've said they're safer than curl | sh, because they don't execute when they're incomplete.

Package mangers also check binaries to see if they're not only intact, but have a level of correctness to them.

Claiming I've said binaries are inherently unsafe could not be further from the truth.

---

> Fairly sizable companies mess up a lot of things. That still isn't a good argument against piping to shell though, since it isn't exclusive to that method.

It is a good argument against shell, because only shell can break a system when the file is incomplete, because other methods of streaming executables are nearly non-existent.

And though you can protect against this, nobody has. If a frequently installed tool has the problem, then it is a problem, and can't just be ignored by saying that you can do it better.

Heroku have actually had the issue reported in the past, by myself as well as others. They've ignored it, because they don't think it's worth the effort.

They've also had SSL downgrading issues in the past, allowing MITM attacks trivially.

Docker has actually taken efforts to isolate their get script from this, but again, they aren't sure that they've actually plugged that hole.

If a company screws up their shell script, there's usually no recourse unless they're interested in listening to you.

If a company screws up their package, then they're required to act on it, or potentially have their package made inaccessible.

---

> I am truly interested in some serious arguments against piping to shell, if there are any besides the one I raised, since all I hear are these bogus ones any time this topic comes up.

There are exactly four problems I have with curl | sh.

* SSL downgrading. Every major organisation has faced it, and if curl | sh is your normal method, then there is probably a bot network waiting to step in and ruin a user's day. Most binary installers can be checked to see if they're broken, which mitigates most of the cloud of bots out there, because they don't tend to be sophisticated.

* Partial file execution. This is unique to the curl installer method, and can cause all sorts of havoc, and has. I've fixed systems that have had directories removed, and system files overwritten, and had $PATH destroyed.

* You can't mitigate curl | sh trustworthiness by checking in your browser. What you see and what you see can, and has, been different. I've seen browser downloads giving the latest stable, and curl giving the latest on git, which hasn't always been stable.

* Root. A lot of scripts will demand root at the start, so you don't get continually prompted. Unfortunately, this means they have the power of root when something like partial file execution leaves you with a line like: rm -rf $TMP, instead of rm -rf $TMP_DIR.

---

Piping to shell is unique in allowing partial execution, and can wreck havoc on a system.

Allowing a package manager to use the experience of several decades to prevent edge-cases when things go bad, is best. (And submitting to most package managers isn't an ordeal, and generally involves crafting a file with less than thirty simple lines in it.)

Downloading an installer is less preferable, but not vulnerable to partial execution, and most installer frameworks that get used also do some checksumming, which is again better than nothing.

The problem with curl | sh is simple: it is easier to do the wrong thing, than the right thing.


> I also said to use a package manager.

Package managers are outside the scope of this discussion though. I agree they are great when they can be used, but for various reasons they can't be used (no root access, the software or the version or the configuration or the architecture one wants isn't available, etc).

This discussion really is about the relative differences in security between curl-piped-to-sh and downloading software some other manual way (opaque installer, tarball full of source, etc).

> Claiming I've said binaries are inherently unsafe could not be further from the truth

I'm not so sure. One of your arguments against curl-piped-to-sh is that it's hard to audit (which I disagree with in general, but that's irrelevant for this point), but if that really was a concern for you, then I don't think it's unreasonable to infer that anything hard to audit would be just as objectionable, like binaries in various forms (and therefore this argument wouldn't apply only to curl-piped-to-sh).

> They've also had SSL downgrading issues in the past, allowing MITM attacks trivially

Which applies to more than curl-piped-to-sh, so this is irrelevant here. This is an argument against Heroku, not against this installation method.

> If a company screws up their shell script, there's usually no recourse unless they're interested in listening to you. If a company screws up their package, then they're required to act on it, or potentially have their package made inaccessible.

The quality of the installation method is not a function of the method itself, but a function of its popularity. If you are arguing that less popular installation methods may have more bugs, then I agree with you, but to then argue that curl-piped-to-sh is therefore inherently buggier than installers or packages doesn't follow. I'd bet if a company's most popular installation method was curl-piped-to-sh and there was some problem with it that affected their users, they'd fix it just as quickly.

So this isn't specific to curl-piped-to-sh, and is irrelevant.

> There are exactly four problems I have with curl | sh.

> SSL downgrading

Not specific to curl-piped-to-sh. Irrelevant.

> You can't mitigate curl | sh trustworthiness by checking in your browser

Not specific to curl-piped-to-sh. Irrelevant.

> Root

Not specific to curl-piped-to-sh. Irrelevant.

> Partial file execution / only shell can break a system when the file is incomplete

This is the only actual argument related to the difference between the two, and it's one we've already hashed through a few times. I agree it's a difference, but I disagree it's a major flaw, cause for concern, or a reason to abandon the curl method.


So... You disagree that I'm allowed to say what I say, because apparently in your world it means something different...

And you consider everything else irrelevant because you want it to be.

You're not interested in a discussion.


Care to point out where you think we're getting our meanings crossed? (In a way that isn't snarky would be nice).

And you should also explain why you think my claims of irrelevance are solely based on my whims, when I've provided my reasons right along side my claims.

If you were interested in a discussion, for example, you'd provide reasons why you think your arguments against curl-piped-to-sh don't also apply to, say, downloading software installers. If you can't do that, then those arguments are irrelevant in a discussion where curl-piped-to-sh is being singled out.


Precisely. Why would anyone do that is beyond me.


If you're still using Windows PowerShell proper instead of PowerShell Core, you still need to delete the curl alias or call curl explicitly as curl.exe in order to use it.

Also, take a look at Chocolatey for package management. Yeah it's not a built-in thing, but it's pretty decent.


I hated Chocolatey. Those binary shims seemed to produce slow, unstable binaries. There also still isn't a fully-usable Chocolatey backend for the PowerShell package manager.

The Programs & Features pane works just fine for me as a package manager.


Yeah, “iwr” sure was hard to remember.


I'm pretty sure curl is also an alias for iwr.


Curl some url and execute is a massive security risk.


In what way is it more of a risk over downloading and executing an installer from the same site?

The only reason I can think of is if the script partially downloads and only half executes. Doesn't seem "massive" though...


As opposed to download in IE and then double click the executable? At least on Windows that battle has already been lost.


What would be the reason behind disabling all those protocols?


Probably to reduce exposure to security issues. The more code you ship, the more code you're responsible for keeping secure.

Looking briefly at the list at https://curl.haxx.se/docs/security.html I see issues for FTP (x2), IMAP, and TFTP in 2017 alone. These protocols which are outside of curl's core competency of http are likely to have less scrutiny and more bugs. While FTP shouldn't be removed from curl I don't think a protocol like TFTP or gopher is crucial, and I wouldn't mind too much if it got the axe in a distribution I used


HTTP is NOT curl's core competency, "transferring data with URLs" is. HTTP just happens to be the most often used in the world, and thus in curl.


I don't like Microsoft ,but I still wish Linux on Windows will have a brighter future.


Either the Curl developers are at fault somewhere, which I somehow doubt, or distributions are really special snowflakes, which I also doubt, or software distribution in the Open Source world is, in my opinion, flawed:

> Finally, I’d like to add that like all operating system distributions that ship curl (macOS, Linux distros, the BSDs, AIX, etc) Microsoft builds, packages and ships the curl binary completely independently from the actual curl project.

Why would everyone rebuild it? There are some security considerations (matching source and binary; disabling "dangerous" stuff) and some feature considerations (disable stuff you don't need to reduce resource usage - maybe), but conceptually this seems so wrong to me.

Conceptually I'd want downstream packagers to talk to upstream developers so that upstream has reasonable defaults and settings and I'd want packagers to just package and make the package follow distribution conventions. But rebuild seems overkill.

Maybe I'm missing something obvious?


Almost all Linux distributions rebuild upstream software from source -- this ensures everything is built from the same toolchain (gcc/libc etc), and that the binaries distributed match the source.

It also allows for ease of patching in a stable release -- generally it's preferred to just fix specific high-impact bugs rather than moving to a new upstream version, which might introduce regressions.

(Context: I'm a Debian developer and on the Ubuntu MOTU team)


And then there is the whole dependencies shitstorm, were far too many upstreams have a bad habit of breaking APIs etc as they see fit.

There are ways to work around it, but it gets messy quickly. And rather than clean up their act they start championing things like Flatpak, that is basically a throwback to the DOS days of everything living in their own folder tree with a bit of souped up chroot thrown on top.

I really expect that if the likes of flatpak becomes mainstream in the Linux world having some flaw being found in a lib somewhere will produce a stampede of updates because every damn project crammed in a copy to make sure it was present.


"Almost all Linux distributions rebuild upstream software from source"

On Gentoo et al, the end user does the building. OK, yes the ebuilds are recipes but I've lost count of the times I've used epatch_user (https://wiki.gentoo.org/wiki//etc/portage/patches). You have a near infinite choice of ways to destroy your system, what with USE flags, mixed stable/unstable and all the other crazy stuff. Despite that, my Gentoo systems have been surprisingly stable.

In winter an update world session on a laptop keeps you (very) warm 8)

(wrt "Context": Ta for your work)


In order to make sure you actually have the freedom to modify the software, you want packagers to have the ability to rebuild the package from source - and preferably using just their OS as a build environment, not a black-box build environment from upstream like a Docker configured just so and make sure they have an equivalent binary. Even if there's nothing to patch today, there might be something to patch tomorrow.

I have seen many times Debian packagers try to build something from upstream and find that it just does not build anywhere other than the maintainer's computer. The fact that Debian requires that every package it ships can be built buy anyone in a generic environment is immensely valuable to free software, even if nobody used the binaries that Debian built. (And to be clear, other distros do the same, I'm just most familiar with Debian.)

I'd agree that in an ideal world, all the patches would be upstream and the binary would not just be equivalent but bit-for-bit reproducible. Some practical reasons why it wouldn't be are that various dependencies are of slightly different versions (e.g., one distro manages to get a new libc uploaded a little bit before another), that downstream conventions are different in different distros (e.g., Red Hat-style distros use /lib64 and Debian-style distros use /lib/x86_64-linux-gnu), or that a dependency has some shortcoming which many but not all distros patch in the same way, and the patch impacts things that use the dependency (e.g., upstream OpenSSL <1.1 does not have symbol versions, but most Linux distros patch them in). Yes, in an ideal world, all these things would be resolved, but there are going to be so many tiny things like this that come up that having infrastructure to accommodate them is the right plan.


Especially in the Linux world, you can't expect upstream to supply binaries for all possible architectures and configuration options. For example, you might be running on armv5 with libressl. Or you may be running on sparc64 with openssl. Or you may be running on 32bit windows with WinSSL. An upstream is unlikely to have access to build all possible configurations and provide binaries every time a security patch is announced.

Also, as a distro provider, you will want to be sure you can build the application yourself, because you might want to ship an updated library dependency that is ABI-incompatible and so you must be able to rebuild the consumers of these libraries. For example curl, in the case of openssl.


Have you ever been an upstream? What you're suggesting is awfully wasteful. You'd get endless requests for "please add build for XXX" if you ever distribute any binary. Where XXX is anything from arbitrary versions of arbitrary distribution, or any kind of non-linux OSes. And all that for i686, x86_64, various ARMs,... Who has time for that? If you state you only distribute source code and perhaps binary for Windows, you will not get bothered ever again.

Better let the building be done by people who are doing building all day en masse for a single arch, or have infrastructures set up to do builds for multiple archs easilly, than to expect every upstream to have this setup.


I can think of a few reasons

(1) Use of a different libc (alpine linux with musl)

(2) most builds are not reproducible, rebuilds are needed for security reasons

(3) non-rolling release distros (-> most distros) fork the upstream projects to backport fixes for their older releases

(4) Different filesystem structure (Gobo Linux)

(6) Most distros want to use the build system associated with their own package manager

There are probably many more than that. Most distros don't even try to stay close the upstream repo and instead maintain a lot of patches.

In the future we will most likely have an distro-specific basic system build and container apps (snap, appimage, flatpak) build directly by upstream on non-server systems.


> In the future we will most likely have an distro-specific basic system build and container apps (snap, appimage, flatpak) build directly by upstream on non-server systems.

I sure hope not. A better system would be something like NIX/GUIX or even Gobolinux, that give us a single package as today, but with the option of installing multiple versions in parallel if upstream has screwed up the API (again).

Flatpak and like will just be and excuse for upsteam to bundle everything and the kitchen sink, resulting in bloat and having to update a mass of paks rather than individual libs in case a flaw is found.


You've got good points and I admit I'm split on this issue.

In theory i would rather have everything handled by the package manager but container apps provide a few useful advantages:

  - sandboxing/isolation out of the box
  - being able to report bugs directly to upstream
  - It's easier to distribute a small app to all distributions (for some value of all)*
  - much easier to distribute proprietary applications (subjective advantage)
  - much easier to install old versions of an application (sometimes needed)
  - it's possible to record the bandwidth/ram/cpu usage; with a standard package that's quiet difficult


7) CPU architecture,

8) output binary format (though these days people generally only use ELF or PE),

9) compile time options (eg some packages will allow you to choose which TLS library to use at compile time)

10) hardware specific optimisations

Basically just a plethora of reasons


> Maybe I'm missing something obvious?

There are many different package managers depending on which Linux you're running.

Debian and derivatives use apt, Red Hat uses yum or dnf, SuSE uses yast, Gentoo has emerge, Arch has pacman... So each package manager needs to build their own package and it's easier to recompile from source than slice and dice a binary.

Also distributions will install the binaries, shared libraries, man pages, etc to different locations (some to /usr, some to /use/local, etc) which is also easier to define at configuration since most autoconf/make files support this already.

Finally they might want to add patches for distribution specific features or quirks. Or maybe they compile with ulibc instead of glibc.

There are many valid reasons why distributions would and should take the upstream source and build/package it themselves.


Really, why would everyone not rebuild it?

People have pointed out tons of reasons why distros do their own builds, but really I don't understand what would possibly be a reason not to!?

The build system itself also is just a piece of software that gets distributed along with the source of the software to be built, and just as you can download and run curl and get a predictable result (the download of some URL), you can download and run the curl build system and get a predictable result (a cURL binary).

So, in that regard, what does it matter what execution of the cURL build system your cURL binary came from?

In particular with the trend towards reproducible builds, where the build result will be bit-identical between different runs of the build system if it's using the same compiler (version), I just don't see why you care!?

Yeah, distros shouldn't just modify the software they package willy-nilly, but that's completely orthogonal to whether they should do their own builds. There are many reasons for applying small changes to enable integration with the distro, and in particular it's just unreasonable to expect that everyone of the thousands of upstream authors or Debian packages, say, operate machines of all ten CPU architectures that are currently supported by Debian so they can provide binaries for all architectures. So, if you want to be able to distribute software written by people who don't happen to have a System z or a MIPS machine in Debian, maintainers have to prepare packages in such a way that binaries for those architectures can be built--and if they do that, it's trivial to also build all the other architectures from the same source. Adding special cases for when a binary is already available for some architecture from upstream would be just completely pointless complexity.


Apart from various technical reasons the workflow of essentially all linux package managers is built on the notion that the binary package is built by some well defined (and repeatable, at least in the "does same thing" sense) process from some kind of source package. Also the souce package format usually contains mechanism that allows distributions to comply with copy-left licenses while also cleanly documenting which modifications were made to the package from upstream version (it is sort of special purpose versioning system).


Rebuild is necessary for things as trivial as changing the default install path. It's absolutley standard; a Linux distro that doesn't rebuild packages somewhere would almost not count as a distro.


Hmm. All the various Ubuntu derivatives (eg, Hanna Montana Edition or Christian Edition, or even Kubuntu and MATE Editions) that change some defaults without rebuilding packages... don't count as distributions?

This is somewhat tongue-in-cheek, but it's actually a question I don't have a solid yes or no answer to. I can see it both ways.


If your users are pulling software direct from the upstream distribution then you're not a separate distribution of Linux, you're a 'spin', 'edition', or an installer.

This does get a little murky when some of the packages are distributed directly but others are pulled from upstream like Antergos but the changes are so minor that I would still consider it an Arch spin.


For Windows, at least, the big reason to build the binaries would be to sign them with the Microsoft certificate, so that you know that the binary is authentic.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: