I have had this experience many times when I tried to use a library where every function was tailored for a very specific concrete situation. The perfect example here is finding an extremely complicated mass of code to read a cert from a file, but no way to read a cert from any other source, not even from a buffer in memory. So if you read it from some other location, or read it from a file yourself, or constructed it yourself in memory as part of a test, you're out of luck. People who write code like that tend to be pretty incredulous and insulting when you explain what you're trying to do. "I mean, if you want to read a cert that isn't in a file, obviously I need to educate you until you stop wanting to do such a stupid thing. Not that it's stupid in itself, but you are probably solving your problem in such a dumb way that you should stop right now and start over when you understand what you're doing. If you're super-polite I might even help you." That kind of thing.
It is not the first time I read that there's no way to read a certificate from memory in OpenSSL, but it's not true. Here's how to do it: http://gist.github.com/574388
I agree with the opinion, though, because it took me a lot of time to figure this out.
There's a school of API design that believes you can guide users to reflect on what they are doing by deliberately excluding certain things, without explaining why they are absent.
This springs from a prescriptive viewpoint that presumes the reader will inevitably arrive at the same conclusions as the author about the right way to do things.
Since no one is always right, and API's are too often designed without considering what actual application usage would be like, this can be extremely frustrating.
The best API's that somehow hit the sweet spot of minimal, efficient, and obvious are always extremely impressive.
Perfect recent example. Plex v9 doesn't allow one to delete a file after you watch it. v8 had the feature and LOTS of us used it. The forums are abuzz with people complaining, and the devs are just now coming around that they might implement it (though when is anyone's guess). In the mean time, the lot of us has gone back to v8.
The good old "I really know better than you. I even know better than you what it is that you want/need to do. Do it the way I tell you to, because that's how I would do it, and that matters more than what you want/need to do."-personality so incredibly prevalent in a certain sphere of programmers we are all very familiar with, who gladly spend 5 minutes pushing you down instead of spending 20 seconds to just kindly, sociably and helpfully answer your question.
The problem is that in many cases, there is not an answer that easy. And instead of admitting that the current design is too crippled to allow such a thing or makes such a thing pretty horrible and difficult, they try to defend the design by declaring that thing as something you don't want to do anyway normally.
I have to wonder - if the API was so bad, but he didn't have to do anything that isn't available from openssl "the tool"... why did he write the code at all? Excerpt from openssl x509 help:
His requirement was writing an application that provided certificate authority services. You could write a crapload of messy wrapper functions that call external programs, or link in a library and avoid adding glue to your app.
The annoying answer is that it's "cleaner", the realistic answer is you don't have to account for as many potential problems you might encounter while executing applications on any given system. The sysadmin in me would use the tool but the programmer in me would use the API. (Oh, and depending on how heavily this application would get used, the API might be significantly faster and reduce load on the server...)
If you're generating new CAs in such volume that the (in)efficiency of a UNIX pipe becomes a problem, you should just stop right there. That should be a red flag.
It's not the pipes, it's fork() and exec() and re-main()-initialization. Try doing that a few ten thousand times and see how long it takes compare to doing everything in-process where you only initialize once.
While I'm not a fan of premature optimization, I don't see nothing wrong here even if I like the Unix philosophy of "Write programs that do one thing and do it well". Why run an external program when the same thing should be possible by calling functions from a (well written) library?
While I generally agree that the OpenSSL library is somewhat horrendous, this specific use-case addresses a very tiny subset of what the library does internally to do some real magic.
I used to work for a very large public facing certificate authority and there was a period in which one of the public CA systems used OpenSSL, so we had to be able to call several of the signing functions via a JNI extension. Why? Because OpenSSL has fairly remarkable support fir internal context objects that allow you to designate PKCS#11-interfaced hardware security modules for private key operations.
Yes, the levels of indirection are painful to read through and when you get down to the levels where you can actually do a signing operation you have multi-level points (something) of strange types, but it works almost magically when you need it to.
It's also the combined with the intersection of handling/reading certificates. The IO interface has to correctly handle PEM and DER information. Even if you can parse those structures all you're left with is ASN.1 annotated structures and ASN.1 is an exercise in pain.
So a library that can read a broad format of single-bit sensitive data, correctly decode and preserve the structure of ASN.1 entries, perform crypto operations, and then somehow make sense of it all across how many years is mostly likely going to be complicated as hell. Or at the very least, given my proficiency with C I think I would have a hard time coming up with a library that worked on as many platforms as transparently for as many years.
I don't understand why he persisted in using OpenSSL. There may not have been a lot of alternatives ten years ago, but we developers are practically spoiled with choices now. I heartily recommend Botan, which is BSD licensed, written in C++, and has a clean API. I loved it so much I donated to the developer.
You're only seeing the very first steps, but not noticing where the intent is.
Building an entirely new replacement implementation will fight the battle of adoption. By wrapping initially an abstraction around the existing code, you can then go back and replace the original while at the same time, keeping everything relying on the original working. If you see Marco's initial work as a "shim" to enable replacement, it makes a lot more sense.
After creating roughly twelve hundred lines of C and hundred or so lines of command language to get OpenSSL working for network connections and to get a CA and signed certs a few weeks back, I can well appreciate the author's frustration with the OpenSSL library.
And at the same time this was being developed, one of the associated OS platforms went through an incompatible-API OpenSSL upgrade; you got to find and rebuild everything that was built against it, or you saw, um, cryptic failures.
The only documentation I found useful at all was the RFC. It seems that if you want to use OpenSSL, you need to read both the RFC and the source. It's also preferable to have examples of its usage in other OSS code.
"In 2004, the head flying height was equivalent to a
Boeing 747 airliner flying at 0.05 cm above the ground
and travelling at 92 Km/h (7200 RPM drive)." That was
in 2004. What would that translate into these days?
The... same? At least, if you're using a 7200 RPM drive, but 10k and 15k drives existed in 2004 as well.
Bit density has increased dramatically over the last six years, but rotational speed has remained constant, thanks to so some hard physical limits; and now that SSDs are no longer a howling black hole of suck on a dollar/GB basis, we won't see any more breakthroughs done with mechanical hard drives. Solid state drives are just better.
Complex code is never good for security. In 2006, a small change was made to the Debian port of OpenSSL to do something like remove a few compiler warning messages. Unfortunately, it had the side effect of making the random number generator predictable. This security flaw wasn't discovered for over a year and in that time many weak keys were generated and used. See here for more info: http://wiki.debian.org/SSLkeys
If there's one thing we don't hear enough in the programming world, it's "all this code I didn't write sucks!"
Oh wait, that's about all we ever hear. It makes for boring reading (and boring comments), so I'm flagging the article. When you write about how you've rewritten OpenSSL to be more flexible and featureful, I will enjoy reading about that. This? It just makes me mad.
(If you want to whine about a particular issue with a particular library, at least try to generalize it to something meaningful. The lesson here is "be careful writing a library that only one app uses and then calling it a library". A lot of C programs make this mistake; sure, they have a small app that calls libapp to do all the real work... but the functions that do the real work are things like read_config_file_and_then_handle_the_applications_foo_command, which is really unhelpful to everyone.)
Anyway... OpenBSD guy whining about code he didn't write? Big surprise. On HN, I want valuable articles that show me something cool or teach me how not to be uncool. This is just a rant that has no real value.
Personally, I found it educational that a vital piece of security infrastructure like OpenSSL is a tangled piece of sh... paghetti. When my coworkers and I write something security-critical in our code (not that I've ever touched anything as important as OpenSSL; I suspect OpenSSL is more critical to our security than any of the code we've written ourselves) we make it as dumb and simple and clear as possible and document it very well. With OpenSSL I would have assumed (did assume) the whole project was implemented that way, bending over backwards for clarity to minimize the chances for a vulnerability, either in the code or due to misuse of the code. It's shocking and interesting to find out that it isn't.
I'm surprised that you're surprised. It's old code that nobody touches. It's written in C. Of course it's going to be bad. (Why? Because "good" changes every year or so, and if nobody has needed to touch the code for a year, it's going to be bad by definition.)
Sure, it's good to clean up old code to minimize the number of unanticipated failures. But this takes time and money; there aren't magic fairies that do things "because they should" or "because it's a critical piece of infrastructure". If it's boring and there is no money to be made, then it's not going to get done.
The author shows why; sometimes, it's just easier to hack around the problems than to fix them.
It's old code that nobody touches. It's written in C. Of course it's going to be bad.
Tcl/Tk is old code written in C. It's beautiful code; an absolute joy to work with. It was beautiful ten years ago, and it'll still be beautiful a hundred years from now when it is long forgotten by all but a handful of historians.
OpenSSL is a miserable codebase, and always has been.
Yes, bitching about other people's code is a popular hobby. Sometimes bitch because it's always easier to write something new than to understand the logic behind something old. And sometimes people bitch because the code is an absolute nightmare to work with.
The definition of good C code hasn't changed that much. API styles change and compiler capabilities change (gotta love C code with 7-character function names inherited from ancient Fortran code) but undocumented and disorganized is never good. I've used C APIs that were procedural with no OO influence at all, each function taking dozens of parameters, many of which weren't actually used because it was easier to pass the same thirty parameters to each function than to try to remember a unique parameter list for each function consisting of the seventeen parameters it actually needed. I've used APIs like that and groaned and cried and pulled my OO-educated hair out, but some of those APIs were still well-documented, logically arranged, and cleanly implemented, and they were a lot easier to use correctly than code written poorly in a modern style.
It's written in C. Of course it's going to be bad.
You are being unfair. I did a little Samba hacking in recent memory, and found it quite easy to understand and follow. (Granted, I didn't delve into the low-level implementation of Microsoft protocols.) I'm sure other well-written C programs exist.
I've looked at his code. He didn't rewrite it, just wrote the wrappers to the API calls to make his usage cases more convenient for him, which is what every beginner would also do.
But I hope you understand that OpenSSL has order of hundreds of thousand of lines of code (don't make me counting now) and the guy made something like a hundred lines total separated in a few functions which simply call OpenSSL API step by step to do the given task. It certainly can't be said that he "rewrote OpenSLL" and it's certainly not something that gave him any right to feel superior -- he's just a user of the API.
I disagree. How are people expected to learn from the mistakes made in OpenSSL if we mustn't be negative?
OpenSSL is, by any sensible code quality metrics, a bloody mess. It is a miracle that this code has not caused more disastrous problems than it has. It is not only legitimate to criticize it: it is essential that we do and that we articulate what mistakes were made so that others may learn from it.
I'm using MatrixSSL (http://www.matrixssl.org) in my product. It's cross platform, small footprint, has dual commercial / GPL license options and a sane API.
Shouldn't there be a debate about the technical merits of the idea? I want my libraries and tools to make unwise things hard to do, and the wise easy and obvious. It's a framing issue, one that I think Marco's right about. But I'd like to hear what OpenSSL has to say about this and other, other, other things.
I'm not going to defend or support OpenSSL, but I find your tone extremely frustrating. For better or worse OpenSSL is an open library for you to use, and you should be thankful for that.
Also, insulting the people working on it is no way to improve the situation.
Have you ever actually talked with any of the OpenBSD developers?
Marco is a fantastic human being and absolutely hilarious. On top of all that, he's also an amazing programmer. Yes, I've met him in person, and we've traded emails and packages for years. In fact, there's a half a pallet of donated gear sitting behind me in need of being shipped out to him. --It should go without saying, but he's a friend and I have a strong bias.
Getting frustrated by widely deployed but poorly written software should be expected. Just voicing said frustrations solves nothing and wastes time, but voicing frustrations while providing an alternative is actually beneficial.
It wasn't intended as a comment on anyone's personal worth as a human being, or what they're like in person. I have spent a good amount of time following OpenBSD-related mailing lists, though, and I'd say "polite and collegial" is not the prevailing tone--- hyperbolically trashing other people's work and calling them stupid monkeys is more par for the course. Admittedly, they don't have a monopoly on that; plenty of GNU mailing lists are similar (esp. anything RMS or Ulrich Drepper regularly posts to).
And ponder it a bit, you'll see how it applies to open source projects, and interactions on mailing lists, or as the case may be, a homepage article by an open source developer.
For me at least, the more fascinating question is why open source projects eventually degrade into "sick systems" of interaction? --I wish I had an answer, but the only speculation I have is it's the result of frustration.
"Welcome to OpenBSD" is customarily spoken as "Fuck you moron" or "SMP is for retards and jackasses like you" or "Threads are for idiots and no, we don't care."
I love OpenBSD though. Use it everyday. They don't pretend. What you see is what you get.
My favorite is the well-known flame by someone at the MIT AI Lab back in its heyday (early Lisp Machine development time)--I'm pretty sure it was RMS: "I've deleted all your <bleeping> code and also erased all the backups" (paraphrased).
It has been a few years since I had to work with OpenSSL and I had much of the same reactions to the code.
OpenSSL has always been bad, so it is not likely that it will improve any time soon unless someone who has a talent for API design decides to spend an immense amount of time sanitizing the library. This is a crypto library, so it is code that requires a lot of scrutiny. You can't simply make changes willy-nilly. Undoing the damage is no simple matter of programming.
I think it is important to point out badly designed APIs and make an example of them so people can learn why it is important to care about API design. It doesn't matter if it is open source or not. That is completely beside the point. Lots of open source code gets worked on by people who get paid for it or whose companies benefit from it directly or indirectly, so let's just be grown-ups and not derail the discussion.
Something being open source is not an excuse for doing a poor job. Bad code is bad code and OpenSSL does deserve harsh criticism for being unnecessarily hard to use.
I find the thought that you should not be able to criticize someone for designing bad APIs just because a project is open source offensive.