But the code works. It supports lots of platforms through choice. Yes, you could make the code prettier if you dropped some platforms. Yes, you could refactor it to use some abstracting libraries that now exist. But the code works. If you rewrote the code, the best result you could end up with is the same functionality that still works. All other possible results are bad. There is nothing to be gained.
I made the page as a companion piece to a blog post[1]. The code works, but it's quite buggy. Also, it is almost certainly broken on outdated OSes. It's just that nobody uses the latest Vim on IRIX or VMS, so no bug reports get filed.
A quick glance shows some obvious errors in RealWaitForChar(). For example: with typical preprocessor defines, it uses gettimeofday() in a select() loop. This will break if the system clock changes, either from user intervention or ntpd. The correct solution is to use a monotonically-increasing timer, such as Linux's clock_gettime() or OS X’s mach_absolute_time().
Vim's UI is powerful, but its codebase is a nightmare. My blog post explains why in more detail.
I too have followed the vim mailing list for many years, and tried and failed to submit a patch to it. I mostly use neovim now, and have successfully submitted a patch to it. Based on these experiences I found your blog post unnecessarily incendiary. The thread of your patch submission (https://groups.google.com/d/msg/vim_dev/-4pqDJfHCsM/LkYNCpZj...) was actually quite civil. It didn't do what you want, but let's be precise with language here.
To state some blatantly obvious facts, people do open source on their own time, they have limited such time and many commitments, and they explicitly allow forks like neovim for the aspects they can't find time for. Vim and neovim have chosen very different design constraints, and there's no reason why they can't continue to exchange code where it makes sense. Putting one of the two down seems unproductive.
The leaders of large projects like Neovim are seldom critical of the competition, and when they express criticism it is measured and proportionate. We should all learn from them.
(Your OP here is still quite useful. Thanks for putting it together.)
I'm sorry if you thought the post was self-serving. I can only say with sincerity that it wasn't. I have no motive beyond my interest in software.
If you don't believe my claim of selflessness, believe my claim of competence: If I was trying to build a personal brand, I could do much better than delving into Vim's codebase for a one-off snippet. It's not difficult to create insubstantial content that is widely-shared. I can think of a dozen topics whose mere mention can drive immense traffic and waste absurd amounts of time. HN's front page attests to this.
It's a shame that many people scrutinize others for selfishness. Purely selfish behavior is an extreme rarity, yet it takes much longer to refute such accusations than to make them. I wish it were otherwise.
Before deriding others, please remember: Almost no one is evil. Almost everything is broken.[1]
>>I don’t consider myself very devoted to my work, but
>Don't try and pretend your problems are our problems. Struggling to survive, scrap by scrap. Fuck off.
If you're ever in the Silicon Valley we should "meet up". I'll be nicer to you.
You trawled through my writings to find one sentence in one blog post[1] to voice outrage at. I'm not sure how to react to that. I guess... congratulations on your hard work.
I live in the bay area, but I prefer not to hang out with people who treat others so callously. I hope you understand.
Edit: It appears roghummal has significantly revised his comment since I finished my reply.
>Edit: It appears roghummal has significantly revised his comment since I finished my reply.
It doesn't 'appear' that way, that's the way it is. My edit was, just guesstimating here, 25 minutes before your post? Edit was 2-5 minutes after my original post?
You've never said anything you quickly realized was indefensible?
So I don't confuse you with an edit and for my own gratification, I'd like you to know that I didn't 'trawl' through your posts to find something that supported mine.
Your titles are bait. Their substance is minimal. When you've had something to say it has been said before, better. If you care about the code as much as you claim you should let it go. (I'm questioning this last statement.)
You're a natural born CEO. Find your Woz.
Edit: And then someone to run your company. (I just saw "Judging from this comment and others, you seem uninterested in civility or truth. I doubt anything can change your mind on this issue.")
You delved[1] into the vim source and found this. You offered no solution except to use neovim, which happens to be 'supported' by the /amazing/ company you're the CEO of. You're very well practiced in false modesty and apologies but fuck you; I'm not sure you know what sincerity means.
I'm sorry if I don't think you're being sincere. Let's have a drink sometime! Let's network!
1. Anything as old as vim, continuously developed, will have any number of these frankenfunctions. They're not hard to find.
Judging from this comment and others, you seem uninterested in civility or truth. I doubt anything can change your mind on this issue.
Still, others read these comments, so I feel compelled to offer corrections: 18 months ago, I pledged $50 to Neovim's Bountysource. My company has not contributed a cent to Neovim. We maintain plugins for many text editors, including Vim and Neovim. That's it.
You keep slinging mud, but there is no ulterior motive at work here. I urge you to treat people more charitably in the future.
Civility works best with mutual respect. That respect might be hollow or fleeting but it's established and traded on. When two people meet for the first time there's a credit exchanged. That's the humanity-freebie.
Don't ask for sympathy ("They don't want civility or truth!!!"), ask yourself why you were treated "uncivilized".
I'm really sad neovim didn't call themselves vi improved improved, or vii. It doesn't just carry the naming tradition of vim, but is also one more than vi.
"Vim is, without question, the worst C codebase I have seen."
From the example you give, I agree it must be bad. But if you want to see something worse, check out PHP. No, not the billions of programs written in the PHP language (which are, indeed, almost all terrible), but the C source to the PHP interpreter.
Obviously PHP was invented by someone who just doesn't care about creating a decent programming language. Everyone knows that; it has been pointed out so many times that it's a cliche. But if you look at the source to the implementation, you will find that it was developed by someone who is so incompetent that whether they care or not is beside the point. Dealing with that codebase (and thinking about how successful PHP has become) made me question my will to live.
The best-engineered technology is rarely the most popular. I'd be interested to see a list of well-written code bases that are also popular.
The interesting thing about PHP (the program) is that, in my experience, the code doesn't suck because it was hastily written or because it was written a long time. It sucks because the people who write it and work on it have bizarre, nonsensical philosophies about writing code. I've seen some talks by the PHP maintainers (as recently as 2013) that made me want to throw something at my monitor.
llvm and redis have good reputations. sqlite, I think. Lua. llvm in particular seems to owe its popularity to good structure, though it does seem to be an exception that highlights the common case.
I can't find the one I was thinking of, and I don't want to comb through literally hours of video to figure it out. This should illustrate my point, though:
I use Neovim as my daily editor, it definitely has it's quirks over Vim (sometimes it takes over 10 seconds to write to certain files, locking up Neovim and there are some bugs with the fuzzy finder I use that cause file corruption that aren't present in Vim) but on the whole it's pretty stable. Haven't encountered any issues yet that are more than a minor annoyance.
It's a third party plugin, it's definitely a Neovim issue because the file corruption doesn't happen with Vim. No idea why it happens. I'm fine living with it because I just check the file back out if it corrupts it. It's only an issue when opening a new file so it doesn't cause work to be lost.
I've been trying it for roughly the last month. I have run into a few quirks, for example it uses non-blocking i/o for things with libuv (for stdin), which can interfere with parent processes that expect blocking i/o. If you file a bug report, the issue usually gets fixed quickly.
I've kept vim installed as a fallback so I can easily use vim if I run into anything that is a show-stopper, at least until it's fixed in nvim. I still plan on submitting patches to both for the foreseeable future though.
I use it with true color support enabled, and a terminal that supports true color, and a patched version of tmux that supports true color, and finally all color schemes render as they do in gvim!
yeah, there is a good chance you'll not run into problems. however new, serious bug reports come in frequently. for example a lot of people have been seeing segfaults, and nobody seems to know where to look.
Although Neovim's codebase is significantly cleaner than vim's, its still a far cry from what I would call good.
Been using it daily for several months now, haven't had much trouble with it. Worked seamlessly with over half a decade of accumulated vim plugins and config modifications.
That if statement’s conditions span 17 lines and 4 different #ifdefs. All to call gettimeofday(). Amusingly, even the body of that statement has a bug: times returned by gettimeofday() are not guaranteed to increase. User intervention or ntpd can cause the system clock to go back in time. The correct solution is to use a monotonically increasing time function, such Linux’s clock_gettime() or OS X’s mach_absolute_time().
ntpd guarantees (under the default settings now, though that's new... you can ask it to shoot you in the head, but I don't recommend that you use firearms in this way) that the clock will never go backwards. You're thinking of nptdate, a utility provided with ntpd, but which is only ever supposed to run at system initialization time to get the time close enough for ntpd to take over.
If ntpd finds that the clock is ahead, it will slow the ticks, allowing it to gradually drift back to the correct time.
The system time should NEVER move backwards. Ever. Relying on the system time not moving backwards is not unreasonable for trivialities like how long you wait for a keystroke. If you reset your system time, you might have to hit a key to get vim to wake up. Shucks.
ntpd will step backward if error stays above a configurable threshold for a configurable period of time. We saw it happen after the recent leap second. 1 second exceeds the default step threshold of 128 ms. With Linux limiting the slew rate to 500 ppm, gradual correction would have taken 2000 seconds, and by default nptd steps after 900 seconds.
> ntpd guarantees (under the default settings now [...]) that the clock will never go backwards.
Can you elaborate on what changed? I just downloaded the latest reference implementation, and the man page still seems to indicate that the default behavior is to step when error > 128ms for a prolonged period.
There is no such guarantee when using CLOCK_REALTIME. If you need this guarantee, use the monotonic clock. I've had issues with graphics engines written naively assuming nobody would change CLOCK_REALTIME. It's a bad assumption because many embedded systems well not use ntp and will instead rely on a user to set the clock. It may go backwards if the user sets it backwards. It may even do so while your software is doing something like presenting the user with a UI to set the time.
A wrapper function could hide the extra #ifdef, taking it out of OP's count. A macro defined during compilation would be even better|harder to understand.
I don't know whether it's technically a bug or not, but vim is often a bit slow. For example, 'o' to open a new line in insert mode sometimes takes up to two or three seconds to work. Not a huge problem, but enough to make my mental flow stumble. I have a small vimrc that's mostly tab-related stuff, so I don't think it's that.
vim crashes or hangs for me about once every month. I've never bothered to find out why since I don't think it's worth the effort. The effort would be lower if the code was cleaner and didn't have so much bloat related to platforms that don't exist. It would definitely be easier to submit a patch if I didn't need to cater to all the platforms I've never seen.
If you run "stty -ixon", it will prevent Ctrl-S from pausing the terminal. I understand that feature in the days of slow connections, but it feels rather silly at this point.
That is fascinating to me, in a train-wreck sort of way.
We had a discussion a few days ago about the ways in which some interfaces (command line in particular) can be user-hostile. (https://news.ycombinator.com/item?id=9831429) Vim's Ctrl-S appears to be a function, keyboard-adjacent to several commonly-used functions, whose main effect for many users is "cause the program to fail immediately with no indication of how to fix it." I don't think I could make up a better example of user-hostile design if I tried.
Ah, my mistake. Same criticism applies, though; possibly more so, as a terminal emulator running within a GUI would find it even easier to display a useful status message or similar.
I've used vim pretty much every work day in the last 15 years or so, and I can count the number of times it's crashed on one hand. If you can actually verify that it crashed (i.e. core dump file), then you've got the source available, and you can send in the patch.
People have tried to add async plugin support to vim, but the patches have all been rejected. Meanwhile, the plugin support in Neovim is worlds better.
Microsoft Word doesn't even accept patches, which I do indeed hold against it.
The patches were rejected because they didn't provide a working solution for cross platform support. The plugin system in Neovim is only "better" for people who actually want async plugins. I am not one of them.
When I press ESC, it takes a while to register, and for the UI (e.g. the little "command in progress" buffer bottom right) to reflect this. CTRL+C doesn't suffer this problem, and is almost functionally interchangeable, so I've retrained myself to use that, now.
This is a bug, if you ask me. Or is that impossible for vim to fix?
In vim, set separate timeout and ttimeout values to get around this. Specifically, set the former high fit regular commands and the latter low to avoid issues like slow Esc processing.
vim works fine on OpenVMS, as it's in use in some other OpenVMS windows right now. (But thanks for the reminder to upgrade to the most current — upgrades underway.)
I'm not particularly inclined to port neovim to OpenVMS right now, as there are a few other projects in the queue ahead of that.
Many companies (FactSet for example) still use VMS, and vim on those systems... I think folks would be surprised how many legacy systems are still out there.
How likely are these companies to upgrade to the latest vim though? It is not like these companies running VMS are following the latest version, if vim dropped crufty old OS's, then the code would be simpler and the old OS's would not notice.
There are a lot of companies with VMS environments that _must_ be regularly updated/patched/maintained with various security bits for SOX compliance or the like. There certainly are places with static installs that are carefully guarded, but that doesn't mean there isn't a poor admin/dev/user who appreciates an updated or patched version. Heck, he might be the guy doing the porting in his spare time to make his day job better. Often if your seeing regular updates to continue support for an obscure system, its a good bet someone using that environment is the key contributor for their own sanity.
This does raise the question though: when is it appropriate to drop old platform support for any OSS project, and how do you even measure it? User surveys seems insufficient.
When I change jobs, I check which version of my editor the dev environments have installed. If it's too old, I build a local copy of the latest version of my editor. No worries asking IT or Ops to update a package for me.
When it comes to production servers, I'm more likely to build my editor in my home directory than use my sudo privileges to install my editor in the system.
So developers can be building newest versions of code on old systems. I do.
Lots of people know and contribute to vim. If they try to make life easier for you, they will probably make life worse for all those who are already familiar with vim code. It is similar to the situation of the Linux kernel, where some people would prefer to use C++, while everyone who already works on Linux also knows and prefers straight C. Given the vitality of vim, I think you should just learn it and stop trying to change what already works.
Right. So you recommend neovim, which is good, except that it won't install -- not on some unused OS, but on OS X 10.6.8 (Snow Leopard). Now I know that's a little behind, but really? Beautiful code that won't work on a 2010 OS is just awesome.
As an aside, I think it's a growing problem that the formulae in Homebrew are not updated to support 10.6.8.
And Vim? It just works. Maybe it's all those #ifdefs that make the code look ugly...
If you're going to use 'this is a companion piece to...' as a defense you should link to the original article in the 'companion piece' before you have to use it as a defense.
"The idea that new code is better than old is patently absurd. Old code has been used. It has been tested. Lots of bugs have been found, and they've been fixed. There's nothing wrong with it. It doesn't acquire bugs just by sitting around on your hard drive."
EDIT: Joel isn't actually recommending "do nothing" - on the contrary - the article goes into some depth on ways one can improve software without throwing everything away and starting from scratch.
If I remember correctly Joel wasn't opposed to refactoring old code. That article seemed to be mainly against wholesale rewrites which I agree in a most cases. He is now paying the price for taking this rule to the extreme. I think sometimes rewrites are unavoidable as tech on which the old code is based dies and the current code needs to continue to be extended. In Joel's company case they slowly ended up having to write an entire compiler and language to keep their old code running.
Relevant passage:
"First, there are architectural problems. The code is not factored correctly. The networking code is popping up its own dialog boxes from the middle of nowhere; this should have been handled in the UI code. These problems can be solved, one at a time, by carefully moving code, refactoring, changing interfaces. They can be done by one programmer working carefully and checking in his changes all at once, so that nobody else is disrupted. Even fairly major architectural changes can be done without throwing away the code. On the Juno project we spent several months rearchitecting at one point: just moving things around, cleaning them up, creating base classes that made sense, and creating sharp interfaces between the modules. But we did it carefully, with our existing code base, and we didn't introduce new bugs or throw away working code."
It doesn't acquire bugs just by sitting around on your hard drive.
Except when sometimes it does. That is, external factors reveal old bugs or introduce new ones. Like when you use it after installing a new package/driver/OS and it blows up. Or after years you give it to a new crazy user which somehow succeeds it hitting 20 keys at a time and your code doesn't handle that. I'm not saying occasions like those require a complete rewrite, just that it's not because code has been fine for 10years that it is bugfree and needs no more work ever. (though in cases like this switching to a proven lib for handling key input if such a thing even exist might be a solution - which then introduces new problems like more dependencies and bugs in that lib making your own software not working properly and so on and so on:)
I remember I once went to great trouble to use a succession of old computers to copy my favorite DOS game off of a 5 1/4" floppy into a 3 1/2" floppy, and then from there onto a network, and then onto my laptop, so I could take it to school with me. And in the end, I learned that the game ran so fast on my laptop that your character would move at light speed and then die as soon as you pressed a button, because my laptop was too fast.
There was nothing wrong with the program, it just didn't expect to still be in use 17 years later.
Bitrot means degradation of physical media containing digital data, not really anyting to do with bugs appearing in code that is introduced into a new environment.
> The Jargon File, a compendium of hacker's lore, defines "bit rot" as a jocular explanation for the degradation of a software program over time even if "nothing has changed"; the idea being this is almost as if the bits that make up the program were subject to radioactive decay.
"And I've always been totally willing to hack things apart if I find a different way that fits better or a different partitioning. I've never been a lover of existing code. Code by itself almost rots and it's gotta be rewritten. Even when nothing has changed, for some reason it rots."
And your old code doesn't have a test suite you can run on the new code?
Well there's your problem. Testless code doesn't acquire bugs by just sitting around on your hard drive, but it doesn't lose any bugs that way, either, and without a test suite you can't afford to do anything but leave it sitting around.
I find it odd how many people overlook this fact. I think people reading tech blogs and sites sometimes live in a bubble.
I don't know if in most, but in many companies there are no automated tests at all, and very few "best practices" such as automated builds/continuous integration and even decent source control.
I don't want to name names, but I'll say this: I know for a fact at least one IT/support department in the local branch of a HUGE multinational energy company (guess a few names and you'll get it) doesn't do automated tests of any kind. They are similarly clueless about most things tech-minded folk would consider best practices of the last decade. This department doesn't work on core software, but instead on inventory/procurement systems, but still...
One of my favorite quotes when QA ‘let’ an obvious bug into production was:
“We stopped testing his code five years ago.”
You can take this in many ways, but in the end getting the right amount of testing is a really hard Cost / Benefit problem and there is no one size fit’s all solution.
PS: From the same dev, "I don't trust QA." and "TDD/Code contracts/etc. sounds nice, but knowing what the output should be is really the hard part and tests don't help you with that."
Something like 60% of enterprise IT projects are considered failures, too.
If you don't write tests, then you're throwing away your investment. It's that simple. Telling me that the code works today tells me nothing about how much value the code will add to the business in the next year.
Sooner or later, you won't be able to predict whether a 'small' task will take 3 weeks or 3 years. Even if you don't change the software, some integer overflow might abruptly halt everything. ("It's the primary key for everything? And we don't have tests? Oh...")
Anyone can write code, but it's usually high-risk code with lots of hidden costs. The value that career software developers bring to the table is the ability to manage the software development process so that it's more-or-less sustainable, i.e. keeping costs visible and managing risk so that the resulting software retains its value over time.
If your publicly-traded company depends heavily on software that isn't properly tested, then ethically this should be listed as a risk factor in your 10-K forms. Sooner or later, shareholders are going to figure this out and start holding these enterprises' management accountable.
> ... should be listed as a risk factor in your 10-K forms. Sooner or later, shareholders are going to figure this out and start holding these enterprises' management accountable.
And on that day publicly traded companies will start to implement comprehensive test suites. Until then they are just extra costs as far as (most) management is concerned and management loves to 'trim the fat' and eliminate costs.
You are correct, of course - but back in the old days that this article was written unit testing wasn't as widely adopted as it was today. Joel does recommend solutions to improving code, and I'll bet if he wrote that article today he would include unit testing as a part of the solution.
I'm assuming that if it doesn't have regression tests, the old code is untouchable / legacy code, because you can't add features to it without risking breaking something. If it does have regression tests, then you can rewrite it and only switch to the rewrite once it passes all tests. Joel's argument goes away, because the regression tests document exactly what obscure edge cases you wouldn't have thought of.
He is talking about commercial software where the only thing that counts is money in and money out. That a free software developer should never clone and rewrite another developers software is false. Lots of successful projects are rewrites or clones of existing software, like QMail over Sendmail and vim itself over vi.
Absolutely. As a vim user, one of my favorite things about it is how it is on virtually every platform -- it's similar to what the article says about Netscape. Is the code pretty? Probably not. But I would rather have my text editor be compatible with any OS I may need to use it on than it have a readable codebase.
Just because it's old, and you've fixed some bugs, doesn't mean that more bugs won't be discovered in the future. Perhaps the new code is in a language or paradigm that is less prone to certain classes of bug.
The "#ifdef maze" in OpenSSL was criticized by the OpenSSH/OpenBSD guys as a key part of the picture that allowed something like Heartbleed to happen: http://blather.michaelwlucas.com/archives/2071
Tons of ifdefs make it much more difficult to ensure that all permutations of flags are correct, or can even compile. I think I remember reading that OpenSSL wasn't even capable of being compiled with the standard, system malloc(). No one compiled it that way, so that build configuration broke without anyone knowing it.
Every time I deal with ifdef mazes I regret it. MonoGame is a familiar & nasty example, since it tries to be a portable reimplementation of a game framework (with ~a dozen platform targets/backends) but instead ends up being catastrophically buggy - as an end user or a maintainer you can't tell which functions are even implemented or syntactically valid.
I started using a SDL2-only fork of the same library (just one backend! near-0 ifdef count!) and suddenly all those nasty problems went away.
...is exactly the type of thinking that results is countless of wasted developer time wading through unmaintainable crap, which is never touched because "the code works" and everyone's to damn afraid to change it because it's so excessively complex because no-one bothered to go back and refactor it for readability, maintainability, modern libraries, or anything else.
"There is nothing to be gained." Not completely true, you could argue that it would be more maintainable in the future but I don't think that outweigh the risk of making changes here while it's still fully functional and nothing really needs it to change.
Yes, maintainability / extensibility is good. But I'd be wary of refactoring code purely for the sake of potential maintainability / extensibility, as it's work for no gain. You're trying to predict the future for the code, and that's a losing game.
If someone has a reason to extend the code, then you have the choice of hacking in the new feature, or refactoring the code to make it nicer. Each case really needs to be considered on its merits. But IMO it's almost always better to prefer to keep the code as-is rather than making big rewrites. Tiny refactoring is the way to go.
I'm OK with that stance, I try to advocate the same myself but you have to be careful of the boiling frog issue of making small incremental refactoring that lead to dead ends.
Stepping back and thinking about the long term is valuable. Doesn't mean you have to do the work now for that potential future but thinking of the potential drawbacks of the current design and it's limitation may guide the decision as to when it's appropriate to do a major refactoring.
Also this particular example seems like something that will probably rarely need modifications. I'd be interested in seeing how often this code changes. My guess is the effort to refactor this will probably be equivalent cost to the future changes.
Your definition of "works" doesn't include "ability to easily absorb change as the world modernizes". The world of computing is changing each year and any software that works well needs to adapt (or else be trivially small like many unix utils so the world adapts around them or abandons them).
I love vim and have looked at contributing a number of times but the sheer amount of cruft creates a huge barrier.
How do you know it works? Did you test it on every platform?
What you would have to gain is maintainability. Maybe it's not worth it; maybe it is. But saying that there is nothing to be gained is just an opinion.
I bet that it contains fewer bugs than a rewrite of the code will have!
That code represents years of tweaks, fixes and obscure workarounds. There are countless problems that you will re-introduce with a code rewrite, because the subtleties in aged, thoroughly-used code are not immediately obvious.
Yeah, on a re-read, my comment is too sure of itself. Never say never! But while there are always edge cases, IMO code rewriting is something that is almost never worth it. Especially a maintainability rewrite just for the sake of maintainability.
I'm so sick of hearing this tripe. Rewriting code is almost always a good idea because after you re-write it, you have some snowball's chance in hell of understanding it.
Oh, but Jeff Attwood said . . . Whatever. And then Jeff Atwood said something very different.
Blah, blah, blah.
If the engineer that wrote the code isn't at the company anymore, and no one really gets it now, rewrite it. At least you will have some chance of understanding the new bugs instead of failing to understand the old bugs (and there are bugs in that old code that you, for whatever reason can't read.)
Are you trolling? This is one of the most wrong things I've ever read on HN. To take the most obvious point - how are you supposed to rewrite the code if you don't understand what it does?
What about being inclusive as an open-source code base? If you simplify the code, remove obsolete cruft and start making use of common place modern libraries, then all of a sudden a bunch of people who were not interested in the first place because the code was way too complicated to get into are suddenly an option. And that's huge.
Yes, I don't think this is a WTF. What it really is, is an example of real-world code, that needs to work everywhere, not just in a specific app on a specific OS running on specific hardware... that of course makes it easy for things to be much more tidy.
Actually, best case is you'd have code that still works, but faster, with fewer bugs, and less likelyhood of breaking with future changes. Code working is not the only metric by which code is judged. How WELL it works is more important, for example.
unreadable code means the code doesn't work for 1 of its 2 purposes : conveying to the reader what it actually does. So it only accomplishes half its job
Simple. Just imagine how this code evolved. First, they started with a simple 1-liner, that took at most 5 minutes to come up with. Then another platform was added, this took another 5 minutes, etc.
Abstracting, however, would take perhaps a day of work. A full library such as libuv would have taken maybe a couple of weeks to develop.
If they added libvu support, we'd have a post about "VIM's 430 line keyboard input routine" instead of a mere 400. At least for a major release or so before they could flush the legacy code. If they can.
In that best result you also end up with code that is smaller and easier to extend in the future. Considering that Vim is still extremely common that is a win.
And is more portable, maintainable etc. And criticizing code is not exactly the same thing as saying it should be re-written, given the time investment and low yield.
Yes the code works, but don't write code like this every again.
This reminds me of a talk at last year's CppCon about modernizing legacy code. In it, they talked about a core output function of the MSVC standard library that had accreted enough #ifdefs over the years that the function signature alone spanned over 100 lines. It's a great example of how repeatedly making the minimal necessary change can drive code into an unmaintainable mess.
When the team found that the function was impossible to modify to accommodate C99 format string changes, they undertook a lengthy project to factor out the #ifdef'd features using abstractions available in C++. Not only were they able to turn the code into something much easier to modify, but they also fixed multiple hidden performance bottlenecks, so the end result actually ran 5x faster than the C version.
It's also a case where making that investment is worth it given the core nature of the MSVC standard library which makes it integral to a universe of other software. The global impact of a 5x improvement in something in every app makes it a no-brainer. But there's a lot of debt out there that will be replaced before the payoff ever comes.
Interesting Fermi problem for sadistic interviewers: what would be the global energy saving per invocation of this vim function if a 5x performance improvement were to be found?
There's a deeper underlying problem: the bit of code lacks any architecture. Even though libuv and others didn't exist back then, that's no excuse to just pull random descriptors from various parts of the program and stuff them in a select, repeating variants of the same logic over and over. Even creating a consistent notification/callback interface would make the code much more readable, as the underlying pattern is always the same: watch a descriptors and call a function when an event occurs. It's just lazy and ugly.
Your solution sounds like Enterprise Java to me - instead of understanding one long simple solution, you have to understand 3 shorter, less simple solutions.
How much would that have cost in speed 10, 20 years ago? Function calls aren't free, and their cost was pretty high before Moore's Law all but erased it.
There is nothing wrong with supporting lots of platforms, but those #ifdefs need to be encapsulated in wrappers functions (or macros). This is a classic example of premature optimization, actually. You use one or two #ifdefs directly because you hate to pay the cost of function overhead just to make the code easier to read. (Even though there's practically no point in tiny optimizations just before the code is going to wait for keyboard input.) A few years go buy, a few more situations are done via #ifdef because at least that way it's consistent. Eventually you have a nightmare function like this, where reading it forces you to read every possible version of the function simultaneously.
Or don't use them at all. For where the behavior must be different, create different source files with the same function signature. When building, link only with the implementation appropriate for the target being built.
I don't think this was done for the purpose of optimization. Usually this happens when you want to add support for one small change, and its easier to get a small patch in vs a big one.
Its not really the classic case of premature optimization either. At least not in my experience. That's more like 'I know I need a spatial partition here, time to research all the ways they can be implemented and their performance tradoffs, and implement a really good one" when you should have just used a hash table and done the other stuff if spatial lookups even show up when profiling.
Stuff like this is way more damaging for both codebase complexity and productivity then anything else. The golden rule is to only do enough optimization to make it easy to do the optimizations you might need to do later, but no more.
To my knowledge, maintaining compatibility is one of the stated goals of Vim, even at the expense of performance. NeoVim is supposed to strip out some of these antiquated features.
Also, it just so happens that the people working on NeoVim are currently porting all IO to libuv (the cross-platform library mentioned in the article).
That said, how does one go about submitting a "fix" for such a widely deployed software? Like, before submitting, I need to make sure it works on Amiga, VMS, Irix... and then all the current platforms. That is, like, a pretty high barrier... who in the world has all these platforms handy for testing? You submit a patch and someone says that it has a problem on Xenix 2.37.4 -- what do you do?
This is a classic case of why legacy code is a nightmare to maintain. It's the OpenSSL situation all over again, and one can definitely see the appeal of going through there and ripping out all of the functionality that nobody uses anymore just to make maintenance less of a nightmare. Luckily vim doesn't run suid root on any sane system, so all of this legacy cruft is not as huge of a threat surface on the machine.
It is still an issue, since vi/vim gets used for sudo purposes all the time. Any time someone uses visudo to modify their sudoers file, for example, that's giving root privileges to vim.
Visudo specifically avoids this problem by creating a copy of the sudoers file, invoke the editor as a normal user, wait for it to exit, check syntax and then rename the temporary file into the actual sudoers file.
When you visudo, you are subjecting vim to (1) your chosen input and (2) the contents of sudoers. If you are malicious, you could choose your input to be ":!", which gives you a root shell. If the contents of sudoers are malicious, they can give root access to whoever they want. So visudo is not a particularly dangerous scenario.
Of course, it's possible that some aspect of vim will, say, create a temp file insecurely as root, or read random shit out of /proc, or what have you, and be exploitable. But the historical security problems in vim (and there have been several!) have mostly had to do with editing maliciously-composed files with it.
Yep. There's nothing worse that maintaining code that doesn't change, or rarely changes. All that doing nothing because the code works, or works well enough, to avoid anybody noticing it. Yes, that's the classic example of why legacy code is a nightmare to maintain. When I think of maintenance, I think of constantly refactoring things that work.
While I see your point you are missing the point that Parent hits on: Maintenance. Sure if everything was working you wouldn't touch it but if you do need to make changes or add features at some point then having refactored it into something sane and readable makes a lot of sense. I doubt the Parent is suggesting we re-write everything just because it is old.
A slight nit-pick, though I was going to say the exact same thing - `|| 0`. If you do `|| 1` the compiler's going to optimize the entire 'or' expression out ;)
I was thinking that using `0` would cause a problem when you select none of those compilation options, but then I realize the code won't even compile if you don't select at-least one (You get empty parenthesis fallowing the '&&' in the 'if' statement), so it doesn't seem like a big deal.
It's been a long time since I've done bare C, but are there now IDEs/editors that let you toggle the define statements you want to activate which then format the code without the ifdef/endif cruft to let you better conceptualize the actual source?
I've just realized that Vim never exited with an error in my whole life, while pretty much all the other editors I've ever used did. While his code is old, it's certainly resilient.
To answer most questions here: Yes, nowadays there are cross-platform libraries you can use instead of implementing that yourself. And yes code that has grown for centuries and contains unnecessary things or patterns that aren't used nowadays can be refactored. Or at least that's how I'd interpret the article.
I really dislike the code review witch hunts you see on HN (or anywhere). On one side of the coin you have startups that merely want to get something thrown together with duct tape, working and ship so they can refactor later and clean things up. On the other hand you have people writing blogs posts to humble brag their code reading and blogging ability.
No one knows the circumstances that created this original code. The developer may have been working on 10 projects and threw something together just so it'd work.
And if your developer time is better spent on producing more features than cleaning up your previously made code, you'll end up with 10.5k warnings. I'm sure that someone will freak out and cry, but as long as you can keep extending it and working with it, why fix it?
Of course, technical debt builds up, and eventually you're badly locked in until you refactor, so it's a balancing act.
You're only looking at it from a sort of incompetent manager perspective. Those 10k warnings will produce exploits, they also make reasoning about the code much much harder costing time and money or even worse nobody understands it.
Totally in for the balancing act. And additionally in an open source project it's possible to refactor for code beauty, because you can choose so for other reasons than resource management.
It would be so amazing to be able to take all of those ifdefs and turn them into type variants and have the compiler enforce the safety and semantics of the system. And it would be phenomenal for old code to break at compile time when the environment changes, instead of waiting for a bug report to roll in.
So, paging T. Petricek...when are we gonna get it in F#? :) Even better if someone can get it into Rust! That is the perfect type of feature for a systems programming language.
Can't believe people are going mad over a code snippet when they don't know under what coditions they were written and under what coditions the code base is being maintained.
Does this have a noticeable performance impact for a typical vim end user, using it in an interactive file editing workflow? On any modern hardware?
I suppose there may be some negative performance impact if we need to use it for an automated/batch workflow (I can't think of many where something like 'sed' won't be better suited).
You can refactor it into multiple functions (#ifdefs around a line containing just "||"? seriously??), some of which are no-ops on some platforms / configurations, and let the compiler deal with inlining the functions that exist and optimizing out those that don't. (Not that you would have noticed the overhead anyway, probably.)
I agree with you. Not all of this can be refactored into separate functions, but the parts that can't can be refactored to be much cleaner after you remove the #ifdef's made unneeded by the functions.
Generally speaking, a lot of the cross-platform stuff in the Linux Kernel is handed this way, and done correctly there shouldn't be any over-head.
Probably with heavy code duplication though, as eg X clipboard management and X session management appear to be cross-cutting concerns. I mean, if you bury the ugly in just one function, I don't see how everyone is calling this horrid engineering.
edit: I read this [1] and maybe the ugly isn't localized =P
edit2: since you're reading this, thank you for ag. I use it every day.
The obvious thing to start with is to get rid of MAY_LOOP and arrange things so there's always an infinite loop, and there's always a finished flag, and the finished flag just never gets set to FALSE in the cases where you don't want to loop. Now you've lost several #ifdef...#endif clauses, and there's no #ifdef nesting.
One way of tidying up the inner parts a bit might be by splitting each FD handling code into an init part, and a part that checks for input and a part that checks for error. For example, here's the code for XSMP, whatever that is. This approach is pretty easy, because you can do it with copy and paste. That's exactly how I did it and that's how I can actually present you code:
(You could argue about this - for example, should ShouldCheckXSMP() maybe always just be ``(xsmp_icefd!=-1)''? - but the way the code is written, this puts all the details in one place, and the logic in another.)
Then the init code would have this bit:
#ifdef USE_XSMP
InitXSMP();
#endif
And after your poll/select code - which you'd similarly hide in a function or a macro, which I've here assumed sets a flag called `any_events' to say that there were any events that might need looking at - you'd do the business like this:
#ifdef USE_XSMP
if (any_events && ShouldCheckXSMP())
{
if (IsXSMPInput())
{
busy = TRUE;
xsmp_handle_requests();
busy = FALSE;
if (--ret == 0)
finished = FALSE; /* keep going if event was only one */
}
else if (IsXSMPError())
{
if (p_verbose > 0)
verb_msg((char_u *)_("XSMP lost ICE connection"));
xsmp_close();
if (--ret == 0)
finished = FALSE; /* keep going if event was only one */
}
}
#endif
(usual disclaimers for forum post code apply.)
So: the actual logic is handled in one place, whether or not you're using poll and select, which woud be my key criticism of the code as it stands. And I don't mind having code like this in a #ifdef, if it's only one level deep, particularly if it's somewhat formulaic, which this function would end up being if you approached it this way.
Then repeat for all the parts, and do a bit of work to declare the right variables at the top of the function (something I've just completely ignored).
If you'd prefer to be able to step through it in your average debugger - which tends to do a poor job with #defines - you could do the above with functions, but you'd probably need to move all the state into a struct so that you could pass it around more easily.
Perhaps you could have a mini wrapper for poll and select - for this sort of level of use I'd probably write something local to the file, since it's not so much a separate layer, or a library, or what have you, as just some little helper functions to stop the calling code becoming too awful.
You could always have some kind of extensible function pointer-based system whereby a given descriptor has a callback to be invoked if its FD had an error or has input, which would give you the opportunity to have each subsection of the code supply a low-level function in its own file. (For example, say xsmp_icefd is global only because this function needs to use it - now it could be static to the xsmp support file, which would need only expose a function that would be called from here when input was available or there was an error.)
And so on, and so on. I've worked on this sort of thing quite a lot over the years. There's always a way of doing things that doesn't involve a huge gnarly pile of nested #ifdefs and control structures inside #ifdefs. Either of those, let alone both together, are a good sign that you've taken a wrong turning somewhere.
(Some people are doctrinaire about never including any platform-specific #ifdefs anywhere in the first place. I'm not - but those people definitely do have a point.)
I assume portability is why the function declaration is old K&R style and not ANSI prototype style. I'm always surprised when I see K&R style in modern(ish) code. I do miss the ability to declare multiple parameters of the same type without repeating the type name, though. Oddly, several modern languages (like D) seem to think that's a feature.
I suppose it's because I've spent quite a lot of time in the GNU Readline source that this function doesn't seem that bad. Not that I'm wanting to endorse it as great code, of course.
I find it fascinating to look at how old but very well used software develops (often, but not always) in ways that seem completely ghastly.
JOE never tries to use poll or select (it uses multiple sub-processes writing to the same pipe to implement input multiplexing). This archaic way works on all version of UNIX, even very old ones which don't have select() or poll(). No #ifdefs needed :-)
Output processing in JOE is interesting: when writing to the terminal, JOE periodically sleeps for a time based on the baud rate and amount of data. This is to prevent too much output buffering so that at low baud rates you can interrupt the screen refresh with type-ahead. If you don't do this at 1200 baud it's a big issue (you could wait for hours if you hit PageDown too many times).
It's ugly, but vim is one of the very few programs I've always seen ready and available on any machine I log into. Even the login shell used/available changes, but vim is always there. So they are doing something right.
EMACs also does this with a VM made from a DSL written in LISP. Of course, I never use either VIM or EMACS ... I am mostly in VS land 99% of my time. I can't even try to imagine what THAT code most look like.
It would be interesting to run this through the preprocessor and see what it comes out as on a real Linux system or something. I can't really tell through all the #ifs, but it would certainly be much shorter.
Speaking of legacy platforms, anyone know of a vi/tinyvi/etc work-alike for CP/M?
I'm going to be doing some hacking on z80 assembly via Yaze, and it would be great if I could have the editor I'm used to, natively on the platform (rather than having to edit in OSX or whatever).
there is no excuse for this. its trivially refactorable into something easier to maintain with less ifdefs if you are willing to include/exclude files by platform. (and if not, just wrap whole file contents in single ifdefs)
this is a monster that has grown over time without any proper love.
To be very frank, I've seen much worse. "Functions" more than 400 lines long, perhaps even 1000 or more, riddled with more macros like in this sample than you can shake a stick at. I'd share a sample of this ghastly stuff, but it's closed source, alas.
I wonder how many examples there are of a more beloved UI (among coders) than Vim's keyboard input. For myself, its exceptional keyboard responsiveness is a essential part of what makes Vim feel so effortless, so hardwired to my brain.
> Vim tries to be compatible with every OS, including dead ones such as BeOS, VMS, and Amiga.
There are people working on BeOS revival: Haiku OS, they would (rightfully) quite annoyed if vim dropped support for the OS they're trying to resurrect..
What are some examples of open-source project that are in daily use by millions of folks that don't have some spaghetti code? I am sure there are some out there, but I really can't think of any at the moment.
If I use vi, vim or neovim as a user I eill not look at the code first, I will look at availability. If I am working on a RedHat server for a customer of mine I will use vi cause it is already installed.
I'd like to see what the code looks like after it's been through the preprocessor in a modern Linux environment -- I'm guessing a hell of a lot shorter?
Wow that's rude. No need to insult someone's upbringing on a thread based on their musings about a snippet of code.
Also you're wrong. Critique is useful in and of itself, not just as a direct means to getting something fixed. If person A doesn't have the time/skill to fix something, they can critique it, hope that person B sees the critique and goes on to fix it. Also, by critiquing it, it starts a discussion about what caused the problem, and how it might be fixed, which is useful for third parties reading the critique: now they know what not to do in their own code.
If you're censoring all of your criticisms just because you don't have the resources to execute on your own, you're hindering discussion and doing a disservice to the community. In a way you're bikeshedding.
Imagine if you were at a coffeeshop and the barista makes a racist remark about the customer in front of you. Do you sit there quietly and let it happen just because you don't have the power to fire him? Or maybe a less loaded example: suppose your friend gets scammed into paying $2000 for a $900 laptop, do you tell him? Doing so might make him feel stupid, but if you care for his well being you'll tell him so that he doesn't do the same thing again.
Criticism isn't rude, it's useful. Insulting someone based on two sentences they post on the internet isn't useful, it's rude.
The 'ol "raising awareness" argument is way too slippery to be nailed down and often leads people to finding exactly the interpretation w/r/t utility that they want.
Regardless, I think it's pretty reasonable to suggest that telling a community of developers "old code is a bit shit" isn't useful. Especially when the code in question is on a codebase that already has a serious modernization effort underway. And there are already numerous better presented articles online introducing the ways open source could use someone's contribution and calls the reader to action.
Sitting around and "providing criticism" ad nauseam is also a hindrance, disservice, and bikeshed.
The author of the blog post[1] associated with that code sample has, in fact, attempted to submit patches to the original vim codebase. Many people have. They were rejected. That's why neovim exists now.
So this is providing criticism in the service of providing a solution.