Hacker News new | past | comments | ask | show | jobs | submit login
A 35-year-old bug in Patch found in efforts to restore 29-year-old BSD (bsdimp.blogspot.com)
238 points by fanf2 on Aug 17, 2020 | hide | past | favorite | 32 comments



> 35-year-old bug

Long ago, someone was archiving magnetic tapes at MIT, containing Lisp Machine backups, from longer ago. Or maybe they were TOPS-20 backups... my memory has faded.

Archiving here means using an old 9-track tape machine with a custom driver, to copy data off now read-once 9 and 7 track tapes. Read-once tapes, because after the tape goes by the read head, the tape's plastic backing goes one way, and the tape's rust goes another. The backing is rewound, and the rust makes a scattered little pile. The original driver would backup and retry on error. Scrubbing back and forth, back and forth. Which here, would be bad. But back to our story.

On one such longer-ago backup, was a core dump file. A core dump, with a snapshot of the frame buffer. A frame buffer showing someone's screen, at that moment of the core dump, so long ago. And at that moment, that someone was being pranked. Pranked by a program which would draw, crawling across the screen, a little spider. A little bitmap sprite bug. A bug, trapped by chance in a core dump, and preserved in rust.


Never saw the spider, but I do remember "xroach", in which roaches would scatter from underneath where a window was after being closed.

Great fun from about 25-30 years ago.

EDIT: It looks like xroach still lives! https://www.freshports.org/games/xroach/


A bitmap spider figures into a display-hack demo in Open Genera that I could never get working. I think it may be more related to that.


That sounds right - tnx!


> xroach still lives!

Ha. I've an old AR head-pans-around-desktop hack. Xroach in AR might make fun video...


Usually when I hear "might make fun video..." I'm intrigued.

I'll pass on this though.


Fragile e-amber


There's a whole paper about crabs, it's surfaced up here on HN before and it's fantastic.

Let me know if this is the same thing you're talking about: https://news.ycombinator.com/item?id=22383205


My fuzzy memory is of a solitary bug. But my memory hook is "bug in amber", which could have pulled askew cardinality and subphylum.

A program like crabs could be seriously disturbing in VR and AR. A transparently declarative framework like A-Frame might permit 3rd-party object eating. Or more difficult, there's various "visible body" code... imagine watching your arms eaten down to muscle and bone. Or to gears, depending on avatar. Or the people you're chatting with. Eeep.


Were these tapes designed to be read once(if so why), or were they degraded with time?


This is an interesting choice to make. There is an option to risk incorrectly applying some corner case of well formed patches, and a risk of incorrectly applying malformed ones that are working silently today.

I can see wanting to avoid the former. The latter also seems bad though. Rarely is it ever made so clear that fixing bugs in a popular tool also leads to compatibility issues. People often cite that concern to avoid fixing bugs, but this is a legit case of that dichotomy.


On a related note I've found finding information on the patch command to be a bit harder than normal. Without digging into the code I was trying to figure out why some people name things in the first two lines with an ending of .old. Some common search queries really have a hard time bringing this up. One link I found was here but didn't really go into much depth:

https://stackoverflow.com/questions/987372/what-is-the-forma...


If I'm understanding you correctly you are wondering why some filenames are named .old in the header of patch files.

If so, then that is likely an easy answer. No one creates patch files by hand, they are created by diff (or similar tools).

And diff, at least the Unix variant, is used to compare two files and produce an output that will change the first of the two into the second of the two (i.e., it outputs the 'differences' between the two files, hence the utility's name 'diff').

Well, to have two files, one of the files has to be given a different name. So if one is editing a single file, and is not using some kind of source control system that tracks changes, it is common to first do:

    cp file_to_be_edited file_to_be_edited.old
Then, edit "file_to_be_edited".

Then produce a patch by doing:

    diff -u file_to_be_edited.old file_to_be_edited > file_to_be_edited.diff
And since diff puts the filenames of the two files it compares into the header lines of the output unified diff format, you get a file named
.old showing up in the output patch file.


Well I just want to stop you right there and say that I have sinned and written diffs by hand, and what’s more they were for a sendmail.cf, and if this conjures visions of a cantankerous sandal-wearing Unix admin then so be it


I just got flashbacks. My night is now ruined.


If you want to make a patch, you need a starting file and an end point. So you take a pristine tree and call it "foo.orig" or "foo.old" and then run diff -ur foo.old foo

Maybe younger developers don't know this experience because usually their patches are generated by git diff and the like. But before that stuff was as common as today, you would need to copy the old files off somewhere to generate a patch. At that point you need to name it something.


And for some projects, you might not have the disk space for a full 'clean' tree, so you copy just the 2 or 3 files you're editing into `file.old`. And when you're constructing the patch, you hand craft the order in which files are shown to make sure the code lines up to the narrative that accompanies the patch.


As 2.11BSD code is, I think, at least partially prior to the settlement with AT&T, does anyone know what, if any, files in it are still encumbered?


The Version 6 (7?) UNIX code that BSD is based on was re-licensed to a 4-clause BSD license in 2002.

http://www.lemis.com/grog/UNIX/


It’s too bad caldera didn’t have ownership to give it away, rather they had rights to sublicense it.

Should have bought a $100 ancient Unix license instead.


Did not know that. Thank you.


2BSD is covered by the Ancient UNIX license though the status of the license is a bit murky https://virtuallyfun.com/wordpress/2018/11/26/why-bsd-os-is-...


The question too is whether UNIX is copyrighted due to the pre-berne convention distribution of the system w/o copyright notices... And there's also a question of latches as well should this ever be litigated (I suspect it won't, but I never anticipated the SCO suit, so there you go).


I cannot reproduce this with an installation of GNU patch 2.7.6.

I could be doing something wrong or misunderstanding the bug.

Input files:

  $ cat patch-bug-test
  How
  now
  brown
  cow?
  Now
  is
  the
  time
  for
  all
  good
  men.

  $ cat patch-bug-test-2
  How
  now
  brown
  cow?
  Now
  is
  the
  time
  for
Context diff:

  $ diff -c patch-bug-test patch-bug-test-2
  *** patch-bug-test 2020-08-17 11:39:03.056723058 -0700
  --- patch-bug-test-2 2020-08-17 11:41:46.683095324 -0700
  ***************
  *** 7,12 ****
    the
    time
    for
  - all
  - good
  - men.
  --- 7,9 ----
Apply in reverse to copy of patch-bug-test-2:

  $ cp patch-bug-test-2 patch-bug-test-3
  $ diff -c patch-bug-test patch-bug-test-2 | patch -R patch-bug-test-3
  patching file patch-bug-test-3
The operation is successful and the reverse-patched is now identical to the original:

  $ diff patch-bug-test patch-bug-test-3 
  # no output
I did try it with different numbers of lines removed.

Looking at the code in the GNU version, the function is quite different. That block of code is found, with the comment intact, but there are differences. It looks like this in the GNU patch repository, as of this commit: http://git.savannah.gnu.org/cgit/patch.git/tree/src/pch.c?id...

     if (!chars_read) {
       if (repl_beginning && repl_could_be_missing) {
          repl_missing = true;
          goto hunk_done;
       }
       if (p_max - p_end < 4) {
         strcpy (buf, "  \n");  /* assume blank lines got chopped */
         chars_read = 3;
       } else {
         fatal ("unexpected end of file in patch");
       }
     }
The unpatched FreeBSD one referenced in the article:

   if (len == 0) {
     if (p_max - p_end < 4) {
       /* assume blank lines got chopped */
       strlcpy(buf, "  \n", buf_size);
     } else {
       if (repl_beginning && repl_could_be_missing) {
         repl_missing = true;
         goto hunk_done;
       }
       fatal("unexpected end of file in patch\n");
     }
   }
The order of the tests is reversed, which could make a difference (though obviously only in the case when repl_beginng and repl_could_be_missing are true and the goto is taken). In the oldest baseline that is in the GNU patch repo (2009-dated), it is already this way, so to find the commit which affected this code we would have to look to earlier GNU patch sources.


I don’t think GNU patch shares history with Larry Wall’s original. https://directory.fsf.org/wiki/Patch seems to say so: ”GNU version of Larry Wall's program that takes "diff's" output and applies it to an original file to generate a modified version of that file”

I tried to verify by looking at their git repo, but that stops at “Import of patch-2.1.tar.gz” (https://git.savannah.gnu.org/cgit/patch.git/log/?ofs=450)


Nope. GNU Patch definitely is based on Larry's patch. But this bug is known since shar's need to workaround it. It's just that the GNU patch had that fixed very early on, BSD and probably others not. So shar still had to use the workaround.


Shar's workaround is for leading white space. This is change is for not assuming traling newlines.


It absolutely shared history with Larry Wall's patch.... It took the 2.0 sources and started improving them. I ran across a blurb about this somewhere. Also, the code structure is clearly derivative: variable names, function names, wording of messages.


> I don’t think GNU patch shares history with Larry Wall’s original.

Seems extremely unlikely to me that someone would independently write that if statement almost identically, with the same comment. These are very clearly descended from the same source.


... in a function having the same name, with a goto to an identical label name, and an error message string literal with exactly the same text, all in a file also named pch.c ...


Well, trivially:

    $ patch --version
    GNU patch 2.7.6
    Copyright (C) 2003, 2009-2012 Free Software Foundation, Inc.
    Copyright (C) 1988 Larry Wall

    License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>.
    This is free software: you are free to change and redistribute it.
    There is NO WARRANTY, to the extent permitted by law.

    Written by Larry Wall and Paul Eggert
But also, the GPL requires keeping a detailed ChangeLog, and if anyone's going to be a stickler about that, it's the FSF: https://git.savannah.gnu.org/cgit/patch.git/tree/ChangeLog-2...


True. I'd been looking at an older version of gnu patch when I thought it hadn't been patched to fix this... It looks like it had a while ago...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: