Hacker News new | past | comments | ask | show | jobs | submit login
MIT develops new tool that can interrupt infinite loops (bostinnovation.com)
106 points by pgatzke on Aug 2, 2011 | hide | past | favorite | 67 comments



This is a good line of investigation. I don't understand all the harrumphing here, except possibly as a means of self-congratulations at to why the harrumpher in question didn't bother inventing this tool themselves and how they understand the "halting problem" in a way not known at MIT.

As the economist famously joked in response to 'how are you'... "Compared to what?" The comparison here is to killing the program and losing all the data.

Obviously there are some big limitations on this - especially given that the state of the program isn't necessarily entirely defined in terms of the process itself, but also a series of possibly stateful network connections and os resources.

Another big limitation is that a straight-up infinite loop at the level of the binary isn't necessarily the only way that a program could get stuck - we could be going around quite a complex loop at the binary level in response to an _interpreted_ loop, and killing the whole interpreter isn't necessarily going to return the program into a usable state.

That being said, what else are you going to do in this case? A number of the alternate solutions here seem to involve time travel.

Some fun possibilities - if you can replicate the OS state, and there's no important network state held open by the program, you can make a bunch of copies of the broken program and try lots of different approaches. You might also be able to fire up the program cleanly, get to the main event loop, identify what that looks like, and just magically splice that state back on top of the broken program in the hope that whatever is happening in global data structures is self-contained enough to recover from.

I understand that there are many reasons why this probably won't work... that being said, the baseline here is zero (especially if the users are made very aware that this program isn't a safety net of any kind and don't become more careless thinking that there's a magic 'fixit' program out there).


I'm one of the authors of this paper and this comment pretty much hits it on the head. Jolt is an investigation in developing tools to help users get more use out of their programs when the alternative is failure. Because the reality is that there will always be programs that fail.

Most of the comments here are pretty fair about where Jolt might not be applicable (busy loops) and how this approach could be integrated with development; we touch on many these points in the paper (already linked by someone in the comments).

All of our emails are on the interwebs, so feel free to ping us if you want more information.


I've done this with MS Word and the VS debugger. Word crashed while I was editing a huge document. I simply skipped a few instructions down, returned to Word, saved the document, and restarted Word. When your options are either 1) crash, or 2) possibly recover to some working state, it's pretty easy to choose option 2.


i wish i could do that for safari


You probably can:

  $ gdb /Applications/Safari.app/Contents/MacOS/Safari $PID
iTunes, annoyingly, uses Apple's stupid little "please don't ptrace me" flag, which is a minor inconvenience, but fairly readily circumventable.


I wonder if that's to make it harder to crack FairPlay.


"Don't ptrace me, bro!"

Sorry, couldn't resist.


Not even a single mention of the halting problem?


I was surprised too. I expected a mention of it, and a link to the poetic and intuitive proof-story of why its a problem would be good for the story too.

In case other HNers haven't read Scooping the Loop Snooper, I (re)submitted it:

http://news.ycombinator.com/item?id=2838488


The old thread with discussions on why the proof works: http://news.ycombinator.com/item?id=1321899


This wasn't written for "us". Look at the homepage: http://bostinnovation.com/


Noob question time: what's the halting problem?


The halting problem:

Given a program and an input, determine whether that program, given that input, will ever halt, or if it will loop infinitely.

One of the fundamentals of academic computer science is learning that it is mathematically impossible to solve the halting problem - which is why the grandparent comment simply notes "the halting problem" with no further explanation; it's a famous problem. (Don't feel bad, everyone has to learn something the first time!)

The poem linked elsewhere in this comment tree is a simple and pithy description of the mathematical proof.

For (many) more details, Wikipedia has plenty of information.


Thanks for the explanation! No shame here (I didn't study CS in undergrad) but I find it intuitive that a program can't detect an input that would loop indefinitely.


If that's intuitive for you already (great!) consider one step further: Rice's Theorem. Rice's Theorem in layman's term is "all nontrivial properties of a program are undecidable in the general case". "Undecidable" is the difficulty class of the halting problem (there are actually harder problems!).

A couple accessible examples:

1. Constant Propagation - we know that we'll never detect all compile-time constants because of Rice.

2. Exception frequency - I can write a function that has "raise new FooException()" in the code, but the function never raises, and you won't be able to prove it never raises.


Well, you won't always be able to prove that it never raises. In many cases, you can, which is why compilers can eliminate dead code. Sometimes the compiler can prove the code is dead, sometimes it can prove the code is live, and sometimes it just doesn't know. For any prover, there will be code that defeats it.


The halting problem is an profound bit of theoretical CS well-explained on this thread.

However, the actual halting problem is moot in most of the references I see to it, as the halting problem is usually misused by the half-educated to make claims that 'nothing useful can ever be inferred about the behavior of any program, ever'.

For example, we built a tool to either estimate a reasonable ceiling on the stack usage (whole-program) or give up; it works very well and can detect when the structure of the program prevents this tool from operating correctly (nasty stack mischief or stack adjustments appearing in 'unusual' places). Quite useful for us, especially because we control the source code and know that we don't do naughty things to our own stack.

Entertainingly, on the way to build this tool, we encountered at a couple screeds about how this problem just plain can't be solved because it's equivalent to 'solving the halting problem'. Well, yes - in theory, but no - in practice.


My undergraduate thesis tackled static analysis in Ruby. One of the problems I attacked was "does a given method yield to its block?" I broke methods into a few classes, two of which are "Block-Required" and "Block-Optional."

I had to write software which proved a method optionally yielded, which isn't just undecidable, but unrecognizable. For non-CS folks, that means it's actually harder than the halting problem. Yet my software works on nearly all Ruby code I throw at it - it only really has difficulty on complex delegation patterns. Unsurprisingly, the code people write to create Block-Optional methods is highly structured.


That problem gets hard very quickly when you deal with recursion. How did you deal with that? (if you did)


We control the source code in question, so we deal with it by detecting unexpected recursion and screaming bloody murder.

We have a 'little bit' of recursion where we know that we won't go around the recursion more than once and we've hand-hacked that it. We have similar hacks to deal with indirect calls.

I'm not saying we've solved the problem for arbitrary code; I'm saying that solutions that fall very far short of solving problems for arbitrary codes are still enormously useful. Many compiler optimizations don't work for 'arbitrary code' either, for example, and know just enough to bail out when they see an irreducible flowgraph or a bizarre indirect jump, etc.


It's mathematically impossible to write a program that always correctly determines whether an arbitrary piece of code will halt (rather than get stuck in an infinite loop). Making that determination is referred to as halting problem.


Basically the halting problem states that it is impossible to determine whether or not a given program halts under the assumption that it is executed in a turing complete machine.

Due to the fact that a program is a composition of programs which at the lowest level are loops and instructions this also means that it is impossible to determine whether or not a loop halts.


It's only impossible to write a program that can be feed with any completely arbitrary program and input and yet determine all the time if it will halt or not.

What few people realize, is that most completely arbitrary programs are very uninteresting. More interesting programs have characteristics that would very often allow to statically prove that all of their loops that should terminate indeed will terminate. And even interesting loops for which it would not be in the first time possible to determine that often gain in clarity from being slightly changed so that the tool can do its work.

IIRC MS has a tool that does sort of that for windows drivers.


Due to the fact that a program is a composition of programs which at the lowest level are loops and instructions this also means that it is impossible to determine whether or not a loop halts.

It is easily provable that this is theoretically possible, but the required time is too long for it to be applicable. Suppose your memory has N states. Run your program through N+1 computation steps and if it's not already finished, then it will never halt.

Thing is, N is very large for real memory, so this straightforward approach is not applicable. However, there's much room to improve this algorithm and I assume that's what the MIT folks did.

True halting problem is about computational devices with infinite memory.


It should be made more clear that 'memory' here isn't the only thing in play. Register states, as well as the state of any IO subsystems are also important. If you're not looking at IO, then polling for events to happen based on network traffic will be shown to never terminate, if that event doesn't happen during the testing.


It also may not be applicable if you don't actually have access to all memory states involved. E.g. when communicating with a 3rd party computer over a network.


It still is applicable, as long as you have a bound on the number of unknown states, however the number of computation steps in this case is so high that it is even less practical than the totally unpractical algorithm I proposed in my first comment.

Let's consider both computers (yours and 3rd party) as a single computational device, with a state represented by a pair of symbols, where the first represents your computer and the second the 3rd party's. Suppose your computer has N states and the other M. It's enough to run N * M + 1 computation steps -- if your program hasn't ended so far, it will never, for there's a certain state (a, b) that must have occured twice, by the pigeonhole principle.

Please keep in mind that this is all purely theoretical, and not at all practical. It's just a counterargument to other purely theoretical argument from unsolvability of halting problem.


  def is_collatz(n):
   s = set()
   while n not in (1, 2, 4):
    if n % 2:
     n = 3 * n + 1
    else:
     n /= 2
    if n in s: # Non-Collatz loop
     return False
    s.add(n)
   return True
This is theoretically bounded far below modern memory limits. Feel free to tell me the first natural number for which it returns False. :3


Why would that be bounded far below modern memory limits? Why would that be bounded at all?

Secondly, you're making the logic error A => B implies not A => not B. If state spaces as large as modern memory limits allow are too large, that doesn't imply that state spaces far below modern memory limits are small enough to handle.

Thirdly, your program trivially terminates doing nothing.


Troll-feeding time!

Pretend that an OOM condition is the same as a False answer. :3

Here's a program, building on the previous one, which either terminates or doesn't.

  from itertools import count
  for i in count(1):
   if not is_collatz(i):
    break


So how is the state space then much smaller than modern memory limits? That still ignores my second point, namely that the logical reasoning that the xyzzyz somehow implied that programs that have state spaces smaller than modern memory limits are in practice analyzeable for halting/not halting, is not sound.

I can tell you now that your code terminates. The set s is strictly increasing in size, inevitably going towards an OOM condition if the loop doesn't terminate by other means. Note that we are talking about determining halting or not halting, not with which value the program halts.

This is not a troll. You tried to respond with something witty. I just pointed out that in this case things like the collatz conjecture don't apply, because it is a conjecture over all natural numbers. When you are working with finite memory, there is no way to encode this conjecture into a program (e.g. by making it halt if the conjecture is true and making it loop when the conjecture is false).


I don't understand your point. I could run it with memory constrained to, say, 100 KB, and it will quickly produce an OOM condition. I clearly said that the approach I sketched in my first comment is not practically applicable.

Real-life computers have bounded memory, so they don't have any more computational power than, say, regular expressions matchers -- every computational task performed by real computer can be represented by appropriately complicated regular expression. That's why one needs to be especially careful when arguing from computability when talking about real-life problems.

The more serious issue is a treatment of I/O.


Does your computer have an infinite tape?


The halting problem doesn't apply.


Seems to me like this sort of thing could be pretty good for advanced users who understand it's limitations and potential failure modes, but a nightmare waiting to happen for everyone else.

And of course I'm also inclined to suggest that advanced users would be better off attaching gdb to the process to get done what they need to, then coredumping and contacting the developer...


Aren't event loops infinite loops where memory might be unchanged for a long time?


Presumably the user would not activate Jolt in that case. And technically, unless the loop uses polling, it's not an infinite loop because the program isn't even running.


Even an event-driven loop needs to cycle now and then to service timers.


The only way this would be valuable for every day use would be for programs that produce "close" output. Think of things like large scale simulations where sub-simulations might go infinite, but if they do and this causes the output of the sub-simulation to change, it isn't likely to cause major changes in the macro-simulation.

Very little code is written that way, but when code is written that way, it allows much more interesting things to happen in software and hardware both.


I think you missed the obvious examples of content-creation applications like Word or Photoshop, where you could interrupt an infinite loop, save progress, then restart.


I didn't miss them. I think those will be much harder to reliably interrupt loops, especially around file I/O. That infinite loop was caused by bad state and those programs (currently) don't handle bad state well.


A good autosave mechanism seems to be a much better solution than this. While that would mean the loss of a little bit of data, it seems better than saving a potentially corrupt file or incorrectly inferring that a loop is infinite.


This seems like the software equivalent of smacking a car starter with a hammer to get it to turn.


Well more like smacking a Toyotas accellerator to get it to stop ;-)


First of all the solution of this problem is computationally impossible so this tool makes me feel realy uneasy I'd never use it -- it would actually make predictable easily reproducible bugs impossible to report when this tool is activated. It makes an assumption that infinite loops are just stuck with a false condition and nothing else is affected by that, which I believe is fundametally false.


It appears to take memory snapshots. If it detects a repeat in the snapshots it knows that the program will loop indefinitely. Of course it can't detect all infinite loops, but a sizeable subset.


In theory I believe it can actually detect all infinite loops, for an extremely generous definition of "can detect". On a finite-memory machine, a computation fails to halt iff it eventually repeats a state, which it must do in finite time, since there are only finitely many possible states (the finite-ness is what makes the halting problem not apply). However, that finite time might be extremely large, perhaps longer than the age of the universe. A http://en.wikipedia.org/wiki/Busy_beaver function gives an upper bound on how long it can be.

But yeah, the practical use is to tell you when a computation is already looping, for the large subset of infinite loops that result in states being repeated relatively quickly.


> On a finite-memory machine, a computation fails to halt iff it eventually repeats a state

This is too strong of a statement. Lots of software simply loops, repeating the same state over and over, until an external interrupt occurs.


Good point; I guess I was thinking of the simplified textbook case of computations with no external interaction. I wonder if this MIT system is able to exclude the most likely false positives, like polling for input? On the other hand, maybe polling too long for input counts as an infinite---or at least, too long---loop for their use-case of interrupting user programs that seem to be hanging, so that the user can save.


The busy beaver function gives very loose upper bound... A better bound: If the computer can be described with $n$ bit of information, and a program did not terminate with $2^n+1$ operations, then the program has a infinite loop.

$n$ is usually the memory + cpu cache size.


Only if you take a very liberal definition of memory. The following probably isn't an infinite loop, even though it'd look like one under normal definitions of memory (either RAM or RAM+disk):

    $ while ping -W 2 -q -c 1 www.google.com; do sleep 10; done
Eventually, the network will go down.


It only takes memory snapshots to find next end-of-loop instruction in binary code to jump to it, with source equivalent of it being:

   while True:
      if loop_var > 9000:
        break
      loop_var += 1
My main complaint that might as well detect a repeat in snapshot but I really doubt that it detects extraneous changes in loop variables that might be caused by stray infinite loop logic.

From programming perspective it's better to crash hard and let the error be known rather than fail silently and introduce more sublte errors.

From user perspective, what is the point of saving file in a program that is falling over when you could potentially save a corrupt file and instead of retaining some of the work the user will end up with a blob of useless data. I guess you could do Save As... and then manually compare changed data with last save. Still I'd be extremelly suspicious of it.


From programming perspective it's better to crash hard and let the error be known rather than fail silently and introduce more sublte errors.

From user perspective, what is the point of saving file in a program that is falling over when you could potentially save a corrupt file and instead of retaining some of the work the user will end up with a blob of useless data. I guess you could do Save As... and then manually compare changed data with last save. Still I'd be extremelly suspicious of it.

I am absolutely shocked that nobody else has even hinted at this. I wouldn't recommend this tool in almost any circumstance.


I think the better way to do this would be to throw a "JoltException" which you could catch in your program if you wanted to, like any other exception.

It seems like it'd be a huge mess to try to write code succeeding the loop which tries to handle the case where your loop code didn't complete like you expected it to.


OP looks like a spammer for bostinnovation.com: http://news.ycombinator.org/submitted?id=pgatzke


There's a fuzzy line between spammer & contributor.


I made the comment because Reddit has some bostinnovation.com spammers: http://www.reddit.com/user/joeymullette & http://www.reddit.com/user/chezral

I'm guessing they have employees who just post to social news sites.


my experience from debugging is that breaking out of a loop in this way is very hit and miss - coupling that with my experience of common office software failures being hard crashes rather than hangs and i can't imagine this tool being /that/ useful. I'd much rather have a tool that does autosaving at a very regular interval - I've written many throwaway tools for that in the past mostly when dealing with old versions of excel being abused to solve database/code problems... still this is better than nothing i guess

as for the nature of the halting problem meaning this can't work??? that only displays a lack of understanding...

all in all this article and the comments reinforce my view that academia diverges too far from the practical realities of software development. sure we need academic research and education, i won't deny it, but the noise making should be reserved for after practical success has actually occurred.


It's common for the runtime of game scripting language to have infinite loop detections - for example if loop takes more than 5ms and hasn't reached (or called) specific point/mark then it's reported, stopped or restarted.

But going to the next line...


Link to the full paper for those interested: http://people.csail.mit.edu/rinard/paper/ecoop11.pdf


TLDR: Power button or inconsistent results, now you can choose.


What if you wrote an infinite loop, and wanted it that way?

There are plenty of legitimate uses for an infinite loop, a REPL, a server, etc.


Oh look! there is a little note: "Do not use if you need an infinite loop not to stop."


The RunStop key?


Can't think of a use-case for this? .... maybe if you have some seriously "legacy" process running somewhere that loves to stall but patching is not an option for some obscure reason? ... space probes?


I thought the message at the end about people expecting full correctness was a bit off; after all, if we expected fully-correct programs, we would be proving them instead of writing C.


They're poking fun at the formal methods community, which spends a lot of time doing research in this direction. Percolation of these techniques back to industry has been... less than stellar.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: