Self Healing Code with clojure.spec

jw- · on Dec 25, 2016

This seems like a very simple concept when compared with other systems using software transplants [1]. The most important thing I gleaned from this is the benefit of having 'type information' (clojure specs) available as first class data. The reason why this is so simple, which isn't a bad thing, is because clojure (and lisps) makes it easy to get metadata and work with it.

[1] http://crest.cs.ucl.ac.uk/autotransplantation/downloads/auto...

catnaroek · on Dec 26, 2016

The article clearly explains how self-healing code works. However, what I don't understand is why this is a good idea in the first place. If you can't correctly write a function that meets a given specification, what makes you think that some random recombination of code you've previously written will meet the specification?

mercurial · on Dec 26, 2016

> If you can't correctly write a function that meets a given specification, what makes you think that some random recombination of code you've previously written will meet the specification?

Absolutely. There are three additional issues I see with it.

Most code in our codebase deal with business objects, not with mucking about with integers. Very little of that code would have similar signatures. This would make, in many cases, for an empty pool of candidates.

Secondly, the fact that function X crashes does not always mean that function X is wrong, it may just be a case of an upstream function W sending the wrong input. If you happen to have a "no-check" version of X, the "self-heal" will happily let invalid data get into your system.

But the worse is that, while I don't know clojure.spec, I doubt that it's able to describe side effects. Who wants a program that invokes arbitrary side effects looking for a good candidate because writeData(data: string) -> void crashed with an error?

It's possibly an interesting research topic, but practically, it seems either useless or plain nasty.

cle · on Dec 26, 2016

Machine learning algorithms do exactly that--they "write" a function that humans couldn't write by hand. What makes you think we couldn't eventually do the same here?

catnaroek · on Dec 26, 2016

There's a qualitative difference between tuning parameters that are real numbers (exploring a continuous space with a meaningful notion of “marginally better”) and coming up with code on the fly (exploring a discrete space of syntax trees, which can only be done by enumerating them). Topological considerations can give you an idea of what kinds of things are worth trying.

msangi · on Dec 26, 2016

I wouldn't want self healing code in my codebase. Especially if I am on call.

Do you want reliable software? Fail fast, design with failure in mind and keep it tested by injecting failures every now and then.

I think the Erlang approach is very good and it doesn't need esoteric approaches

blueprint · on Dec 25, 2016

Good idea. I get that it was a demonstration of the application of spec but for a real implementation of self-healing code, in my opinion, there needs to be a way to verify that the problem the matched functions are supposed to be solving is the same instead of relying on some sort of statistical confidence via IO sampling.

cle · on Dec 25, 2016

Why? What's wrong with inducing the function?

(Genuinely curious.)

blueprint · on Dec 26, 2016

What's wrong with quantum mechanics? Same thing. If the problem isn't exactly the same then the outputs will actually differ in"edge" cases. And the problem is when trying to induce the problem we don't at first know to what degree of exactness we can make confirmations. So we either need to (a) take a long time to learn or otherwise know exactly what the problem is or (b) have the degree of wisdom (eyesight) to already be able to confirm that the problem-function will have no unexpected outputs at a glance by checking key features despite the wise person not having directly sampled/computed all possible outputs.

mattdeboard · on Dec 26, 2016

This is probably a reflection of my own incomplete understanding of Erlang & BEAM, but doesn't Erlang have the capability to achieve this same end? Hot-loading new code etc?

(This comment is not meant as a criticism of the blog post btw, just a slightly related question)

xapata · on Dec 26, 2016

Any programming language that can do metaprogramming has this capability, even C, but it's more pleasant to implement in certain languages.

devty · on Dec 26, 2016

It's still difficult for me to imagine a future where programs are organic instead of specific and dry machine instructions. I caught myself thinking of alternative solutions that doesn't require clojure.spec (e.g. why don't we make sure to never write candidate functions that isn't correctly "typed"? Then we can simply forget about checking the specs), but I think that defeats the spirit of this exercise - that tools like clojure.spec enables us to create more intelligent programs.

Fiahil · on Dec 25, 2016

Interesting. Is there a way to do something like this in Python?

xapata · on Dec 26, 2016

Yes. You can get pretty fancy with decorators, and there's always eval/exec if you need it. The stdlib has plenty of power tools for introspection.

The leg up that lisps have over Python is that Python's eval/exec operates on strings whereas lisps can easily manipulate the AST. (I'm not sure if I should call it abstract for a lisp.) Directly manipulating Python's AST can be done with the aptly named ast module, but personally I find it easier to do string templating and eval.

tveita · on Dec 26, 2016

I don't think I'd want code to self-heal on the fly, even less in anything safety critical.

But maybe a similar method could be used as part of static analysis to help programmers. It could look through your functions and search for library functions you could import to replace them with. Or it could look for functions that do very nearly the same, but handle edge cases differently.

holyOrIonsBelt · on Dec 26, 2016

You are arguing for and against the same logic, but defending one level of abstraction as good and another as poor.

Functions that you define take parametrized input and use predefined logical constructs and produce some output. Self-healing (diction demands a more appropriate word, perhaps rectifying as code isn't defined ontologically as holistic, though a reordering of ethics might serve that purpose one day), so: self-rectifying code does so by working in the boundary conditions set by the logician who defines its instruction, and merely iterates through code chunks and tests against user defined output until it is successful. The big O to achieve fairly complex goals makes implementation very manageable, I presented a paper on this at my alma mater a few years ago and was a bit surprised it had yet to receive much traction. Every parameter can be included in the logic to ensure space and time complexity restraints are met, as well as code style.

Would you argue against a medical procedure that heals your cells instead of healing your organ at the tissue level?

catnaroek · on Dec 26, 2016

How can this be used as a part of static analysis, when it works by running your code?

tree_of_item · on Dec 26, 2016

Because you can divide your runtime in to multiple stages: if program A runs and outputs program B, then from B's point of view A was able to perform a static analysis of B's behavior.

You can always view the term and type levels of a programming language as two separate but intertwined languages. In this case, both the term and type level would be identical instead of the type level being specialized with logic programming-esque constructs as usual.

catnaroek · on Dec 26, 2016

That's not what's happening in this case. You're fixing a program that has just run if it doesn't pass certain validations. This is pretty much by definition not possible in a static analysis, whose purpose is to determine whether a putative program is in fact worth running in the first place.

On the other hand, in a typed language, when you check whether a term “t” has type “T”, it's a precondition that “T” is a valid type. If you haven't established that “T” is a type, you have to establish it first.

Pica_soO · on Dec 25, 2016

Haven't we been there already? Systems that would run code, no matter what? The result was, that systems could be hacked by committing flawed code- which would execute completely different then intended without ever throwing a error message?

The Error this was, brought forth the fail early, fail loud approach?

edem · on Dec 25, 2016

Did you happen to read the article _and_ have basic knowledge of spec at the same time?

Pica_soO · on Dec 26, 2016

Yep, read it and its flaw is the assumption of "correct" code transfer from another application to the error region, resulting in "correct" code. In other words, it ignores the context. And such self healing systems have been around. I worked with "syntax-healing" compilers and code that linear interpolates upon error to a approximate solution of nearby earlier results. So dividing by zero is basically- nearby is 1 - the identity operator, with a minus in front if you divide a neg number. And that is just a solution for flawed code that returns numerical results.

You can not switch context that easily, without erecting a whole system of Meta-Information for every function. So instead of debugging, we start to fill out forms again, hoping- not knowing for certain that the result will be self repairing. And then the idea dies, to be reborn again, one generation into the future. Because self-repair system even in nature are often flawed and kill the repaired organisms.

If time and resources wouldn't matter- you would stand a better chance of solving this, by having a result fitness function (aka unit test) and a genetic algorithm, modifying the code.

eternalban · on Dec 25, 2016

Not intending to detract from the elegant and progressive thrust of the ideas here, when we consider software failure in context of distribution and higher level systems, we note that both CAP & impossibility results in that domain are applicable here and at best we can hope for episodic health checks and healing of distributed software systems. Specifically only at AP consensus points or CP system downtime for off-band repairs.

alimw · on Dec 25, 2016

Is this what happens when self-healing code starts to post on Hacker News :)

quickben · on Dec 25, 2016

So basically, error handling code? I apologize, but I don't see what's new in this?

lgas · on Dec 25, 2016

All of your error handling code re-writes functions in live code to replace failing function calls with calls to function that would succeed?

astrobe_ · on Dec 25, 2016

No, because that would be stupidly expensive. Besides, it would introduce a second problem: fixing the healing code - more specifically the parts that validate the candidate functions, which can be bugged as well as anyone who as written automated tests knows.

But I guess that saying that your system died from autoimmune disease is way more "exciting" than saying that it died from a segfault.

juskrey · on Dec 26, 2016

"system died from autoimmune disease" - this is gold!

I can't imagine why would anyone want to have an untested set of semibuggy functions just to replace them at will. However, I can imagine the tinkering code approach, not "self-healing", when we are expecting ALL functions to be good, it is just we don't know which one fits better to the situation. People actually do this all the time since the beginning of the history, but perhaps the discussed approach can introduce more abstract framework for that.