Hacker News new | past | comments | ask | show | jobs | submit login

While it's relatively easy to define a grammar for a severely restricted subset of a natural language that's both unambiguous and, given appropriate "vocabulary", sufficient to describe a wide variety of programming tasks — AppleScript is a readily available example — that's only a small part of the problem.

For instance, you also need to allow users to define "vocabulary words", including some way to import it from one or more external libraries that haven't necessarily been designed either well, or together. To remain unambiguous without sacrificing capability or becoming unnecessarily verbose, you then end up needing the same sorts of features for scoping, renaming, and qualified names you'd have in a more traditional language, versioning problems if you allow anything beyond fully qualified and explicitly imported vocabulary terms. AppleScript does little to prevent terminology conflicts — if anything, certain aspects of its design seem to encourage them — leading to unexpected behavior and hard-to-find bugs.

Also, you may as well introduce "unnatural" syntax like parentheses for regrouping arithmetic and conditional expressions, as all the ostensibly "natural" alternatives I can think of (requiring a series of assignments to often arbitrarily chosen variables to force subexpression evaluation, say) seem worse.

Furthermore, the simplest side effecting subexpressions can easily lead to hopelessly "unnatural" situations: suppose I have two objects "Alice" and "Bob", each with an integer property "age", defined as follows:

    Bob's age is twice Alice's age.
and

    Alice's age is 35 minus the number of previous calls to Alice's age.
Then, assuming we have not previously asked for Alice's age,

    the sum of Alice's age and Bob's age
is either

    (2 * 35) + 34
or

    35 + (2 * 34).
Given that integer addition is naturally commutative, unambiguous "natural language" seems to require some way to express explicit sequencing, whether sequential assignment to temporaries, or by some special syntactic device along the lines of

    the sum of first getting Alice's age, then getting Bob's age.
This is also necessary when any other observable side effects differ: perhaps Bob is always 70 and Alice is always 35, but that each "age" is recorded at the end of a shared log file immediately before it's returned.

And so on. Seems to me that starting with a less ad hoc natural language would, at best, only help with the easiest parts.




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: