I want a language where you can express time and constraints on time within the ...

munificent · on Sept 18, 2018

Part of the reason most languages obscure this is because it's a moving target.

If a language let you say, "this chunk of code here should run in 7 cycles", what happens when a new optimization finds a way to reduce that, or a new architecture comes up where that operation gets slower but lots of others get faster?

I'm not arguing against your desire, just explaining that it's not unreasonable that we're where we are now. We've gotten so used to language portability, that it's good to remember how painful it can be to lose that. It's no fun having to rewrite all your code every time a new chip comes out.

yuushi · on Sept 18, 2018

This could only ever be doable with extremely simple architectures anyway. Off the top of my head, add in just one of branch prediction, micro-op fusion, out-of-order execution, pipelines and pipeline stalls, or cache misses, and this becomes impossible. Of course, this assumes you even know which CPU you are targeting its specific instruction latencies.

That's already an extremely niche set of processors. Further, the number of bits of code you're likely to care about this kind of extremely precise timing for, you'll either examine the emitted assembly, or just hand-write the ASM yourself.

It seems like a huge amount of effort for an extremely niche scenario. Remember, the ISA is still just an abstraction, after all.

daemin · on Sept 18, 2018

To add to that there's also the difference between cycles spent executing an instruction and how many of those instructions can be executed at once in the pipeline. So there is a difference between executing a set of instructions once versus executing them millions of times.

gpderetta · on Sept 18, 2018

In these times of dynamic frequency scaling even the temperature of the room the computer is sitting in ia going to affect the performance.

In practice I think hard real-time systems use extremely conservative estimates of cpu performance.

adrianratnapala · on Sept 18, 2018

This is actually why you want the compiler to track it.

You write an algorithm that seems reasonable. And you encode timing constraints into the thing. Now you re-target to a different machine, and the compiler re-checks those constraints. A much cheaper way of dealing with weird timing bugs than doing a post-mortem of why the fuel valves didn't quite close on the day of the rocket test.

But this only works if the CPU is knowable to the compiler. Which is why it is only even remotely feasible in the embedded world (where it is also the most useful).

gpderetta · on Sept 18, 2018

The compiler can only help so much as many of the timing parameters are variables and runtime dependent.

henrikeh · on Sept 18, 2018

This is one of the most actively researched fields in computer architecture/engineering. Hard real-time requirements has been been extensively researched for ISAs, compilers, network-on-chips, concurrency models etc etc.

Personally, I was briefly involved in the T-CREST[0] project where a complete ISA, network on a chip and compiler where devolved in tandem to support deterministic timing and well understood worst case execution times (WCET).

As an example, the ISA explicitly defines multiple caches with different write-rules. So for example the stack cache is write-back to lower WCET.

The whole project is available as a VM image[1] where the HDL can be compiled to an emulator and/or synthesized to hardware.

[0]: http://www.t-crest.org/page/overview [1]: http://patmos.compute.dtu.dk

adrianratnapala · on Sept 18, 2018

When I was a student in the late '90s, folks at the Uni wanted to that stuff with Ada. I suspect that some limited-but-useful version of this does exist by now in niche sub-branches of safety-critical embedded programming. (Though the commercially used onces are likely to be static analysers for C as they are to be an ADA compiler).

I mean it's nice for compilers to count instructions, but that is far more likely to work on some "primitive" embedded CPU than it is to work on the modern spectre-pipelined goliath CPUs that we use in personal computers and datacentres.

alexisread · on Sept 18, 2018

So there are languages that can do these things individually:

Dedalus is an evolution of datalog, incorporating the notion of time in the type system. You're better off working with an abstract version of time, for portability reasons - the compiler can look at the timing specs to optimize resource usage, otherwise you're into VHDL territory and your only recourse is simulation of your system to reason about it, due to complexity .

Fine grained energy usage would be specific to say the SOC you're using, eg. for the CC1310 there are specific modes for low power. The compiler could potentially take care of this, but you would probably require an abstract state machine for it to optimize, eg. identify periods to go in to low power mode.

Hard realtime nearly precludes many languages due to their need for GC - there are a few realtime GCs around, but really it's better to use a language that has managed memory without GC. So after all that, you're down to a handful of suitable languages for realtime eg. Rust or Composita.

Lastly, guaranteeing program correctness is something you need to do in a high level language, to be able to capture the semantics you're tying to prove. ie. you need a language that can express higher order logic. Whether this proof step is intrinsic to the language is up to you, but you then need to map the higher-level model to your low level language. Several projects do this eg. CakeML, but the full self-verified OS doesn't exist as yet.

Interestingly there is a linguistic solution for all of this which I'm working on, just not got anything concrete to share yet - maybe in 6 months :)

uryga · on Sept 18, 2018

I'm very curious about your solution! If there's a place where I could follow the development, I'd be grateful for a link or, if it's not too much of a hassle, a ping to the email in my profile when you have something you want to show off :)

alexisread · on Sept 18, 2018

I'm grateful for the interest! Not really thought about a location - I'll probably use my github account when I've got some material, and ping you when it's there.

As an aside, most of the stuff I've on my account is higher level eg. https://github.com/alexisread/noreml deals more with a new method of visually programming cross-plaform UIs ie. using nodered dashboards and flow-based programming in a drag and drop fashion.

Main influences at the moment are Maru, Composita, Ferret and Maude:

http://piumarta.com/software/maru/ http://concurrency.ch/Research/Composita https://github.com/nakkaya/ferret http://maude.cs.uiuc.edu/overview.html

uryga · on Sept 21, 2018

Thanks, and good luck! Followed you on GH. noreml sounds interesting too, I'll check it out when I have better internet access!

Waterluvian · on Sept 18, 2018

Are you suggesting a language where you could hypothetically look at your IDE and it'll tell you via static analysis, "this code block takes x cpu units. On a Y core this is 0.95ns".

Never thought about that. That would be so cool.

tshaddox · on Sept 18, 2018

There are lots of attributes that are provably impossible to statically determine about a computer program. Whether the program halts is the obvious example, but Rice's theorem generalizes this to effectively any attribute of a computer program you would be likely to care about.

https://en.wikipedia.org/wiki/Rice%27s_theorem

TheCoelacanth · on Sept 18, 2018

Impossible to prove for any arbitrary program doesn't mean impossible to prove for programs that we actually care about, and it especially doesn't mean impossible to prove for a program where the programmer is intentionally trying to write a program where certain attributes are provable.

Any statically typed language necessarily has to prevent some well-formed programs from being represented. This language would just need to reject programs where it can't prove the attributes that it is suppose to prove in addition to programs that are actually malformed.

It is really ergonomics that prevents a language like this from taking off, not any theoretical concerns. Languages that let you prove a lot of complex properties at compile-time take too much work to use, but they are not impossible to create.

ghayes · on Sept 18, 2018

F-Star [0] and Idris [1] are two great languages which give you a large variety of static analysis, and yes, provide ways to prove totality.

Idris, for instance, disallows recursion unless it can prove an argument "smaller" in the recursive call. For instance:

    fib : Nat -> Nat
    fib Z = Z
    fib (S Z) = (S Z)
    fib (S (S n)) = fib (S n) + fib n

In this case, the function pattern matches on natural numbers (Z stands for Zero and S for the one plus some other natural number). Each recursive call of `fib` has an argument which is smaller (closer to the base case), and thus this program must halt. Idris will not compile unless it can prove the program will halt. It, therefore, rejects all programs which would have halted but which it could not prove.

[0] https://www.fstar-lang.org/

[1] https://www.idris-lang.org/

dwohnitmok · on Sept 18, 2018

Rice's Theorem generalizes this to any implementation-independent attribute (i.e. properties of the idealized function on natural natural numbers a program represents). However, certain implementation-specific details such as run time (i.e. performance) are fair game. In particular it is quite simple to at least calculate lower bounds on run time. Simply choose a lower bound and then run the program and see if it halts before then.

tshaddox · on Sept 18, 2018

If you can do that with your program then you can probably just precompute the actual result at build time. Many, if not most unexpected performance problems occur with unexpected or unconsidered inputs.

dwohnitmok · on Sept 18, 2018

There are far less wasteful ways of calculating implementation-specific details. My "run until completion" was just an existence proof that general techniques exist. Things like symbolic execution, type systems, and other static analysis tools could in theory help here.

That being said I have personally yet to see non-toy static analysis tools for performance (although I suspect they exist at least in some limited fashion or operating in some constrained domains). This is probably due to the nature of certain pathological inputs and code paths as you point out.

There may nonetheless be hope if you guide the way users can write programs as a sibling comment points out.

alexisread · on Sept 18, 2018

There are many properties as such, but we can still prove many properties about a program despite non-deterministic behaviour, using hidden logic https://cseweb.ucsd.edu/~goguen/pps/hherb.ps

As such, a proof system which incorporates behavioural specification is highly desirable.

calebh · on Sept 18, 2018

It turns out that it is possible to optimally minimize the number of beta reductions in a lambda calculus program. See this excellent blog post: https://medium.com/@maiavictor/some-functions-may-have-negat...

chriswarbo · on Sept 18, 2018

Approaches like the Abstract Algorithm (that you link to) turn out to just shift the costs elsewhere; in particular to the "book keeping" (IIUC nodes act differently depending on which "path" we're traversing, and keeping track of that exactly cancels out the benefits of optimal sharing).

For example, see links at https://www.reddit.com/r/haskell/comments/2zqtfk/why_isnt_an...

pka · on Sept 18, 2018

This is already possible to a degree within existing type systems - here’s a small experiment tracking execution time of list comprehensions on the type level in Purescript [0].

[0] https://twitter.com/pkamenarsky/status/961666713900249088?s=...