Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I have a question for those with a better understanding of such things. Is it difficult to calculate time or memory complexity of a function through static analysis?

It seems kind of basic, but I still find myself manually checking code instead of relying on a tool.



For a completely arbitrary program, it is not possible to calculate a tight bound on the time or memory complexity of that that program via static analysis. See Rice's Theorem[0].

It is possible to determine interesting properties of programs if you weaken your requirements to approximations and/or add significant constraints. You can maintain properties by construction if you only allow yourself to use certain kinds of building blocks in assembling programs. The less powerful your computational model, the easier it is to prove things about it.

The core idea of a type system is to form a symbolic representation of the semantics of your program which is simpler than the program itself, so that you can prove properties of that simpler system instead of the intractable problem of doing so for the actual program. (If you aren't careful, your type system could end up sufficiently powerful that it is itself undecidable, and you're back where you started.)

[0] https://en.wikipedia.org/wiki/Rice's_theorem


> For a completely arbitrary program, it is not possible to calculate a tight bound on the time or memory complexity of that that program via static analysis. See Rice's Theorem[0].

That theorem proves that no general algorithm exists. That doesn’t mean it can’t be done for a particular program. The real challenge is that modern compilers and processors will execute in a way that is hard to predict from source code.


The question was whether it is hard to determine the space/time complexity of ‘a’ function, which firmly placed the question in Rice’s theorem grounds.


If you need precision good enough for practical applications, it’s very hard to do. All modern CPUs are pipelined, time they spent on every instruction depends on neighbor instructions + data dependencies between them. All modern CPUs have multi-level caches, static analyzers must also simulate what data is in which levels of caches; for many practical problems RAM access dominates computations and L1 cache is ~2 orders of magnitude faster than main RAM. All modern CPUs have very sophisticated power management, they scale frequency based on temperature, they selectively turn on/off their blocks, this too affects time. All modern CPUs have multiple cores, L3 cache is often shared across them, all caches suffer heavy penalty when a line is shared across cores. And so on.

People don’t do it not only because it’s hard, also because running the function and measuring time + memory is much easier, also much more precise.

Even if you don’t care about time and memory complexity and only care about these useless O(N), it’s still very hard because complexity depends on input data. E.g. bubble sort is O(N^2) but it can also be O(N) if the data is mostly sorted already and only a single element needs to be moved.


The resolution of the Halting Problem shows you can't even determine whether the complexity is finite!


Yes and no - if the function operates on a balanced tree, you might need some way of encoding the constraints the tree follows, and even just describing what “n” is. But if you’re doing merge sort, the function may take n as an integer, and recurse with n/2 and n-n/2, and that is tractable. But generally, no.


Yes it's difficult, but people try anyway. The term to Google for is WCET analysis:

https://en.wikipedia.org/wiki/Worst-case_execution_time#Auto...


I would argue it isn't tractable.

f(n) = f(n-1) + f(n-1) has runtime O(2^n); f(n) = f(n-1) + f(n-2) has runtmie O(~1.62^n); f(n) = f(n/2) + f(n/2) has O(n log n); for i = 1 to f(n) has runtime f(n).. good luck determining a function output through static analysis


In both cases it has O(2^n). Normally speaking, this response would be pure pedantry [0], but for static analysis it might be good enough. In most cases, the naive answer will be good enough, and the tool can direct you to pay attention to areas of particular concern.

Getting static guarantees about performance would require carefully constructing your code with the tool in mind.

[0] Although I still find it outrageous that CS departments describe their algorithms class as "math" when they do not accept this answer.


They’re asking for the tightest upper bound.


There's master theorem though...


Given a sufficiently restricted syntax, any static analysis is possible. So the question isn't especially useful with the current wording.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: