Hacker News new | past | comments | ask | show | jobs | submit login

> ”K programs routinely outperform hand-coded C. This is theoretically impossible. K compiles into C. Every k program has a C equivalent that runs exactly as fast. Yet it is true in practice. Why? Because it is easier to see your error in four lines of code than in four hundred lines of C.”

How true is this? I hear this claim a lot but haven't seen any real benchmarks. E.g, https://benchmarksgame-team.pages.debian.net/benchmarksgame/... doesn't have K or any related languages.

I'm guessing K is fast due to autovectorization for certain cases, but are there benchmarks that provide hard numbers? The existence of a benchmark prohibition clause in the kdb license makes me skeptical of its performance claims.

https://tech.marksblogg.com/benchmarks.html has a kdb benchmark, but due to the use of Xeon Phis, it can't be compared with other benchmarks there.




He wrote solutions for Shootout (predecessor to that Debian page) problems a few years ago; a lack of k in benchmarks is not due to a lack of k being volunteered: http://www.kparc.com/z/comp.k

k is really fast. Half of the things on the Shakti mailing list are just Arthur getting really excited about how significantly he's beating x or y or z in performance and giving numbers for it. `grep`ping it now I see 40 in half a year that explicitly contain the word "benchmark," though not all of these are comparing to other things (some are just comparing to different k releases), and there are more comparisons without that word.

Arthur doesn't work at Kx anymore, by the way. He's at Shakti now. Shakti has a different (but still draconian/non-(A)GPL) license. It probably doesn't have the benchmark clause, but I don't care enough to check (I prefer J to k and don't have a proprietary k on my system).


> He wrote solutions for Shootout (predecessor to that Debian page) problems a few years ago; a lack of k in benchmarks is not due to a lack of k being volunteered

That's 6 of the programs, there were at-least 4 others ;-)

I lacked and still lack the knowledge to figure out if those snippets are doing what they should.

For example, do those scripts set arg n from the command line and read from stdin? Do those scripts write correctly formatted output to stdout?

What a pity that page does not show measurements for those scripts, and a comparison with some of the C programs written for the benchmarks game.


Thanks for that link. Do you know if the results are posted anywhere?

https://shakti.com/license.php says "Customer shall not... distribute ... any report regarding the performance of the Software benchmarks..." which I would need to agree to if I want to download the binaries at https://shakti.sh/benchmark/


You could check and see if it was ever posted on the Shootout site with archive.org, but when I checked coverage was spotty. The point of these benchmarks is everyone being able to run them themselves, though.


> was ever posted

No, K was never installed.

> The point of these benchmarks is everyone being able to run them themselves, though.

Exactly.

And compare with programs in some other language:

https://benchmarksgame-team.pages.debian.net/benchmarksgame/...


Have you looked at any of the opensource k impls? (Kona, ok, klong, ngn/k...)


If you look at my post history, it's actually primarily free software APL implementations! ngn/k is amazing. John Earnest's work with his own variants is really fantastic. I dislike klong and Kona.


How has your experience been with J for on disk time series DBs.


Not developer, just a lone data wrangler but FWIW Jd has made me much faster at exploring data. I now use Jd to create my datasets and use R just for the models. IMO, it's far better than using R to tidy things.


> How true is this? I hear this claim a lot but haven't seen any real benchmarks ...

I think you're trying to read into the statement something other than Arthur meant; The last part of the quote is just as important as the first. Allow me to try and explain.

It's pretty easy for an experienced C programmer to beat K at some things, for example:

    int i,n;for(i=n=0;i<1000000;++i)n+=i;
is faster (with gcc -O2) than the "equivalent":

    +/!1000000
which is more literally:

    int N=1000000;*a=malloc(N*sizeof(int)),i;
    for(i=0;i<N;++i)a[i]=i;
    int n=a[0];for(i=1;i<N;++i)n+=a[i];
    free(a);
but even an experienced C programmer will experience some fatigue trying to convert a K program to C in this way, and the K implementation is certainly faster than a literal translation (largely because it doesn't use malloc, but yes also because of careful vectorisation and parallelising of many of the operators).

It's my experience this difference adds up faster than anything else, and that's why a 20kloc C++ implementation of a HIBP checker was beat 10x by a 5-line k/q solution:

https://news.ycombinator.com/item?id=22467866

But make no mistake, I don't use k because it's fast, but because it makes me fast: five lines of code ain't squat to get right over 20k lines, so I'd prefer the k solution even if it were "only" just as fast as someone's C++ solution.

It does help tremendously though, that it's usually much much faster.


I think you're missing the point. K certainly isn't faster than C. If you have a C guru, they should always be able to write faster code than K.

The difference is it should be a lot easier for someone like me (not a C guru) to write a K program that does something useful. I'll also be able to see the whole program as it will likely be a few lines long. If I wrote some C code, it would take me forever and take forever to debug. I see this as similar to any scripting language (Ex: Python).


Absolutely. The advantage is terseness of expression.

Less code means fewer bugs.


Arthur's new company, Shakti, has some of the benchmarks they've talked about[1][2].

[1] https://shakti.sh/

[2] https://shakti.sh/benchmark/about?eula=shakti.com/license


That shows kdb is faster than BigQuery, Spark & other similar systems. But it doesn't compare K to C.


What would be interesting, and perhaps informative, if is we could share our some of our favorite k programs and then people could try to write faster implementations in C.

I like benchmarks that I can easily run on any computer myself.

For me, a major part of k's appeal is the (original) author's appreciation for small interpeter size and language terseness. Generally, I think k will always beat C, and probably any scripting language you can think of, in that department.

The issue of verbosity is so polarising amongst the vocal programmers who comment online that even if k were faster than whatever language (and free, open source, etc.), I would bet many of them still would reject the idea of using it.


My guess is that K is faster in some situations because the nature of the language requires programmers to use the most direct solution to a problem, getting all the code and data in the cache. C being a verbose language makes this harder to achieve, unless you spend some time to optimize the memory layout of your data.


There are lies, there are damn lies, and there are benchmarks.

Also, consider how few comments there are in this thread. What does that usually mean on HN?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: