Well no, not in a non-array programming language. In any language that has a sem...

otteromkram · 2024-10-06T03:57:25 1728187045

Why would it be avg(a, b)?

What if I want to take the average difference of two arrays?

IshKebab · 2024-10-06T06:41:50 1728196910

mean(a - b)

rak1507 · 2024-10-06T03:06:11 1728183971

I don't know what you mean by the q array operations being defined in the standard library. Yes there are things defined in .q, but they're normally thin wrappers over k which has array operations built in.

sudosysgen · 2024-10-06T14:54:29 1728226469

I don't consider an interpreted language having operations "built-in" be significantly different from a compiled language having basic array operations in the stdlib or calling a compiled language.

rak1507 · 2024-10-06T22:43:48 1728254628

Hmm, why not? Using K or a similar array language is a very different experience to using an array library like numpy.

sudosysgen · 2024-10-06T23:58:02 1728259082

It is syntactically different, not semantically different. If you gave me any reasonable code in k/q I'm pretty confident I could write semantically identical Julia and/or numpy code.

In fact I've seen interop between q and numpy. The two mesh well together. The differences are aesthetic more than anything else.

rak1507 · 2024-10-07T00:15:50 1728260150

There are semantic differences too with a lot of the primitives that are hard to replicate exactly in Julia or numpy. That's without mentioning the stuff like tables and IPC, which things like pandas/polars/etc don't really come close to in ergonomics, to me anyway.

sudosysgen · 2024-10-07T01:07:03 1728263223

Do you have examples of primitives that are hard to replicate? I can't think of many off the top of my head.

> tables and IPC

Sure, kdb doesn't really have an equal, though it is very niche. But for IPC I disagree. The facilities in k/q are neat and simple in terms of setup, but it doesn't have anything better than what you can do with cloudpickle, and the lack of custom types makes effective, larger-scale IPC difficult without resorting to inefficient hacks.

rak1507 · 2024-10-07T17:32:59 1728322379

None of the primitives are necessarily too complicated, but off the top of my head things like /: \: (encode, decode), all the forms of @ \ / . etc, don't have directly equivalent numpy functions. Of course you could reimplement the entire language, but that's a bit too much work.

Tables aren't niche, they're very useful! I looked at cloudpickle, and it seems to only do serialisation, I assume you'd need something else to do IPC too? The benefit of k's IPC is it's pretty seamless.

I'm not sure what you mean by inefficient hacks, generally you wouldn't try to construct some complicated ADT in k anyway, and if you need to you can still directly pass a dictionary or list or whatever your underlying representation is.

sudosysgen · 2024-10-08T22:04:59 1728425099

> None of the primitives are necessarily too complicated, but off the top of my head things like /: \: (encode, decode), all the forms of @ \ / . etc, don't have directly equivalent numpy functions. Of course you could reimplement the entire language, but that's a bit too much work.

@ and . can be done in numpy through ufunc. Once you turn your unary or binary function into a ufunc using food = np.frompyfunc, you then have foo.at(a, np.s_[fancy_idxs], (b?)) which is equivalent to @[a, fancy_idxs, f, b?]. The other ones are, like, 2 or 3 lines of code to implement, and you only ever have to do it once.

vs and sv are just pickling and unpickling.

> Tables aren't niche,

Yes, sorry, I meant that tables are only clearly superior in the q ecosystem in niche situations.

> I looked at cloudpickle, and it seems to only do serialisation, I assume you'd need something else to do IPC too? The benefit of k's IPC is it's pretty seamless.

Python already does IPC nicely through the `multiprocess` and `socket` modules of the standard library. The IPC itself is very nice in most usecases if you use something like multiprocessing.Queue. The thing that's less seamless is that the default pickling operation has some corner cases, which cloudpickle covers.

> Im not sure what you mean by inefficient hacks, generally you wouldn't try to construct some complicated ADT in k anyway, and if you need to you can still directly pass a dictionary or list or whatever your underlying representation is.

It's a lot nicer and more efficient to just pass around typed objects than dictionaries. Being able to have typed objects whose types allow for method resolution and generics makes a lot of code so much simpler in Python. This in turns allows a lot of libraries and tricks to work seamlessly in Python and not in q. A proper type system and colocation of code with data makes it a lot easier to deal with unknown objects - you don't need nested external descriptors to tag your nested dictionary and tell you what it is.

rak1507 · 2024-10-11T06:06:29 1728626789

Again, I'm not saying anything is impossible to do, it's just about whether or not it's worth it. 2 or 3 lines for all types for all overloads for all primitives etc adds up quickly.

I don't see how k/q tables are only superior in niche situations, I'd much rather (and do) use them over pandas/polars/external DBs whenever I can. The speed is generally overhyped, but it is significant enough that rewriting something from pandas often ends up being much faster.

The last bits about IPC and typed objects basically boil down to python being a better glue language. That's probably true, but the ethos of array languages tends to be different, and less dependent on libraries.