Hacker News new | past | comments | ask | show | jobs | submit login

Multi-indexes definitely have their place. In fact, I got involved in pandas development in 2013 as part of some work I was doing in graduate school, and I was a heavy user of multi-indexed columns. I loved them.

Over time, and after working on a variety of use cases, I personally have come to believe the baggage introduced by these data structures wasn't worth it. Take a look at the indexing code in pandas, and the staggering complexity of what's possible to put inside square brackets and how to decipher its meaning. The maintenance cost alone is quite high.

We don't plan to ever support multi-indexed rows or columns in Ibis. I don't think we'd fare well _at all_ there, intentionally so.




> Take a look at the indexing code in pandas

As the end-user, not quite my concern.

> and the staggering complexity of what's possible to put inside square brackets and how to decipher its meaning

I might not be aware of everything that's possible -- the usage I have of it doesn't give me an impression of staggering complexity. In fact I've found the functionality quite basic, and have been using pd.MultiIndex.from_* quite extensively for anything slightly more advanced than selecting a bunch of values at some level of the index.


> As the end-user, not quite my concern.

Complicated code is (probabilistically) slow, buggy, infrequently updated code. By all means, if it looks like a good enough tool for the job (especially if the alternatives don't) then use it anyway, but that's slightly different from it not being your concern.

I've seen enough projects need "surprise" major revisions because some team tried to sneak a dataframe into a 10M QPS service that my default is keeping pandas far away from anything close to a user-facing product.

I've also seen costs balloon as the data's scale grows beyond what pandas can handle, but basically all the alternatives suck for myriad reasons, so I don't try to push "not pandas" in the data backend. People can figure out what works for themselves, and I kind of like just writing it from scratch in a performant language when I personally hit that bottleneck.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: