Hacker News new | past | comments | ask | show | jobs | submit login

You are very confused between how something works right now and how it can work in principle. In this you very much resemble CPython developers: they never attempt optimizations that go beyond what Python C API can offer. This is very limiting (and, this is why all sorts of Python JIT compilers can in many circumstances beat CPython by a lot).

The evidence to how absurd your claim is is right in front of you: Google's implementation of Protobuf uses std::map for dictionaries, and these dictionaries are exposed to Python. But, following your argument this... shouldn't be possible?

To better understand the difference: Python dictionary stores references to Python objects, but it doesn't have to. It could, for example, take Python strings and use C character arrays for storage, and then upon querying the dictionary convert them back to Python str objects. Similarly with integers for example etc.

Why is this not done -- I don't know. Knowing how many other things are done in Python, I'd suspect that this isn't done because nobody bothered to do it. It also feels too hard and to unrewarding to patch a single class of objects, even as popular as dictionaries. If you go for this kind of optimizations, you want it to be systematically and uniformly applied to all the code... and that's, I guess, how Cython came to be, for example.




Cython is pretty much Python without bytecode interpreter, translated to C instead, but retaining the object model. That's why it's so slow.

And the reason why the object model is the way it is, is because it's an entrenched part of the Python ABI. Sure, if you break that, you can do things a lot faster - this isn't news, people have been doing this with projects like Jython and IronPython that can work a lot faster. But the existing ecosystem of packages is so centered on CPython that this approach has proven to be self-defeating - you end up with a Python implementation that very few people actually use.

So, no, it's not because people are "very confused" or "nobody bothered to do it". It's because compatibility matters.


Well, again, you are confused... and, most likely the CPython developers didn't bother.

No. You don't need the Python object model when implementing Python dictionary. You have evidence right in front of you: std::map bindings are successfully used in its place.

Why even keep arguing about this?

In fact, you can implement your own dictionary, and if you expose all the same mapping protocol, it will work the same as the built-in one. Do you have to use Python objects for this? -- absolutely no. You can convert at the interface boundary. Experience shows that this works noticeably better than using Python objects all the way. Why did the original CPython developers not do it? -- I don't know, can only guess. I already wrote what my guess is. And, in all sincerity, CPython has a lot more and a lot worse problems. Compared to the rest of the codebase, the dictionary object is fine. So, if anyone would seriously consider improving CPython's performance they wouldn't touch dictionaries, at least not at first.


You don't need a Python object model when implementing a highly specialized dictionary that can only store very specific data types. But Python dict is a generic collection type that is designed to store any Python object (or rather, reference to such, since Python has reference semantics for anything).

And this part:

> if anyone would seriously consider improving CPython's performance they wouldn't touch dictionaries, at least not at first.

is just straight up nonsense, given how many times over Python's history dicts have been substantially rewritten. As it happens, I work on Python dev tooling, and the CPython team changing internal data structures for perf reasons has been a recurring headache for me, so I know full well what I'm talking about here.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: