Hacker News new | past | comments | ask | show | jobs | submit login
A Python internals adventure (2014) (flowerhack.dreamwidth.org)
101 points by luu on Dec 29, 2017 | hide | past | favorite | 3 comments



The reason for why the whole unicode checking is done in python3 is to guarantee Unicode support in python3. I actually disliked the old half-string-half-binary approach and almost from the start enjoyed the clear distinction between str and bytes in python3.

That being said, the strings/bytes cleanup was also one of the few things that really broke backward compatibility with 2.x.


The convention with the Python C API is to return a non-NULL pointer to a python object on success, and return NULL and set the exception global variable on error. Yes, global variables are also alive and well.

    PyObject *fout = _PySys_GetObjectId(&PyId_stdout);
    stdout_encoding = _PyObject_GetAttrId(fout, &PyId_encoding);
The python equivalent of this is `sys.stdout.encoding`. The StringIO object was constructed without an encoding, so this is None.

    stdout_encoding_str = PyUnicode_AsUTF8(stdout_encoding);
This tries to convert None to a C string, which fails.


The bug has been fixed since then:

https://bugs.python.org/issue8256




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: