That extra 0x20 (32 byte) offset is the size of the PyBytes object header for anyone wondering; 64 bits each for type object pointer, reference count, base pointer and item count.
Thank you, because I was wondering if some Python developer found the same issue and decided to just implement the offset. It makes much more sense that it just happens to work out that way in Python.