The recursion limit exists simply to prevent the user-level code from exhausting the native C stack.
On the other hand CPython in fact does extra work because it also creates heap allocated frame objects (which essentially mirror the C stack) for debugging purposes.
Yes, but the point is that that extra work is still happening on the stack, not on the heap. Overflowing the C stack results in a segfault; exhausting heap memory doesn't.
When it performs a Python call, CPython performs a corresponding recursive call of the interpreter function. It uses both the C stack to manage interpreter state and heap-allocated stack frame objects to manage the interpreted program's state.
Your experiment shows that the C stack overflows faster than the heap. It does not show that no heap space is consumed at all.