This code attempts to create unboundedly many Python stack frames. If those frames were allocated on the heap, you would expect it to either throw an exception or else display symptoms of memory exhaustion. Instead, it segfaults.
The recursion limit exists simply to prevent the user-level code from exhausting the native C stack.
On the other hand CPython in fact does extra work because it also creates heap allocated frame objects (which essentially mirror the C stack) for debugging purposes.
Yes, but the point is that that extra work is still happening on the stack, not on the heap. Overflowing the C stack results in a segfault; exhausting heap memory doesn't.
When it performs a Python call, CPython performs a corresponding recursive call of the interpreter function. It uses both the C stack to manage interpreter state and heap-allocated stack frame objects to manage the interpreted program's state.
Your experiment shows that the C stack overflows faster than the heap. It does not show that no heap space is consumed at all.