Python and performance can be summed up as "just think about what it has to do to add two ints together". The I/O requirements of that vastly exceed the CPU requirements.
Basically any tight loop, heavy use of hash tables (so that's Python, Ruby etc.) or floating point (since this is a sore spot for ARM) is going to create an explosion in time used. C/C++/C#/Java can, by being careful about the types being used, be much faster.
I don't understand what I/O has to do with that. The issue is all the extra branching and copying with every op code. Python is going to be 20X to 100X slower than C++ at pretty much anything. It's an inevitability of the type system.
Basically any tight loop, heavy use of hash tables (so that's Python, Ruby etc.) or floating point (since this is a sore spot for ARM) is going to create an explosion in time used. C/C++/C#/Java can, by being careful about the types being used, be much faster.