you need more connections than what a single AWS CLI process open to saturate network on a big box. You can achieve the same by doing some "xargs | aws cli" trickery but then error handling becomes harder.
Our S3 client just handles multiple worker processes correctly with error handling.
Can you please explain how you were able to better the performance of aws cli.