I think if you're going to commit to an offering like this, you need a strong team and they should be able to debug such issues.
Your performance issues are only really going to be related to the VM tech, overlay network,storage layer, the orchestrator settings, or logstash itself.
Given that to can switch these in and out, you can isolate the problem and replace the broken part. You can also trace the app to see what syscalls are taking so long.
You can have similar issues with pretty much any environment if your team can't debug that, and if that's the case you should probably go for a popular vendor supported solution, but you'll be in a sad place when the vendor doesn't have the staff to debug their solution, so pick carefully.
This post isn't supposed to sound insulting to you or your ex colleagues, just pointing out that there is a gulf of knowledge between the guys who can get things to work, and the guys who can tell you why something doesn't work, and this gulf only really presents itself when shit hits the fan.
Your performance issues are only really going to be related to the VM tech, overlay network,storage layer, the orchestrator settings, or logstash itself.
Given that to can switch these in and out, you can isolate the problem and replace the broken part. You can also trace the app to see what syscalls are taking so long.
You can have similar issues with pretty much any environment if your team can't debug that, and if that's the case you should probably go for a popular vendor supported solution, but you'll be in a sad place when the vendor doesn't have the staff to debug their solution, so pick carefully.
This post isn't supposed to sound insulting to you or your ex colleagues, just pointing out that there is a gulf of knowledge between the guys who can get things to work, and the guys who can tell you why something doesn't work, and this gulf only really presents itself when shit hits the fan.