I'm not sure I'd consider this a "failure," but related to GP, I have had a number of issues maintaining Elastic Beanstalk environments, including:
- The single container Docker platform (not sure if this is an issue with other platforms) can cause the CloudWatch agent on the environments' EC2 instances to stop streaming logs to CloudWatch. This seems to occur when a Docker container fails, for example if the process it's managing stops (e.g., if a Node.js application triggers an exception that is not caught and exits). A new Docker container will be started, but the new container's log file sometimes does not automatically get attached to/monitored by the CloudWatch agent.
- The default CloudWatch alarms created by the environment can create a "boy who cried wolf" situation. For example, when updating the application version for an environment, EB will transition the environment's state from "OK" to "Info" or even "Warning," depending on the deployment policy. This is a regular operation, but CloudWatch will still send an email to the designated notification email address about the state change. If you monitor those emails for environment issues, this normal operation could cause overload, which might lead to ignoring the emails outright. This could be problematic if the environment state transitions to an actual problem state. You can create email client rules for this, but the structure of the alarm email doesn't make this very easy, at least in Outlook 365.
An annoying example of this is when your EB environment auto-scales up due to, for example, an increase in traffic. When the auto-scaling policy scales down your instances (due to normal operation of the policy), you'll get an email that your environment has transitioned into a "Warning" state because one or more of your environment's EC2 instances are being terminated. This looks scary in the CloudWatch email that is delivered, but you have to learn that it's just the ASG doing its thing, terminating unused instances as it's been configured to do. The emails, however, do not provide good context into what has led to the "Warning" state.
- The way environments handle configuration files stored in your application's .ebextensions/ directory can cause inconsistent application state between version deployments on existing/new EC2 instances. For example, if your auto-scaling policy creates a new EC2 instance, but your recently deployed application version doesn't specify some of the commands/settings applied during a previous update to your .ebextensions/ files that might have been deployed to existing EC2 instances, you run the risk of having inconsistent state across your application's EC2 instances. This can be solved by using the "immutable" deployment type, but that's not the default deployment type. It's an edge case, but it's still something that requires you to SSH into your EC2 instances, and possibly manually terminate older instances when you eventually figure out what's going on.
Having said all of that, I think EB is still a reasonable choice for small/beginner workloads: It gives you a number of things (automated deployment, auto-scaling, load balancing, logs, etc.) that you can get by doing things on your own, but lets you get to production quickly. For mature applications, I think you could be better off managing these individual services yourself (EB is mostly just wiring together a number of AWS services with a a few deployment and monitoring agents running on each EC2 instance). If you're comfortable with the components EB is managing for you and if you have a stable CI/CD pipeline, you can get more flexibility than bending EB against its will.
- The single container Docker platform (not sure if this is an issue with other platforms) can cause the CloudWatch agent on the environments' EC2 instances to stop streaming logs to CloudWatch. This seems to occur when a Docker container fails, for example if the process it's managing stops (e.g., if a Node.js application triggers an exception that is not caught and exits). A new Docker container will be started, but the new container's log file sometimes does not automatically get attached to/monitored by the CloudWatch agent.
- The default CloudWatch alarms created by the environment can create a "boy who cried wolf" situation. For example, when updating the application version for an environment, EB will transition the environment's state from "OK" to "Info" or even "Warning," depending on the deployment policy. This is a regular operation, but CloudWatch will still send an email to the designated notification email address about the state change. If you monitor those emails for environment issues, this normal operation could cause overload, which might lead to ignoring the emails outright. This could be problematic if the environment state transitions to an actual problem state. You can create email client rules for this, but the structure of the alarm email doesn't make this very easy, at least in Outlook 365.
An annoying example of this is when your EB environment auto-scales up due to, for example, an increase in traffic. When the auto-scaling policy scales down your instances (due to normal operation of the policy), you'll get an email that your environment has transitioned into a "Warning" state because one or more of your environment's EC2 instances are being terminated. This looks scary in the CloudWatch email that is delivered, but you have to learn that it's just the ASG doing its thing, terminating unused instances as it's been configured to do. The emails, however, do not provide good context into what has led to the "Warning" state.
- The way environments handle configuration files stored in your application's .ebextensions/ directory can cause inconsistent application state between version deployments on existing/new EC2 instances. For example, if your auto-scaling policy creates a new EC2 instance, but your recently deployed application version doesn't specify some of the commands/settings applied during a previous update to your .ebextensions/ files that might have been deployed to existing EC2 instances, you run the risk of having inconsistent state across your application's EC2 instances. This can be solved by using the "immutable" deployment type, but that's not the default deployment type. It's an edge case, but it's still something that requires you to SSH into your EC2 instances, and possibly manually terminate older instances when you eventually figure out what's going on.
Having said all of that, I think EB is still a reasonable choice for small/beginner workloads: It gives you a number of things (automated deployment, auto-scaling, load balancing, logs, etc.) that you can get by doing things on your own, but lets you get to production quickly. For mature applications, I think you could be better off managing these individual services yourself (EB is mostly just wiring together a number of AWS services with a a few deployment and monitoring agents running on each EC2 instance). If you're comfortable with the components EB is managing for you and if you have a stable CI/CD pipeline, you can get more flexibility than bending EB against its will.