It assumes the velocity of items being added to the queue never changes. In real life it speeds up (approaching infinite items), and it slows down (approaching empty). A more meaningful measurement would be something like average number of items in the queue and average time in the queue for any given item.
If you feel like you have enough spare capacity and any given item isn’t taking too long to process then it doesn’t matter if you are ever empty.
You're on the right track, but I personally find averages to be not a useful measure...on average.
Corner conditions are where I start to care:
* What is the worst case latency? How long until the work can be done? How long until the work can finish from the point at which it enters the queue?
* What is the worst case number of items in the queue? How large does the queue need to be?
* What is the maximum utilization over some time period with some granularity? How spiky is the workload? A queue is a rate-matching tool. If your input rate exactly equals your output rate at all times, you don't need a queue.
* What is the minimum utilization over some time period? Another view on how spiky the workload is.
Minimums and maximums I find are much more illustrative than averages or even mean-squared-errors. Minimums and maximums bound performance; an average doesn't tell you where your boundary conditions are.
In general you don't want your queue to fully fill up, but like the other poster said, it's some tradeoff between utilization and latency, and the two are diametrically opposed.
Sure, I guess by average I just meant shorthand for some sort of measurement for making decisions about resources allocated to your queues. Median, mean, P99, whatever is useful.