OP here: I ran 580 model-dataset experiments to show that, even if you try very hard, it is almost impossible to know that a model is degrading just by looking at data drift results.
"In my opinion, data drift detection methods are very useful when we want to understand what went wrong with a model, but they are not the right tools to know how my model's performance is doing.
Essentially, using data drift as a proxy for performance monitoring is not a great idea.
I wanted to prove that by giving data drift methods a second chance and trying to get the most out of them. I built a technique that relies on drift signals to estimate model performance and compared its results against the current SoTA performance estimation methods (PAPE and CBPE) to see which technique performs best."
"In my opinion, data drift detection methods are very useful when we want to understand what went wrong with a model, but they are not the right tools to know how my model's performance is doing.
Essentially, using data drift as a proxy for performance monitoring is not a great idea.
I wanted to prove that by giving data drift methods a second chance and trying to get the most out of them. I built a technique that relies on drift signals to estimate model performance and compared its results against the current SoTA performance estimation methods (PAPE and CBPE) to see which technique performs best."