I couldn't agree more. I would add reproducibility to the list of important things, above everything else you mentioned. I looked into Vanna after seeing it on here because generating SQL code by only embedding the schema and business logic seems like a nice, quick middle ground that doesn't require embedding an entire database. However, 88% accuracy in generating the correct query isn't good enough for deployment at the organization-level. "Give me sales for the last quarter as of end of prior month" should return the same result for everyone, without exception.