In my case I had hundreds of invoices in a not-very-consistent PDF format which ...

In my case I had hundreds of invoices in a not-very-consistent PDF format which I had contemporaneously tracked in spreadsheets. After data extraction (pdftotext + OpenAI API), I cross-checked against the spreadsheets, and for any discrepancies I reviewed the original PDFs and old bank statements.

The main issue I had was it was surprisingly hard to get the model to consistently strip commas from dollar values, which broke the csv output I asked for. I gave up on prompt engineering it to perfection, and just looped around it with a regex check.

Otherwise, accuracy was extremely good and it surfaced a few errors in my spreadsheets over the years.