CSV works because CSV is understood by non technical people who have to deal with some amount of technicality. CSV is the friendship bridge that prevents technical and non technical people from going to war.
I can tell an MBA guy to upload a CSV file and i'll take care of it. Imagine i tell him i need everything in a PARQUET file!!! I'm no longer a team player.
Indeed the my main use is most financial services will output your records in csv, although I mostly open that in excel which sometimes gets a bit confused.
This is incorrect. Everyone uses Excel, not CSV. There are billions of people on this planet who know what to do with an .xlsx file.
Do the same with a .CSV file and you'll have to teach those people how to use the .CSV importer in Excel and also how to set up the data types for each column etc. It's a non trivial problem that forces you down to a few million people.
.CSV is a niche format for inexperienced software developers.
Among the shit I have seen in CSV, no " for strings, including those with a return char, innovative SEP, date, numbers, no escape for " within strings, rows related to the reporting tools used to export to CSV etc
True. But most of those problems are pretty easy for the non-technical person to see, understand, and (often) fix. Which strengthens the "friendship bridge".
(I'm assuming the technical person can easily write a basic parsing script for the CSV data - which can flag, if not fix, most of the format problems.)
For a dataset of any size, my experience is that most of the time & effort goes into handling records which do not comply with the non-technical person's beliefs about their data. Which data came from (say) an old customer database - and between bugs in the db software, and abuse by frustrated, lazy, or just ill-trained CSR's, there are all sorts of "interesting" things, which need cleaning up.
I can tell an MBA guy to upload a CSV file and i'll take care of it. Imagine i tell him i need everything in a PARQUET file!!! I'm no longer a team player.