This is pure gold !! Thank you so much eugene and gang for doing this. For those of them which I have encountered, I can 100 % agree with them. This is fantastic !! So many good insights.
If this is using OpenAI which it seems is what it is using, It is only sending column headers / column names. Not the data. If you are concerned about column names, you could also mask it on the way out and back in. If you are looking for an end to end database connect and query, please reach out to me.
GPT3 model generates a SQL. You can sqldf on top of your data.table. We will be demo'ing at one of the events shortly. BTW, you could do somewhat similar with other LLMs such as GPTJ and GPT NEOX if you have worked with them
They are decently good, I could not find major differences for the cases I was trying. The key is to control the temperature. Make sure it is low, otherwise the randomness increases tremendously. Infact you can feed the same input from openAI into NEOX and it generates results. There are many NEOX open playgrounds that allow you the control the temperature etc.
2 chapters in. Very good and makes you think. Truly impressive way of teaching how data is pivotal. The introduction reminds me of Peter Norvig's talk. I also like the point below of data is code, it is very profound.