I was working with GPT the other day on a new (not programming) problem for the first time, and I asked it to download a Google Drive spreadsheet and take a look at it. Which I was quite surprised to get a flat response that it couldn't do that, since just the other day, I had been working through a programming problem with GPT 3.5 (and I've never paid for it) when there was a point in the dialogue that I was asking it to generate a parser, and I suggested it would be really handy if there was some way I could provide it with example data for the parser to be tested against.
Wouldn't you know, it said this:
> I'm sorry, but as a text-based AI model, I cannot directly receive or process files or attachments. However, I can guide you on how to provide a sample PDF and assist you further with the script.
> If you'd like to share an example PDF, you can upload it to a file-sharing service (such as Dropbox, Google Drive, or any other file-hosting platform) and provide a link to the file. I can then guide you on how to modify the script based on the specific structure and content of the PDF.
Can it download the PDF, or is it faking? I've told it that it is a competent Ruby programmer as a first prompt, and this is the only reason that I can think of why I'd get such different results. Is it drawing a line between "reading the structure" and "processing a set of data files" or is it really just faking me out? The next prompt after I shared a link:
> After reviewing the PDF, I'll provide an updated version of the script that should work well with the given example. Please allow me some time to analyze the PDF and make the necessary adjustments.
... OK, should I just wait here then? I thought about it for a minute, and then I provided it with the header row for the output CSV, which was the only thing I really needed it to get from the PDF file. We moved on after that, and I do not know if it actually ever read the PDF, or if it was pulling my leg supportively, "yeah I can't really download a file, but sure send me the file, I am a great Ruby programmer as you said before, so surely I will know what to do with it friend-o!"
Is it really just the presence or absence of the reassuring "you are a programmer and you know what you're doing" prompt at the beginning of my conversation with it, that made the bot behave differently in both circumstances? I was told that it would be important to do things like this. What's really happening here, I wonder? I would love to understand a bit better!
We did get to the end of the problem, and no the parser did not work reliably (but it worked much better than I was honestly expecting, also I did not try with GPT4 so maybe that would have been the difference...)
There's a new thing called ChatGPT Browsing mode which CAN do this, but you need to pay for access to it and it makes it very obvious what is going on. There's a screenshot of that in my post (at the bottom).
> very convincing imitation of being able to do so
So that's really what was happening! Then it's even more impressive how close it came to parsing the PDF file correctly, (or I guess I gave it more direct hints than I remembered.) Thanks for the link!
> Please allow me some time to analyze the PDF and make the necessary adjustments.
That should have been the give away. It's not sitting there pondering your problem in the background. But it sure sounds like something your helpful robot buddy would say.
Yeah, I was about to ask it "OK, how long should I wait, or do you need me to provide more input?" then I remembered, I'm not talking to a being. There's no reason to ask it questions like that. Keep talking about the problem, it's trained to talk about a solution, (don't spoil the illusion! Just let it pretend it saw the PDF file, and see how far this goes.)
I still think things might have gone differently if I provided it with the PDF as base64 encoded content. That's me, still not understanding what type of machine is a GPT/LLM, isn't it. It can write a parser but it can't test and run the parser. But it can pretend we did, and it can sure apologize when I call it out for lying!
The things it says aren't based on knowledge or truth. It doesn't even know whether it can download things, much less can it tell you the truth about that capability. It just says words.
Wouldn't you know, it said this:
> I'm sorry, but as a text-based AI model, I cannot directly receive or process files or attachments. However, I can guide you on how to provide a sample PDF and assist you further with the script.
> If you'd like to share an example PDF, you can upload it to a file-sharing service (such as Dropbox, Google Drive, or any other file-hosting platform) and provide a link to the file. I can then guide you on how to modify the script based on the specific structure and content of the PDF.
Can it download the PDF, or is it faking? I've told it that it is a competent Ruby programmer as a first prompt, and this is the only reason that I can think of why I'd get such different results. Is it drawing a line between "reading the structure" and "processing a set of data files" or is it really just faking me out? The next prompt after I shared a link:
> After reviewing the PDF, I'll provide an updated version of the script that should work well with the given example. Please allow me some time to analyze the PDF and make the necessary adjustments.
... OK, should I just wait here then? I thought about it for a minute, and then I provided it with the header row for the output CSV, which was the only thing I really needed it to get from the PDF file. We moved on after that, and I do not know if it actually ever read the PDF, or if it was pulling my leg supportively, "yeah I can't really download a file, but sure send me the file, I am a great Ruby programmer as you said before, so surely I will know what to do with it friend-o!"
Is it really just the presence or absence of the reassuring "you are a programmer and you know what you're doing" prompt at the beginning of my conversation with it, that made the bot behave differently in both circumstances? I was told that it would be important to do things like this. What's really happening here, I wonder? I would love to understand a bit better!
We did get to the end of the problem, and no the parser did not work reliably (but it worked much better than I was honestly expecting, also I did not try with GPT4 so maybe that would have been the difference...)