You get an acceptable answer maybe about 60% of the time, assuming most of your questions are really simple. The other 40% of the time it's complete nonsense dressed up as a reasonable answer.
In my experience I get acceptable answers in more than 95% of questions I ask. In fact, I rarely use search engines now. (btw I jumped off Google almost a decade ago now, have been using duckduckgo as my main search driver)
I think the problem is that they cannot communicate that they don't know something and instead make up some BS that sounds somewhat reasonable. Probably due to how they are built. I notice this regularly when asking questions about new web platform features and there is not enough information in the training data.
Yes I (try to) use them all the time. I regularly compare ChatGPT, Gemini, and Claude side by side, especially when I sniff something that smells like bullshit. I probably have ~10 chats from the past week with each one. I ask genuine questions expecting a genuine answer, I don't go out of my way to try to "trick" them but often I'll get an answer that doesn't seem quite right and then I dig deeper.
I'm not interested in dissecting specific examples because never been productive, but I will say that most people's bullshit detectors are not nearly as sensitive as they think they are which leads them to accepting sloppy incorrect answers as high-quality factual answers.
Many of them fall into the category of "conventional wisdom that's absolutely wrong". Quick but sloppy answers are okay if you're okay with them, after all we didn't always have high-quality information at our fingertips.
The only thing that worries me is how really smart people can consume this slop and somehow believe it to be high-quality information, and present it as such to other impressionable people.
Your success will of course vary depending on the topic and difficulty of your questions, but if you "can't remember" the last time you had a BS answer then I feel extremely confident in saying that your BS detector isn't sensitive enough.
> Your success will of course vary depending on the topic and difficulty of your questions, but if you "can't remember" the last time you had a BS answer then I feel extremely confident in saying that your BS detector isn't sensitive enough.
Do you have a few examples? I'm curious because I have a very sensitive BS detector. In fact, just about anyone asking for examples, like the GP, has a sensitive BS detector.
I want to compare the complexity of my questions to the complexity of yours. Here's my most recent one, the answer to which I am fully capable of determining the level of BS:
I want to parse markdown into a structure. Leaving aside the actual structure for now, give me a exhaustive list of markdown syntax that I would need to parse.
It gave me a very large list, pointing out CommonMark-specific stuff, etc.
I responded with:
I am seeing some problems here with the parsing: 1. Newlines are significant in some places but not others. 2. There are some ambiguities (for example, nested lists which may result in more than four spaces at the deepest level can be interpreted as either nested lists or a code block) 3. Autolinks are also ambiguous - how can we know that the tag is an autolink and not HTML which must be passed through? There are more issues. Please expand on how they must be resolved. How do current parsers resolve the issues?
Today, I asked Google if there was a constant time string comparison algorithm in the JRE. It told me "no, but you can roll your own". Then I perused the links and found that MessageDigest.isEqual exists.