I consistently have exactly the opposite experience. ChatGPT seems extremely willing to do a huge number of searches, think about them, and then kick off more searches after that thinking, think about it, etc., etc. whereas it seems like Gemini is extremely reluctant to do more than a couple of searches. ChatGPT also is willing to open up PDFs, screenshot them, OCR them and use that as input, whereas Gemini just ignores them.
I will say that it is wild, if not somewhat problematic that two users have such disparate views of seemingly the same product. I say that, but then I remember my own experience just from few days ago. I don't pay for gemini, but I have paid chatgpt sub. I tested both for the same product with seemingly same prompt and subbed chatgpt subjectively beat gemini in terms of scope, options and links with current decent deals.
It seems ( only seems, because I have not gotten around to test it in any systematic way ) that some variables like context and what the model knows about you may actually influence quality ( or lack thereof ) of the response.
> I will say that it is wild, if not somewhat problematic that two users have such disparate views of seemingly the same product.
This happens all the time on HN. Before opening this thread, I was expecting that the top comment would be 100% positive about the product or its competitor, and one of the top replies would be exactly the opposite, and sure enough...
I don't know why it is. It's honestly a bit disappointing that the most upvoted comments often have the least nuance.
How much nuance can one person's experience have? If the top two most visible things are detailed, contrary experiences of the same product, that seems a pretty good outcome?
Also, why introduce nuance for the sake of nuance? For every single use case, Gemini (and Claude) has performed better. I can’t give ChatGPT even the slightest credit when it doesnt deserve any
Chatgpt is not one model! Unless you manually specify to use a particular model your question can be routed to different models depending on what it guesses would be most appropriate for your question.
Yes but then what does the grandparent mean with “unless you specify a specific model” ? Do they mean “if you select auto, it automatically decides between instant or thinking” ?
Because neither product has any consistency in its results, no predictive behaviour. One day it performs well, another it hallucinates non existing facts and libraries. Those are stochastic machines
I see the hyperbole is the point, but surely what these machines do is to literally predict? The entire prompt engineering endeavour is to get them to predict better and more precisely. Of course, these are not perfect solutions - they are stochastic after all, just not unpredictably.
Prompt engineering is voodoo. There's no sure way to determine how well these models will respond to a question. Of course, giving additional information may be helpful, but even that is not guaranteed.
Also every model update changes how you have to prompt them to get the answers you want. Setting up pre-prompts can help, but with each new version, you have to figure out through trial and error how to get it to respond to your type of queries.
I can't wait to see how bad my finally sort-of-working ChatGPT 5.1 pre-prompts work with 5.2.
It definitely isn’t voodoo, it’s more like forecasting weather. Some forecasts are easier to make, some are harder (it’ll be cold when it’s winter vs the exact location and wind speed of a tornado for an extreme example). The difference is you can try to mix things up in the prompt to maximize the likelihood of getting what you want out and there are feasibility thresholds for use cases, e.g. if you get a good answer 95% of the time it’s qualitatively different than 55%.
No, it's not. Nowadays we know how to predict the weather with great confidence. Prompting may get you different results each time. Moreover, LLMs depend on the context of your prompts (because of their memory), so a single prompt may be close to useless and two different people can get vastly different results.
And I’d really like for Gemini to be as good or better, since I get it for free with my Workspace account, whereas I pay for chatgpt. But every time I try both on a query I’m just blown away by how vastly better chatgpt is, at least for the heavy-on-searching-for-stuff kinds of queries I typically do.
It’s like having 3 coins and users preferring one or the other when tossing it because one coin gives consistently more heads (or tails) than the other coin.
What is better is to build a good set of rules and stick to one and then refine those rules over time as you get more experience using the tool or if the tool evolves and digress from the results you expect.
<< What is better is to build a good set of rules and
But, unless you are on a local model you control, you literally can't. Otherwise, good rules will work only as long as the next update allows. I will admit that makes me consider some other options, but those probably shouldn't be 'set and iterate' each time something changes.
what I had in mind when I added that comment was for coding, with the use of .md files.
For the web version of chats I agree there is little control on how to tailor the way you want the agent to behave, unless you give a initial "setup" prompt.
They're not really capable of producing varying answers based on load.
But they are capable of producing different answers because they feel like behaving differently if the current date is a holiday, and things like that. They're basically just little guys.
Tesla FSD has been more or less the same experience. Some people drive 100s of miles without disengaging while others pull the plug within half a mile from their house. A lot of it depends on what the customer is willing to tolerate.
We've been having trouble telling if people are using the same product ever since Chat GPT first got popular. The had a free model and a paid model, that was it, no other competitors or naming schemes to worry about, and discussions were still full of people talking about current capabilities without saying what model they were using.
For me, "gemini" currently means using this model in the llm.datasette.io cli tool.
openrouter/google/gemini-3-pro-preview
For what anyone else means? If they're equivalent? If Google does something different when you use "Gemini 3" in their browser app vs their cli app vs plans vs api users vs third party api users? No idea to any of the above.
Same, I use chatgpt plus (the entry-level paid option) extensively for personal research projects and coding, and it seems miles ahead of whatever "Gemini Pro" is that I have through work. Twice yesterday, gemini repeated verbatim a previous response as if I hadn't asked another question and told it why the previous response was bad. Gemini feels like chatGPT from two years ago.
Are you uploading PDFs that already have a text layer?
I don't currently subscribe to Gemini but on A.I. Studio's free offering when I upload a non OCR PDF of around 20 pages the software environment's OCR feeds it to the model with greater accuracy than I've seen from any other source.
I'm sorry but I don't see what "knowledge cutoff" has to do with what we were talking about- which is using a LLM find PDFs and other sources for research.
I agree with you. To me, gemini has much worse search results. Then again, I use kagi for search and I cannot stand the search results from Google anymore. And its clear that gemini uses those.
In contrast, chatgpt has built their own search engine that performs better in my experience. Except for coding, then I opt for Claude opus 4.5.
I’ve done this as well. I’m only making twice what I was 30 years ago at 18. (Though to be fair, I was making a ridiculous amount of money for a 18 year old - my first job was in the Qwest network operations center.)
No no, spend three months in an untraceable maze of ICE holding facilities, then to jail, then deported to a 'shithole' country that you didn't come from in the first place.
Exactly. I give services like this - generally coded as someone's first "wow I know PHP now!" or the modern equivalent - approximately 5 years shelf life, at best.
Whereas I have notes-to-future-me on my calendar that I put there 30 years ago.
What calendar system have you been using for 30 years, that's survived that long?
I think I sent one of those "mails to the future" in the 90's, asking 2002 me how I am. I don't think it ever arrived, or the free email domain I was using ceased operating.
Sheesh, anyone old enough to remember the services offering a free email address with a choice of maybe 50 domains in a dropdown?
This was my first exposure to the Fourier transfer. I also highly recommend this book. It was recommended to me by the head of the math department at university.
Reminds me of Harry Turtledove's The Road Not Taken.
https://en.wikipedia.org/wiki/The_Road_Not_Taken_(short_stor...
reply