I think this is the journalist playing at the idea that we don't have a universally accepted definitions of "life", "virus", "viroids", "mobile genetic elements", "plasmids", or the other words that describe what I view as the agents of evolutionary games. It is kind of catchy way to raise that conversation though eh!
I've been using Gemini Flash for free through the API using Cline for VS Code. I switch between Claude and Gemini Flash, using Claude for more complicated tasks. Hope that the 2.0 model comes closer to Claude for coding.
Agreed - tried some sample prompts on our data and the rough vibe check is that flash is now as good as the old pro. If they keep pricing the same, this would be really promising.
Sure average humans don’t do that, but this is hackernews where it’s completely normal for commenters to confidently answer questions and opine on topics they know absolutely nothing about.
"On two occasions I have been asked, 'Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?' I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question."
So the result might not necessarily be bad, it's just that the machine _can_ detect that you entered the wrong figures! By the way, the answer is 7.
> Can you please give an example of a “completely illogical statement” produced by o1 model? I suspect it would be easier to get an average human to produce an illogical statement.
Following the trail as you did originally: you do not hire "ordinary humans", you hire "good ones for the job"; going for a "cost competitive" bargain can be suicidal in private enterprise and criminal in public ones.
Sticking instead to the core matter: the architecture is faulty, unsatisfactory by design, and must be fixed. We are playing with the partials of research and getting some results, even some useful tools, but the idea that this is not the real thing must be clear - also since this two years plus old boom brought another horribly ugly cultural degradation ("spitting out prejudice as normal").
> For simple tasks where we would alternatively hire only ordinary humans AIs have similar error rates.
Yes if a task requires deep expertise or great care the AI is a bad choice. But lots of tasks don't. And in those kinds of tasks even ordinary humans are already too expensive to be economically viable
Do you have good examples of tasks in which dubious verbal prompt could be an acceptable outcome?
By the way, I noticed:
> AI
Do not confuse LLMs with general AI. Notably, general AI was also implemented in system where critical failures would be intolerable - i.e., made to be reliable, or part of a finally reliable process.
reply