Hacker News new | past | comments | ask | show | jobs | submit login

But if you want to predict the future utility of these models you want to look at their current intelligence, compare that to humans and try to figure out roughly what skills they lack and which of those are likely to get fixed.

For example, a team of humans are extremely reliable, much more reliable than one human, but a team of AI's isn't mean reliable than one AI since an AI is already an ensemble model. That means even if an AI could replace a person, it probably can't replace a team for a long time, meaning you still need the other team members there, meaning the AI didn't really replace a human it just became a tool for huamns to use.




I think this is a fair criticism of capability.

I personally wouldn't be surprised if we start to see benchmarks around this type of cooperation and ability to orchestrate complex systems in the next few years or so.

Most benchmarks really focus on one problem, not on multiple real-time problems while orchestrating 3rd party actors who might or might not be able to succeed at certain tasks.

But I don't think anything is prohibiting these models from not being able to do that.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: