> AI can displace human work but not human accountability. It has no skin and faces no consequences.
Let’s assume that we have amazing aj and robotics, better than humans at everything - if you could choose between robosurgery (completely automatic) with 1% mortality and for 5000 usd vs surgery performed by human with 10% mortality and 50000 usd price tag, would you really choose human just because you can sue him? I wouldn’t. I don’t thing anyone thinking rationally would.
The relation won’t invert because it’s very easy and quick to train guy pilling up bricks while training architect is slow and hard. If low skilled jobs will pay much better than high skilled then people will just change their job.
That’s only true as long as the technical difficulties aren’t covered by tech.
Think of a world where software engineering itself is handled relatively well by the llm and the job of the engineer becomes just collecting business requirements and checking they’re correctly addressed.
In that world the limit for scarcity might be less in the difficulty of training and more in the willingness to bend your back in the sun for hours vs comfortably writing prompts in an air conditioned room.
Right now there are enough people willing to bend their back in the sun for hours that their salaries are much lower than these of engineers. Do you think that for some reason supply of these people will drop with higher wages and much lower employment opportunities in office jobs? I highly doubt it.
My argument is not that those people’s salaries will go up until overtaking the engineers’.
It’s the opposite, that the value of office/intellectual work will tank, while manual work remains stable. Lower barrier of entry for intelectual work if a position even needs to be covered, work conditions much more comfortable.
That’s completely wrong and you are just making things up to fit your anticapitalist world view. I know people that take dope to break 3h in marathon, trust me, no one thinks that you will get any financial benefit from breaking 3h.
Of course they are. I wouldn't expect otherwise :)
But the price we're paying (and I don't mean money) is very high, imho. We all talk about how good engineers write code that depends on high-level abstractions instead of low-level details, allowing us to replace third party dependencies easily and test our apps more effectively, keeping the core of our domain "pure". Well, isn't it time we started doing the same with LLMs? I'm not talking about MCP, but rather an open source tool that can plug into either free and open source LLMs or private ones. That would at least allow us to switch to a free and opensource version if the companies behind the private LLMs go rogue. I'm afraid tho that wouldn't be enough, but it's a starting point.
To put an example: what would you think if you need to pay for every single Linux process in your machine? Or for every Git commit you make? Or for every debugging session you perform?
> an open source tool that can plug into either free and open source LLMs or private ones
Fortunately there are many of these that can integrate with offline LLMs through systems like LiteLLM/Ollama/etc. Off the top of my head, I'd look into Continue, Cline and Aider.
> I'm not talking about MCP, but rather an open source tool that can plug into either free and open source LLMs or private ones. That would at least allow us to switch to a free and opensource version if the companies behind the private LLMs go rogue. I'm afraid tho that wouldn't be enough, but it's a starting point.
There are open source tools that do exactly that already.
Ah, well that's nice. But every single post I read don't mention them? So, I assume they are not popular for some reason. Again, my main point here is: the normalization of using private LLMs. I don't see anyone talking about it; we are all just handing over a huge part of what it means to build software to a couple of enterprises whose goal is, of course, to maximize profit. So, yeah, perhaps I'm overthinking I don't know; I just don't like that now these companies are so ingrained in the act of building software (just like AWS is so ingrained in the act of running software)
Because the models are so much worse that people aren't using them.
Philosophical battles don't pay the bills and for most of us they aren't fun.
There have been periods of my life where I stubbornly persisted using something inferior for various reasons - maybe I was passionate about it, maybe I wanted it to exist and was willing to spend my time debugging and offer feedback - but there a finite number of hours in my life and often I'd much rather pay for something that works well than throw my heart, soul, time, and blood pressure at something that will only give me pain.
> I'm not talking about MCP, but rather an open source tool that can plug into either free and open source LLMs or private ones.
Has someone computed/estimated what is at cost $$$ value of utilizing these models at full tilt: several messages per minute and at least 500,000 token context windows? What we need is a wikipedia like effort to support something truly open and continually improving in its quality.
> Well, isn't it time we started doing the same with LLMs? I'm not talking about MCP, but rather an open source tool that can plug into either free and open source LLMs or private ones.
Almost all providers and models can be used with the OpenAI api and swapping between them is trivial.
None of that applies here since we could all easily switch to open models at a moment's notice with limited costs. In fact, we switch between proprietary models every few months.
It just so happens that closed models are better today.
They are a little better. Sometimes that little bit is an activation-energy level of difference. But overall, I don't see a huge amount of difference in quality between the open and closed models. Most of the time, it just takes a little more effort to get as good of results out of the open models as the closed ones.
The scope and creativity required for IMO is much bigger than chess/GO. Also IMO is taken VERY seriously. It's a huge deal, much bigger than any chess or go tournaments.
Imo competitive math (or programming) is about knowing some tricks and then trying to find a combination of them that works for a given task. The number of tricks and depth required is much less than in go or chess.
I don't think it's very creative endeavor in comparison to chess/go. The searching required is less as well. There is a challenge processing natural language and producing solutions in it though.
Creativity required is not even a small fraction of what is required for scientific breakthroughs. After all no task that you can solve in 30 minutes or so can possibly require that much creativity - just knowledge and a fast mind - things computers are amazing at.
I am AI enthusiast. I just think a lot of things that were done so far are more impressive than being good at competitive math. It's a nice result blown out of proportion by OpenAI employees.
I'd disagree with this take. Math olympiads are some of the most intellectually creative activities I've ever done that fit within a one day time limit. Chess and go don't even come close--I am not a strong player, but I've studied both games for hundreds of hours. (My hot take is that chess is not even very creative at all, that's why classical AI techniques produced super human results many years ago.)
There is no list of tricks that will get a silver much less a gold medal at the IMO. The problem setters try very hard to choose problems that are not just variations of other contests or solvable by routine calculation (indeed some types of problems, like polynomial inequalities, fell out of favor as near-universal techniques made them too routine to well prepared students). Of course there are common themes and patterns that recur--no way around it given the limited curriculum they draw on--but overall I think the IMO does a commendable job at encouraging out-of-the-box thinking within a limited domain. (I've heard a contestant say that IMO prep was memorizing a lot of template solutions, but he was such a genius among geniuses that I think his opinion is irrelevant to the rest of humanity!)
Of course there is always a debate whether competition math reflects skill in research math and other research domains. There's obvious areas of overlap and obvious areas of differences, so it's hard to extrapolate from AI math benchmarks to other domains. But i think it's fair to say the skills needed for the IMO include quite general quantitative reasoning ability, which is very exciting to see LLMs develop.
What you are missing about chess and go is that those games are not about finding one true solution. They are very psychological games (at human level) and are about finding moves that are difficult to handle for the opponent. You try to understand how your opponent thinks and what is going to be unpleasant for them. This gives a lot of scope for creative and psychological warfare.
In competitive math (or programming) there is one correct solution and no opponent. It's just not possible for it to be very creative endeavor if those solutions can be found in very limited time.
>>(I've heard a contestant say that IMO prep was memorizing a lot of template solutions, but he was such a genius among geniuses that I think his opinion is irrelevant to the rest of humanity!)
So you have not only chosen to ignore the view of someone who is very good at it but also assumed that even though the best preparation for them is to memorize a lot of solutions it must be about creativity for people who are not geniuses like this guy? How does it make sense at all?
> They are very psychological games (at human level) and are about finding moves that are difficult to handle for the opponent. You try to understand how your opponent thinks and what is going to be unpleasant for them. This gives a lot of scope for creative and psychological warfare.
And yet even basic models which can run on my phone win this psychological warfare with best players in the world. The scope of problems on IMO is unlimited. Please note that IMO is won by literally best high-school students in the world and most of them are unable to solve all problems (even gold medal winners). Do you think that they are dumb and unable to learn "few tricks"?
>In competitive math (or programming) there is one correct solution and no opponent. It's just not possible for it to be very creative endeavor if those solutions can be found in very limited time.
That's absurd. You could say same things about math research (and "one correct solution" would be wrong as it is for IMO), do you consider it something that's not creative?
>>Do you think that they are dumb and unable to learn "few tricks"?
They are just slow because they are humans. It's like in chess: if you calculate million times faster than a human you will win even if you're pretty dumb (old school chess programs). Soon enough Chat GPT will be able to solve IMO problems at international level. It still can't play chess.
>>That's absurd. You could say same things about math research (and "one correct solution" would be wrong as it is for IMO), do you consider it something that's not creative?
Have you missed the other condition?
No meaningful math research can be done in 30-60 minutes (time you have for IME problem). Nothing of value that require creativity can be done in short time. Creativity requires forming a mental model, exploration, trying various paths, making connections. This requires time.
My point about math competitions not being taken as seriously also stands. People train chess or go for 10-12 years before they start peaking and then often improve after that as well. This is a lot of hours every day. Math competitions aren't done for so many hours and years and almost no one does them anymore once in college.
This means level at those must be low in comparison to endeavours people pursue professionally.
I have pipelines written in both frameworks.
Nextflow (despite the questionable selection of groovy as the language of choice) is more powerful and enables greater flexibility in terms of information flow.
For example, snakemake makes it very difficult if not impossible to create pipelines that deviate from a DAG architecture. In cases where you need loops, conditionals and so on, Nextflow is a better option.
One thing that I didn't like about nextflow is that all processes can either run under apptainer or docker, you can mix and match docker/apptainer like you do in snakemake rules.
Can you describe a scenario that would be impossible to code for in a snakemake paradigm? For example at least with conditionals I imagine you could bake some flags into the output filename and have different jobs parse that. I’m not sure exactly what you mean by loop but if its iterating over something that can probably be handled with the expand or lambda functions.
Here is a scenario which is relatively trivial in Nextflow and difficult to write in snakemake:
1. A process that "generates" protein sequences
2. A collection of processes that perform computationally intensive downstream computations
3. A filter that decides, based on some calculation an a threshold whether the output from process (1) should move to process (2).
Furthermore, assume you'd like process (1) to continue generating new candidates continously and independently until N number of candidates pass the filter for downstream processing.
That's not something that you can do easily with snakemake since it generates the DAG before computation starts.
Sure, you can create some hack or use checkpoints that forces snakemake to reevaluate the DAG and so on, and maybe --keep-going=true so that it won't end the other processes from failing, but with nextflow you just set a few channels as queues and connect them to processes, which is much easier.
Just make your N number of candidates check generate some empty file after N is reached and put that as input for the next job. For threshold example you can do the same thing or even bake the metric into a filename.
As I said, you can hack your way through snakemake to make it work probably using DAG reevaluation and tricks with filenames, but Nextflow allows it in a much more straightforward manner that's more easy to follow, understand and debug.
I’ve used both. I would say nextflow is a more production-oriented tool. Check out seqera platform to see if any of the features there seem useful. It can also be useful to get out of the wildcards/files mindset for certain workflows. Nextflow chucks the results of a step into a hashed folder, so you don’t have to worry about unique output names.
That said, I do find snakemake easier to prototype with. And it also has plenty of production features (containers, cloud, etc). For many use cases, they’re functionally equivalent
Let’s assume that we have amazing aj and robotics, better than humans at everything - if you could choose between robosurgery (completely automatic) with 1% mortality and for 5000 usd vs surgery performed by human with 10% mortality and 50000 usd price tag, would you really choose human just because you can sue him? I wouldn’t. I don’t thing anyone thinking rationally would.