"You need to take care so the synchronization isn't eating your performance wins though. I know you wrote "in a sensible manner", so you probably already thought of that."
Like I said, I haven't gotten too far into it, but one of the other ideas that keeps coming to mind is to have some sort of cost model in the language, because one of my other I Have A Dream elements of a new language is the ability to make assertions about blocks of code like "this code is inlined" or (in a GC'ed language) "this variable is stack allocated", or in the case of crypto code, "this code is optimized only to constant-time operations". These assertions wouldn't make the thing they are asserting happen, because experience over the decades shows that that doesn't work (witness the way that inline pragmas have gotten consistently weaker over the years to the point that I believe they're no-ops in many environments), but they'd be signs to the compiler to say back to the user that if the constraint is violated, here's a clear message explaining why.
The idea could be adapted to something like "this block of code only copies the data to the GPU once" or some variant of that claim (a given byte only copied once).
I think a similar thing could be used for some interesting network/cloud transparency, too, where the language permits wiring up function calls/messages/whatever transparently, but you can use labeling to make compile-time (or possibly start-up time) assertions about how expensive the result is going to be. I've tossed around the idea of making a function call not necessarily be the fundamental primitive, but maybe have something weaker than that be the cross-module default, like an Erlang message pass. One of the reasons networks are such a pain to work with in conventional programming languages is the mismatch between function calls and network operations; it would be interesting to expand on Erlang and see if we might be able to address that via weakening the function call across modules.
Anyhow, like I said, I'll never get to any of this, but I do sort of feel like there's a surprising amount of room for a new language out there that doesn't get explored very well because people keep remaking Python or something like D over and over again, just with slightly different spelling.
You probably already know of it, but you might find Halide interesting. It aims to separate computation strategy and dataflow, in a performant fashion.
Also, potentially pyCUDA and other things that google autocompletes when you type "pyCUDA vs".
None of them are exactly what you're talking about, of course.
This is one of those things I have in my list of things to try out one of these days. If I'm not mistaken, this is what Google use to implement their image algorithms in their Google Camera Android app. Marc Levoy is one of the co-authors of the Halide paper [0] and is now working at Google, so it's natural. Do you happen to know how widely used it is in industry?
If you mean me (and I guess others aren't likely to come across this conversation):
I was very impressed by Halide but I'm not involved and have no idea how widely it is used, sorry. I could imagine that a lot of potential users enjoy tweaking GPU and SIMD code enough, or have enough confidence in their results, that they don't give it a good look.
Like I said, I haven't gotten too far into it, but one of the other ideas that keeps coming to mind is to have some sort of cost model in the language, because one of my other I Have A Dream elements of a new language is the ability to make assertions about blocks of code like "this code is inlined" or (in a GC'ed language) "this variable is stack allocated", or in the case of crypto code, "this code is optimized only to constant-time operations". These assertions wouldn't make the thing they are asserting happen, because experience over the decades shows that that doesn't work (witness the way that inline pragmas have gotten consistently weaker over the years to the point that I believe they're no-ops in many environments), but they'd be signs to the compiler to say back to the user that if the constraint is violated, here's a clear message explaining why.
The idea could be adapted to something like "this block of code only copies the data to the GPU once" or some variant of that claim (a given byte only copied once).
I think a similar thing could be used for some interesting network/cloud transparency, too, where the language permits wiring up function calls/messages/whatever transparently, but you can use labeling to make compile-time (or possibly start-up time) assertions about how expensive the result is going to be. I've tossed around the idea of making a function call not necessarily be the fundamental primitive, but maybe have something weaker than that be the cross-module default, like an Erlang message pass. One of the reasons networks are such a pain to work with in conventional programming languages is the mismatch between function calls and network operations; it would be interesting to expand on Erlang and see if we might be able to address that via weakening the function call across modules.
Anyhow, like I said, I'll never get to any of this, but I do sort of feel like there's a surprising amount of room for a new language out there that doesn't get explored very well because people keep remaking Python or something like D over and over again, just with slightly different spelling.