This would be very useful; I've written some OpenCL code (called from Python, but I coded the Kernel and set up the buffers in C myself) and it's full of pitfalls; much harder than just "here's an array, here's a procedure. run it in parallel for me."
I beleive there are already ways to do this (IIRC, F#'s original demo showed off trivially parallelisible functions), but a simple interface (i.e. something that looks like a high level language with a REPL etc) is key. As someone who's generally excited about anything that makes you write Scheme, or anything that integrates with Python, I'll be keeping an eye on Harlan!
The reason it's harder is because it deals with device specifics as well as type of memory and mapping, queues, workgroup size.. etc.. This framework might be cute but all it really is doing is what compiler vectorizes have done for ages.
I beleive there are already ways to do this (IIRC, F#'s original demo showed off trivially parallelisible functions), but a simple interface (i.e. something that looks like a high level language with a REPL etc) is key. As someone who's generally excited about anything that makes you write Scheme, or anything that integrates with Python, I'll be keeping an eye on Harlan!