Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I was really confused for a moment, because the article mention CUDA a lot, which is a nvidia-specific API/framework/language. I guess that's mainly to appeal to the CUDA crowd? However, Julia being seemingly based on LLVM, interfacing it with AMD GPUs should be quite doable:

> Much of the initial work focused on developing tools that make it possible to write low-level code in Julia. For example, we developed the LLVM.jl package that gives us access to the LLVM APIs. Recently, our focus has shifted towards generalizing this functionality so that other GPU back-ends, like AMDGPU.jl or oneAPI.jl can benefit from developments to CUDA.jl. Vendor-neutral array operations, for examples, are now implemented in GPUArrays.jl whereas shared compiler functionality now lives in GPUCompiler.jl. That should make it possible to work on several GPU back-ends, even though most of them are maintained by only a single developer.

This is the takeaway to me. First-class access to GPU accelerators using the same syntax, regardless of the vendor :)




Yes, it would be great if the .jl source code didn't even mention CUDA (but right now it does, with statements such as "using CUDA" and "CuArray(...)".)


Yes, that's fair. I focused on CUDA.jl because it is the most mature, easiest to install, etc. but as I mentioned we're actively working on generalizing that support as much as possible, and as a result support for AMD (AMDGPU.jl) and Intel (oneAPI.jl) GPUs is rapidly catching up.


This is a complete novice, ill informed, question. So forgive it in advanced, but why have an AMD specific backend at all? Couldn't you just use AMD's HIP/HIP-IFY tool on the CUDA backend and get an AMD friendly version out?

https://github.com/ROCm-Developer-Tools/HIP

I realize these sort of tools aren't magic and whatever it spites out will need work, but it seems like a really good thin starting place for AMD support with a lower overhead for growth.

After the original CUDA bits can ""cross-compile"", the workflow is greatly reduced, right?

Workflow:

- update CUDA code

- push through the HIPIFY tool

- Fix what is broken (if you can fix it on the CUDA side)

After enough iterations, the CUDA code will grow friendly to HIPification...


> This is a complete novice, ill informed, question. So forgive it in advanced, but why have an AMD specific backend at all? Couldn't you just use AMD's HIP/HIP-IFY tool on the CUDA backend and get an AMD friendly version out?

HIP and HIPify only work on C++ source code, via a Perl script. Since we start with plain Julia code, and we already have LLVM integrated into Julia's compiler, it's easiest to just change the LLVM "target" from Native to AMDGPU (or NVPTX in CUDA.jl's case) to get native machine code, while preserving Julia's semantics for the most part.

Also, interfacing to ROCR (AMD's implementation of the Heterogeneous System Architecture or HSA runtime) was really easy when I first started on this, and codegen through Julia's compiler and LLVM is trivial when you have CUDAnative.jl (CUDA.jl's predecessor) to look at :)

I should also mention that not everything that CUDA does maps well to AMD GPU; CUDA's streams are generally in-order (blocking), whereas AMD's queues are non-blocking unless barriers are scheduled. Also, things like hostcall (calling a CPU function from the GPU) doesn't have an obvious alternative with CUDA.


Thank you for taking the time! I found this quite helpful.


Something that is hinted at, but not spelled out loud in our posts is that AMD actively upstreams and maintains a LLVM back-end for their GPUs, so it really is a matter of switching the binary target for the generated code, at least in theory :)




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: