Hello, I am the author of VkFFT, Tolmachev Dmitrii.
I remember VkFFT got a lot of initial traction thanks to Hacker News three years ago. Back then VkFFT was a simple collection of pre-made shaders for powers of two FFTs.
Nowadays it is based on the runtime code generation and optimization platform that supports all the mentioned backends, has a wide range of implemented algorithms (some of which are not even present in other codes) to cover all system sizes and can do things no other GPU FFT library can so far (like real to real transforms, arbitrary dimensional transforms, zero-padding, convolutions and more).
If you have some questions about the library, design choices, functionality or anything else - I will be happy to answer them!
The way it works now with run-time codegen it should be Fast Fourier Transform Fixed That For You (FFTFTFY) which also somewhat evokes a butterfly diagram.
Vulkan is a terrible standard. And by relation, DirectX is now, too considering they're identical. The amount of absolutely worthless boilerplate is through the roof.
I had this complaint myself not long ago, but Vulkan isn't targeted at new computer graphics programmers just learning about the graphics pipeline, or vertex and fragment shaders. It's certainly not optimised for the 'hello triangle' use case—your complaint is equivalent to someone saying 'why do I need #include <cstdio>, and a main() just to print "hello world"?' A lot of that boilerplate is write-once, meaning it's first-time setup that will never have to be done again after actually initialising the GPU. You'll find that extending that example to include texture mapping, mipmapping, supersampling, and even GPU compute will be a lot easier than the initial code.
Vulkan is targeted at graphics and game engine developers who want to extract the maximum possible performance from their GPUs and know the limitations of a global-state API like OpenGL. Vulkan allows extremely fine-grained pipeline management and synchronisation primitives, and adds in hardware ray-tracing support natively without having to uncannily bolt it on.
If all you want is to draw a three-coloured triangle, you can do that easier and faster with ShaderToy instead of fudging with Vulkan. If you want to write a fast, powerful, modern graphics engine, then you use Vulkan or D3D12.
People forget that GPUs are now massive slices of silicon with memory and power subsystems in their own right, and are obviously extremely powerful hardware. At some point, OpenGL itself becomes a bottleneck, or is difficult enough to program with that Vulkan becomes easier, and that's when the true utility and power of its extreme verbosity is displayed.
For the record, the boilerplate isn't '921 lines of nothing'—it's effectively setting the GPU up from scratch, similar to bootstrapping a CPU from 16-bit real mode.
Except with the deprecation/stagnation of OpenGL, Vulkan is the only thing new computer graphics programmers have to learn from Khronos as native 3D API.
Use a middleware instead might be the answer.
Then we don't need really Vulkan, as the middleware already allows to use the best 3D API on each platform.
I'd like to see this tested on a 210/250 with ROCm 5.6. There are improvements in the latest release that might affect the benchmarks in a positive way.
To a first approximation, Kompute[1] is that. It doesn't seem to be catching on, I'm seeing more buzz around WebGPU solutions, including wonnx[2] and more hand-rolled approaches, and IREE[3], the latter of which has a Vulkan back-end.
Very impressive performances. I'd be happy to have a comparison with regular CPU performances... If you put together the GPU time + GPU upload & download, is it faster than CPU overall ?
Did NVIDIA's ascent to a 1T company result primarily from their substantial software investments, or is there another element that AMD needs to focus on to achieve similar recognition and adoption in the realm of GPU compute?
It's mostly software. Nobody gives a damn about your AI chip even if it is better than what AMD has and AMD's hardware is no slouch.
The other factor is that AMD's data center hardware is not available at any cloud provider so nobody even has access to the supposedly supported hardware.
Right, I know, but what's the advantage of gpu vs cpu for fft, considering cpu-s support some vectorization and you need to format the data and send it to the gpu and back.
As far as I understand that's not a very meaningful question b/c it depends on what CPU and what GPU. So it's a bit apples to oranges and depends on the user's configuration. There is a benchmark at the very bottom: https://openbenchmarking.org/test/pts/vkfft
Also, maybe a bit obvious.. but that even if there is no huge benefit - sending compute to the GPU frees up your CPU/application to do other things .. like keeping your application responsive :)
But not backed by VkFFT. The implication of the comment is that it would make FFTs on various backends easier if it was implemented on VkFFT in the first place. Not sure that's true though, as I don't know how much code the various backends share.
Edit: as an example that I experienced first-hand, coremltools, which converts PyTorch to CoreML models, only gained FFT support very recently. It's also not really a PyTorch backend but a PyTorch code converter, though, so wouldn't benefit at all from PyTorch's FFT being backed by VkFFT. Still, good example that one shouldn't take FFTs for granted.
I remember VkFFT got a lot of initial traction thanks to Hacker News three years ago. Back then VkFFT was a simple collection of pre-made shaders for powers of two FFTs.
Nowadays it is based on the runtime code generation and optimization platform that supports all the mentioned backends, has a wide range of implemented algorithms (some of which are not even present in other codes) to cover all system sizes and can do things no other GPU FFT library can so far (like real to real transforms, arbitrary dimensional transforms, zero-padding, convolutions and more).
If you have some questions about the library, design choices, functionality or anything else - I will be happy to answer them!