Hacker News new | past | comments | ask | show | jobs | submit login
Beginners Guide to Vulkan (khronos.org)
248 points by deafcalculus on Aug 25, 2017 | hide | past | favorite | 74 comments



Oculus Tiny Room is linked, but Valve also has a pretty good minimal Vulkan + OpenVR example in the OpenVR SDK called hellovr_vulkan (https://github.com/ValveSoftware/openvr/blob/master/samples/...).

On HN this might be attractive because people tend to complain about Windows only software, and all of Valve's samples use SDL and in theory run on Mac OS X and Linux. The OpenVR example is a little bit more verbose, but it does have some extra functionality, like drawing render models (controllers) provided by the API.


Thanks for this!

I happen to have a Vive and have been wanting to get my game engine outputting to it. I am primarily focused on Linux.


Can we see your engine?


> It reminds me of a lot of the attitudes surrounding C++ when I was first learning it-- yes, it's more lines of code, but that doesn't mean it can't be a fine place for a beginner to start learning.

Sure, but Vulkan really feels like another level of complexity. OpenGL is simple in comparison.


Programming anything non trivial in OpenGL that is intended to be used on a variety of GPU's in an efficient manner, is a royal pain in the butt. This is because of the black box that opengl provides. This makes things simpler but I'd rather deal with the complexity of the API, rather than "hope and prey" style programming that opengl accepts (encourages).


It may feel like it, but when you consider all of OpenGL's versions and extensions where it quickly becomes an ugly mess, that feeling disappears.

Vulkan feels more like a breath of fresh air in comparison, and its still higher level than the APIs we get on consoles.

I find it much simpler to architect engines with Vulkan in mind than OpenGL.


Well, libraries and frameworks have solved problems like that before and probably will again.


Low level graphics programming is tough, specialist work. Imagine a relatively simple stateful API layer over the top of this stuff, widely adopted and with an open spec. We could call it the Open Graphics Library, or OpenGL for short.


Well, as I understand it the issue with OpenGL was that it was very high-level without an option to drop to a lower level on demand since it was implemented in a complex GPU driver. This meant that drivers mostly had to guess what the intention was, and be optimized for e.g. specific games inside of the driver. With Vukan and a high-level library, you would have the option of dropping down to a lower level if you want and making those optimisations yourself. Keeping the complexity inside of the library, not the driver means you can modify it as you wish.


And a horribly painful codebase that is a horror to work with.

Nowadays people are more comfortable with having objects that represent something, and executing operations on them, than having to bind an object to a global variable, execute a global procedure, and unbinding it.


You think so? I think people like to call functions with comprehensible arguments.

I don't agree that people like to bind an object to a global variable, execute a global procedure, and unbind it. Does that sound like the API of anyone's dreams? The binding and unbinding sounds like pure overhead to me.

I understand that very high performance and easy API may not be achievable, but seriously, the modern graphics APIs are hard to love.


> I don't agree that people like to bind an object to a global variable, execute a global procedure, and unbind it. Does that sound like the API of anyone's dreams? The binding and unbinding sounds like pure overhead to me.

Isn't that how OpenGL works?


Not in the old days, no.

glColor3f( 1,0,0 );

glBegin( GL_POLYGON );

glVertex3f( 0,0,0 );

glVertex3f( 1,1,0 );

glVertex3f( 1,2,0 );

glEnd();


That's exactly the global state I am complaining about.

The current polygonal environment is a global state variable.

The current vertex is a globally set variable (as you'll notice once you try to set textures and colors, becayse glVertex3f just sets a global variable), etc.

Try throwing two threads at this, you'll get hilarious results, and there's no way to propeely fix this.

And OpenGL 1.0 to 1.5 and then 2.0 made it even worse.

You allocate a buffer, fine.

But to write to it, you first bind the buffer to a global variable, then use a call writing to a global variable.


> The current vertex is a globally set variable (as you'll notice once you try to set textures and colors, becayse glVertex3f just sets a global variable), etc.

Bad example. glVertex3f is the one function (among the various glColor, glMultiTexCoord etc.) which doesn't set global state. It actually triggers a vertex to be sent (which takes its attributes from the global state you mentioned).

But all this is pretty moot, because it's not how OpenGL has been meant to be used in ages. Use the core profile, where these functions don't even exist any more.

> You allocate a buffer, fine.

> But to write to it, you first bind the buffer to a global variable, then use a call writing to a global variable.

Also outdated, though not by quite as much. Direct state access (DSA) has been a thing for a pretty long time, now.


> Also outdated, though not by quite as much. Direct state access (DSA) has been a thing for a pretty long time, now.

If it just worked for everything...

Even in 4.5 there's many functions that have no direct state version yet, and many others were only introduced in 4.4 and 4.5 (meaning it won't be really compatible anywhere).

For me, the #1 advantage of Vulkan is having DSA for everything, just working.


> Even in 4.5 there's many functions that have no direct state version yet

Which ones are you missing in particular?


I think this might be a misunderstanding, I said that Vulkan (with complicated calls, but no shared state whatsoever) is far preferrable to OpenGL (no complicated calls, but all procedures only operate on global state).


You are right! I'm sorry, I misread your sentence and missed the 'than' that changes the meaning! My bad.


I personally think the OpenGL4.5 new functions, and Vulkan, are quite nice in readability actually.

In contrast, older OpenGL with the global state is the pure horror, and even 4.5 has lots of that still.


Except when all you have to debug it is a black screen.

Graphics work is tricky, you need everything to line up just right or stuff renders in ways that don't lend well to understanding what went wrong. When you put a framework abstraction in between that it can be really difficult to find out exactly what went wrong.


I clicked cause I thought it was about Star Trek. Always forget there's a library with that name.


> Always forget there's a library with that name.

And that "Vulcan" is properly spelled with a "c" in Star Trek? (At least in English...I suppose it might be different in other languages).


Disappointed it's not on Khan Academy.


Yes, me too, I really love conlangs. For the Vulcan language see: http://www.vli-online.org/vlif.htm https://scifi.stackexchange.com/questions/1453/how-realized-...


Can someone give me a non-graphical use-case for learning Vulkan or just GPU-based programming in general? I've heard of hardware acceleration. Is it something like writing your routines in a language like Vulkan and offloading the computation to the GPU?


Massive SIMD works great on the GPU. Compute shaders are already a non-graphical use-case on the GPU.

You get high bandwidth but also high latency, so its best for a few large batches rather than lots of small batches.

There is also overhead to uploading and downloading data to and from the GPU. So you want to save more time than you spend there.


The use cases for GPU compute are fairly narrow. You need something that is embarrassingly parallel and deals with almost nothing but floats. You also need something isolated enough that you're ok with paying the PCI-E bandwidth overhead to send it to the GPU & receive the results back from the GPU.


> You need something that…deals with almost nothing but floats.

This hasn't been true for years. GPU integer compute is quite good these days.


GPUs will handle ints just fine, but it's not what they are best at. They are best at fp32, and depending on the GPU the gap is rather substantial. The performance characteristics of integer ops is also kinda weird.


AMD GPUs actually have identical performance for int32 and fp32, except for full 32-bit integer multiplies and divisions. I think that's a big part of why cryptocurrency miners like them so much.


We can solve this by looking at the published latencies:

http://docs.nvidia.com/cuda/cuda-c-programming-guide/index.h...

And indeed 32 bit integers and 32 bit floats are essentially the same, except for multiplication where it's fuzzy, but still quite fast.

Certainly modern GPUs are fast enough at integer ops that you shouldn't just assume your problem will be slow on the GPU just because it's integer based. Bitcoin mining (as much as I hate to bring it up) is an obvious counterexample, for instance.


WebGL now has integer types in shaders as well.


Anything that involves a massive amount of independent floating point computations can easily be offloaded to the GPU. It is more complicated, when each floating point computation depends on the result of other floating point computations because it is not easy to tell the GPU about the floating point computation's relationship with one another. And this cannot be done, without slowdown, on the CPU.


This is true, but there are cases where certain very branchy/interdependent problems can be pushed to the GPU (with enough effort). The Bullet physics engine's GPU-based rigid body physics pipeline is a good example of this working out pretty well.


Yep recent advances surely make this easier, but if your supporting older version of opengl you options are limited.


Numerically solving PDE on large scale.


For folks curious about WebGL in a Vulkan world, there's https://github.com/KhronosGroup/WebGLNext-Proposals/tree/mas... .


Not to be overly harsh about their work but that just screams "me too!" instead of solving an actual problem.

Vulkan is a low-level, highly verbose API designed to extract maximum performance and leverage multiple threads to deal with slower aspects of rendering. Then you chuck that into a single-threaded, slow (relatively) runtime where method calls to the actual implementation are particularly expensive (JavaScript -> native transitions ain't cheap), and that's supposed to be a good idea? Why? Who is supposed to use this for anything useful?

Instead of just porting native features to the web for the sake of porting them make it easier to make use of the hardware's capabilities. Let me throw a GPU shader into a CSS3 transition animation or something like that, for example. That'd be cool and potentially useful instead of a system where you can port some game to the web for the sake of porting it to the web where nobody will ever use it because it sucks compared to the vastly superior native version of the game.


> Then you chuck that into a single-threaded, slow (relatively) runtime

Slow relative to C++, sure. But JS is very fast relative to pretty much any other widely used dynamic language. If we should expose Vulkan bindings to dynamic languages (and why not?) JS is the obvious first choice for a target, given its speed and popularity.

> JavaScript -> native transitions ain't cheap

They're actually very cheap nowadays, because so many benchmarks stress the DOM and C++-implemented builtins. Driver calls, even with Vulkan, far exceed the cost of JS-to-native transitions. glDrawElements() is probably at least 100x slower than a JS-to-native call, the latter of which has latencies measured in nanoseconds.

> Let me throw a GPU shader into a CSS3 transition animation or something like that, for example.

Please, let's not. This will neither be good for designers (modern GPU programming has an enormous learning curve compared to CSS) nor browser developers (shaders would break batching and would expose too many engine-specific internal details).


> If we should expose Vulkan bindings to dynamic languages (and why not?) JS is the obvious first choice for a target, given its speed and popularity.

Exposing it to JS and exposing it to web pages in a browser are two completely different things.

> They're actually very cheap nowadays, because so many benchmarks stress the DOM and C++-implemented builtins. Driver calls, even with Vulkan, far exceed the cost of JS-to-native transitions. glDrawElements() is probably at least 100x slower than a JS-to-native call, the latter of which has latencies measured in nanoseconds.

glDrawElements in vulkan is a couple hundred lines of API calls. I don't think you're fully grokking the orders of magnitude of verbosity that Vulkan brings.

As for the transition cost there's varying levels of cheap, but at the end of the day it's not great. It's raw per-call overhead on an API designed around making an obscene amount of calls, and the overhead is significant. Jumping between these worlds is just not something you want to do very frequently if your goal is performance.

C++-builtins are a completely different class of problems as intrinsics get to play by their own compiler rules than regular JS -> native bindings.

> Please, let's not. This will neither be good for designers (modern GPU programming has an enormous learning curve compared to CSS) nor browser developers (shaders would break batching and would expose too many engine-specific internal details).

So simple pixel shaders are too complex, but vulkan is not?

If a pixel/fragment shader is too much, then so is the entirety of WebGL, and webvulkan would just be pure insanity from that perspective.

Also it doesn't break batching at all, I have no idea what you're talking about there. A pixel shader is just a function with a set of inputs and a color output. It's really quite simple, easily emulated for non-GPU fallbacks, and easily manipulated by the browser.


> Exposing it to JS and exposing it to web pages in a browser are two completely different things.

It would make little sense to expose Vulkan to JS and not to put those bindings in a Web browser.

> glDrawElements in vulkan is a couple hundred lines of API calls. I don't think you're fully grokking the orders of magnitude of verbosity that Vulkan brings.

> As for the transition cost there's varying levels of cheap, but at the end of the day it's not great. It's raw per-call overhead on an API designed around making an obscene amount of calls, and the overhead is significant. Jumping between these worlds is just not something you want to do very frequently if your goal is performance.

It doesn't matter whether it's a couple hundred calls or not. The overhead, which again is measured in nanoseconds, really does not matter. Vulkan's performance demands on API boundary transitions are no worse than that of the DOM, which has been optimized for decades.

> C++-builtins are a completely different class of problems as intrinsics get to play by their own compiler rules than regular JS -> native bindings.

No, they don't. They are one and the same in many JS engines. (I can't speak for all engines, but I'm familiar with SpiderMonkey, where a JSNative is a JSNative, whether a builtin or a DOM method.) SpiderMonkey nowadays even knows about things like purity of various DOM methods and will optimize accordingly. (bz used this to make Dromaeo really fast.)

> So simple pixel shaders are too complex, but vulkan is not?

Fragment shaders are too complex for CSS. They aren't too complex for programmatic manipulation in JS. The reason is simple: CSS is a high-level declarative language intended to be accessible to designers, while JS is an imperative language mainly used by programmers.

> Also it doesn't break batching at all, I have no idea what you're talking about there.

As you know, switching shader programs can only be done in between draw calls.

> A pixel shader is just a function with a set of inputs and a color output.

With hundreds of pages of specification describing how all the different operations that that function can perform must behave.


You just stumbled into and argument the game industry has been dealing with for years, whether or not programmers or artists should own shader code. This boils down to whether or not you implement a visual node based shader editor in your game engine. While I agree that expecting designers to know "modern gpu programming" is extreme, exposing fragment shader style functionality isn't a bad idea, if properly implemented. I also don't think it'd be super difficult for designers to grok.


> exposing fragment shader style functionality isn't a bad idea, if properly implemented.

I agree that functionality like fragment shaders is useful, as long as it's declarative and fits in with the rest of CSS. In fact, we already have it: the CSS filter property.


I don't think glDrawElements() goes into the driver though (at least there is no sane reason for it I can imagine, you only need to go into the driver to kick a command buffer at the end of frame).

Nevertheless, even if some implementations actually do call driver, the GP's point is that the whole point of Vulkan is getting rid of global state to allow parallel command buffer creation. JS is single threaded so, no matter how cheap or expensive the calls are, you won't be able to take advantage of Vulkan since you are running just one thread.


Of course glDrawElements() has to go into the driver, because it needs to do hardware-specific work. This is obviously true in OpenGL(ES), but even if you were to implement it via a translation layer to Vulkan, you still have to call the various Vulkan-equivalent commands, at the very least vkCmdDrawIndexed. Here's an implementation to make it painfully obvious why that has to involve the driver: https://github.com/mesa3d/mesa/blob/d819b1fcec02be5e0cfc87b6...


No, you only need to go into the driver when you need to do work that cannot be done in the userland. What are you talking about is just a shared library in the address space of your app, and the code you are showing is literally just writing bytes into a buffer.

Nobody forbids you from calling it "driver", of course but then the whole point of "going into the driver" does not make sense, since there is no syscall and it's just a regular function.


You're thinking of kernel-mode drivers.

What I've linked to is a driver. Everybody calls it that.

When you go download a driver for graphics cards, whether on Linux or Windows, that driver actually consists of multiple components, some of them running in kernel-mode and some of them running in user-mode. It's basically the exo-kernel principle, but without feeling the need of giving it a fancy name :)

There's a broader history of user-mode drivers not just for graphics, and not just in the obvious case of micro-kernels. User-mode USB drivers used to be a thing, for example (and I guess they still are for some more obscure hardware).


As I said, you can call it whatever you want, "driver", "kernel" or "linux" even. The point of "going into the driver" being expensive only makes sense if it's a syscall, which it is not as we both seem to agree.


Not really. OpenGL in particular has to do a surprising amount of work on every call, to make sure the state hasn't changed and to update all sorts of things if it has, to manage the various buffers and make sure they're mapped in the right place, and then to go down through all the abstraction layers until you end up in the code that actually writes stuff into the command buffer. It's not going to be a syscall level of overhead, though it may end up being that if stuff needs to get mapped into the GPU address space, but it's definitely going to be more than the dozen instruction overhead of going from JITed to native C++ code.


"Not really" what? There is syscall? Then you say yourself there is not... I only argue that there is no syscall in glDraw* as well as the vast majority of the APIs. Sure, driver/opengl do whatever and some calls are more expensive than others but adding more overhead is not going to make it any better and it's already pretty bad without overhead. That's why they developed Vulkan in the first place.


You know, you're talking to somebody who writes graphics drivers for a living :)

If you don't believe me or crzwdjk, just go ahead and actually profile a system running an OpenGL application. The syscall overhead -- as in, the overhead of transitioning between user and kernel mode -- is laughably negligible compared to everything else. Also, the vast majority of driver CPU time is spent in user space building up command submissions. The final command submission itself isn't free of course, but clearly more time is spent processing precisely those glDraw*() calls that you seem to think don't matter.


> You know, you're talking to somebody who writes graphics drivers for a living :)

That's great. Why do you think you are the only one? And what should I believe exactly here? That there is a syscall in every OpenGL API? If you are writing drivers you know it's not true yourself. The syscall overhead is not laughable, it's tens of thousands of clocks.

>Also, the vast majority of driver CPU time is spent in user space building up command submissions.

Exactly. OpenGL system (if you want to call it "driver" be my guest, DirectX does not do that for example, neither do other APIs) works mostly in the user space.

> The final command submission itself isn't free of course, but clearly more time is spent processing precisely those glDraw*() calls that you seem to think don't matter.

??? I don't even know what are you arguing here. Let's rewind. Someone said that "driver calls" are expensive. And it's true for people who understand drivers as a part of OS, not "user mode drivers", which are just shared libs. I corrected, saying that there is no actual driver call in the sense that people understand, i.e. there is no OS call or "syscall" since the OpenGL "driver" is mostly a shared library in the user space. You seem to agree with me. Now, I am well aware that some calls are expensive. I even know why. Some are not though. On some the overhead of moving data from a managed language to the GPU will be much greater than the call itself. E.g. setting an index buffer.

It still does not make it true that there are syscalls in the OpenGl calls anyways.


The userspace part of the driver is still called the driver.

A driver does not mean a kernel module. It's often that, but it does not exclusively mean that. Userspace drivers are still drivers.

The library that gets loaded into the process is part of the driver. It's provided by the GPU vendor and it's specific to the hardware you're running. It maps API calls to hardware-specific commands. Aka, it's a driver. It just happens to be implemented as a userspace library for most of the work.


I imagine the next step is to provide a WebAssembly binding.

edit: "The API has to execute efficiently on WebAssembly and in multi-threaded environment. That means no GC allocations during the rendering loop in order to avoid the garbage collection pauses."


Well WebAssembly is another thing I'd question the usefulness of. It's going to result in threading finally coming to JS which is nice, but the rest of it is more like a showing of of technical infrastructure for the sake of it instead of helping apps with problems they have.


I would argue that having to write apps in JS, or transpiling to the JS runtime, is a problem for many people.


Sure but you could imagine something more like a .NET or JVM bytecode instead, which would be a more practical target for transpiled web apps instead of webasm.

Instead we ended up with a stack machine & sbrk.


.NET and the JVM are in no way more suitable for compiling C/C++ and the like than wasm is.

The JVM doesn't even have unsigned integers!


Well of course, but I think running C/C++ on the web is a complete nonsensical waste of time. That's not a useful market to target and critically it completely ignores the needs and problems of the current market.


.NET and JVM are both stack machines. The .NET VM is the only one that has features that cater to C/C++-ish languages (i.e. Linear memory access via instructions). WebAssembly's memory model (including sbrk-style allocation) is likely what a sandboxed, linear memory focused .NET VM would have wanted to go with anyway in the interest of minimizing the performance impact of address range checking.


But in what sense would an extra VM layer be more practical? Maybe this is where we disagree. I'm not fond of VMs, especially those two.

C++ or Rust -> wasm bytecode is great. Soon we'll be writing directly to command buffers, no fuss.


If you want to do C++/Rust direct to command buffers why on earth would you bother with the pile of overhead that is a modern web browser?

But ~nobody wants to build UIs like that anyway, so what's your target audience?


Games and other 3D applications? Getting people in-game (say a lazily loaded demo with slightly less perf) with one click is huge.


No, it isn't. Games are already served by consoles first (which obviously won't run webasm), and steam second. There's no market there, and it's already one-click to launch steam to the game in question. Where it will then download in a medium suitable to handling the downloading & updating of a game's assets instead (which, even for a demo, is in the gigabyte range - you aren't lazy loading this).

As for 3D applications what 3D applications? Do you really think Maya is going to be ported to a browser? Why would they bother? Why would they restrict themselves like that?

The web has no advantages in this space, and the needs of those markets is already being served with superior technology and infrastructure.


Have you ever played a flash game? Slither.io? There is a huge market there. Steam is not one click away from a tweet or a Facebook post.

Please have a nice weekend.


Flash games are dead and even facebook games are largely a thing of the past as Facebook is now primarily used on mobile. The casual audience is on their phone in app stores & not on the web anymore.

I assure you those casual game companies are not going to want to go anywhere near vulkan or similar, though, and they are generally fine with the performance scripting languages already give them (hence why they were in flash instead of java applets)

They want strong 2D graphics capabilities primarily, which is largely an ignored category. <canvas> has a 2D context, but it's pretty crappy.


Most web-based games will probably remain to be in 2D (for multiple reasons, including development cost and accessibility), but 3D games on the web are certainly performant enough.

To give one example, bananabread is a tech demo showing off what can be done with asm.js and WebGL (performance is likely to get even better with WASM):

http://kripken.github.io/misc-js-benchmarks/banana/index.htm...

As for whether a VM has an overhead compared to native, of course it does, but you're not going to convince anyone of your viewpoint by stating something that's already obvious to us all. The performance of web-based games doesn't have to be the best, it just has to 'good enough'. I personally don't think we'll fast adoption of WebVR, but I'm glad it exists as it helps to have it as a goal to improve 3D performance, almost certainly leading to reducing the performance overhead of browsers (as low latency is a key component of a good VR experience).


WebAssembly doesn't appear to be a way for threading to come to JS, unless you count adding a JS API that lets you invoke multithreaded code in a totally different format and memory/execution model as "threading coming to JS". I don't think anyone is itching to move V8/SpiderMonkey/etc. to concurrent garbage collection, so concurrent memory access in multithreaded JS code will likely be limited to SharedArrayBuffer/WebAssembly for the foreseeable future.


800 lines of code to render a triangle. OK.

I understand the need of being explicit, but why not include some reasonable defaults? When you need that last drop of performance, you could opt out of those defaults, but specifying everything by hand...

OK, OK, I understand. Perhaps Vulkan is a compiler target. It was never intended for someone to write Vulkan code by hand. Just code in some higher-level API or language and get everything compiled into low-level Vulkan calls. I surely hope so, for the sake of my sanity. Right? Right?


It's not intended for someone to use Vulkan to render a triangle. If you want to render a triangle (or merely a million triangles), just throw together some simple OpenGL code and you'll have no problems assuming basic optimizations (large batches).

If you really need Vulkan's advantages, you are committing yourself to a whole lot more than 800 lines of code, Vulkan or no. The setup code will be done once, then you'll move on to many thousands of lines of pipeline code that will take the vast majority of dev time. The GL vs Vulkan question is more about what you get in the end. GL takes care of a lot for you, but that makes writing a highly specialized pipeline more complicated. Vulkan makes everything complicated because specialization is presumed to be the goal.


Firstly, I am worried that OpenGL development seems to be stalled and it looks like its going into maintenance mode, where Vulkan is where all action happens. It is increasingly pushed as a replacement to OpenGL, implicitly or explicitly, and if it will be the only mainstream alternative, I am somewhat worried.

Secondly, I am worried that there are no mid-level 3D graphic toolkits. We have low-level (very very low-level) like Vulkan, or high-level stuff like Three.js and game engines (which are not what I need, as I write realtime scientific visualization code). What I want is something like imgui (https://github.com/ocornut/imgui), but for 3D. Simple, powerful, and bloat-free. However, I can't find anything like this.


Maybe https://github.com/bkaradzic/bgfx or maybe http://www.ogre3d.org depending on the abstraction level you are looking for.


Looks interesting, thank you.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: