Lua-RTOS: a real-time operating system for ESP32

mijoharas · on June 9, 2021

This looks like a cool project but there's something I'm confused about.

Is the goal of this project to allow people to run realtime software? if so, isn't using lua a problem because of it's memory management causing GC stalls?

It doesn't appear to be addressed in the README of the github. One other explanation is that I'm missing the point, and it's just using RTOS as a small embeddable OS on which to run lua, is that the case?

mastax · on June 9, 2021

"RTOS" has come to mean a library os for embedded microcontrollers. A lot of embedded projects don't care about realtime-ness at all.

caseymarquis · on June 9, 2021

I think anything truly real time would be interrupt driven and use C/assembly. This would act as the glue between that critical code. I don't feel like we've hit peak embedded development yet. I love the idea of rapid embedded development that deploys scripts over a network or RS232, but most scripting languages aren't great for this. I would like to see a statically typed actor system with no GC or optional per actor GC that is pre-allocated and can only cause a single actor to fail in a predictable way.

cinntaile · on June 9, 2021

There are languages invented for this purpose, such as Ada. I'd look at that instead of C when you're doing real time systems. Although there are many different kinds of real time systems so it depends on the goal. Soft real time systems aren't as time critical as hard real time systems so the way you program and reason about them is a bit different.

bluGill · on June 9, 2021

How real time do you need to be? some real time is you need to react within milliseconds or bad things happen, humans will notice if there is 10ms lag in real time audio. Some real time is you need to react within seconds. A machine might need several tens of seconds to get up to speed, so controlling the motor only needs to update a once a second or so for the speed to remain within tolerance. And everything in between.

Depending on where you are on the need for control different technologies can work for you.

detinho · on June 9, 2021

I'm probably wrong, but isn't real time not about how fast you can react but if you can react on a defined time constraint? Or is there a time threshold you can't consider real time any more like 1 day? Edit: just like you said, 10ms, 1s, etc.

tdeck · on June 9, 2021

You're correct, the definition requires that operations be bounded, not necessarily "fast".

bluGill · on June 9, 2021

The time depends on your problem. The real time is more about bad things happen if you don't respond on time. If you click on a webpage and your browser times out loading it is a real time failure, but the failure isn't bad enough that anyone thinks of browsers as real time because the failure isn't really harmful.

fullstop · on June 9, 2021

Maximum tick time for RTOS are usually measured in tenths of seconds. The range is typically from microseconds to about 100ms.

edit: updated to reflect that I meant "upper bound"

edit2: completely reworded to be more clear.

monocasa · on June 9, 2021

Time constraints on an RTOS are what you need them to be for the domain.

On the RTOS I was the lead for, some constraints were measured in 100s of ns.

Edit to address the edit: our upper bound in that case was single digit microseconds.

fullstop · on June 9, 2021

My point was that once you pass the 100ms mark most would not consider it "real-time".

monocasa · on June 9, 2021

Ah it reads backwards, that you're asserting no one actually deals with constraints tighter than 100 ms.

fullstop · on June 9, 2021

We're coming from different directions. My upper bound was "after this point it won't be rtos", but I've reworded it entirely to be more clear.

monocasa · on June 9, 2021

Upper bound is a specific term in real time also unfortunately.

Might I suggest wording such as "upper bound constraints exceeding 100ms aren't typically addressed with real time methods".

detinho · on June 9, 2021

For what I see from the discution it seems to be more a practical decision to not use real time resources and techniques for much higher times. So 1 day for example would be a waste of development time to try to use real time, you could use alerts and background jobs to fix issues. But when you reach shorter and shorter timeframes and bad things can happen a real time approach starts to make more sense.

Now, some parent comment mentioned RTOS. Maybe for a real time OS there would be a practical hard upper bound. But for real time systems in general this upper bound would be totally domain specific.

pjmlp · on June 9, 2021

Lua is definitly not the best option, but GC isn't an issue when a real time aware implementation is used.

https://www.ptc.com/en/products/developer-tools/perc

https://www.aicas.com/wp

https://www.microej.com/product/vee/

gnzoidberg · on June 9, 2021

I don't think this is true. No current GC tech is fully hard real time. (I am happy to be corrected, as it'd make my life way easier)

pjmlp · on June 9, 2021

Real time enough to control weapons in war scenarios.

https://www.ptc.com/en/blogs/plm/ptc-perc-virtual-machine-te...

https://www.militaryaerospace.com/defense-executive/article/...

https://www.aicas.com/wp/products-services/jamaicavm/

galangalalgol · on June 9, 2021

Those examples were naval ships. When you can float a huge data center in water and dump all the waste heat you want, it doesn't matter that the Perc vm used is 2.5x as slow as the sun vm. Or that it uses even more ram than a normal java vm which is already a lot. This is running on a microcontroller.

pjmlp · on June 9, 2021

Moving goal posts?

I thought we were talking about real time GC, besides only the first example is a naval ship.

galangalalgol · on June 9, 2021

Both fair points. Hard real-time is absolutely doable with GC if it is deterministic. It has throughput and memory penalties to get low and predictable latencies but it is regularly done. I have written soft real time java myself. And other than avoiding garbage like you would do with any low latency code with GC, it was idiomatic. But that one change does increase cognitive burden to the point I didnt find it any more productive than C++. If I had needed reflection I might have felt differently.

kaba0 · on June 9, 2021

> It has throughput and memory penalties to get low and predictable latencies

Hard realtime C++ does as well.

galangalalgol · on June 10, 2021

I've done lots of hard real time c++. It certainly has development time overhead, but the memory usage was the same as idiomatic c++, just everything was preallocated at max usage. No throughput hit either, if anything it was faster because of no allocations.

pjmlp · on June 10, 2021

Well for the situations that it really really matters, it is rather MISRA-C++, and allocations are not allowed anyway.

galangalalgol · on June 11, 2021

I've done safety critical that wasn't MISRA. But you have reminded me that for years we were leaving optimization off so we could verify full code and branch coverage in the assembly. At which point Java is almost certainly faster, though we never would have fit in memory with Java. Eventually we started turning optimization on and it was harder to verify but not impossible.

jaywalk · on June 9, 2021

How does any of that matter? The parent comment provided examples of GC being used in hard real-time when the claim was that it couldn't be done.

nix23 · on June 9, 2021

Not hard real time, but with a fixed, short bound:

https://researcher.watson.ibm.com/researcher/view_page.php?i...

dTal · on June 9, 2021

That's the definition of hard real time, if I'm not mistaken. Is there an even stricter guarantee than than that?

nix23 · on June 9, 2021

True, above that would be "live".

samatman · on June 9, 2021

Lua's garbage collector can be driven 'manually' quite easily.

That is, one can start it, stop it, run it to completion, run a 'step', tune the size of the step and the aggressiveness of collection, all from within Lua.

It's true that you can't get hard realtime guarantees while using Lua naively, you do have to be aware of what you're doing. If you need to be 'cycle perfect', probably use something else.

But there are an enormous number of applications where what Lua offers is just fine, and there's no reason a Lua program should have GC 'stalls', if that means unexpected and lengthy pauses in execution.

This is a really cool project imho.

snarfy · on June 9, 2021

All the real time guarantees would only ever happen in the libraries you are calling out to, not in the scripting language. That's just glue code. If the scripting language has the equivalent of eval() there is no way it can be made real time anyway.

admax88q · on June 9, 2021

I can't speak for this projevt specifically, but you can do GC in a real time system. IBMs metronome garbage collector is a real time garbage collector.

bob1029 · on June 9, 2021

I do soft real-time in .NET5 without any problems.

I find that if I completely abduct a high-priority thread and never yield back to the OS, things on the critical execution path run incredibly smoothly. I am able to execute a custom high precision timer in one of these loops while experiencing less than a microsecond of jitter. Granted, it will light up an entire core to 100% the whole time the app is running. But, in today's world of 256+ core 1U servers, I think this is a fantastic price to pay for the performance you get as a result. Not yielding keeps your cache very toasty.

Avoiding any allocation is also super important. In .NET, anything touching gen2 or LOH is death sentence. You probably don't even want stuff making it into gen1. Structs and streams are your friends for this adventure.

OskarS · on June 9, 2021

I'm curious about this as well. And it's not just GC: just allocating memory is not real-time safe unless you're using something like a bump allocator. Lua seems very much like the wrong language for this.

fullstop · on June 9, 2021

Lua doesn't need to use malloc directly -- you can replace the memory allocation function with your own implementation which is real-time safe.

mmoskal · on June 9, 2021

If your heap is on the order of 100kB the GC stalls may not be so bad. A bigger problem may be pulling your code from external SPI flash - typically you will need to put all your real time code in RAM and you have only so much of it.

ludamad · on June 9, 2021

You could disable the Lua GC and mostly manage C buffers with Lua functions. There's some precedent given that Lua has been used a lot in embedding

mijoharas · on June 9, 2021

can you disable the GC? in my last role we had a large C++ application that had embedded lua. I didn't touch it much, but I would have thought that while most of the stuff it did was calling out to our C++ api, the "lua objects" and tables e.t.c. would still be created and need to be garbage collected as normal.

Can you entirely turn off lua's GC?

wahern · on June 9, 2021

Yes:

  collectgarbage("stop")

See https://www.lua.org/manual/5.4/manual.html#pdf-collectgarbag...

For such a dynamic language Lua is quite good about avoiding hidden dynamic allocation. Creating a new closure (or plain Lua function), coroutine, string, or table will, of course, allocate a new object. But all of those are either explicit or fairly obvious. Lua's iterator construct is careful to avoid dynamic allocation--I believe it's one reason why iterators take two explicit state arguments. And Lua has a special type for C functions (as opposed to Lua functions), allowing you to push, pass, and call C functions without dynamic allocation. Likewise for lightuserdata (just a void pointer), and of course numbers (floats and integers) and booleans--no dynamic allocation.

Nested function calls could result in a (Lua VM) stack reallocation. But Lua guarantees tail call optimization. And the default stack size is a compile-time constant.

Finally, Lua is very rigorous about catching allocation failure while maintaining consistent state. Well-written applications can execute data processing code from behind a protected call or coroutines resume (which is implicitly protected) and still keep chugging along in the event of allocation failure anywhere in the VM. The core of the application, such as the event loop and critical event handlers, can be written to avoid dynamic allocation after initialization.

ludamad · on June 10, 2021

For a usage like that, no as you'll probably rack up lots of allocations and need to GC eventually. But if your goal is this from the outset, there's way to do it as other post mentions. If you never create more than a fixed amount of Lua objects, closures, or unique string values you can certainly indefinitely postpone the GC

gnzoidberg · on June 9, 2021

Embedded != real time

ludamad · on June 10, 2021

Sure, I'm just saying embedded systems also care about GC cycles

m00x · on June 9, 2021

GC stalls aren't as bad on a low memory chip since there's much less memory to clean up.

hikarudo · on June 9, 2021

> isn't using lua a problem because of it's memory management causing GC stalls?

Perhaps even more important is the increased memory use due to Lua. Some devices have very little memory to begin with.

pjmlp · on June 10, 2021

While others, like the ESP32, are much better than the PCs our computer school club had to play Defender of the Crown.

Yet MS-DOS had plenty of programming languages to chose from, when we weren't coding games or demoscene stuff.

TickleSteve · on June 9, 2021

The Lua interpreter is not a real-time interpreter and cannot give bounded response time.

This is a non-realtime application running on top of a scheduler that is capable of supporting realtime applications.

Just placing an application on top of an RTOS does not make it realtime.

fullstop · on June 10, 2021

Execution can be interrupted, though, through debug hooks. It could be set up to yield every N instructions. [1]

There's a few caveats, though, in that this will not be called if you've called into C code. That is, you will only yield while executing code in the Lua interpreter.

1. https://pgl.yoyo.org/luai/i/lua_sethook

antattack · on June 9, 2021

The project seems to have started in 2017 but it does not appear to be very popular. Whitecat IDE is still in alpha.

https://ide.whitecatboard.org/

Renaud · on June 9, 2021

I thought there was something wrong on that page:L I wanted to try the IDE but just got this video filling the page and that single red button: "sign in with your google account".

Why does an IDE for microcontrollers requires a Google Account? Guess I'll never know.

teleforce · on June 9, 2021

Very interesting project on RTOS and hopefully this can support the new RISC-V based ESP32-C3 MCU [1].

I wonder if the performance can significantly be improved if this is ported to Terra language, a system programming language meta-programmed in Lua [2]. It's going to stable 1.0 version real soon.

[1]https://www.espressif.com/en/news/ESP32_C3

[2]https://terralang.org/

rainer42 · on June 9, 2021

https://github.com/crmulliner/fluxnode follows the same idea providing a JavaScript runtime for the application development, runs on ESP32 and supports LoRa. Fewer features as this is a hobby project.

mastrsushi · on June 9, 2021

A language centric OS isn't a compelling idea to users outside enthusiasts of that language.

samatman · on June 9, 2021

Most RTOS are 'language centric', it's just that the language is C.

lebuffon · on June 9, 2021

or Forth... :)

qyi · on June 18, 2021

This is true until you realize all general purpose languages are the same and redundant. There is no reason to have more than one on a given system.

pjmlp · on June 10, 2021

You mean like UNIX?