Hacker News new | past | comments | ask | show | jobs | submit login
Object-oriented techniques in C (dmitryfrank.com)
113 points by sea6ear on Sept 22, 2015 | hide | past | favorite | 72 comments



This seems to just seems to be replicating the functionality of C++ in C. The only reason he gives for not using C++ is lack of C++ compilers for embedded CPUs. I think a better approach for the long term would be to use a C++ -> C transpiler (as much as I hate that word).


I'm saddened by the author going to this trouble. Embedded cpu selection doesn't happen in isolation - or at least, it shouldn't. The available tools are a large part of why you would pick chip A over chip B. If you don't have a C++ compiler, and you end up having to re-invent C++ in C, then maybe you should have picked the chip with decent tools.

If this project really did require re-inventing C++ in C, it must be justified by being fairly large. In which case, a low-end ARM (Cortex M based) microcontroller would have been entirely suitable. All ARMs have multiple C++ toolchains which support them.

If this is a microcontroller you end up lumbered with, and didn't choose, then fair enough, but this effort should be labeled for the hack that it is. This is not best or even good practice.


Embedded world is weird. Changing hardware design is difficult and costly by itself and manufacture process is limited by real world constraints, therefore the largest "part of why you would pick chip A over chip B" is availability, both long-term and short-term: will I be able to source the chip in 5 years? how easy it would be to change delivery schedule if my manufacture schedule changes?. Oh, and ARM looks like a solution until you have to battery operate it.

Cost of the software in embedded systems is quite often low compared to hardware costs. I ballpark lower limit for part cost at $10. I ballpark software development cost at $100k/y or $50/h. Let's say it is reasonable to get software done in a man-year. At 1M units of projected sales, software cost is at 1% of hardware cost (not project cost). At such ballpark estimate software engineers' opinion is only treated as a humble request.

Of course this varies from project to project. Increasing portion of embedded systems are mostly software. Portion of projects have hard-realtime requirements. Here software engineers are benevolent dictators. But that is by no means a general case.


> If you don't have a C++ compiler, and you end up having to re-invent C++ in C, then maybe you should have picked the chip with decent tools.

So the software engineers are the only ones on the project who get a say as to what chips get used?


No (and that isn't what he said), but surely they should get a say, right?


pslam basically did in his next paragraph:

> If this project really did require re-inventing C++ in C, it must be justified by being fairly large. In which case, a low-end ARM (Cortex M based) microcontroller would have been entirely suitable. All ARMs have multiple C++ toolchains which support them.

ARM is not entirely suitable for any application just because you have complex (for some definition of complex) software.


No, but the software impact of choosing a particular piece of hardware should absolutely be part of the system design process. If it is not, then you are part of a project with bad leadership, or very, very old-fashioned architects.

It's not just ARM - there's plenty of other CPU architectures with modern Clang, GCC, or other toolchains with C++ support. Given that this is a large code base (justifying this amount of work), this cannot be a "tiny" microcontroller, as in 8 bit, and needing to run on microamps. I cannot believe that there were not C++-capable microcontrollers available which would have done the job, without breaking the bank.


Doing it at lowest cost is the point. Because even if you don't lower your cost, your competitors will.


Those low end cortex M's tend to have C++2003.

I am kinda-sorta happy to have C++ at all.

I think.


I work mainly on embedded systems and i would much rather work with a different CPU but stuck with the hardware we have.

The compiler I use does have an embedded c++ option but it's limited and has some issues working with the RTOS that's supplied with the compiler. Also with a total of 6K of ram available it's usually easier just to stick to writing plain C.


> use a C++ -> C transpiler (as much as I hate that word)

Why not just use the word "compiler"?


I've encountered this here recently as well, the belief that a 'compiler' has to target machine code. When really a compiler is just a translator from one program language to another.


Too many people that are locked into the JS mental institution all the time?


and even if compilers were those that generate machine code (and they're not), what's wrong with (source-to-source) translator?


Honestly, creating an object-oriented version of C that's not C++ would likely result in a language better than C++. C++ is not a particularly good language.


Isn't there some way to emit C from C++? Wouldn't a capability like that and using C as a host language completely bypass the need for emulating OOP in C, by just letting you write C++? That seems like it would be far more manageable, and since it's compiled, not susceptible to some of the major downsides commonly associated with the technique (in JS).


That used to be the only way to compile C++: https://en.wikipedia.org/wiki/Cfront


The emitted code is highly name mangled - one of the reasons why you encapsulate the C code in "extern C" https://en.wikipedia.org/wiki/Name_mangling


That seems like a problem that has quite a few possible solutions: deterministic name mangling, hinting in the source as to how the name should be mangled, an external mapping of mangling exceptions and/or rules to be used by the transpiler. None of those are mutually exclusive, all could be used together.


It's more than that - there's extra information that's compiled alongside your runtime data structures whenever you write C++. For example, every class with virtual member functions has a vtable, and every object of such a class has a pointer to the vtable. Every time you access a virtual member function, it's indirecting through the vtable to find the particular address to call, and then calling it with the object itself as the first argument.

If you got rid of name mangling (and a few other C++ features that are besides the point), you could certainly call C++ from C. The thing is - your C calls would look exactly like what the article is suggesting. That's why it's important to learn this technique: it is what your C++ compiler is doing under the hood. Indeed, the very first C++ compilers were just preprocessors that transformed C++ syntax into the type of vtable + base class + first parameter indirection that you see here.


Yes, so while this technique would be harder to adopt for a library, or at least harder on the users, for an application where you don't really need to worry about people using C calling into your code, it would be fairly useful. A short HOWTO on how to call from C using the necessarily included and emulated C++ vtable bits would make calling in from C possible and easier where required, but you could still reap the benefits of non-C features while sticking with a C toolchain at the lowest level.


Usually an 'extern "C" { .... }' declaration is easier if you control the C++ library code.


I see what's being talked about now. I misinterpreted meson2k's original reply, so haven't really been on the same page as you or him.

If current C++ to C transpilers don't handle naming well, that's a problem that should be worked on.


this approach is similar to the one that GObject uses, though stripped down a bit for embedded. i haven't used it in years, but it worked great with GTK+. the only downside i remember is the overhead of the virtual function calls (in contrast, the jvm is able to factor out the virtual call in many cases at runtime)

one advantage of using something like this is it forces you to really think about what makes something OO and gives you a deeper appreciation of it. i'm a java programmer now, but to this day i still prefer object oriented C to C++


I've found that, in the embedded world at least, the overhead this adds (in terms of flash/rom and ram) can be fatal. That said, there are some more implementations of this here if anyone is curious: https://github.com/ryanbalsdon/libr


> the overhead this adds (in terms of flash/rom and ram) can be fatal.

I know you list the resources, but how does this approach add overhead? (how does it use those resources more than a "C" approach would?) Most of the constructs in C++ only cost anything if you use them, so you only pay if you need the feature; further, the runtime costs of most C++ features are pretty much exactly what's required for that feature… so if you were writing C, and you needed the same functionality, how would you avoid paying the same cost?


EFnet's infamous Zhivago (from #c) has a great article about OOP in C - http://www6.uniovi.es/cscene/CS1/CS1-02.html


How badly does this impact optimizations? First you add an extra function pointer for any member, even if it's not virtual. Then since you're always calling via a function pointer -- do compilers notice when fps are assigned and never modified and do inlining and so on?

Also the first technique doesn't feel like OO in any meaningful way. Defining a state object and passing it to functions is exactly what you'd do in say, a functional programming language, or even in regular old imperative code.


> First you add an extra function pointer for any member, even if it's not virtual

Wrong. If some function is not virtual, there's no point to add it to vtable; so, it's just a regular function.

> Also the first technique doesn't feel like OO in any meaningful way.

Really? Of course I can be wrong, but for me, defining a state object and operate on it only through methods (i.e. functions that are given a pointer to state object) is exactly what is called an OO style.


Thanks for the correction. I had misread and thought all functions were called with member syntax.

I guess in simple cases like this the difference between OO and other styles are pointless. I'm just thinking that if I were to write it in a functional language, it'd probably a very similar API, though possibly/probably returning a new state vs modifying.


> Also the first technique doesn't feel like OO in any meaningful way

But in the simple case described in op (no inheritance), that is all a class is, with the exception of some syntactic sugar.


Object-oriented programming is a paradigm that can be followed in almost any language. If you're creating a conceptual 'object' that stores state and has associated methods then you're using OOP.


Also check out: Simply Object-Oriented C --> http://www.codeproject.com/Articles/22139/Simply-Object-Orie...


Unmaintainable and extra complexity just to emulate a paradigm that C wasn't designed for.


Again, if we use it correctly, the resulting thing is much more maintainable than the code with switch-cases instead of virtual methods, and so on.

By the way, similar approach is used across the Linux Kernel: check, for example, the book "Linux Kernel Development" by Robert Love, especially the discussion on the virtual filesystem.


Every solution feels great and neat in the beginning until you wake up one day and feel sick to your stomach from the mess it has created.

Most vtable like implementations in the kernel are nothing more than struct of function pointers, and this is almost exclusively.


As nostrademons pointed out above (https://news.ycombinator.com/item?id=10261375), if you feel that some tool you've used has created mess (so that you even feel sick to your stomach), it's most likely not because the tool is bad, but because you've used it in inappropriate way.


As an example, SDL is a large C codebase that employs a very OO paradigm, and I find its APIs very well organized.

When I was (or am) coding C, a lot of the OO principles helped me write better C code, as I understood better what the higher level constructs that I was working with was. (i.e., if there's a base class among essentially hiding among my structs that should perhaps be pulled out into a separate struct; do I need a vtable, or perhaps function pointers on the struct; how is a struct initialized, and how is it destructed, etc.)


I once met a software engineer who was able to write code that was, and remained, maintainable. Crazy, right.


"The danger in trying to force object-oriented concepts onto a C base is to get an inconsistent construction, impairing the software development process and the quality of the resulting products. A hybrid approach yields hybrid quality. This is why serious reservations may be voiced about the object-oriented extensions of C described in chapter 20. To benefit from object-oriented techniques, one must use a consistent framework, and overcome the limitations of languages such as Fortran or C which regardless of their other characteristics -- were designed for entirely different purposes." Bertrand Meyer, Object-oriented Software Construction, 1988.


This interesting article reminded me of this post by Rob Pike from some time ago: https://plus.google.com/+RobPikeTheHuman/posts/hoJdanihKwb.

While I've been a C++ programmer since my uni days, I think algorithms and data structures should come first, and I don't like the idea of "bending" programs to make them fit nicely within the OO paradigm.


Do NOT use these techniques please!

I have experience of an early nineties project that was entirely structured like this and trust me, it ends up as a disaster.

C is not meant to be OO, consequently all the tooling in editors and the like do not understand the links between classes and methods.

What it ends up like is an impenetrable mess with all the mechanics of C++ exposed but being impossible to navigate.

again... please do not do this in any project... for the sake of any who will follow you in maintaining your code!


Bah. It's a tool. Just like any tool, you should have it in your toolbox, and use it when appropriate.

When I hear about C programs that were a disaster because they used OO-like techniques, the operative word is usually "disaster" and not "C" or "OO". In other words, there were other things wrong with the project that had nothing to do with the selection of language or programming paradigm, like hiring inexperienced programmers or mandating a single programming paradigm for the whole codebase.

Good C code will have a lot fewer objects than say, Java, because most of the things you want to use C for, you can accomplish just fine with structs and functions. But you shouldn't shy away from a struct full of function pointers or an ADT with opaque pointer types and appropriately-namespaced member functions just because you're afraid a maintenance programmer won't get it. It's their responsibility to learn the language.


Seconding the idea that you don't use something like this prolifically. The point is to only go slightly past the limits of the tooling and still primarily be writing C in the style that C is best at.

For similar reasons, "cute" macro-heavy C code tends not to survive long-term, because it breaks too much tooling.


Disagreed. The techniques explained in the article worked quite well for me, and resulting projects are well-structured and very well maintainable for years.

As ever, we have to apply common sense: as I mentioned in the article, 1-level inheritance is quite good, or maximum 2, but if we have more, it gets too verbose. Luckily, for most cases, 1-level is quite enough to create common API and reusable modules.

Maybe you overuse it? Of course if you try to build huge hierarchies like this, you most likely will end up using typecasts in client code, and the entire thing will quickly become a mess. The article proposes solutions that don't have typecasts in client code.

As I said before, we should apply common sense. Engineering is all about tradeoffs. If I were you, I wouldn't argue against any approach as aggressively as you just did.


> and trust me, it ends up as a disaster. [...] please do not do this in any project... for the sake of any who will follow you in maintaining your code!

The Linux kernel has some OO techniques in C. 2-part story:

https://lwn.net/Articles/444910/

https://lwn.net/Articles/446317/

I don't follow the Linux development mailing lists so I don't know if Linus Torvalds and other respected contributors regard those techniques as "disasters". Maybe they do.


I would argue that the linux kernel is largely object oriented, especially in the realm of device drivers. Every subsystem that I have worked with has you fill out the equivalent of a vtable, and passes state through structs rather than globally.


Most large-scale FOSS projects in C I've seen tend to be structured in OO-like ways. The C language actually lends itself well to a low-level, Nygaard-style OO when you reach a certain scale and complexity across modules.

Though, if your project is more of a toolkit, then a strictly procedural style is probably better.


Oh, nonsense. It works great for GTK+. Having worked on both GTK+ and Qt, I would take GTK+ over having to use Moc any day.


Wierd, in the 90's we did a rather large, successful medical imaging product using techniques like these. If you had a CT or MRI scan in the 90s and early 00s, your images were likely rendered for the tech/radiologist using this type of code.


Yeah all that effort would be better spent on finding a platform that can actually support C++. Otherwise the base of code you end up with is hardly maintainable.


And C is also not meant to be functional, but yet, there's a 400+ page book on Functional C: http://eprints.eemcs.utwente.nl/1077/02/book.pdf


Fully agree.

Ended up doing a similar approach when forced to use C instead of C++ for a data structures university project.

Goal achieved, but not an experience to repeat ever again.


So how would you have done it the next time if you had to stick with C?


Just plain ADTs instead of trying to mock object semantics.

Edit: Forgot to mention that so far I only used plain C instead of C++ when obliged to do so.


A part a of series of articles on HN: How to %feature_from_c++% in C (when you could have used C++)


or when memory constraints prevent you from using libstdc++


You can get C++ OO without linking in a whole STL. IIRC it is the C++ equivalent of -ffreestanding in C. Given that popular toolchains like GCC and IAR offer C++ support to micro controllers as small as ATtiny 8-bit AVR micros, I don't see the necessity to reimplement C++ in a hamstrung C form.

I'm currently writing a binary parser library using C++ but without the STL. I'm able to use range-based for loops, zero-cost iterators, and other features without any dependencies on a C++ STL library. After optimization, the generated code is essentially what the C equivalent with all the boilerplate would compile down to. There are plenty of OS kernels written in C++, you just have to pick the appropriate C++ subset. ;-)

Regardless, I enjoy the stricter typing in C++ that generates essentially same code as C with OO tacked on but with less boilerplate.


I don't feel like the crc32 example is a good one. Maybe the game entities example would have been better...


IIRC, MD5 (and probably the other hashs) are implemented in exactly this form in OpenSSL.

I can't really think of a more straightforward way to implement CRC32, frankly, especially assuming that maybe you don't want to force the entire buffer to be present. If you did, you could `uint32_t crc32(unsigned char * buffer, size_t len);` — but you can implement that function in terms of the interface in the article, and if you pick up a bit of inlining from an optimizer, I think it would end up being just as efficient, too. (If you don't, you'll end up with some extra pointer dereferences, but again, the streaming interface supports streaming.)

And perhaps to stress what some other commenters are missing: it's not "you should always OO in C"; it's that if you want an interface to compute, while streaming, the CRC32 of a stream — you need to store that state somewhere, and you need to act on that state somehow. That state is an object: it has a setup, and potentially a teardown, and two relevant methods: feed it data, and get the computed hash.


CRCs or MD5 or other hash sums don't require inheritance or other OO tricks or even constructors or destructors (apart from zero-initialization). The state is just a single integer. For other hashes, the state might be slightly more complicated but there's still no need for fancy OO.

If you want to build a system where you can change the hash function at runtime, you might need some kind of interface or inheritance. But if you end up needing a virtual call per byte (or per a small chunk of bytes), you're killing your performance.

And besides, there's so few hash functions you end up needing in practice that a simple switch-case would do the same with less boilerplate and probably give better performance because compiler optimizations can take place (virtual calls tend to prevent many optimization tricks).

I agree with GP... The choice of CRC was a poor example for an article about OO in C. Something related to filesystems or device drivers would be vastly more practical.


> CRCs or MD5 or other hash sums don't require inheritance or other OO tricks

Given some of the other comments, I think this ought to be stressed though… OO doesn't strictly require inheritance, or vtables. A fair number of classes in the C++ STL don't use those features, for example.

It's not that the state is a single integer, either, it's that the concept of a state exists; having a separate type (with the functions surrounding that separate type) exist to embody that concept, and can change a function from an ambiguous "what goes in this uint32_t arg?" to a very obvious md5_state_t.

> or even constructors or destructors (apart from zero-initialization).

Just a nit: MD5 requires initialization that is more than just zero-initialization.

> probably give better performance because compiler optimizations can take place (virtual calls tend to prevent many optimization tricks).

This is about the most concrete argument against vtables (but hardly against OO in general) thus far, and I'd still love more explanation: what makes a branch more optimizable than following a pointer?


That's what I meant - the state is a uint32_t that you pass to the crc32_calc_next_byte function...


This is probably a stupid question, but here it goes:

Embedded platforms tend to have size limits. Would this increase the size of the source code? Seems a bit verbose to me.


Surely it does have some overhead. And, by the way, passing arguments to function has an overhead as well, comparing to globals. But, how often would you recommend use globals?

The more interesting question is: how much overhead? And the answer is: it depends on the MCU. For example, some 8-bit PICs don't have silicon support for indirect function call (i.e. by function pointer), so they work around this problem by saving function address to the stack and execution 'return' instruction. It works much slower and code size increases notably. I don't use these techniques on these chips.

But for 16- and 32-bit MCUs that I was working with, it works flawlessly. For most of our projects, the overhead is much less significant than the maintainability we get with this approach. As I said in another comment, engineering is all about tradeoffs.


> don't have silicon support for indirect function call (i.e. by function pointer), so they work around this problem by saving function address to the stack and execution 'return' instruction.

While no doubt this is true, if you have an indirect function call in C++ OO code, how do you not have it in C? That you had it in C++ implies — I hope — that you needed it. The C code can't simply whisk that need away. (Or, if it can, so can the C++…)


The size increase by using these techniques would be far smaller than linking in libstdc++. libc is much smaller than libstdc++, especially when you need to support dynamic code and do a whole-archive on your libraries in the executable.


Hmm...those who don't understand Objective-C are doomed to reinvent it, badly? (Yes, have used OO C techniques of different sorts and implemented ObjC compiler/runtime).


Isn't this article's implementation of OO C based on C++?


"...badly" ;-)


Method hashes (ObjC) != Vtables (C++). The former allows runtime introspection and modification that the latter doesn't, and the latter allows performance that the former doesn't.


Regardless of one's opinion on either, C++ and Objective-C are different OO languages. Simula 67 vs. Smalltalk and all that.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: