> pseudo open-source (the ML model is still a closed black box).
?
I am confused with this statement. Sure the ML model is a black box, but it's better than closed source completely with no model. It's more realistic to build the software yourself than training your own self driving ML system.
The most fundamental part of the system, the one which makes driving decisions, is not open. I did not say anything about whether or not this was "better" than the product being fully closed source, only that it is not truly open, and I fully believe this. "Open source autopilot" implies to me that the autopilot is open - that an end-user can inspect, audit, and attempt to understand the decisions their vehicle is making. This is not the case for Comma - rather, it is an open-source CANbus translation layer attached to a closed source autopilot.
When you say not everything is open source. I assume you mean the training code is not open source? I'm curious what you would want to learn from that? You wouldn't be able to actually train a model since you wouldn't have access to the data.
The end-user can inspect, audit and understand the decisions their vehicle is making. All you have to do is see how the neural network behaves for different inputs. That's the correct approach, whether you have access to the training code or not.
Comma don't even say _how_ the model works! What layers are there? What learning strategies are they using? What do they do? It's literally a black box! "All you have to do is see how it behaves for different inputs" is just black box reverse engineering! Machine Learning is NOT a magic black box.
Comma have constructed a "stack" of models, just as you would connect a series of functions to make a kernel in the mathematics sense, or a series of algorithms or instructions to make a program. And that stack is entirely closed.
https://medium.com/@chengyao.shen/decoding-comma-ai-openpilo... here is an example of reverse-engineering the driving model. If Comma released this exact sort of documentation, including what ML modeling strategies they were using, what each input and output parameter affected, and how the model was trained, I could maybe consider the system open.
The models are now saved in ONNX format. Which is the most readable format available. You can view the architecture of the model with a basic neural network viewer.
Again, I'm curious what you want to learn from the training code?
Chengyao's medium post is great, but it is only possible because the models, the code that runs them and the code that parses the outputs is fully open source.
My binary is saved in a PE format. Which is the most readable format available. You can view the architecture of the software by opening it in the basic Ghidra pseudocode decompiler. All Windows software is now "fully open source."
Chengyao's Medium post is advanced reverse-engineering work requiring a detailed knowledge of the appearance of specific ML algorithms saved in a binary format. And even with this knowledge, Chengyao was only able to _speculate_ about the behavior of the model and the desired response to certain inputs.
What would satisfy me from Comma, if they were aspiring to some kind of "open" label, would be a detailed document explaining each layer of the ML system and what its goals are - like Chengyao's Medium post, but without the need to reverse-engineer the system and attempt to infer its behavior!
Now, maybe Comma don't aspire to be truly open, in which case, that's fine - In that case, Comma is a closed model with an open-source CAN interceptor on top. So essentially, crowd-sourcing the tedious and high-liability parts (vehicle integration, driving video) while owning the valuable parts (training data and model architecture). Very cool!
What format would you rather the model be saved in? ONNX is the most cross platform and standard as far as I know, and it's also what we use internally.
It's not like a PE format which is compiled from something else higher level.
?
I am confused with this statement. Sure the ML model is a black box, but it's better than closed source completely with no model. It's more realistic to build the software yourself than training your own self driving ML system.
I would still class this as still 'open source'.