Except this isn't how statistical methods on vectors and matrices work.
In an ML data set, the value "Mike" may actually be one-hot encoded to one of 500 columns (one column value is 1, everything else is 0) - because you have 500 different names in your dataset, so each VALUE gets its own column.
It's a very different problem/solution than NULLs in databases.
In an ML data set, the value "Mike" may actually be one-hot encoded to one of 500 columns (one column value is 1, everything else is 0) - because you have 500 different names in your dataset, so each VALUE gets its own column.
It's a very different problem/solution than NULLs in databases.