Hacker News new | past | comments | ask | show | jobs | submit login

Except this isn't how statistical methods on vectors and matrices work.

In an ML data set, the value "Mike" may actually be one-hot encoded to one of 500 columns (one column value is 1, everything else is 0) - because you have 500 different names in your dataset, so each VALUE gets its own column.

It's a very different problem/solution than NULLs in databases.




I was referring to jacques_chester’s comment. You’re bringing up a related and still well explored topic as this situation is very common. https://link.springer.com/article/10.1007%2Fs00180-013-0468-...




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: