Hacker News new | past | comments | ask | show | jobs | submit login

The example given by OP actually shows a taboo from the intersection of both sets.





Yes, English text in pretraining will necessarily have similar distribution. But when it comes to alignment, distributions will be different, since that data is typically not shared. The metapoint is - it is not realistic to expect completely uncensored models. Not in the East, nor in the West. The best you can do is use critical thinking when consulting both.



Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: