Models that are trained on data under open source licenses (such as Creative Commons) would likely be much safer from copyright claims. I like to use the Debian Deep Learning Team's Machine Learning Policy to evaluate the openness of ML work.
Unless they carry with them a library of attributions to every source image, that safety comes mostly from anticipating that authors of CC-licensed works won't be too upset about people using them.
https://salsa.debian.org/deeplearning-team/ml-policy