Many times a recommendation system is opaque and not adjustable because the peop...

Many times a recommendation system is opaque and not adjustable because the people who made it don't know how it make it transparent and adjustable. A good example of such a system was given in Andrew Ng's old machine learning MOOC. I'll describe it below.

Suppose we want to do a movie recommendation system, and we have a bunch of data consisting of anonymous user identifiers and for each user a list of movies they have seen and their rating of that movie.

Imagine that we had a list of movie attributes that we thought might be relevant to whether or not someone would like the movie. This might be things like running time, if it is funny, if it has car chases, if it has romance, if there are horses in it, if there is profanity, and so on. Let's say we've got 50 of these attributes.

Now suppose we had for each movie a vector of length 50 whose components were how well the movie fit each of those attributes, on a -1 to 1 scale.

If we had that movie attribute data, then what we could try is modeling the users as if each user had a 50 component vector telling how important each attribute is for that user, and for each user we could go over the list of movie ratings and try to find a set of weights for the components of that user's vector such that the dot product of their vector with a given movie's attribute vector correlates with that user's rating.

That works pretty well, but there are two practical problems. First, we need to come up with that attribute list for the movies. Second, once we've got our attributes someone has to go through each movie and figure out the vector.

So forget about that approach for a moment. Suppose instead we came up with a list of attributes, somehow, but instead of figuring them out for the movies and then inferring the user weights, suppose we told the users the attributes, and asked them how important each was, and assumed that the users are actually right about what is important to them?

Then maybe we could take all the movies, and try to assign attribute weights to them that lead to consistent predictions of user scores!

It turns out that works, but we still have the problem of guessing what attributes matter, and depend on the users actually somewhat knowing what makes them like a movie.

So...if we knew the movie attribute weights we could infer the user preference weights, and if we knew the user preference weights we could infer the movie attribute weights.

The brilliant solution is to get rid of the attribute labels. All we decide on is the number of attributes! So we might decide that there are going to be 50 attributes, and we can assign the movies random weights for all those attributes. We can assign the users random preferences for the attributes.

Then you can go through an iterative processes where you tune the movie attribute weights to better predict user scores, and you tune the user preferences to better match the movies. This ends up converging to a set of movie attribute weights and user preference weights that do a good job of reproducing the user's scores for the movies, and makes good recommendations...

...and you have no idea when it is done what the attributes mean!