Presumably you already know how Netflix does that and you're asking this rhetorically. If you are actually interested in how Netflix evaluated submissions, you can read all about it on their contest page.
I thought about the Netflix example, which is obviously related, while writing my post. Ultimately the Netflix competition isn't for a recommendation engine, it's a prediction engine. They use that prediction engine to produce recommendations, but that's a separate phase. The top N predicted ratings != the best N things to recommend at this moment, though obviously it's useful to know predicted ratings for movies if you're writing a recommender.
A recommendation engine should not just recommend the highest predicted rated apps given that the user downloads the app. As an extremely simple example, even if you habitually rate games much higher than other apps, it shouldn't recommend you 100% games. You probably want to see other things sometimes too. The perfect set of apps on your phone would not be entirely games; you do like to twitter after all, even if you're not entirely fond of any of the twitter apps out right now.
The perfect recommendation engine would be psychic; it would know the set of apps on your phone that would make you maximally happy. That's obviously not the same as predicting exactly what you'd rate each app (though presumably you could do that easily).
So if I take the history set of Apps rated/downloaded by a user and give the Apps recommender only half of it to base it's recommendation, is that a prediction engine or recommendation engine?