Hacker News new | past | comments | ask | show | jobs | submit login
A movie recommendation service that actually works (nanocrowd.com)
49 points by noaharc on April 15, 2009 | hide | past | favorite | 24 comments



Well, it didn't work for me. If it is any consolation, no recommendation service usually does.

Here is the problem: recommendation engines usually try to match common variables between the movie you like and other movies - kudos to nanocrowd for doing this at a more sophisticated level than "it has the same actor in it" - but most fail to weigh heavily enough the quality of the movie (Netflix is notorious bad at this). More to the point, the movies that people really love have some personal connection with them that I am not sure is open to crowdsourcing.

I will now give you an example: I love the movie Pitch Black. Why? Yes, there's the action/horror tension, and the fabulous spaceship crash, and the charismatic lead - but the reason I love that movie is because of its kernel which is a highly moral tale about the salvation of caring for someone other than yourself, and of the powerful human need to seek redemption.

Now I am not going to reproduce this here, but go put in Pitch Black in nanocrowd and you will see the problem: it focuses on the superficial properties of the plot, and none of the movies recommended come even close.

What is the closest movie in feel that I have seen? The Station Agent. Now the day I type in "Pitch Black" in a recommendation and get back "The Station Agent" is the day I am going to sell all my wordly goods to buy stock in that company.

(That said it is safe to say most people aren't as picky as me, and I am sure you can find some success with this model especially if you harness it to something people visit a lot anyway, like IMDB or Netflix).


I find Netflix's recommendations to be far superior. Participants in the Netflix Prize have found that content-based filtering (such as "nano categories") actually hurts relevancy (in RMSE terms) when working with large data sets. That is, when you have a copious amount of data, it is usually better to let the data speak for itself. Similarly, it has been noted by Google and others that the better your data, the less impressive your algorithms need to be. Most people think Google's success is mainly attributed to magic algorithms, but most of their effort of late has been in increasing the quality and breadth of their data.

http://anand.typepad.com/datawocky/2008/03/more-data-usual.h...


I was under the impression that Netflix looked at your ratings, found people "similar" to you, and made recommendations based on that. Since signing up for Netflix, I"ve found that at least on the UI side, it's as you describe: driven by actor, genre, etc.

The beauty about the raw data approach is that it finds people with similar preferences to you. This is, I think, the real manifestation of O'Reilly's "Web 2.0." Rather than a semantic web, with ontologies and categorization the data do, indeed, speak for themselves.

Even if we were able to extract categorizations based on the preferences in the underlying Netflix data, it'd be difficult to map them to actual categories that we're familiar with. I'd envision it more like a PCA decomposition, where the principle components will be the strongest characteristics among each clique of like-minded movie watchers.

But alas, my Netflix home page is filled with crappy movies (the Matrix has a near-5 rating, but most other movies with its actors are complete crap). Instead I rely on friends with similar preferences for recommendations... which is what Netflix was supposed to offer. If that functionality is hidden somewhere, then they need to do a better job of exposing it.


Content-based data may not help with Netflix prize, but they do help to quickly drill down to what people want to see. You can pick a "nano category" based on the meaning of the words in it, rating data can be superior but cannot be used interactively in such a manner.

I've thought of doing a service like this movie search engine, maybe I still will. Search engines don't really leverage interactivity as much as they should. Consider a simple example: I choose Pi as a movie I like and Drama genre and get a search box to search within dramas similar to Pi. Providing context can go a long way towards getting what the user wants.


Same thing here. I put Godfather II which is one my all time favorites and of course I got a suggestion for Godfather III which I consider a totally stupid movie. Apart from this I got suggestions for a bunch of gangster movies. I couldn't care less about the gangster theme, it's the quality of directing, acting and dialogue that makes Godfather II great. Ironically enough Godfather III is from the same director and with the same lead actor which supports even more the point that properties similarity is not a good predictive factor in movies recommendation.


This is such a perfect example of what I was talking about that I shall thereafter think of this as the Godfather II problem :-)

In a way the real triumph in this area is Pandora / Music Genome Project. It is interesting to think about whether and how such a thing could translate to movies - in other words a visual and verbal medium. I certainly don't think micro-genres is the right direction.


> Well, it didn't work for me.

Me neither, and "The Wrestler" was quite a glaring omission.

I also agree that the "micro-genres" feature is flawed. For example, I searched for "Gwoemul" and none of the micro-genres it returned matched the reason I liked the movie - the deliciously dark humour.

When I searched for Eight Legged Freaks though it returned "terrorize goofy carnage" which is a close enough, if over-simplified, description of Gwoemul.

I suppose it boils down to the audience size as well. More people (on that site) are likely to have seen Eight Legged Freaks because it's older, and because it's Western movie (Gwoemul is Korean).


Ditto. I asked for movies like wargames, military, strategic, nuclear, and it was way off. It gave me movies like Meteor. I was expecting things like By The Dawns Early Light, Threads maybe, things like that.

Sorry, complete miss.


The autocomplete is nice. The selection is lacking. There seem to be virtually no foreign films, for example. The most important implication of this is that I've already seen everything it recommends (but that at least means that the recommendation algorithm is on target).

Making the user click on a nanogenre after entering a movie is unnecessary - you could show at least a partial list of all of them instead (and maybe show more of a particular list if you click on it).

Overall, I like clerkdogs better, mainly due to the wider selection.


The problem I see is that it forces the user to pick a 'nanogenre', a set of 3 arbitrary characteristics - what if I'm looking for a mix of charateristics from within the different nanogenres? It won't let me choose the exact mix of characteristics I want and instead, restricts me to the ones it displays.

I understand that recommendation engines need to work within certain predefined parameters, but that's exactly why they'll usually disappoint - you can't categorise a person's preferences into predefined parameters. Most of the time, there's no real reason why someone likes a movie and hates a logically related movie.

Personally, I prefer clerkdogs.com


I wonder if this suffers from the Napoleon Dynamite Problem:

http://www.nytimes.com/2008/11/23/magazine/23Netflix-t.html?...


Curious... From the end of the article:

Hastings is even considering hiring cinephiles to watch all 100,000 movies in the Netflix library and write up, by hand, pages of adjectives describing each movie, a cloud of tags that would offer a subjective view of what makes films similar or dissimilar. It might imbue Cinematch with more unpredictable, humanlike intelligence.

This appears to describe what the linked search engine is doing.


I searched on some of my favorite movies and was very impressed with the accuracy of the results. I've seen 95% of the movies it recommended, but I loved almost all of the ones it did. I think this would be good for finding movies that I would like to watch that are outside of my normal genres.


The site looks great, and your auto-complete is amazing! How can IMDB not have that?

However, i entered Sweeny Todd. I get a mostly blank page and am asked to pick a sub-genre. None of which really fit what i'm looking for (dark & musical). so I try the sub-genre thing, and it just isn't working. But then I see the left column with "movies most like". I'm assuming that is the main feature of the site. So why on earth do you not put that front and center, and if I want sub-genre, I can do that after??

Aside from that, i think the service is pretty good.


Strange. My first reaction is that the auto-complete was so bad that it, alone, would keep me from ever returning. I typed in "Ran" (the last film I saw), and then I had to click a tiny down-arrow about 25 times, and the down-arrow jumps around as you go. What takes "3 letters + return + click" on IMDB, took "3 letters + 25 precise clicks + return" on Nanocrowd. Autocomplete is usually an optional assist; here, it's more like a mandatory in-place search system with awkward controls.

I would have left a note about this (and other issues), but their only feedback mechanisms seem to be email or logging in to Blogger.

It may be the coolest recommendation algorithm ever, but from these first two things I tried, the interface seems fairly high-overhead. You need to hook me before I'll go for high-overhead. You need to convince me that you're more valuable than, say, simply listing other films by the same director. For "Ran", Nanocrowd recommends "Rambo" -- 'nough said. :-)


I think if they made their tag cloud (on the right in B&W) clickable, then you could select 'dark' and 'musical' and have the site suggest movies based on that. I think that would be a pretty sweet way to suggest movies.


It appears that the auto-complete is filtering from a local cache, thus initially downloading a list of all the movies in their database, rather than posting back to the server.


I guess it's Pandora's HGP for movies, which is cute, but presenting me with buckets doesn't really work.

http://nanocrowd.com/genre/nanogenre/id/3629 - The Iron Giant is all of these things, not just one.


Well I think that's the idea. It lets you choose related movies based on specific aspects of the seed, so that what it returns is as well-tailored to your appetite as possible.

Just giving a seed is much more difficult (see: Netflix Prize).


The UI doesn't make it clear, but if you don't want to use the Nanogenre thing, use the 'most like' bit on the left. I think the UI would be improved by making the Nanogenres and 'most like' more equal, as they are both useful but in different ways.


Wow! I think it indeed does. I clicked on the link thinking "sure... another recommendation engine" but I was surprised at the effectiveness of the method they use to make the suggestions: "3-word nanogenre." I searched for one of my favorite movies 'MirrorMask', clicked on 'fantasy, wondrous, surreal' nanogenre, and I got a list of films, many of which I loved:

http://nanocrowd.com/movie/genremovies/genreId/1814/movieId/...


There really hasn't been a good movie recommendation engine since LikeMinds got sold to IBM and their MovieCritic.com site was shuttered.

One of the original brains behind their collaborative filtering technology launched a similar site a few years ago at moviepig.com, but sadly the entire thing is done in flash and the design is so awful that it eclipses the fact that it makes pretty solid movie recommendations after you rank order a couple dozen movies. Worth a try.


Hmm. Amadeus is a "lavish music musical" and a "lavish historic historical".


Nicely done.




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: