Friday, January 23, 2009

Art Genomes?

Structuralist and post-deconstructionist and neo-post-pseudo-intellectualist-structuralites have been advancing the idea that we can classify and thus, to some extent, interpret or evaluate works of art by their semiotics. Briefly, this means we think of these art works as systems of signs that stand for things, and then consider the way the signs are configured within a work of art, and how that might reflect something about the things the signs stand for. Language is a semiotic system ... we use words to stand for things, and in writing, the combinations and sequences of words says something about the subject we're writing about. At least, we hope it does. Much of the 20th century philosophizing about this dealt with movies or, as it's more pompously known, the art of cinema.

That's all well and good, but now there's a very practical problem with is leading to a new way of thinking about this. Basically, the problem is that you've been listening to music through Pandora, or renting movies via Netflix, and you'd like to know what other music or movies would appeal to you.

These online content sources have tons of movies and music, but the problem is how to classify it so that you can identify which ones might appeal to audience members with certain tastes. Netflix, for example, asks you to rate each movie you rent, and over time, it builds up a database of what movies you've liked.

But there's still the similarity problem: If your favorite movies were Ishtar and Gigli, how to you decide which other movies are similar?

One promising answer is the movie genome. Basically, the idea is to identify a slew of properties that a movie can have, like plot, cast, awards, box office success, etc. Comparisons of movies on the basis of their genomes are likely to good matches and non-matches.

Now, you might think "Isn't this just a fancy way of comparing all these properties? Calling them a genome doesn't really change anything."

The answer is "Well, yes." It is just a fancy way of comparing these things. But treating all these properties as genes accomplishes two things:
  1. It makes the classification of movies more systematic, and
  2. It makes it possible to use some elaborate algorithms for doing the comparisons, finding near-matches, etc.
Obviously this genome idea could also be applied to books, plays, paintings, etc. It may lead to some new ideas about how to classify and interpret works in these art forms.

Or not.


Phoebe said...

Interesting thoughts on the complexity of recommendations. Jinni ( recently opened in private beta and offers search and recommendations from the Movie Genome. Have a look and see what you think!

Phil H said...

The problem with using 'facts' as a genome is that it is ignorant of the quality of the film - whether the crew gel, the funding is cut, they had to get it done for Christmas release... You know all the information it uses (director, cast) before the film is made.

Netflix' algorithms compare films without knowing any of that information, because the maths allows you to separate out which films are similar and which users are similar - so films that people similar to you like are the best suggestions.

In essence, the 'genome' of a film lacks any information about the quality of the film from the perspective of users like you. So how can it be as good at predicting?

For further info, look up the Netflix prize and Singular Value Decomposition.

Unknown said...

For one thing, the "genome" could include information about audience and critical responses to a film. That's as close to an objective measure of "good" as we can define, I think. Of course, you may disagree with audiences and critics. That's where the "people similar to you" angle comes in. People who like the same movies as you (over a large enough sample) may be a good predictor of what movies you'll like.