What constitutes a 4-star movie? What is a thumbs up? Can you compare two 4-star movies directly without some kind of qualifier?
The addition of any kind of hard metric into an exercise in personal perspective is often futile, resulting in apples-to-oranges comparisons. Does Ebert really put WarGames on the same playing field as Apocalypse Now? If pressed, he’d probably say no, and point to the fact that one is in his “Great Movies” pantheon, while the other is merely a somewhat-dated 4-star movie.
It is in the aggregate, though, that a ratings system becomes more useful, as NetFlix, with their million-dollar prize for an improved recommendation engine, already knows. While reading the guts of a review is still the only way to find out for sure what a reviewer’s thoughts were (it’s difficult to fit subtlety into a thumb), you can get a feel for the types of movies that are likely to garner praise from him when you take an overview of his ratings and compare them against your own experiences. For instance, if a reviewer gives I Am Legend and Mr. & Mrs. Smith four-star reviews, while also giving Pride and Prejudice a two-star review, you can assume 1) that he won’t like Atonement and 2) that he needs his lithium dosage adjusted.
What this shows is that a rating metric can be useful, but only as regards a specific reviewer. Sites like Metacritic and Rottentomatoes attempt to find a weighted mean for reviews, converting a metric like a thumb or star from an individual review to a 100% scale, and then combining a multitude of these converted metrics for a given movie to obtain one all-knowing, all-seeing metric, in a process that no doubt involves higher math, advanced heuristics, and the witch-woman from Prince of Thieves. And really, all this is useful for is finding out that 80% of the reviewing public disagrees with you.
No, the only context in which ratings are useful is the context of that reviewer’s œuvre of reviews, and even then only if that reviewer is consistent in his rating system. For instance, Roger Ebert has 40 years’ worth of well-written reviews, all on a four-star scale. That’s a lot of raw data to go by. Richard Roeper, on the other hand, has 20-some years’ worth of reviews, altering between a five- and four-star scale, depending on if he’s writing for his own personal blog or the Sun-Times. That’s less useful, because a five-star scale doesn’t fit neatly into a four-star scale, for reasons obvious to anyone with a basic grasp of spatial reasoning.
As such, reviews on this site will be following a strict metric, allowing future generations interested in statistically estimating the exact point at which Western Civilization began to fail an easier time of pinpointing it (NOTE: the correct answer is Dogma). The metric will be based off the NetFlix method of a five-star scale, with no halvsies. The reasoning behind this is that a rating will be applied to the NetFlix queue anyway, so might as well carry it over. As I said earlier, you can’t fit subtlety in a thumb, so the missing half-stars can be gleaned from the tone of the review.
There will be one nod to the innate human desire to contextualize everything. There will be various categories that reviews will be filed under: Blockbuster, Arthaus, and General, as well as categories for Home and Theater. This is a nod to the fact that objectivity often goes out the window depending on the circumstances you see a movie under.
Case in point, had I seen Transformers 2 alone in my living room (or even alone in a quiet theater), I doubt I’d have given it as glowing a review as I did. Instead, I saw it on a 30′ screen in a packed theater with a good sound system. The review is as much a review of the circumstances and expectations as it is of the movie, but then that is, or should be, the case with all reviews. Movies like TF2 are made to play on a big screen in front of a loud audience, and when a movie succeeds at what it was made to do, that movie should get a positive review.