By Sharon Begley
(Reuters) - In the escalating battle of big data vs. human experts, score another win for numbers.
The most accurate
predictions of which movies the U.S. Library of Congress will deem
"culturally, historically, or aesthetically significant" are not the
views of critics or fans but a simple algorithm applied to a database,
according to a study published on Monday.
The
crucial data, scientists reported in Proceedings of the National
Academy of Sciences, are what the Internet Movie Database (IMDb.com)
calls "Connections" - films, television episodes and other works that
allude to an earlier movie.
For
15,425 films in IMDB.com examined in the study, the measure that was
most predictive of which made it into the Library of Congress's National
Film Registry, which honors "significant" movies, was the number of
references to it by other films released many years later.
The
1972 classic "The Godfather," for instance, is referred to by 1,323
films and television episodes, which as recently as 2014 quoted the
"offer he can't refuse" line, referred to the famous horse-head scene,
or played the theme music, for instance. "Godfather" made the registry
in 1990.
The number of
references to a film more than 25 years after its release was a nearly
infallible predictor of whether it would make the registry, topping 91
percent accuracy, said applied mathematician and study author Max
Wasserman of Northwestern University.
Critics' judgments, Oscar wins, and box-office numbers did not come close.
Films
are nominated for the registry by the public and chosen by the
Librarian of Congress in consultation with a board of experts including
critics, academics, directors, screenwriters and other industry
insiders.
By the
25-year-lag rule, the 1971 box-office disappointment "Willy Wonka &
the Chocolate Factory" should be in the registry: IMDb lists 52 long-lag
citations to it, the 37th most in the Northwestern analysis.
In
December, six months after the scientists submitted their paper, the
Library added "Willy Wonka" to the list of 650 cinematic immortals, just
as the research predicted.
"Experts
have biases that can affect how they evaluate things," said physicist
and co-author Luis A.N. Amaral of Northwestern. "Automated, objective
methods don't suffer from that. It may hurt our pride, but they can
perform as well as or better than experts."
Other
movies identified by the Northwestern algorithm as likely to make the
Registry include "Dumbo," "Spartacus" and "The Shining."
Of
course, humans are not entirely superfluous: flesh-and-blood creators
must decide to refer to an earlier gem in order to establish the crucial
IMDb "connections."
(Reporting by Sharon Begley; Editing by Nick Zieminski)
No comments:
Post a Comment