Advertisement
 
 

eView: Why the Netflix Prize Is a Good Start to Personalized Recommendations

September 10, 2009 By Darren Erik Vengroff
Get the Flash Player to see this rotator.
 

Netflix created the $1 million Netflix Prize in 2006 as a way to reward developers of a next-generation film-rating prediction algorithm.

The idea sounded seductively simple: If developers could predict how users would rate a film, they could use that prediction to decide whether it makes sense to recommend the film to them or not.

The winner would be the first person to develop an algorithm capable of predicting user ratings of films at least 10 percent more accurately than Netflix’s internally developed Cinematch algorithm. Three years and 43,000 entries later — from more than 5,100 teams in 185 countries — a winner appears to have emerged. The winner will be officially announced this month, and it’s down to two teams.

Staking their claims
The data set provided to entrants was simple: it consisted of 100 million movie ratings given by 480,000 users to 18,000 movies over a seven-year period. Some users rated hundreds or even thousands of films; others rated just a handful.

For the movies, Netflix provided only the title and year of release. There was no information about stars, type of film (sci-fi, romance, comedy, etc.), director or Motion Picture Association of America rating.

Nonetheless, the large volume of data, and the fact that it came from a real live consumer-facing business, was incredibly exciting to computer science researchers. And the chance of winning $1 million probably didn’t hurt, either.

Thousands of teams entered to stake their claims to the prize. Some were teams of machine-learning Ph.D.s from prestigious institutions, while others were hobbyists working out of their basements. Press and blog coverage of the competition elevated some of the contestants into geek superstars.

The wrong direction
Much of the coverage, however, failed to fully understand how constrained the competition was. Consider how it compares to commerce in the offline world. If you walk into your local video store and ask a clerk for a recommendation, most likely the clerk would ask you what kind of films you like. If you like romantic comedies, he might recommend Hugh Grant’s latest opus. If you like action, "Transformers" would be the recommendation for you.

Netflix, however, didn't provide contestants the kind of information that customers normally would give a video store clerk. Initially, contestants had to construct recommendations, or “mine” them, from the simple raw data. The teams employed a variety of mathematical techniques to identify sets of films that appeared to be similar.

If an algorithm could determine that films A, B, C, D and E are very similar, for example, then someone who rated A, C and D highly would probably also like B and E. Soon, entrants began pulling data from other sources, like the IMDb website, to help identify other attributes of the films that could prove useful.

On June 26, a team of researchers calling themselves BellKor’s Pragmatic Chaos made the first claim of winning the $1 million prize with a 10.05 percent improvement over Cinematch. This triggered a 30-day window in which other teams could take a final shot at the prize. On July 25, just as the window was about to close, a team called The Ensemble submitted an entry with a 10.09 percent improvement. The contest is now closed, and judges are doing their final evaluations of the two potential winners.

In search of relevant recommendations
The entrants, particularly the stronger teams, have made tremendous advances in their ability to mine very subtle relationships out of the data. However, their advances are still insufficient to deliver the holy grail: relevant, high-quality recommendations.

Most retailers have far more data about their products in their catalogs than what Netflix gave contestants. This data — the top sellers in each category, color and size of a shirt, or a customer’s historical purchase behavior — can be used to make even more accurate recommendations. Beyond catalog data, continuous feedback about how people react to recommended products (ignore, click or buy) can be invaluable in improving the quality of recommendations.

Real-time context is also critical. If a user who normally prefers historical biopics is browsing in the action and adventure category, then that critical bit of context can make a tremendous difference in the quality of the recommendations in that moment.

The Netflix Prize winners, whomever they may be, will be remembered as valuable contributors to the field of recommendations. But to achieve what customers really want — the kind of relevant recommendations they’d get from a clerk or trusted friend — requires a deeper analysis at the right contextual inputs and real-time feedback loops.

Darren Erik Vengroff is the chief scientist at richrelevance, a San Franciso-based provider of personalization and product recommendation tools for enterprise-class e-commerce sites. Reach Darren at vengroff@richrelevance.com.


 

SPONSORED CONTENT

MORE ON SOCIAL MEDIA >>

FROM THE BOOKSTORE

<i>Powered by the Email Campaign Archive, www.emailcampaignarchive.com </i>

According to “The Power of Direct,” a late 2009 study from the Direct Marketing Association, email marketing returned an unbeatable ROI of $43.62 for every dollar spent on it in 2009. 

Thanks to this tremendous success, email marketing is on the rise … and increased volume means that marketers are faced with more and more competition resulting in overcrowded inboxes and frustrated, overwhelmed prospects.

The challenge: How to break through the clutter and get your message opened and read within 3 seconds, for that’s how long your prospects allow before they hit the delete button.  
 
<b>“All About Email Creative” is here to help.</b>

Through detailed analysis of hundreds of thousands of emails residing in the Email Campaign Archive (www.emailcampaignarchive.com), best-practice advice from industry experts, case studies and more, this groundbreaking report will give you the tools you need for success.  Here are just a few of the take-aways that you will learn:

•	Month with the Highest Volume of Email
•	Day of the Week with the Highest Volume of Email
•	Time of Day with the Highest email Distribution
•	Top 20 Most Popular Words and Symbols in Subject Lines
•	Word with Highest Increase of Subject Line in Repeat Email
•	Top 10 Categories with Most Email Volume
•	Word Count Trends … What Could It Mean?
•	The One Single Tactical Move to Improve Email Response
•	Maximum Number of Characters in the Subject Line
•	How to Test Subject Lines
•	How to Avoid Junk Filters – the Trigger Words That Get You Trashed
•	Why you Should Pay More Attention to the “From” Line
•	Once Opened, What Should the Reader See Next?
•	10 Steps to Getting Your Message Just Right
•	5 Ways to Optimize the Email Preview Pane
•	How to Deal with Blocked Images
•	Web-Friendly Fonts and Font Sizes – What Are They?
•	The Top Reason People Unsubscribe from Marketing Messages
•	To Use Free or Not to Use Free … That Is the Question
•	16 Most Effective Strategies for Email Branding
•	The Difference Between B-to-B and B-to-C Email Marketing
•	HTML or Text.  Which Should You Use?
•	The list goes on … and on

Filled with countless examples, more than 20 charts, several case studies, and privileged knowledge from top email marketers, “All About Email Creative” is must-reading for any marketer involved in email and cross-media campaigns.

<b><u>100% Money-Back Guarantee</b></u>

Your order is risk-free. If you are not completely delighted with “All About Email Creative,” notify us within 30 days for a complete credit or refund, no questions asked.

<u>About DirectMarketingIQ</u>

The Research Division of the Target Marketing Group, DirectMarketingIQ (www.directmarketingiq.com) is the go-to resource for direct marketers. Publishing books, special reports, case study stockpiles and how-to guides, it opens up a new world for those who seek more information, more ideas and more success stories in order to boost their own marketing efforts. DirectMarketingIQ has unparalleled access to direct marketing data - including the world's most complete library of direct mail as well as a growing library of promotional emails across hundreds of categories - and proudly produces content from the most experienced editors and practitioners in the industry. All About Email Creative

Powered by the Email Campaign Archive, www.emailcampaignarchive.com According to “The Power of Direct,” a late 2009 study from the Direct Marketing Association, email marketing returned an unbeatable ROI of $43.62 for every dollar spent on it in 2009. Thanks to this tremendous success, email marketing is on the rise …...

ORDER NOW

Who's Charging What Who's Charging What!

NEW 2010 Edition Now Available Who's Charging What! -- Your Guide to Direct Marketing Creative Services gives you complete facts on top copywriters, agencies, designers, and consultants, providing you with the critical information you need to make decisions when looking for a copywriter. These top-flight professionals...

ORDER NOW

 

COMMENTS

Most Recent Comments: