Gravity @ Netflix Prize

Our team started to work on the Netflix Prize problem in late October 2006. Our first top-40 position dates back to 9th December 2006 (RMSE 0.9116, improvement 4.18%). We were among the top 10 competitors on 28th December 2006 (RMSE 0.9017, improvement 5.22%), and reached to top at 18th January 2007 (RMSE 0.8887, improvement 6.59%). With a few short and one longer breaks (29th March, 6th April, 26th April-12th May, 15-16th May, 22nd May) we stayed at the top until 8th June 2007. Later we get back to the lead with our joined team "When Gravity and Dinosaurs Unite" on 30th September 2007 (RMSE 0.8717, improvement 8.38%) - unfortunately only for one day. Then we returned there on 7th February 2008 (RMSE 0.8691, improvement 8.65%) and stayed there until 1st March 2008.

Since early January 2007, our team was always among the top 5 individual teams. Gravity as detailed above participated in a few collaborative teams. We finished at the first place with The Ensemble, at the 3rd place with GPT, at the 14th place with Gravity (5th individual team), and at the 27th place with When Gravity and Dinosaurs Unite. The graph below summarizes our performance at the Netflix Prize contest. Here you can check out the leaderboard.

  

We have published a few papers on our algorithms:

  1. G. Takács, I. Pilászy, B. Németh, and D. Tikk. On the Gravity Recommendation System.
    In Proc. of KDD Cup Workshop at SIGKDD'07, 13th ACM Int. Conf. on Knowledge Discovery and Data Mining,
    pp. 22-30, San Jose, CA, USA, August 12-15, 2007. [Article, Bibtex]
  2. G. Takács, I. Pilászy, B. Németh, and D. Tikk. Major components of the Gravity Recommendation System.
    ACM SIGKDD Explorations Newsletter, 9(2), pp. 80-83, 2007. [Article, Bibtex]
  3. G. Takács, I. Pilászy, B. Németh, and D. Tikk. A Unified Approach of Factor Models and Neighbor Based Methods for Large Recommender Systems.
    In 1th IEEE Workshop on Recommender Systems and Personalized Retrieval, Ostrava, Czech Republic, August 4, 2008. [Article, Bibtex]
  4. G. Takács, I. Pilászy, B. Németh, and D. Tikk. Investigation of Various Matrix Factorization Methods for Large Recommender Systems.
    In 2nd Netflix-KDD Workshop, Las Vegas, NV, USA, August 24, 2008. [Article, Bibtex]
  5. G. Takács, I. Pilászy, B. Németh, and D. Tikk. Matrix Factorization and Neighbor Based Algorithms for the Netflix Prize Problem.
    In 2nd ACM International Conference on Recommender Systems, Lausanne, Switzerland, October 25, 2008. [Article, Bibtex]
  6. G. Takács, I. Pilászy, B. Németh, and D. Tikk. Scalable collaborative filtering approaches for large recommender systems, Journal of Machine Learning Research, 10 (2009), 623-656. March 31, 2009. [Article, Bibtex]
  7. I. Pilászy and D. Tikk. Computational Complexity Reduction for Factorization-Based Collaborative Filtering Algorithms, EC-Web 2009, Accepted. [Article, Bibtex]
  8. I. Pilászy and D. Tikk. Recommending New Movies: Even a Few Ratings Are More Valuable Than Metadata, In 3rd ACM International Conference on Recommender Systems, New York, NY, October 22-25, 2009, Accepted. [Article, Bibtex]

Other resources:


Check out the source code and binaries of our program that calculates linear regression.

We use a particular 1/10 of the Probe set to evaluate our methods, which we term as Probe10. An interesting property of Probe10 is that methods trained on all data excluding Probe10 get almost the same RMSE on Probe10 and Quiz. Here is a perl script that creates Probe10 from probe.txt.