Rule-based Recommender Systems for the Web of Data


A challenge of RuleML 2015, August 2015 Berlin

This challenge has two focus areas:

  • rule learning algorithms applied on recommender problems
  • using the linked open data cloud for feature set extension

The challenge uses a semantically enriched version of the MovieLens dataset.
In addition to the standard metrics of recommender performance, the
challenge aims to assess the understandability of the rule set
generated by the participating rule-based systems.

Motivation and Objectives

Many modern recommender systems rely on machine learning algorithms to learn user preferences.  While  these  generally provide the best results, they also usually act as a black-box solution which does not  provide a human-understandable explanation. However, the ability to explain a specific recommendation is a mandatory requirement in some application domains.

Rules are recognized as one of the most suitable and understandable forms to represent knowledge and relations in data. Rule-based recommender systems can thus provide a desirable balance between the quality of the recommendation and the understandability of the explanation for the human user.

Target Audience 

Researchers in the area of rule learning and machine learning in general. 
Within the context of the challenge, a rule-based recommender is either directly a rule learning algorithm, or a generic machine-learning algorithm whose output can be converted to human-understandable set of rules.
Researchers in  the area of semantic web and feature extraction
This challenge targets a new type of recommender problems which uses the Web of data to augment the feature set. Data in the original dataset are automatically mapped to Linked Open Data identifiers, and then additional features are generated from public knowledge bases such as DBpedia or Freebase.


The participating systems are requested to find and recommend a
limited set of 5 items that best match a user profile.

  • The participants will be provided with a semantically enriched
    version of the MovieLens dataset.
  • It is mandatory that a participating solution either uses the linked
    open data cloud to further extend the feature set or is a rule-based
    classifier. Both options simultaneously are preferred.
  • A scorer is provided by the organizers so that the participants can
    check their progress.
  • Challenge submission will consist of the set of additional
    recommendations (top-5 movies) for each user from the train dataset
    and a file containing the rules that lead to the prediction (rule
    based classifiers only, PMML RuleSet model preferred but not


Evaluation Dataset 

MovieLens dataset - dataset provided by GroupLens Research that contains rating datasets of movies from MovieLens website. MovieLens1M with 1 million ratings from 6000 users on 4000 movies will be used for this challenge. This dataset is enriched by additional semantic information using DBpedia mappings provided by SisInf Lab. 

Evaluation Metrics

  1. Recommender Performance
    • Relevance scores will be used by an evaluation service to form a Top-5 item recommendation list for each user. This means that for each user only items in the evaluation set are considered to form the Top-5 recommendation list. The evaluation metric for this task is the F-measure@5.
  2. Recommender Performance vs Compactness of Explanation
    • For each item in the recommendation list, an explanation is expected in terms of rules fired to compute the recommendation. We will consider as good explanations the most compact ones, i.e. those involving small number of easy to understand rules. This metric will be evaluated qualitatively by the PC members.
  3. Aggregate diversity
    • By looking at the Top-5 recommendation lists for all the user, the total number of suggested items will be considered in order to evaluate how well the systems perform in terms of recommending diverse items. Also in this case F-measure@5 and aggregate diversity will be combined in an overall score representing a Pareto optimal solution.
Since the Movie lens dataset is freely available, we ask the participants to use only the training/test set provided within this challenge. Any submission found not to comply with this rule will be disqualified.

Judging and Prize

A prize of 200 EUR for the the best recommender performance will be given to the paper with the highest score in the evaluation. Results of winning submissions may be asked to submit their code (compilable code or executable) for the track organizers  to be able to verify results on the test set (complete MovieLens dataset is freely available).

How to Participate

  1. Register for the challenge
  2. Download datasets
  3. Submit results and check your position on the leader board
  4. Submit paper describing your approach
The following information has to be provided for the paper submission:
  • Abstract: no more than 200 words.
  • Description: It should contain the details of the system, including why the system is innovative. The paper should cover in detail the rule learning algorithm used and the additional feature expansion if employed. The description should also summarize how participants have addressed the evaluation tasks. Papers must be submitted in PDF format, following the style of the Springer's Lecture Notes in Computer Science (LNCS) series, and not exceeding 2-15 pages in length.
Rule Challenge 2015 proceedings will be published as CEUR Proceedings and indexed by SCOPUS.

RecSysRules 2015 Deadlines:

Dataset availability: 5.1.2014
Paper and result submission (extended): June 1st, 2015
Author Notification (extended) June 18, 2015
Camera Ready (extended) July 19th, 2015
Challenge: Aug 5 , 2015


RecSysRules Challenge Chairs

  • Jaroslav Kuchař (Czech Technical University, Prague)
  • Tommaso di Noia (Politecnico di Bari, Italy)
  • Heiko Paulheim (University Mannheim, Germany)
  • Tomas Kliegr (University of Economics, Prague)

Program Committee (to be completed) 

  • Martin Atzmüller, University of Kassel, Germany
  • Johannes Fürnkranz, TU Darmstadt, Germany
  • Frederik Janssen, TU Darmstadt, Germany
  • Florian Lemmerich, University of Würzburg, Germany
  • Václav Zeman, University of Economics, Prague, Czech Republic
  • Tomáš Horváth, Pavol Jozef Šafárik University in Košice, Slovakia
  • Alan Said, Recorded Future, Sweden