Tutorial on Benchmarking Algorithm Performance

October 29th at the 14th IEEE International Conference on eScience 2018, Amsterdam, the Netherlands

View the Project on GitHub NLeSC/IEEE-eScience-Tutorial-Designing-Benchmarks


Benchmarking Algorithm Performance for Research

The tutorial was held on October 29th, 2018 in association with the IEEE international conference on eScience, at the Mövenpick Hotel Amsterdam City Centre (Piet Heinkade 11, Amsterdam). The slides of the tarlks are available by clicking the titles of the talks below in the program.


One of the currently most well-known benchmarks for algorithm performance is ImageNet. Many challenges have been organized using this database, with the latest challenge now running on Kaggle. In various scientific disciplines there is a growing interest to benchmark algorithm performance on research data. Many algorithms are proposed in the literature, but there is a growing need to compare them on the same data, using the same metrics and ground truth to compare their performance for a specific task. Organizing these open online benchmarks, will not only increase insight into which algorithms perform best for a given task, but open up these tasks for a wider audience to test their algorithms on, which could lead to new breakthroughs in the field. This tutorial shows two research fields with a longer history in benchmarking algorithm performance, such as medical image analysis (Bram van Ginneken) and multimedia information retrieval (Maria Eskevich). Mike Lees talked about how benchmarking is being introduced for slum detection on sattelite images, a field with strong restrictions on data sharing, and Kasper Marstal talked about a new concept in medical image analysis, continuous integration for grand challenges. Before the coffee break, the EYRA benchmark platform that is currently under development was introduced. This is a joint initiative of SURF and the Netherlands eScience Center, to support researchers to easily set-up benchmarks and apply their algorithm on benchmarks from various scientific disciplines. We ended the tutorial with a discussion on the required features for such a platform from various scientific disciplines.

Target Audience



To register visit the IEEE eScience website. If you are not planning to attend the full conference, but only the workshop & tutorial day (October 29th), choose “Register for workshop & tutorial day”. You can attend another workshop in the afternoon, the full list of tutorials and workshops can be found here.


Bram van Ginneken Bram van Ginneken is Professor of Medical Image Analysis at Radboud University Medical Center and chairs the Diagnostic Image Analysis Group. He pioneered the concept of challenges (algorithm benchmarks) in medical image analysis, by organizing the first two challenge workshops in 2007 at one of the leading international conferences (MICCAI) in the field, after which many followed. He initiated and maintains https://grand-challenge.org/ that currently list over 150 challenges organized by various researchers in the field. He is the driving force behind the COMIC (Consortium for Open Medical Image Computing) platform, to easily set up websites for challenges in biomedical images analysis. At this tutorial, he will share his experiences with challenges in the biomedical image analysis field.

Maria Eskevich Maria Eskevich is Central Office Coordinator at CLARIN ERIC (European Research Infrastructure for Language Resources and Technology). Since 2010 she has been highly interested in running the benchmarking evaluation campaigns for multimedia related tasks exploring how users can browse and discover intersting material within multimedia archives using both audio and visual features of the data. She has been organizer and co-organizer of many tasks (algorithm benchmarks) at MediaEval and TRECVID, including Rich Speech Retrieval (RSR) at MediaEval (2011), Search and Hyperlinking (S&H) at MediaEval (2012-2014), Search and Anchoring in Video Archives (SAVA) at MediaEval (2015), and Video Hyperlinking (LNK) at TRECVID (2015-2017). At this tutorial, she will share her experiences with shared tasks for multimedia evaluation.


Surf eScience

Associated with

IEEE eScience