Tutorial on Benchmarking Algorithm Performance

Benchmarking Algorithm Performance for Research

The tutorial was held on October 29th, 2018 in association with the IEEE international conference on eScience, at the Mövenpick Hotel Amsterdam City Centre (Piet Heinkade 11, Amsterdam). The slides of the tarlks are available by clicking the titles of the talks below in the program.

About

One of the currently most well-known benchmarks for algorithm performance is ImageNet. Many challenges have been organized using this database, with the latest challenge now running on Kaggle. In various scientific disciplines there is a growing interest to benchmark algorithm performance on research data. Many algorithms are proposed in the literature, but there is a growing need to compare them on the same data, using the same metrics and ground truth to compare their performance for a specific task. Organizing these open online benchmarks, will not only increase insight into which algorithms perform best for a given task, but open up these tasks for a wider audience to test their algorithms on, which could lead to new breakthroughs in the field. This tutorial shows two research fields with a longer history in benchmarking algorithm performance, such as medical image analysis (Bram van Ginneken) and multimedia information retrieval (Maria Eskevich). Mike Lees talked about how benchmarking is being introduced for slum detection on sattelite images, a field with strong restrictions on data sharing, and Kasper Marstal talked about a new concept in medical image analysis, continuous integration for grand challenges. Before the coffee break, the EYRA benchmark platform that is currently under development was introduced. This is a joint initiative of SURF and the Netherlands eScience Center, to support researchers to easily set-up benchmarks and apply their algorithm on benchmarks from various scientific disciplines. We ended the tutorial with a discussion on the required features for such a platform from various scientific disciplines.

Target Audience

Algorithm developers with an interest to apply their algorithm to a benchmark
Scientists interested in setting up a benchmark
Anyone with an interest in how benchmarks can advance a scientific field

Program

8:30: Welcome
8:35: “Grand Challenges in Medical Image Analysis” (Prof.dr. Bram van Ginneken - Radboud UMC)
9:20: “Benchmarking Initiatives Testing the Algorithms of Natural Language and Multimedia Content Processing for Information Retrieval Purposes and Beyond” (dr. Maria Eskevich - CLARIN ERIC)
10:05: “The Enlighten your Research (EYRA) Benchmark Platform” (dr. Adriënne Mendrik - NL eScience Center)
10:30: Coffee Break
11:00: “Slum Detection on Satellite Images Challenge” (dr. Mike Lees - UvA)
11:20: “Continuous Integration for Grand Challenges” (Kasper Marstal - Erasmus MC)
11:40: “SURF Research Cloud” (dr. Ymke van den Berg - SURF)
12:00: Interactive discussion using menti.com (code: 58 69 72) on the EYRA benchmark platform for various scientific disciplines
12:30: Lunch

Registration

To register visit the IEEE eScience website. If you are not planning to attend the full conference, but only the workshop & tutorial day (October 29th), choose “Register for workshop & tutorial day”. You can attend another workshop in the afternoon, the full list of tutorials and workshops can be found here.

Speakers

Bram van Ginneken is Professor of Medical Image Analysis at Radboud University Medical Center and chairs the Diagnostic Image Analysis Group. He pioneered the concept of challenges (algorithm benchmarks) in medical image analysis, by organizing the first two challenge workshops in 2007 at one of the leading international conferences (MICCAI) in the field, after which many followed. He initiated and maintains https://grand-challenge.org/ that currently list over 150 challenges organized by various researchers in the field. He is the driving force behind the COMIC (Consortium for Open Medical Image Computing) platform, to easily set up websites for challenges in biomedical images analysis. At this tutorial, he will share his experiences with challenges in the biomedical image analysis field.

Maria Eskevich is Central Office Coordinator at CLARIN ERIC (European Research Infrastructure for Language Resources and Technology). Since 2010 she has been highly interested in running the benchmarking evaluation campaigns for multimedia related tasks exploring how users can browse and discover intersting material within multimedia archives using both audio and visual features of the data. She has been organizer and co-organizer of many tasks (algorithm benchmarks) at MediaEval and TRECVID, including Rich Speech Retrieval (RSR) at MediaEval (2011), Search and Hyperlinking (S&H) at MediaEval (2012-2014), Search and Anchoring in Video Archives (SAVA) at MediaEval (2015), and Video Hyperlinking (LNK) at TRECVID (2015-2017). At this tutorial, she will share her experiences with shared tasks for multimedia evaluation.

Organization

Surf eScience

Adriënne Mendrik (Netherlands eScience Center, Amsterdam, the Netherlands)
Mary Hester and Annette Langedijk (SURF, Utrecht, the Netherlands)

Associated with

IEEE eScience