[Seminar] Towards Practical Few-Shot Query Sets: Transductive Minimum Description Length Inference

The next LIVIA seminar will be held on Wednesday, December 14th at 12h30 in hybrid mode.

Title: Towards Practical Few-Shot Query Sets: Transductive Minimum Description Length Inference
by Ségolène Martin, Ph.D. candidate, CentraleSupélec, Université Paris Saclay, INRIA, France

Short bio:
Ségolène Martin received her M.Sc. degree in mathematics from the Ecole Normale Supérieure Paris-Saclay, France, in 2020. She also obtained the qualification for teachers (Agrégation) in 2019. She is currently a Ph.D. student in CVN lab at CentraleSupéelec, University Paris Saclay, France. Her research focuses on optimization methods for image processing and machine learning.

Abstract:
Standard few-shot benchmarks are often built upon simplifying assumptions on the query sets, which may not always hold in practice. In particular, for each task at testing time, the classes effectively present in the unlabeled query set are known a priori, and correspond exactly to the set of classes represented in the labeled support set. We relax these assumptions and extend current benchmarks, so that the query-set classes of a given task are unknown, but just belong to a much larger set of possible classes. Our setting could be viewed as an instance of the challenging yet practical problem of extremely imbalanced K-way classification, K being much larger than the values typically used in standard benchmarks, and with potentially irrelevant supervision from the support set. Expectedly, our setting incurs drops in the performances of state-of-the-art methods. Motivated by these observations, we introduce a PrimAl Dual Minimum Description LEngth (PADDLE) formulation, which balances data-fitting accuracy and model complexity for a given few-shot task, under supervision constraints from the support set. Our constrained MDL-like objective promotes competition among a large set of possible classes, preserving only effective classes that befit better the data of a few-shot task. It is hyperparameter free, and could be applied on top of any base-class training. Furthermore, we derive a fast block coordinate descent algorithm for optimizing our objective, with convergence guarantee, and a linear computational complexity at each iteration. Comprehensive experiments over the standard few-shot datasets and the more realistic and challenging i-Nat dataset show highly competitive performances of our method, more so when the numbers of possible classes in the tasks increase.
Paper: https://openreview.net/pdf?id=j9JL96S8Vl
Code: https://github.com/SegoleneMartin/PADDLE

* In person: ETS-LIVIA, room A-3600.
* Zoom link: https://etsmtl.zoom.us/j/84820130813