Unshuffling Data for Improved Generalization in Visual Question Answering

Teney, D.; Abbasnejad, E.; van den Hengel, A.

Please use this identifier to cite or link to this item: https://hdl.handle.net/2440/136312

Scopus	Web of Science®	Altmetric
Citations
?	?

Type:	Conference paper
Title:	Unshuffling Data for Improved Generalization in Visual Question Answering
Author:	Teney, D. Abbasnejad, E. van den Hengel, A.
Citation:	Proceedings / IEEE International Conference on Computer Vision. IEEE International Conference on Computer Vision, 2021, vol.abs/2002.11894, pp.1397-1407
Publisher:	IEEE
Publisher Place:	Los Alamitos, CA
Issue Date:	2021
Series/Report no.:	IEEE International Conference on Computer Vision
ISBN:	9781665428125
ISSN:	1550-5499
Conference Name:	IEEE/CVF International Conference on Computer Vision (ICCV) (11 Oct 2021 - 17 Oct 2021 : Virtual Online)
Statement of Responsibility:	Damien Teney, Ehsan Abbasnejad, Anton van den Hengel
Abstract:	Generalization beyond the training distribution is a core challenge in machine learning. The common practice of mixing and shuffling examples when training neural networks may not be optimal in this regard. We show that partitioning the data into well-chosen, non-i.i.d. subsets treated as multiple training environments can guide the learning of models with better out-of-distribution generalization. We describe a training procedure to capture the patterns that are stable across environments while discarding spurious ones. The method makes a step beyond correlation-based learning: the choice of the partitioning allows injecting information about the task that cannot be otherwise recovered from the joint distribution of the training data. We demonstrate multiple use cases with the task of visual question answering, which is notorious for dataset biases. We obtain significant improvements on VQA-CP, using environments built from prior knowledge, existing meta data, or unsupervised clustering. We also get improvements on GQA using annotations of “equivalent questions”, and on multidataset training (VQA v2 / Visual Genome) by treating them as distinct environments.
Rights:	©2021 IEEE
DOI:	10.1109/ICCV48922.2021.00145
Published version:	https://ieeexplore.ieee.org/xpl/conhome/9709627/proceeding
Appears in Collections:	Australian Institute for Machine Learning publications

Files in This Item:

File	Description	Size	Format
hdl_136312.pdf	Submitted version	2.15 MB	Adobe PDF	View/Open

Show full item record

Adelaide Research & Scholarship