Blendmask: Top-down meets bottom-up for instance segmentation

Chen, H.; Sun, K.; Tian, Z.; Shen, C.; Huang, Y.; Yan, Y.

Please use this identifier to cite or link to this item: https://hdl.handle.net/2440/132388

Scopus	Web of Science®	Altmetric
Citations
?	?

Type:	Conference paper
Title:	Blendmask: Top-down meets bottom-up for instance segmentation
Author:	Chen, H. Sun, K. Tian, Z. Shen, C. Huang, Y. Yan, Y.
Citation:	Proceedings / CVPR, IEEE Computer Society Conference on Computer Vision and Pattern Recognition. IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2020, pp.8570-8578
Publisher:	IEEE
Publisher Place:	online
Issue Date:	2020
Series/Report no.:	IEEE Conference on Computer Vision and Pattern Recognition
ISBN:	9781728171692
ISSN:	1063-6919 2575-7075
Conference Name:	IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (14 Jun 2020 - 19 Jun 2020 : virtual online)
Statement of Responsibility:	Hao Chen, Kunyang Sun, Zhi Tian, Chunhua Shen, Yongming Huang, Youliang Yan
Abstract:	Instance segmentation is one of the fundamental vision tasks. Recently, fully convolutional instance segmentation methods have drawn much attention as they are often simpler and more efficient than two-stage approaches like Mask R-CNN. To date, almost all such approaches fall behind the two-stage Mask R-CNN method in mask precision when models have similar computation complexity, leaving great room for improvement. In this work, we achieve improved mask prediction by effectively combining instancelevel information with semantic information with lowerlevel fine-granularity. Our main contribution is a blender module which draws inspiration from both top-down and bottom-up instance segmentation approaches. The proposed BlendMask can effectively predict dense per-pixel position-sensitive instance features with very few channels, and learn attention maps for each instance with merely one convolution layer, thus being fast in inference. BlendMask can be easily incorporate with the state-of-the-art one-stage detection frameworks and outperforms Mask R-CNN under the same training schedule while being faster. A lightweight version of BlendMask achieves 36.0 mAP at 27 FPS evaluated on a single 1080Ti. Because of its simplicity and efficacy, we hope that our BlendMask could serve as a simple yet strong baseline for a wide range of instance-wise prediction tasks.
Rights:	©2020 IEEE
DOI:	10.1109/CVPR42600.2020.00860
Published version:	https://ieeexplore.ieee.org/xpl/conhome/9142308/proceeding
Appears in Collections:	Computer Science publications

Files in This Item:

There are no files associated with this item.

Show full item record

Adelaide Research & Scholarship