Improve your Instance Segmentation Boundary Results

Amal
3 min readJan 24, 2022

Tremendous efforts have been made on instance segmentation but the mask quality is still not satisfactory. The boundaries of predicted instance masks are usually imprecise due to the low spatial resolution of feature maps and the imbalance problem caused by the extremely low proportion of boundary pixels. To address these issues, a conceptually simple yet effective post-processing refinement framework to improve the boundary quality based on the results of any instance segmentation model, termed BPR.

Following the idea of looking closer to segment boundaries better, BPR extracts and refines a series of small boundary patches along the predicted instance boundaries. The proposed BPR framework (as shown below) yields significant improvements over the Mask R-CNN baseline on the Cityscapes benchmark, especially on the boundary-aware metrics.

  • In this paper, they improve the boundary quality through a crop-then-refine strategy. Specifically, given a coarse instance mask produced by any instance segmentation model, we first extract a series of small image patches along the predicted instance boundaries. After concatenated with mask patches, the boundary patches are fed into a refinement network, which performs binary segmentation to refine the coarse boundaries. The refined mask patches are then reassembled into a compact and high-quality instance mask, shown in Figure 1. We termed the proposed framework as BPR (Boundary Patch Refinement).
  • The proposed framework can alleviate the aforementioned issues and improve the mask quality without any modification or fine-tuning to the segmentation models. Since we only crop around object boundaries, the patches are allowed to be processed with the much higher resolution than previous methods, so that low-level details can be retained better. Concurrently, the fraction of boundary pixels in the small patch is naturally increased, which can alleviate the optimization bias.
  • The proposed BPR framework significantly improves the results of Mask R-CNN baseline (+4.3% AP on Cityscapes dataset), and produces substantially better masks with finer boundaries.
  • The benefit of the binary mask patch is that it accelerates training convergence and provides location guidance for the instance to be segmented.
  • We mainly report the results on Cityscapes , a real-world dataset with high-quality instance segmentation annotations. Eight instance categories are involved, including bicycle, bus, person, train, truck, motorcycle, car, and rider.

Installation : https://github.com/tinyalpha/BPR/blob/main/docs/install.md

  • They also provide training for the COCO dataset. For those who want to apply BPR to their own datasets, we recommend converting them to the COCO format first.

We assume that the folder structure of the COCO data set is as follows:

BPR
├── data
│ ├── coco
│ │ ├── annotations
│ │ ├── train2017
│ │ ├── val2017
│ │ ├── test2017

Training :

IOU_THRESH=0.15 \
sh tools/prepare_dataset_coco.sh \
mask_rcnn_r50.train.segm.json \
mask_rcnn_r50.val.segm.json \
maskrcnn_r50 \
70000

To train Refinement Network :

DATA_ROOT=maskrcnn_r50/patches \
bash tools/dist_train.sh \
configs/bpr/hrnet18s_128.py \
4

Inference :

IOU_THRESH=0.25 \
IMG_DIR=data/coco/val2017 \
GT_JSON=data/coco/annotations/instances_val2017.json \
GPUS=4 \
sh tools/inference_coco.sh \
configs/bpr/hrnet18s_128.py \
hrnet18s_coco-c172955f.pth \
mask_rcnn_r50.val.segm.json \
mask_rcnn_r50.val.refined.json

IOU_THRESH means the threshold of nms (see our paper for details).

IMG_DIR and GT_JSON indicate the image folder and ground truth json file of COCO dataset.

configs/bpr/hrnet18s_128.py and hrnet18s_coco-c172955f.pth indicate the config file and checkpoint of Refinement Network.

mask_rcnn_r50.val.segm.json is the coasre instance segmentation results to be refined.

mask_rcnn_r50.val.refined.json saved the refined results.

Overall Results :

Qualitative results on Cityscapes val. The proposed framework (2nd and 4th rows) produces substantially better masks with more precise boundaries than Mask R-CNN (1st and 3rd rows). Best viewed digitally and in colour.

Paper link : https://arxiv.org/pdf/2104.05239.pdf

--

--