Agriculture Vision 2020

Results Summary

NOTE all our single model’s scores are computed with just single-scale (512x512) and single feed-forward inference without TTA. TTA denotes test time augmentation (e.g. flip and mirror). Ensemble_TTA (checkpoint1,2) denotes two core.net.(checkpoint1, and checkpoint2) ensemble with TTA, and (checkpoint1, 2, 3) denotes three core.net.ensemble.

Models	mIoU (%)	Background	Cloud shadow	Double plant	Planter skip	Standing water	Waterway	Weed cluster
MSCG-Net-50 (ckpt1)	54.7	78.0	50.7	46.6	34.3	68.8	51.3	53.0
*MSCG-Net-101 (ckpt2)*	*55.0*	*79.8*	*44.8*	*55.0*	*30.5*	*65.4*	*59.2*	*50.6*
MSCG-Net-101_k31 (ckpt3)	54.1	79.6	46.2	54.6	9.1	74.3	62.4	52.1
Ensemble_TTA (ckpt1,2)	59.9	80.1	50.3	57.6	52.0	69.6	56.0	53.8
Ensemble_TTA (ckpt1,2,3)	60.8	80.5	51.0	58.6	49.8	72.0	59.8	53.8
Ensemble_TTA (new_5model)	62.2	80.6	48.7	62.4	58.7	71.3	60.1	53.4

Model Size

NOTE all backbones used pretrained weights on ImageNet that can be imported and downloaded from the link. And MSCG-Net-101_k31 has exactly the same architecture wit MSCG-Net-101, while it is trained with extra 1/3 validation set (4,431) instead of just using the official training images (12,901).

Models	Backbones	Parameters	GFLOPs	Inference time (CPU/GPU )
MSCG-Net-50	Se_ResNext50_32x4d	9.59	18.21	522 / 26 ms
MSCG-Net-101	Se_ResNext101_32x4d	30.99	37.86	752 / 45 ms
MSCG-Net-101_k31	Se_ResNext101_32x4d	30.99	37.86	752 / 45 ms