Frustratingly Simple Few-Shot Object Detection (TFA)

Untitled

Introduction

backbone and RPN features are class-agnostic
- features learned from base classes are likely to transfer to novel classes without updates
separate feature representation learning and box predictor learning to two stages
outperforms previous meta-learning approaches and more memory efficient
able to address severe data imbalance issue without repeated sampling
- checked from results on LVIS

Untitled

create small balanced training set with K shots per class
- novel & base classes
assign random initialized weights to box prediction networks for novel classes
fine-tune only the box classification and regression networks
- feature extractor all freeze
used smaller learning rate (reduced by 20)
Cosine similarity
- used cosine similarity classifier in second fine-tuning stage
- also adopted instance-level feature normalization
  - help reduce the intra-class variance and reduce novel and base trade-off