chester (@chesterzelaya)

2025-10-07 | โค๏ธ 270 | ๐Ÿ” 29


< How to Architect a Modelโ€™s Neck >

neck? yes, a neck is what we call the intermediate step between the backbone and the head in computer vision models

backbone - in charge of extracting multi-scale features

neck - in charge of fusing/reshaping features together, priming the data for the head

head - in charge of producing task-specific outputs (logits, boxes, masks)

now, what are some of the ways you can fuse the low level features together?

concatenation - preserves info, increases channels/compute addition - cheap, requires aligned channels; good default element-wise multiplication - acts like a gate; can be fragile to scale weighted summation - learnable mixing (e.g. BiFPN); best of both, slight overhead

very common out-of-the-box necks include:

  • FPN
  • BiFPN
  • NAS-FPN
  • PANet

all with pros-and-consโ€ฆ focused on the balance between speed and accuracy


Auto-generated bookmark

Tags

AI-ML Dev-Tools