Paper Review. Align Deep Features for Oriented Object Detection@IEEE Transactions on Geoscience and Remote Sensing' 2021

JooChan Park on Apr 14, 20212021-04-14T14:00:00+09:00

Updated Jul 29, 20212021-07-29T17:39:11+09:00 3 min read

Abstract

기존 방법들은 scale, angle, aspect ratio를 사용하여 heuristic하게 정의된 anchor에 의존함
회전된 Anchor box와 회전되지 않은(axis-aligned) convolution사이의 misalignment로 인해 classification과 localization 정확도 간에 불일치가 존재함
Single-shot alignment Network(S^2A-Net)은 두개의 모듈로 이루어져 있음.
- Feature Alignment Module(FAM) : 고품질 anchor를 생성하고, alignment convolution을 통해 anchor 위치에 맞는 convolution을 수행함
- Oriented Detection Module(ODM) : Active Rotating Filters(ARF)를 사용해 방향 정보를 인코딩하여 orientation-sensitive feature를 제공함으로써 classification과 localization 정확도 간의 불일치를 어느정도 해결함
DOTA 데이터 셋에서 SOTA를 달성함

Anchor box와 object간의 misalignment를 해결하기 위해 논문에서는 Single-shot alignment Network(S^2A-Net)을 제안함.
Feature Alignment Module(FAM)
- 다른 방법론들과 다르게 하나의 horizontal anchor를 갖음.
- Anchor Refinement Network (ARN)에서 고품질의 회전된 anchor를 생성함.
- Alignment convolution을 통해 anchor 위치에 맞는 convolution을 수행함.
Oriented Detection Module(ODM)
- Active Rotating Filters(ARF)를 사용해 방향 정보를 인코딩하여 orientation-sensitive feature를 제공함으로써 classification과 localization 정확도 간의 불일치를 어느정도 해결함.

Feature Alignment Module과 Oriented Detection Module이 포함된 Single-Shot Alignment Network을 제안하며 속도와 정확도 모두 챙김
Predict box 정보를 Anchor box에 넣어서 학습 가능한 refine anchor를 만드는 방법이 좋았음
Deformable convolution을 쓸 때, offset field를 refine anchor를 기반으로 가져오기 때문에 효과가 있었음
모델이 갈수록 어려워지는 것 같고, 논문에서는 아주 간략하게 설명하기 때문에 전반적인 기초 지식이 없으면 이해하기 힘듬
코드를 뜯어보며 살펴보니 모델 구조를 명확히 이해할 수 있었고, Loss부분도 자세하게 뜯어볼 계획임

Zhou, Yanzhao, et al. “Oriented response networks.” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017.