オムライスの備忘録

数学・統計学・機械学習・プログラミングに関することを記す

【マルチモーダル】Dense Captioning

Index

Index
Dense Captioning
アルゴリズム
- GRiT / 2022
- ControlCap / 2024
データセット / ベンチマーク
- Visual Genome / 2016
参考
- Web ページ

Dense Captioning

マルチモーダル #まとめ編
- Vision-Language
- yhayato1320.hatenablog.com

アルゴリズム

GRiT / 2022

GRiT: A Generative Region-to-text Transformer for Object Understanding
- [2022]
- arxiv.org

ControlCap / 2024

ControlCap: Controllable Region-level Captioning
- [2024]
- arxiv.org

データセット / ベンチマーク

Visual Genome / 2016

Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations
- [2016]
- arxiv.org
- paperswithcode.com

参考

paper with code のタスクのページ
- paperswithcode.com

Web ページ

[DL輪読会]Dense Captioning分野のまとめ
- [DL輪読会]Dense Captioning分野のまとめ from Deep Learning JP
  www.slideshare.net