Index
Optical Character Recognition / OCR
Optical Character Recognition : 光学式文字認識.
Text Recognition とも
画像から文字情報を抽出するマルチモーダルタスク.
- マルチモーダル #まとめ編
- Vision-Language
- yhayato1320.hatenablog.com
アルゴリズムの構造
Text Detection
U-Net / Mask R-CNN などの汎用的なアルゴリズムも利用される.
U-Net
Mask R-CNN
TextSnake / 2018
- TextSnake: A Flexible Representation for Detecting Text of Arbitrary Shapes
- [2018]
- arxiv.org
Pixel Aggregation Network / PANet / 2019
- Efficient and Accurate Arbitrary-Shaped Text Detection with Pixel Aggregation Network
- [2019]
- arxiv.org
Progressive Scale Expansion Network / PSENet / 2019
- Shape Robust Text Detection with Progressive Scale Expansion Network
- [2019]
- arxiv.org
Differentiable Binarization Net / DBNet / 2019
- Real-time Scene Text Detection with Differentiable Binarization
- [2019]
- arxiv.org
DBNet++ / 2022
- Real-Time Scene Text Detection with Differentiable Binarization and Adaptive Scale Fusion
- [2022]
- arxiv.org
Deep Relational Reasoning Graph / DRRG / 2020
- Deep Relational Reasoning Graph Network for Arbitrary Shape Text Detection
- [2020]
- arxiv.org
FCENet / 2021
- Fourier Contour Embedding for Arbitrary-Shaped Text Detection
- [2021]
- arxiv.org
Text Recognition
MobileNet や ResNet などの汎用的なアルゴリズムも利用される.
MobileNet
ResNet
Convolutional Recurrent Neural Network / RCNN / 2015
- An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition
- [2015]
- arxiv.org
NRTR / 2018
No Recurrence sequence-to-sequence Text Recognizer
- NRTR: A No-Recurrence Sequence-to-Sequence Model For Scene Text Recognition
- [2018]
- arxiv.org
SAR / 2018
Show Attend and Read
- Show, Attend and Read: A Simple and Strong Baseline for Irregular Text Recognition
- [2018]
- arxiv.org
Self-Attention Text Recognition Network / SATRN / 2019
- On Recognizing Texts of Arbitrary Shapes with 2D Self-Attention
- [2019]
- arxiv.org
MASTER / 2019
RobustScanner / 2020
- RobustScanner: Dynamically Enhancing Positional Clues for Robust Text Recognition
- [2020]
- arxiv.org
ABINet / 2021
- Read Like Humans: Autonomous, Bidirectional and Iterative Language Modeling for Scene Text Recognition
- [2021]
- arxiv.org
ライブラリ・API
MMOCR
- github.com
- MMOCR
Japanese OCR 実装
- github.com
- DetectionNet (R2U-Net) + MobileNetV1
Tesseract
参考
- MMOCR: A Comprehensive Toolbox for Text Detection, Recognition and Understanding
- [2021]
- arxiv.org
Web サイト
- paperswithcode.com
- paper with code のタスクのページ
書籍
- PyTorch ではじめる AI 開発
- 8 OCR における文字認識
- 9 0CR を完成させる