オムライスの備忘録

数学・統計学・機械学習・プログラミングに関することを記す

【マルチモーダル】Optical Character Recognition / OCR

Index

Optical Character Recognition / OCR

Optical Character Recognition : 光学式文字認識.

Text Recognition とも

画像から文字情報を抽出するマルチモーダルタスク.

アルゴリズムの構造

Text Detection

U-Net / Mask R-CNN などの汎用的なアルゴリズムも利用される.

TextSnake / 2018

  • TextSnake: A Flexible Representation for Detecting Text of Arbitrary Shapes

Pixel Aggregation Network / PANet / 2019

  • Efficient and Accurate Arbitrary-Shaped Text Detection with Pixel Aggregation Network

Progressive Scale Expansion Network / PSENet / 2019

Differentiable Binarization Net / DBNet / 2019

  • Real-time Scene Text Detection with Differentiable Binarization

DBNet++ / 2022

  • Real-Time Scene Text Detection with Differentiable Binarization and Adaptive Scale Fusion

Deep Relational Reasoning Graph / DRRG / 2020

  • Deep Relational Reasoning Graph Network for Arbitrary Shape Text Detection

FCENet / 2021

  • Fourier Contour Embedding for Arbitrary-Shaped Text Detection

Text Recognition

MobileNet や ResNet などの汎用的なアルゴリズムも利用される.

Convolutional Recurrent Neural Network / RCNN / 2015

  • An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition

NRTR / 2018

No Recurrence sequence-to-sequence Text Recognizer



  • NRTR: A No-Recurrence Sequence-to-Sequence Model For Scene Text Recognition

SAR / 2018

Show Attend and Read
  • Show, Attend and Read: A Simple and Strong Baseline for Irregular Text Recognition

Self-Attention Text Recognition Network / SATRN / 2019

  • On Recognizing Texts of Arbitrary Shapes with 2D Self-Attention

MASTER / 2019

  • MASTER: Multi-Aspect Non-local Network for Scene Text Recognition

RobustScanner / 2020

  • RobustScanner: Dynamically Enhancing Positional Clues for Robust Text Recognition

ABINet / 2021

  • Read Like Humans: Autonomous, Bidirectional and Iterative Language Modeling for Scene Text Recognition

ライブラリ・API

MMOCR

Japanese OCR 実装

Tesseract

参考

  • MMOCR: A Comprehensive Toolbox for Text Detection, Recognition and Understanding

Web サイト

書籍