A curated list of resources dedicated to scene text localization and recognition. Any suggestions and pull requests are welcome.
Papers & Code
Overview
- [2015-PAMI] Text Detection and Recognition in Imagery: A Survey
paper
- [2014-Front.Comput.Sci] Scene Text Detection and Recognition: Recent Advances and Future Trends
paper
Visual Geometry Group, University of Oxford
- [2016-IJCV, M. Jaderberg] Reading Text in the Wild with Convolutional Neural Networks
paper
demo
homepage
- [2016-CVPR, A Gupta] Synthetic Data for Text Localisation in Natural Images
paper
code
data
- [2015-ICLR, M. Jaderberg] Deep structured output learning for unconstrained text recognition
paper
- [2015-D.Phil Thesis, M. Jaderberg] Deep Learning for Text Spotting
paper
- [2014-ECCV, M. Jaderberg] Deep Features for Text Spotting
paper
code
model
GitXiv
- [2014-NIPS, M. Jaderberg] Synthetic Data and Artificial Neural Networks for Natural Scene Text Recognition
paper
homepage
model
CUHK & SIAT
- [2016-arXiv] Accurate Text Localization in Natural Image with Cascaded Convolutional Text Network
paper
- [2016-AAAI] Reading Scene Text in Deep Convolutional Sequences
paper
- [2016-TIP] Text-Attentional Convolutional Neural Networks for Scene Text Detection
paper
- [2014-ECCV] Robust Scene Text Detection with Convolution Neural Network Induced MSER Trees
paper
Media and Communication Lab, HUST
- [2016-CVPR] Robust scene text recognition with automatic rectification
paper
- [2016-CVPR] Multi-oriented text detection with fully convolutional networks
paper
- [2015-CoRR] An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition
paper
code
github
AI Lab, Stanford
- [2012-ICPR, Wang] End-to-End Text Recognition with Convolutional Neural Networks
paper
code
SVHN Dataset
- [2012-PhD thesis, David Wu] End-to-End Text Recognition with Convolutional Neural Networks
paper
Others
- [2018-CVPR] FOTS: Fast Oriented Text Spotting With a Unified Network
paper
- [2018-IJCAI] IncepText: A New Inception-Text Module with Deformable PSROI Pooling for Multi-Oriented Scene Text Detection
paper
- [2018-AAAI] PixelLink: Detecting Scene Text via Instance Segmentation
paper
code
- [2018-AAAI] SEE: Towards Semi-Supervised End-to-End Scene Text Recognition
paper
code
- [2017-arXiv] Fused Text Segmentation Networks for Multi-oriented Scene Text Detection
paper
- [2017-arXiv] WeText: Scene Text Detection under Weak Supervision
paper
- [2017-ICCV] Single Shot Text Detector with Regional Attention
paper
- [2017-ICCV] WordSup: Exploiting Word Annotations for Character based Text Detection
paper
- [2017-arXiv] R2CNN: Rotational Region CNN for Orientation Robust Scene Text Detection
paper
- [2017-CVPR] EAST: An Efficient and Accurate Scene Text Detector
paper
code
- [2017-arXiv] Cascaded Segmentation-Detection Networks for Word-Level Text Spotting
paper
- [2017-arXiv] Deep Direct Regression for Multi-Oriented Scene Text Detection
paper
- [2017-CVPR] Detecting oriented text in natural images by linking segments paper
code
- [2017-CVPR] Deep Matching Prior Network: Toward Tighter Multi-oriented Text Detection
paper
- [2017-arXiv] Arbitrary-Oriented Scene Text Detection via Rotation Proposals
paper
- [2017-AAAI] TextBoxes: A Fast Text Detector with a Single Deep Neural Network
paper
code
- [2017-ICCV] Deep TextSpotter: An End-to-End Trainable Scene Text Localization and Recognition Framework
paper
code
- [2016-CVPR] Recursive Recurrent Nets with Attention Modeling for OCR in the Wild
paper
- [2016-arXiv] COCO-Text: Dataset and Benchmark for Text Detection and Recognition in Natural Images
paper
- [2016-arXiv] DeepText:A Unified Framework for Text Proposal Generation and Text Detection in Natural Images
paper
- [2015 ICDAR] Object Proposals for Text Extraction in the Wild
paper
code
- [2014-TPAMI] Word Spotting and Recognition with Embedded Attributes
paper
homepage
code
Datasets
-
MLT 2017
2017
- 7200 training, 1800 validation images
- Bounding box, text transcription, and script annotations
- Task: text detection, script identification
-
COCO-Text (Computer Vision Group, Cornell)
2016
- 63,686 images, 173,589 text instances, 3 fine-grained text attributes.
- Task: text location and recognition
COCO-Text API
-
Synthetic Word Dataset (Oxford, VGG)
2014
- 9 million images covering 90k English words
- Task: text recognition, segmentation
download
-
IIIT 5K-Words
2012
- 5000 images from Scene Texts and born-digital (2k training and 3k testing images)
- Each image is a cropped word image of scene text with case-insensitive labels
- Task: text recognition
download
-
StanfordSynth(Stanford, AI Group)
2012
- Small single-character images of 62 characters (0-9, a-z, A-Z)
- Task: text recognition
download
-
MSRA Text Detection 500 Database (MSRA-TD500)
2012
- 500 natural images(resolutions of the images vary from 1296x864 to 1920x1280)
- Chinese, English or mixture of both
- Task: text detection
-
- 350 high resolution images (average size 1260 × 860) (100 images for training and 250 images for testing)
- Only word level bounding boxes are provided with case-insensitive labels
- Task: text location
-
KAIST Scene_Text Database
2010
- 3000 images of indoor and outdoor scenes containing text
- Korean, English (Number), and Mixed (Korean + English + Number)
- Task: text location, segmantation and recognition
-
Chars74k
2009
- Over 74K images from natural images, as well as a set of synthetically generated characters
- Small single-character images of 62 characters (0-9, a-z, A-Z)
- Task: text recognition
-
ICDAR Benchmark Datasets
DatasetDiscriptionCompetition Paper
ICDAR 20151000 training images and 500 testing imagespaper
ICDAR 2013229 training images and 233 testing imagespaper
ICDAR 2011229 training images and 255 testing imagespaper
ICDAR 20051001 training images and 489 testing imagespaper
ICDAR 2003181 training images and 251 testing images(word level and character level)paper
Blogs
- Scene Text Detection with OpenCV 3
- Handwritten numbers detection and recognition
- Applying OCR Technology for Receipt Recognition
- Convolutional Neural Networks for Object(Car License) Detection
- Extracting text from an image using Ocropus
- Number plate recognition with Tensorflow
github
- Using deep learning to break a Captcha system
report
github
- Breaking reddit captcha with 96% accuracy
github
Scene Text Recognition in iOS 11github
原文:https://github.com/chongyangtao/Awesome-Scene-Text-Recognition
- 登录 发表评论