跳转到主要内容

标签(标签)

资源精选(342) Go开发(108) Go语言(103) Go(99) angular(82) LLM(78) 大语言模型(63) 人工智能(53) 前端开发(50) LangChain(43) golang(43) 机器学习(39) Go工程师(38) Go程序员(38) Go开发者(36) React(33) Go基础(29) Python(24) Vue(22) Web开发(20) Web技术(19) 精选资源(19) 深度学习(19) Java(18) ChatGTP(17) Cookie(16) android(16) 前端框架(13) JavaScript(13) Next.js(12) 安卓(11) 聊天机器人(10) typescript(10) 资料精选(10) NLP(10) 第三方Cookie(9) Redwoodjs(9) ChatGPT(9) LLMOps(9) Go语言中级开发(9) 自然语言处理(9) PostgreSQL(9) 区块链(9) mlops(9) 安全(9) 全栈开发(8) OpenAI(8) Linux(8) AI(8) GraphQL(8) iOS(8) 软件架构(7) RAG(7) Go语言高级开发(7) AWS(7) C++(7) 数据科学(7) whisper(6) Prisma(6) 隐私保护(6) JSON(6) DevOps(6) 数据可视化(6) wasm(6) 计算机视觉(6) 算法(6) Rust(6) 微服务(6) 隐私沙盒(5) FedCM(5) 智能体(5) 语音识别(5) Angular开发(5) 快速应用开发(5) 提示工程(5) Agent(5) LLaMA(5) 低代码开发(5) Go测试(5) gorm(5) REST API(5) kafka(5) 推荐系统(5) WebAssembly(5) GameDev(5) CMS(5) CSS(5) machine-learning(5) 机器人(5) 游戏开发(5) Blockchain(5) Web安全(5) Kotlin(5) 低代码平台(5) 机器学习资源(5) Go资源(5) Nodejs(5) PHP(5) Swift(5) devin(4) Blitz(4) javascript框架(4) Redwood(4) GDPR(4) 生成式人工智能(4) Angular16(4) Alpaca(4) 编程语言(4) SAML(4) JWT(4) JSON处理(4) Go并发(4) 移动开发(4) 移动应用(4) security(4) 隐私(4) spring-boot(4) 物联网(4) nextjs(4) 网络安全(4) API(4) Ruby(4) 信息安全(4) flutter(4) RAG架构(3) 专家智能体(3) Chrome(3) CHIPS(3) 3PC(3) SSE(3) 人工智能软件工程师(3) LLM Agent(3) Remix(3) Ubuntu(3) GPT4All(3) 软件开发(3) 问答系统(3) 开发工具(3) 最佳实践(3) RxJS(3) SSR(3) Node.js(3) Dolly(3) 移动应用开发(3) 低代码(3) IAM(3) Web框架(3) CORS(3) 基准测试(3) Go语言数据库开发(3) Oauth2(3) 并发(3) 主题(3) Theme(3) earth(3) nginx(3) 软件工程(3) azure(3) keycloak(3) 生产力工具(3) gpt3(3) 工作流(3) C(3) jupyter(3) 认证(3) prometheus(3) GAN(3) Spring(3) 逆向工程(3) 应用安全(3) Docker(3) Django(3) R(3) .NET(3) 大数据(3) Hacking(3) 渗透测试(3) C++资源(3) Mac(3) 微信小程序(3) Python资源(3) JHipster(3) 语言模型(2) 可穿戴设备(2) JDK(2) SQL(2) Apache(2) Hashicorp Vault(2) Spring Cloud Vault(2) Go语言Web开发(2) Go测试工程师(2) WebSocket(2) 容器化(2) AES(2) 加密(2) 输入验证(2) ORM(2) Fiber(2) Postgres(2) Gorilla Mux(2) Go数据库开发(2) 模块(2) 泛型(2) 指针(2) HTTP(2) PostgreSQL开发(2) Vault(2) K8s(2) Spring boot(2) R语言(2) 深度学习资源(2) 半监督学习(2) semi-supervised-learning(2) architecture(2) 普罗米修斯(2) 嵌入模型(2) productivity(2) 编码(2) Qt(2) 前端(2) Rust语言(2) NeRF(2) 神经辐射场(2) 元宇宙(2) CPP(2) 数据分析(2) spark(2) 流处理(2) Ionic(2) 人体姿势估计(2) human-pose-estimation(2) 视频处理(2) deep-learning(2) kotlin语言(2) kotlin开发(2) burp(2) Chatbot(2) npm(2) quantum(2) OCR(2) 游戏(2) game(2) 内容管理系统(2) MySQL(2) python-books(2) pentest(2) opengl(2) IDE(2) 漏洞赏金(2) Web(2) 知识图谱(2) PyTorch(2) 数据库(2) reverse-engineering(2) 数据工程(2) swift开发(2) rest(2) robotics(2) ios-animation(2) 知识蒸馏(2) 安卓开发(2) nestjs(2) solidity(2) 爬虫(2) 面试(2) 容器(2) C++精选(2) 人工智能资源(2) Machine Learning(2) 备忘单(2) 编程书籍(2) angular资源(2) 速查表(2) cheatsheets(2) SecOps(2) mlops资源(2) R资源(2) DDD(2) 架构设计模式(2) 量化(2) Hacking资源(2) 强化学习(2) flask(2) 设计(2) 性能(2) Sysadmin(2) 系统管理员(2) Java资源(2) 机器学习精选(2) android资源(2) android-UI(2) Mac资源(2) iOS资源(2) Vue资源(2) flutter资源(2) JavaScript精选(2) JavaScript资源(2) Rust开发(2) deeplearning(2) RAD(2)
SEO Title

This list contains links to great software tools and libraries and literature related to Optical Character Recognition (OCR).

Contributions are welcome, as is feedback.

Software

OCR engines

  • tesseract - The definitive Open Source OCR engine Apache 2.0
  • EasyOCR - OCR engine built on PyTorch by JaidedAI, Apache 2.0
  • ocropus - OCR engine based on LSTM, Apache 2.0
  • ocropus 0.4 - Older v0.4 state of Ocropus, with tesseract 2.04 and iulib, C++
  • kraken - Ocropus fork with sane defaults
  • gocr - OCR engine under the GNU Public License led by Joerg Schulenburg.
  • Ocrad - The GNU OCR. GPL
  • ocular - Machine-learning OCR for historic documents
  • SwiftOCR - fast and simple OCR library written in Swift
  • attention-ocr - OCR engine using visual attention mechanisms
  • RWTH-OCR - The RWTH Aachen University Optical Character Recognition System
  • simple-ocr-opencv and its fork - A simple pythonic OCR engine using opencv and numpy
  • Calamari - OCR Engine based on OCRopy and Kraken
  • doctr - A seamless & high-performing OCR library powered by Deep Learning

Older and possibly abandoned OCR engines

  • Clara OCR - Open source OCR in C GPL
  • Cuneiform - CuneiForm OCR was developed by Cognitive Technologies
  • Eye - an experimental Java OCR (image-to-text) application
  • kognition - An omnifont OCR software for KDE
  • OCRchie - Modular Optical Character Recognition Software
  • ocre - o.c.r. easy
  • xplab - A GTK 2 tool for pattern matching
  • hebOCR - Hebrew character recognition library (previously named hocr, see Wikipedia articleGPL

OCR file formats

hOCR

  • hocr-tools - Tools for doing various useful things with hOCR files, Apache 2.0
  • hocr-spec - hOCR 1.2 specification
  • ocr-transform - CLI tool to convert between hOCR and ALTO, MIT
  • hocr-parser - hOCR Specification Python Parser
  • hOCRTools - hOCR to ALTO conversion XSLT

ALTO XML

TEI

  • TEI-OCR - TEI customization for OCR generated layout and content information
  • TEI SIG on Libraries - Best Practices for TEI in Libraries
  • GDZ - METS/TEI-based GDZ document format

PAGE XML

  • PAGE-XML Schema - XML schema of the PAGE XML format along with documentation and examples
  • omni:us Pages Format (OPF) - XML schema very similar to PAGE XML that has some additional features.
  • py-pagexml - Python library for handling PAGE XML and OPF files.

OCR CLI

  • OCRmyPDF - OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched
  • Pdf2PdfOCR - A tool to OCR a PDF (or supported images) and add a text "layer" (a "pdf sandwich") in the original file making it a searchable PDF. GUI included. Tesseract and cuneiform supported.
  • Ocrocis - Project manager interface for Ocropy, see also external project homepage
  • tesseract-recognize - Tesseract-based tool that outputs result in Page XML format (docker image).

OCR GUI

  • moz-hocr-editor - Firefox Addon for editing hOCR files Discontinued
  • qt-box-editor - QT4 editor of tesseract-ocr box files.
  • ocr-gt-tools - Client-Server application for editing OCR ground truth.
  • Paperwork - Using scanners and OCR to grep paper documents the easy way.
  • Paperless - Scan, index, and archive all of your paper documents.
  • gImageReader - gImageReader is a simple Gtk/Qt front-end to tesseract-ocr.
  • VietOCR - A Java/.NET GUI frontend for Tesseract OCR engine, including jTessBoxEditor a graphical Tesseract box data editor
  • PoCoTo - Fast interactive batch corrections of complete OCR error series in OCR'ed historical documents.
  • OCRFeeder - GTK graphical user interface that allows the users to correct characters or bounding boxes, ODT export and more.
  • PRImA PAGE Viewer - Java based viewer for PAGE XML files (layout + text content). Also supports ALTO XML, FineReader XML, and HOCR.
  • LAREX - A semi-automatic open-source tool for Layout Analysis and Region EXtraction on early printed books.
  • archiscribe - Web application for transcribing OCR ground truth from Archive.org. Deployed instance available at https://archiscribe.jbaiter.de/, results are available in @jbaiter/archiscribe-corpus.
  • nw-page-editor - Simple app for visual editing of Page XML files. Provides desktop and server docker-based versions.

OCR Preprocessing

OCR as a Service

OCR evaluation

OCR libraries by programming language

Crystal

Elixir

  • tesseract_ocr - Elixir library wrapping the tesseract executable.

Go

  • gosseract - Golang OCR library, wrapping Tesseract-ocr.

Java

  • Tess4J - Java Native Access bindings to Tesseract.
  • tess-two - Tools for compiling Tesseract on Android and Java API.

.Net

Object Pascal

PHP

Python

  • pytesseract - A Python wrapper for Google Tesseract.
  • pyocr - A Python wrapper for Tesseract and Cuneiform.
  • ocrodjvu - A library and standalone tool for doing OCR on DjVu documents, wrapping Cuneiform, gocr, ocrad, ocropus and tesseract
  • tesserocr - A Python wrapper for the tesseract-ocr API

Javascript

  • ocracy - pure javascript lstm rnn implementation based on ocropus
  • gocr.js - Javascript port (emscripten) of gocr
  • ocrad.js - Javascript port (emscripten) of ocrad
  • tesseract.js - Javascript port (emscripten) of Tesseract
  • node-tesseract-ocr - A simple wrapper for the Tesseract OCR package.
  • node-tesseract-native - C++ module for node providing OCR with tesseract and leptonica.

Ruby

  • rtesseract - Ruby library wrapping the tesseract and imagemagick executables.
  • ruby-tesseract - Native Tesseract bindings for Ruby MRI and JRuby
  • ocr_space - API wrapper for free ocr service ocr.space. Includes CLI

Rust

  • tesseract.rs - Rust bindings for tesseract OCR.
  • leptess - Productive and safe Rust bindings/wrappers for tesseract and leptonica.

R

Swift

  • Tesseract OCR iOS - Swift and Objective-C wrapper for Tesseract OCR.
  • SwiftOCR - Fast and simple OCR library written in Swift. Optimized for recognizing short, one line long alphanumeric codes.

OCR training tools

  • glyph-miner - A system for extracting glyphs from early typeset prints
  • ocrodeg - Document image degradation for OCR data augmentation

Datasets

Ground Truth

  • Rescribe - Transcriptions of Caroline Minuscule Manuscripts PDM 1.0

Literature

OCR-related publication and link lists

Blog Posts and Tutorials

OCR Showcases

  • abbyy-finereader-ocr-senate - Using OCR to parse scanned Senate Financial Disclosure forms.
  • cvOCR - An OCR system for recognizing resume or cv text, implemented in Python and C and based on tesseract
  • MathOCR - A printed scientific document recognition system, pre-alpha

Academic articles

2011 and before

2012

2013

2014

2015

2016

2017

2018

原文:https://github.com/kba/awesome-ocr