Engines#
PanoOCR uses dependency injection for OCR engines. Provide any object with a matching recognize() method.
OCREngine Protocol#
OCREngine
#
Bases: Protocol
Protocol for OCR engines (structural typing).
Any class with a matching recognize() method can be used.
No inheritance required.
recognize
#
Recognize text in an image.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
image
|
Image
|
Input image as PIL Image. |
required |
Returns:
| Type | Description |
|---|---|
list[FlatOCRResult]
|
List of FlatOCRResult objects with normalized bounding boxes (0-1 range). |
MacOCREngine#
Uses Apple's Vision Framework for fast, accurate OCR on macOS. Requires the [macocr] extra.
MacOCREngine
#
OCR engine using Apple Vision Framework via ocrmac.
This engine uses macOS's built-in Vision Framework for text recognition. It provides excellent accuracy for many languages on Apple Silicon.
Attributes:
| Name | Type | Description |
|---|---|---|
language_preference |
List of language codes to use for recognition. |
|
recognition_level |
Recognition accuracy level ("fast" or "accurate"). |
Example
from panoocr.engines.macocr import MacOCREngine, MacOCRLanguageCode
engine = MacOCREngine(config={ ... "language_preference": [MacOCRLanguageCode.ENGLISH_US], ... "recognition_level": MacOCRRecognitionLevel.ACCURATE, ... }) results = engine.recognize(image)
Note
Requires macOS and the ocrmac package. Install with: pip install "panoocr[macocr]"
Initialize the MacOCR engine.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config
|
Dict[str, Any] | None
|
Configuration dictionary with optional keys: - language_preference: List of MacOCRLanguageCode values. - recognition_level: MacOCRRecognitionLevel value. |
None
|
Raises:
| Type | Description |
|---|---|
ImportError
|
If ocrmac is not installed. |
ValueError
|
If configuration values are invalid. |
Source code in src/panoocr/engines/macocr.py
recognize
#
Recognize text in an image.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
image
|
Image
|
Input image as PIL Image. |
required |
Returns:
| Type | Description |
|---|---|
List[FlatOCRResult]
|
List of FlatOCRResult with normalized bounding boxes. |
Source code in src/panoocr/engines/macocr.py
EasyOCREngine#
Cross-platform OCR supporting 80+ languages. Requires the [easyocr] extra.
EasyOCREngine
#
OCR engine using EasyOCR library.
EasyOCR supports 80+ languages and can run on CPU or GPU. It provides good accuracy for many scripts including CJK.
Attributes:
| Name | Type | Description |
|---|---|---|
language_preference |
List of language codes to use. |
|
reader |
EasyOCR Reader instance. |
Example
from panoocr.engines.easyocr import EasyOCREngine, EasyOCRLanguageCode
engine = EasyOCREngine(config={ ... "language_preference": [EasyOCRLanguageCode.ENGLISH], ... "gpu": True, ... }) results = engine.recognize(image)
Note
Install with: pip install "panoocr[easyocr]" For GPU support, install PyTorch with CUDA.
Initialize the EasyOCR engine.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config
|
Dict[str, Any] | None
|
Configuration dictionary with optional keys: - language_preference: List of EasyOCRLanguageCode values. - gpu: Whether to use GPU (default: True). |
None
|
Raises:
| Type | Description |
|---|---|
ImportError
|
If easyocr is not installed. |
ValueError
|
If configuration values are invalid. |
Source code in src/panoocr/engines/easyocr.py
recognize
#
Recognize text in an image.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
image
|
Image
|
Input image as PIL Image. |
required |
Returns:
| Type | Description |
|---|---|
List[FlatOCRResult]
|
List of FlatOCRResult with normalized bounding boxes. |
Source code in src/panoocr/engines/easyocr.py
PaddleOCREngine#
PaddlePaddle-based OCR with optional V4 server model for Chinese text. Requires the [paddleocr] extra.
PaddleOCREngine
#
OCR engine using PaddleOCR library.
PaddleOCR is developed by PaddlePaddle and supports multiple languages. It provides good accuracy and can optionally use the V4 server model for better results on Chinese text.
Attributes:
| Name | Type | Description |
|---|---|---|
language_preference |
Language code for recognition. |
|
recognize_upside_down |
Whether to use angle classifier. |
|
use_v4_server |
Whether to use the V4 server model. |
Example
from panoocr.engines.paddleocr import PaddleOCREngine, PaddleOCRLanguageCode
engine = PaddleOCREngine(config={ ... "language_preference": PaddleOCRLanguageCode.CHINESE, ... "use_gpu": True, ... }) results = engine.recognize(image)
Note
Install with: pip install "panoocr[paddleocr]" For GPU support, install paddlepaddle-gpu.
Initialize the PaddleOCR engine.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config
|
Dict[str, Any] | None
|
Configuration dictionary with optional keys: - language_preference: PaddleOCRLanguageCode value. - recognize_upside_down: Enable angle classifier (default: False). - use_v4_server: Use V4 server model for better Chinese OCR. - use_gpu: Whether to use GPU (default: True). - model_dir: Custom directory for V4 server models. |
None
|
Raises:
| Type | Description |
|---|---|
ImportError
|
If paddleocr is not installed. |
ValueError
|
If configuration values are invalid. |
Source code in src/panoocr/engines/paddleocr.py
recognize
#
Recognize text in an image.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
image
|
Image
|
Input image as PIL Image. |
required |
Returns:
| Type | Description |
|---|---|
List[FlatOCRResult]
|
List of FlatOCRResult with normalized bounding boxes. |
Source code in src/panoocr/engines/paddleocr.py
Florence2OCREngine#
Microsoft's Florence-2 vision-language model for OCR. Requires the [florence2] extra.
Florence2OCREngine
#
OCR engine using Microsoft's Florence-2 model.
Florence-2 is a vision-language model that can perform OCR with region detection. It provides good accuracy across many languages and can detect text in various orientations.
Attributes:
| Name | Type | Description |
|---|---|---|
device |
Device to run inference on (cuda, mps, or cpu). |
|
model |
The Florence-2 model. |
|
processor |
The Florence-2 processor. |
Example
from panoocr.engines.florence2 import Florence2OCREngine
engine = Florence2OCREngine(config={ ... "model_id": "microsoft/Florence-2-large", ... }) results = engine.recognize(image)
Note
Install with: pip install "panoocr[florence2]" For GPU support, install PyTorch with CUDA.
Initialize the Florence-2 engine.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config
|
Dict[str, Any] | None
|
Configuration dictionary with optional keys: - model_id: HuggingFace model ID (default: "microsoft/Florence-2-large"). - device: Device to use ("cuda", "mps", "cpu", or None for auto). |
None
|
Raises:
| Type | Description |
|---|---|
ImportError
|
If dependencies are not installed. |
Source code in src/panoocr/engines/florence2.py
recognize
#
Recognize text in an image.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
image
|
Image
|
Input image as PIL Image. |
required |
Returns:
| Type | Description |
|---|---|
List[FlatOCRResult]
|
List of FlatOCRResult with normalized bounding boxes. |
Source code in src/panoocr/engines/florence2.py
TrOCREngine#
Microsoft's TrOCR transformer-based OCR. Requires the [trocr] extra.
TrOCREngine
#
OCR engine using Microsoft's TrOCR model.
TrOCR is a transformer-based OCR model that excels at single-line text recognition. It does NOT provide bounding boxes - it reads the entire image as a single text line.
WARNING: This engine is experimental and may not work well for panorama OCR since it doesn't detect text regions. Consider using Florence2OCREngine or other engines that provide region detection.
Attributes:
| Name | Type | Description |
|---|---|---|
model |
The TrOCR model. |
|
processor |
The TrOCR processor. |
Example
from panoocr.engines.trocr import TrOCREngine, TrOCRModel
engine = TrOCREngine(config={ ... "model": TrOCRModel.MICROSOFT_TROCR_LARGE_PRINTED, ... })
Note: Returns single result for entire image#
results = engine.recognize(cropped_text_image)
Note
Install with: pip install "panoocr[trocr]" For GPU support, install PyTorch with CUDA.
Initialize the TrOCR engine.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config
|
Dict[str, Any] | None
|
Configuration dictionary with optional keys: - model: TrOCRModel enum value or model ID string. - device: Device to use ("cuda", "mps", "cpu", or None for auto). |
None
|
Raises:
| Type | Description |
|---|---|
ImportError
|
If dependencies are not installed. |
ValueError
|
If configuration values are invalid. |
Source code in src/panoocr/engines/trocr.py
recognize
#
Recognize text in an image.
NOTE: TrOCR treats the entire image as a single text line and does not provide bounding boxes. This makes it unsuitable for most panorama OCR use cases. The result will have a bounding box covering the entire image.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
image
|
Image
|
Input image as PIL Image. |
required |
Returns:
| Type | Description |
|---|---|
List[FlatOCRResult]
|
List with single FlatOCRResult covering the entire image, or empty |
List[FlatOCRResult]
|
list if no text is recognized. |
Source code in src/panoocr/engines/trocr.py
Custom Engines#
Any class with a compatible recognize() method works:
from panoocr import PanoOCR, FlatOCRResult, BoundingBox
from PIL import Image
class MyEngine:
def recognize(self, image: Image.Image) -> list[FlatOCRResult]:
# Return list of FlatOCRResult with normalized bounding boxes (0-1)
return [
FlatOCRResult(
text="Hello",
confidence=0.95,
bounding_box=BoundingBox(
left=0.1, top=0.2, right=0.4, bottom=0.3,
width=0.3, height=0.1
),
engine="my_engine",
)
]
pano = PanoOCR(engine=MyEngine())