API Reference#
Client API#
The PanoOCR class is the main entry point.
PanoOCR
#
PanoOCR(engine: OCREngine, perspectives: Optional[Union[PerspectivePreset, List[PerspectiveMetadata]]] = None, dedup_options: Optional[DedupOptions] = None)
Pipeline-first API for panorama OCR.
This class provides a high-level interface for running OCR on equirectangular panorama images with automatic perspective projection and deduplication.
Example
from panoocr import PanoOCR from panoocr.engines.macocr import MacOCREngine
engine = MacOCREngine() pano = PanoOCR(engine) result = pano.recognize("panorama.jpg") result.save_json("results.json")
Attributes:
| Name | Type | Description |
|---|---|---|
engine |
The OCR engine to use for text recognition. |
|
perspectives |
List of perspective configurations. |
|
dedup_options |
Deduplication options. |
Initialize PanoOCR.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
engine
|
OCREngine
|
OCR engine implementing the OCREngine protocol. |
required |
perspectives
|
Optional[Union[PerspectivePreset, List[PerspectiveMetadata]]]
|
Perspective configuration - either a preset name or custom list of PerspectiveMetadata. Defaults to DEFAULT. |
None
|
dedup_options
|
Optional[DedupOptions]
|
Deduplication options. Uses defaults if not provided. |
None
|
Source code in src/panoocr/api/client.py
recognize
#
recognize(image: Union[str, Image], panorama_id: Optional[str] = None, show_progress: bool = True) -> OCRResult
Run OCR on a panorama image.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
image
|
Union[str, Image]
|
Path to panorama image or PIL Image. |
required |
panorama_id
|
Optional[str]
|
Optional identifier for the panorama. |
None
|
show_progress
|
bool
|
Whether to show a progress bar. |
True
|
Returns:
| Type | Description |
|---|---|
OCRResult
|
OCRResult containing deduplicated sphere OCR results. |
Source code in src/panoocr/api/client.py
recognize_multi
#
recognize_multi(image: Union[str, Image], presets: Sequence[PerspectivePreset], panorama_id: Optional[str] = None, show_progress: bool = True) -> OCRResult
Run OCR on a panorama using multiple perspective presets.
Useful for multi-scale detection to catch both small and large text.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
image
|
Union[str, Image]
|
Path to panorama image or PIL Image. |
required |
presets
|
Sequence[PerspectivePreset]
|
List of perspective presets to use. |
required |
panorama_id
|
Optional[str]
|
Optional identifier for the panorama. |
None
|
show_progress
|
bool
|
Whether to show a progress bar. |
True
|
Returns:
| Type | Description |
|---|---|
OCRResult
|
OCRResult containing deduplicated sphere OCR results. |
Source code in src/panoocr/api/client.py
OCRResult
dataclass
#
OCRResult(results: Sequence[SphereOCRResult], image_path: Optional[str] = None, perspective_preset: Optional[str] = None, perspective_presets: Optional[Sequence[str]] = None)
OCR output plus metadata, with preview-tool-friendly JSON export.
Attributes:
| Name | Type | Description |
|---|---|---|
results |
Sequence[SphereOCRResult]
|
List of deduplicated sphere OCR results. |
image_path |
Optional[str]
|
Optional path to the source image. |
perspective_preset |
Optional[str]
|
Name of the perspective preset used. |
perspective_presets |
Optional[Sequence[str]]
|
List of perspective preset names if multiple were used. |
to_dict
#
Convert to a dictionary for JSON serialization.
Source code in src/panoocr/api/models.py
save_json
#
Save OCR results in a JSON file suitable for the preview tool.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path
|
str
|
Output file path. |
required |
from_dict
classmethod
#
Create an OCRResult from a dictionary.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data
|
dict
|
Dictionary with OCR result data. |
required |
Returns:
| Type | Description |
|---|---|
'OCRResult'
|
OCRResult instance. |
Source code in src/panoocr/api/models.py
load_json
classmethod
#
Load OCR results from a JSON file.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path
|
str
|
Input file path. |
required |
Returns:
| Type | Description |
|---|---|
'OCRResult'
|
OCRResult instance. |
Source code in src/panoocr/api/models.py
PerspectivePreset
#
Bases: str, Enum
Pre-defined perspective configurations for common text scales.
OCROptions
dataclass
#
Options passed to the underlying OCR engine.
Attributes:
| Name | Type | Description |
|---|---|---|
config |
dict | None
|
Engine-specific configuration dictionary. |
DedupOptions
dataclass
#
DedupOptions(min_text_similarity: float = 0.5, min_intersection_ratio_for_similar_text: float = 0.5, min_text_overlap: float = 0.5, min_intersection_ratio_for_overlapping_text: float = 0.15, min_intersection_ratio: float = 0.1)
Deduplication options applied after multi-view OCR.
Attributes:
| Name | Type | Description |
|---|---|---|
min_text_similarity |
float
|
Minimum Levenshtein similarity for text comparison. |
min_intersection_ratio_for_similar_text |
float
|
Minimum region overlap for similar texts. |
min_text_overlap |
float
|
Minimum overlap similarity for text comparison. |
min_intersection_ratio_for_overlapping_text |
float
|
Minimum region overlap for overlapping texts. |
min_intersection_ratio |
float
|
Minimum region intersection ratio threshold. |
Module Structure#
panoocr/
├── api/ # Client API
│ ├── client.py # PanoOCR
│ └── models.py # OCRResult, options, OCREngine protocol
├── engines/ # OCR engines (lazily imported)
│ ├── macocr.py # MacOCREngine (requires [macocr])
│ ├── easyocr.py # EasyOCREngine (requires [easyocr])
│ ├── paddleocr.py # PaddleOCREngine (requires [paddleocr])
│ ├── florence2.py # Florence2OCREngine (requires [florence2])
│ └── trocr.py # TrOCREngine (requires [trocr])
├── ocr/ # OCR result models
│ ├── models.py # FlatOCRResult, SphereOCRResult
│ └── utils.py # Visualization (requires [viz])
├── dedup/ # Deduplication
│ └── detection.py # SphereOCRDuplicationDetectionEngine
├── image/ # Panorama handling
│ ├── models.py # PanoramaImage, PerspectiveMetadata
│ └── perspectives.py # Presets, generate_perspectives()
└── geometry.py # Coordinate conversion utilities
Submodules#
- Engines -
OCREngineprotocol and built-in engines - Image - Panorama and perspective classes
- OCR Models - OCR result types
- Deduplication - Text deduplication
- Geometry - Coordinate conversion
- Visualization - OCR visualization (requires
[viz])