OCR Models#
Data classes for OCR results in different coordinate systems.
BoundingBox#
BoundingBox
dataclass
#
Normalized bounding box with coordinates in 0-1 range.
Coordinates are relative to image dimensions: - (0, 0) is top-left - (1, 1) is bottom-right
Attributes:
| Name | Type | Description |
|---|---|---|
left |
float
|
Distance from left edge (0-1). |
top |
float
|
Distance from top edge (0-1). |
right |
float
|
Distance from left edge to right side (0-1). |
bottom |
float
|
Distance from top edge to bottom side (0-1). |
width |
float
|
Box width (0-1). |
height |
float
|
Box height (0-1). |
FlatOCRResult#
OCR result from a flat (perspective) image with normalized bounding box coordinates.
FlatOCRResult
dataclass
#
FlatOCRResult(text: str, confidence: float, bounding_box: BoundingBox, engine: Optional[str] = None)
OCR result from a flat (perspective) image.
Attributes:
| Name | Type | Description |
|---|---|---|
text |
str
|
Recognized text content. |
confidence |
float
|
Recognition confidence (0-1). |
bounding_box |
BoundingBox
|
Normalized bounding box in image coordinates. |
engine |
Optional[str]
|
Name of the OCR engine used. |
to_dict
#
from_dict
classmethod
#
Create from dictionary.
Source code in src/panoocr/ocr/models.py
to_sphere
#
to_sphere(horizontal_fov: float, vertical_fov: float, yaw_offset: float, pitch_offset: float) -> 'SphereOCRResult'
Convert to spherical OCR result using camera parameters.
All parameters are in degrees.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
horizontal_fov
|
float
|
Horizontal field of view of the camera. |
required |
vertical_fov
|
float
|
Vertical field of view of the camera. |
required |
yaw_offset
|
float
|
Horizontal offset of the camera. |
required |
pitch_offset
|
float
|
Vertical offset of the camera. |
required |
Returns:
| Type | Description |
|---|---|
'SphereOCRResult'
|
SphereOCRResult with spherical coordinates. |
Source code in src/panoocr/ocr/models.py
SphereOCRResult#
OCR result in spherical (panorama) coordinates.
SphereOCRResult
dataclass
#
SphereOCRResult(text: str, confidence: float, yaw: float, pitch: float, width: float, height: float, engine: Optional[str] = None)
OCR result in spherical (panorama) coordinates.
Attributes:
| Name | Type | Description |
|---|---|---|
text |
str
|
Recognized text content. |
confidence |
float
|
Recognition confidence (0-1). |
yaw |
float
|
Horizontal angle in degrees (-180 to 180). |
pitch |
float
|
Vertical angle in degrees (-90 to 90). |
width |
float
|
Angular width in degrees. |
height |
float
|
Angular height in degrees. |
engine |
Optional[str]
|
Name of the OCR engine used. |