Engines#
PanoSAM uses dependency injection for segmentation engines. Provide any object with a matching segment() method.
SegmentationEngine Protocol#
SegmentationEngine
#
Bases: Protocol
Protocol for segmentation engines (structural typing).
Any class with a matching segment() method can be used.
No inheritance required.
segment
#
segment(image: Image, text_prompt: str, threshold: float = 0.5, mask_threshold: float = 0.5, simplify_tolerance: float = 0.005) -> list[FlatMaskResult]
Segment objects in an image using a text prompt.
SAM3Engine#
The built-in engine using Meta's SAM3 model. Requires the [sam3] extra.
SAM3Engine
#
SAM3 segmentation engine using HuggingFace Transformers.
This engine uses the facebook/sam3 model for Promptable Concept Segmentation (PCS) on images. It supports text prompts to segment all instances of a concept.
Attributes:
| Name | Type | Description |
|---|---|---|
model |
The SAM3 model. |
|
processor |
The SAM3 processor for pre/post-processing. |
|
device |
The device to run inference on (cuda, mps, or cpu). |
Note
Requires SAM3 dependencies. Install with: pip install "panosam[sam3]" Also requires HuggingFace login: huggingface-cli login
Initialize the SAM3 engine.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model_id
|
str
|
HuggingFace model ID for SAM3. |
'facebook/sam3'
|
device
|
Optional[str]
|
Device to use. If None, auto-detects (cuda > mps > cpu). |
None
|
dtype
|
dtype
|
Data type for model weights. Defaults to torch.float32. |
None
|
Raises:
| Type | Description |
|---|---|
ImportError
|
If SAM3 dependencies are not installed. |
Source code in src/panosam/engines/sam3.py
segment
#
segment(image: Image, text_prompt: str, threshold: float = 0.5, mask_threshold: float = 0.5, simplify_tolerance: float = 0.005, return_raw_masks: bool = False) -> List[FlatMaskResult] | Tuple[List[FlatMaskResult], List[np.ndarray]]
Segment objects in an image using a text prompt.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
image
|
Image
|
Input image as PIL Image. |
required |
text_prompt
|
str
|
Text describing the objects to segment (e.g., "car", "person"). |
required |
threshold
|
float
|
Confidence threshold for detections (0-1). |
0.5
|
mask_threshold
|
float
|
Threshold for binary mask generation (0-1). |
0.5
|
simplify_tolerance
|
float
|
Tolerance for polygon simplification (0-1). |
0.005
|
return_raw_masks
|
bool
|
If True, also return raw binary masks for visualization. |
False
|
Returns:
| Type | Description |
|---|---|
List[FlatMaskResult] | Tuple[List[FlatMaskResult], List[ndarray]]
|
List of FlatMaskResult objects containing segmentation masks. |
List[FlatMaskResult] | Tuple[List[FlatMaskResult], List[ndarray]]
|
If return_raw_masks=True, returns tuple of (flat_results, raw_masks). |
Source code in src/panosam/engines/sam3.py
104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 | |
segment_with_boxes
#
segment_with_boxes(image: Image, boxes: List[Tuple[int, int, int, int]], box_labels: Optional[List[int]] = None, threshold: float = 0.5, mask_threshold: float = 0.5, simplify_tolerance: float = 0.005) -> List[FlatMaskResult]
Segment objects in an image using bounding box prompts.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
image
|
Image
|
Input image as PIL Image. |
required |
boxes
|
List[Tuple[int, int, int, int]]
|
List of bounding boxes in (x1, y1, x2, y2) pixel format. |
required |
box_labels
|
Optional[List[int]]
|
List of labels (1 for positive, 0 for negative). Defaults to all positive. |
None
|
threshold
|
float
|
Confidence threshold for detections (0-1). |
0.5
|
mask_threshold
|
float
|
Threshold for binary mask generation (0-1). |
0.5
|
simplify_tolerance
|
float
|
Tolerance for polygon simplification (0-1). |
0.005
|
Returns:
| Type | Description |
|---|---|
List[FlatMaskResult]
|
List of FlatMaskResult objects containing segmentation masks. |
Source code in src/panosam/engines/sam3.py
185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 | |
get_raw_masks
#
get_raw_masks(image: Image, text_prompt: str, threshold: float = 0.5, mask_threshold: float = 0.5) -> Tuple[List[np.ndarray], List[float]]
Get raw binary masks without polygon conversion.
Useful when you need the full mask data rather than simplified polygons.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
image
|
Image
|
Input image as PIL Image. |
required |
text_prompt
|
str
|
Text describing the objects to segment. |
required |
threshold
|
float
|
Confidence threshold for detections (0-1). |
0.5
|
mask_threshold
|
float
|
Threshold for binary mask generation (0-1). |
0.5
|
Returns:
| Type | Description |
|---|---|
Tuple[List[ndarray], List[float]]
|
Tuple of (masks, scores) where masks are numpy arrays. |
Source code in src/panosam/engines/sam3.py
Custom Engines#
Any class with a compatible segment() method works:
import panosam as ps
from PIL import Image
class MyEngine:
def segment(
self,
image: Image.Image,
text_prompt: str,
threshold: float = 0.5,
mask_threshold: float = 0.5,
simplify_tolerance: float = 0.005,
) -> list[ps.FlatMaskResult]:
# Return list of FlatMaskResult
...
client = ps.PanoSAM(engine=MyEngine())
Mask Results#
FlatMaskResult
dataclass
#
FlatMaskResult(polygons: List[List[Tuple[float, float]]], score: float, label: Optional[str] = None, mask_id: Optional[str] = None)
A segmentation mask result in flat/perspective image coordinates.
Attributes:
| Name | Type | Description |
|---|---|---|
polygons |
List[List[Tuple[float, float]]]
|
List of polygons, each polygon is a list of (x, y) tuples in normalized coordinates (0-1 range, where 0,0 is top-left). |
score |
float
|
Confidence score for this mask (0-1). |
label |
Optional[str]
|
Optional text label for the segmented object. |
mask_id |
Optional[str]
|
Optional unique identifier for this mask. |
to_sphere
#
to_sphere(horizontal_fov: float, vertical_fov: float, yaw_offset: float, pitch_offset: float) -> SphereMaskResult
Convert flat mask result to spherical coordinates.
Uses proper 3D rotation to accurately map perspective image coordinates to equirectangular spherical coordinates.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
horizontal_fov
|
float
|
Horizontal field of view in degrees. |
required |
vertical_fov
|
float
|
Vertical field of view in degrees. |
required |
yaw_offset
|
float
|
Horizontal offset of the perspective in degrees. |
required |
pitch_offset
|
float
|
Vertical offset of the perspective in degrees. |
required |
Returns:
| Type | Description |
|---|---|
SphereMaskResult
|
SphereMaskResult with polygons in spherical coordinates. |
Source code in src/panosam/sam/models.py
to_dict
#
from_binary_mask
classmethod
#
from_binary_mask(mask: ndarray, score: float, label: Optional[str] = None, mask_id: Optional[str] = None, simplify_tolerance: float = 0.001, min_contour_area_ratio: float = 0.01) -> FlatMaskResult
Create a FlatMaskResult from a binary mask.
Extracts ALL significant contours from the mask, not just the largest.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
mask
|
ndarray
|
Binary mask as numpy array (H, W) with values 0 or 1/255. |
required |
score
|
float
|
Confidence score for this mask. |
required |
label
|
Optional[str]
|
Optional text label. |
None
|
mask_id
|
Optional[str]
|
Optional unique identifier. |
None
|
simplify_tolerance
|
float
|
Tolerance for polygon simplification (0-1). |
0.001
|
min_contour_area_ratio
|
float
|
Minimum contour area as ratio of largest contour. Contours smaller than this are discarded. |
0.01
|
Returns:
| Type | Description |
|---|---|
FlatMaskResult
|
FlatMaskResult with normalized polygon coordinates. |
Source code in src/panosam/sam/models.py
SphereMaskResult
dataclass
#
SphereMaskResult(polygons: List[List[Tuple[float, float]]], score: float, label: Optional[str] = None, mask_id: Optional[str] = None, center_yaw: float = 0.0, center_pitch: float = 0.0)
A segmentation mask result in spherical/panoramic coordinates.
Attributes:
| Name | Type | Description |
|---|---|---|
polygons |
List[List[Tuple[float, float]]]
|
List of polygons, each polygon is a list of (yaw, pitch) tuples in degrees. |
score |
float
|
Confidence score for this mask (0-1). |
label |
Optional[str]
|
Optional text label for the segmented object. |
mask_id |
Optional[str]
|
Optional unique identifier for this mask. |
center_yaw |
float
|
Yaw of the polygon centroid in degrees. |
center_pitch |
float
|
Pitch of the polygon centroid in degrees. |
to_dict
#
Convert to dictionary representation.
Source code in src/panosam/sam/models.py
from_dict
classmethod
#
Create from dictionary representation.
Source code in src/panosam/sam/models.py
get_bounding_box
#
Get the bounding box of all polygons.
Returns:
| Type | Description |
|---|---|
Tuple[float, float, float, float]
|
Tuple of (min_yaw, min_pitch, max_yaw, max_pitch) in degrees. |
Source code in src/panosam/sam/models.py
get_area_estimate
#
Estimate the total area of all polygons using the shoelace formula.
Returns:
| Type | Description |
|---|---|
float
|
Estimated area in square degrees. |