sahi: a vision library for performing sliced inference to detect small objects in large images

sahi: a vision library for performing sliced inference to detect small objects in large images

What it solves

SAHI addresses the difficulty of detecting small objects within very large images. Standard object detection models often struggle with small objects because they resize high-resolution images to a smaller input size, causing small objects to lose critical detail. SAHI overcomes this by implementing "sliced inference."

How it works

Instead of processing a large image as a single unit, SAHI slices the image into smaller, overlapping patches. Each patch is then passed through a detection model independently. Finally, SAHI merges the results from all patches back into the original image coordinates, ensuring that small objects are detected with higher precision without losing resolution.

Who it’s for

It is designed for developers and researchers working with high-resolution imagery, such as satellite imagery, medical imaging, or large-scale industrial inspections, who need to detect small objects using existing object detection frameworks.

Highlights

  • Framework Agnostic: Supports a wide range of popular models from Ultralytics (YOLO), MMDetection, HuggingFace, TorchVision, and Roboflow.
  • Sliced Inference: Enables high-precision detection of small objects in large-scale images.
  • COCO Utilities: Includes tools for slicing COCO annotations, converting datasets to YOLO format, and performing error analysis.
  • Integration: Works with FiftyOne for interactive visualization and inspection of prediction results.

Sources