Perception and Grounding

FastSAM

FastSAM is a fast Segment Anything style image segmentation tool that supports everything, point, box, and text prompt modes.

Tool Introduction

Core parameters, trigger timing, and visual before/after demo references.

Short Explanation

Upload an image and optionally provide points, boxes, or text prompts; FastSAM returns segmentation masks much faster than the original SAM-style workflow.

InputImage + optional prompt

OutputSegmentation masks

Trigger TimingTriggered on demand from the source demo or local example command.

RuntimePython / PyTorch / Gradio / Replicate

BeforeImage + optional prompt

Prepare the scene, image, video, sensor stream, prompt, or configuration expected by the original project.

AfterSegmentation masks

Read the produced visualization, prediction, map, trajectory, mask, grasp pose, or other documented artifact.

Preset Example

A quick-run style example for the documentation page.

Inputtools/fastsam/images/dogs.jpg

Promptthe yellow dog

ExpectedA mask overlay and mask files for the selected object or all objects in the image.

Parameters And Output

Readable controls and the meaning of each returned artifact.

Parameter Explanation

img_pathfile

Image to segment.

text_prompttextthe yellow dog

Text-guided prompt used to select one target region from candidate masks.

box_prompttext

Bounding box prompt in pixel coordinates, used when the target region is already localized.

point_prompttext

Foreground/background point coordinates for interactive segmentation.

Output Explanation

mask

Binary segmentation mask for the selected object or image regions.

score

Mask confidence or proposal ranking score from the segmentation model.

overlay

Rendered mask visualization over the input image.

How To Use

Official resources, deployment steps, academic context, citation, and source-reported benchmark numbers.

Resources

GitHubhttps://github.com/CASIA-IVA-Lab/FastSAM Hugging Face Demohttps://huggingface.co/spaces/An-619/FastSAM Code Downloadhttps://github.com/CASIA-IVA-Lab/FastSAM/archive/refs/heads/main.zip Paperhttps://arxiv.org/abs/2306.12156 FastSAM Checkpointshttps://github.com/CASIA-IVA-Lab/FastSAM#model-checkpoints Replicate Demohttps://replicate.com/casia-iva-lab/fastsam

Deployment Notes

Clone the official FastSAM repository and install PyTorch plus Ultralytics-style dependencies.
Download the official FastSAM checkpoint into tools/fastsam/weights/.
Run Inference.py for image-level tests or app_gradio.py for an interactive local demo.
Store masks and overlays under tools/fastsam/output/ or tools/fastsam/runs/.

Relative Path Example

# Relative-path local entry for the FastSAM tool folder
python tools/fastsam/Inference.py   --model_path tools/fastsam/weights/FastSAM.pt   --img_path tools/fastsam/images/dogs.jpg

# Prompt modes:
python tools/fastsam/Inference.py --model_path tools/fastsam/weights/FastSAM.pt   --img_path tools/fastsam/images/dogs.jpg --text_prompt "the yellow dog"
python tools/fastsam/Inference.py --model_path tools/fastsam/weights/FastSAM.pt   --img_path tools/fastsam/images/dogs.jpg --box_prompt "[[570,200,230,400]]"
python tools/fastsam/app_gradio.py

# Suggested repository layout:
# tools/fastsam/README.md
# tools/fastsam/Inference.py
# tools/fastsam/app_gradio.py
# tools/fastsam/images/
# tools/fastsam/output/

# This page documents the path. The static page does not execute FastSAM.

Expected Result Shape

{
  "tool": "fastsam",
  "status": "ok",
  "masks": [
    {
      "label": "Promptable segmentation",
      "score": 0.87,
      "output": "Segmentation masks"
    }
  ],
  "timing": {
    "runtime": "FastSAM reports 40 ms on a single NVIDIA RTX 3090 and 68M parameters; the paper states 50x faster than SAM-H with 32x32 point prompts and 170x faster than SAM-H with 64x64.",
    "device": "documented in source benchmark when available"
  },
  "artifacts": {
    "visualization": "tools/fastsam/runs/visualization.png",
    "raw_predictions": "tools/fastsam/runs/predictions.json"
  }
}

Paper figure

Academic Info

Paper identity and contribution summary.

TitleFast Segment Anything

AuthorsXu Zhao, Wenchao Ding, Yongqi An, Yingqi Du, Tao Yu, Min Li, Ming Tang, Jinqiao Wang

VenuearXiv:2306.12156, 2023

ContributionUses a CNN-based segment-anything model trained on a small fraction of SA-1B to provide SAM-like promptable segmentation at much higher runtime speed.

Citation

@misc{fastsam2023,
  title={Fast Segment Anything},
  author={Xu Zhao and Wenchao Ding and Yongqi An and Yingqi Du and Tao Yu and Min Li and Ming Tang and Jinqiao Wang},
  year={2023},
  note={arXiv:2306.12156, 2023},
  url={https://arxiv.org/abs/2306.12156}
}

Benchmark

Only compact, source-reported numbers are shown here.

Dataset	Metric	Value	Runtime	Source
COCO object proposal	Box AR@1000	63.7, reported 1.2 points above SAM-H E32	40 ms on one RTX 3090	FastSAM paper
LVIS v1	BBox AR@1000 / AR_s / AR_m / AR_l	57.1 / 44.3 / 77.1 / 85.3	68M parameters	FastSAM paper

Artifacts

FastSAM paper, speed table, COCO/LVIS proposal tables, prompt modes, Gradio demo, Replicate demo, model checkpoints, and output masks.

Demo Images

Visual references from the original tool. Click any image to inspect the original size.