Short Explanation
Provide a video and an initial object mask, then Cutie propagates the object mask through later frames for video object segmentation.
Cutie is a video object segmentation framework that improves consistency, robustness, and speed while supporting scripting and interactive GUI workflows.
Core parameters, trigger timing, and visual before/after demo references.
Provide a video and an initial object mask, then Cutie propagates the object mask through later frames for video object segmentation.
Prepare the scene, image, video, sensor stream, prompt, or configuration expected by the original project.
Read the produced visualization, prediction, map, trajectory, mask, grasp pose, or other documented artifact.
A quick-run style example for the documentation page.
Readable controls and the meaning of each returned artifact.
videofileInput video or ordered frame folder.
initial_maskfileFirst-frame mask that defines the object identity to propagate.
num_objectsnumber1Number of object identities tracked in the interactive demo.
output_dirpathDestination for masks, overlays, and logs.
maskPer-frame segmentation mask for each tracked object.
object_idStable identity label assigned to the object across the sequence.
overlayPreview image showing the mask on top of the video frame.
Official resources, deployment steps, academic context, citation, and source-reported benchmark numbers.
# Relative-path local entry for the Cutie tool folder python tools/cutie/scripting_demo.py # Add/delete object workflow: python tools/cutie/scripting_demo_add_del_objects.py # Interactive GUI: python tools/cutie/interactive_demo.py --video tools/cutie/examples/example.mp4 --num_objects 1 # Suggested repository layout: # tools/cutie/README.md # tools/cutie/scripting_demo.py # tools/cutie/interactive_demo.py # tools/cutie/examples/images/ # tools/cutie/examples/masks/ # This page documents the path. The static page does not execute Cutie.
{
"tool": "cutie",
"status": "ok",
"masks": [
{
"label": "Video object segmentation",
"score": 0.87,
"output": "Tracked object masks"
}
],
"timing": {
"runtime": "Cutie-base reports 36.4 FPS on V100; Cutie-small with MOSE training reports 45.5 FPS. The paper states +8.7 J&F over XMem and +4.2 J&F over DeAOT on MOSE while being 3x faster than DeAOT.",
"device": "documented in source benchmark when available"
},
"artifacts": {
"visualization": "tools/cutie/runs/visualization.png",
"raw_predictions": "tools/cutie/runs/predictions.json"
}
}Paper identity and contribution summary.
@misc{cutie2024,
title={Putting the Object Back into Video Object Segmentation},
author={Ho Kei Cheng and Seoung Wug Oh and Brian Price and Joon-Young Lee and Alexander Schwing},
year={2024},
note={CVPR 2024 Highlight / arXiv:2310.12982},
url={https://arxiv.org/abs/2310.12982}
}Only compact, source-reported numbers are shown here.
| Dataset | Metric | Value | Runtime | Source |
|---|---|---|---|---|
| MOSE validation | J&F | 68.3 for Cutie-base with MOSE training | 36.4 FPS on V100 | CVPR 2024 paper |
| DAVIS-2017 / YouTubeVOS-2019 | J&F / G | DAVIS val 88.8, DAVIS test 85.3, YouTubeVOS G 86.5 | Cutie-small 45.5 FPS | Cutie paper |
Cutie paper, MOSE/DAVIS/YouTubeVOS tables, scripting demo, interactive GUI, pretrained model download script, example frames, and masks.
Visual references from the original tool. Click any image to inspect the original size.