Execution and Control

TAPIR

Tracks arbitrary target points through a live video stream so closed-loop control can correct motion under disturbance.

Tool Introduction

Core parameters, trigger timing, and visual before/after demo references.

Short Explanation

Use TAPIR-style tracking during the robot approach phase to keep a target pixel locked despite motion or camera shake.

InputQuery point + live video stream
OutputTracked point, occlusion flag, confidence
Trigger TimingTriggered on demand after the required input files and configuration are prepared.
RuntimeLocal GPU
BeforeQuery point + live video stream

Prepare the scene, image, video, sensor stream, prompt, or configuration expected by the original project.

AfterTracked point, occlusion flag, confidence

Read the produced visualization, prediction, map, trajectory, mask, grasp pose, or other documented artifact.

Preset Example

A quick-run style example for the documentation page.

Inputtools/tapir/examples/stream.mp4
Promptinitial_target_pixel: [120, 150]
ExpectedPer-frame target coordinates, occlusion status, confidence, and total displacement.

Parameters And Output

Readable controls and the meaning of each returned artifact.

Parameter Explanation

initial_target_pixeltext[120, 150]

Initial 2D point selected on the target object.

camera_stream_framefile

Live frame or buffered video sequence to track through.

inference_modeselectmock_no_jax

Selects the installed JAX/TAPIR backend or a mock tracking fallback.

Output Explanation

current_tracked_pixel

Current 2D target coordinate for closed-loop correction.

is_occluded

Whether the point is estimated to be hidden.

confidence

Tracking confidence for the current frame.

total_displacement

Measured shift from the initial point over the tracked clip.

How To Use

Official resources, deployment steps, academic context, citation, and source-reported benchmark numbers.

Deployment Notes

  1. Clone the official TAPNet repository and install the required JAX dependencies.
  2. Prepare a live camera feed or video sequence and choose initial target pixels.
  3. Run the tracker from a repository-relative path during the robot approach phase.
  4. Save tracks and occlusion flags under tools/tapir/runs/ for controller inspection.

Relative Path Example

python tools/tapir/run.py --video tools/tapir/examples/stream.mp4 --initial-point 120,150 --output tools/tapir/runs/tracks.json

Expected Result Shape

{
  "tool": "tapir",
  "status": "ok",
  "results": [
    {
      "label": "Point tracking for visual servoing",
      "score": 0.87,
      "output": "Tracked point, occlusion flag, confidence"
    }
  ],
  "timing": {
    "runtime": "The official project page reports about 40 FPS when tracking 256 points on a 256x256 video in online mode; the submitted spreadsheet also describes about 20 ms and 30-60 Hz output frequency.",
    "device": "documented in source benchmark when available"
  },
  "artifacts": {
    "visualization": "tools/tapir/runs/visualization.png",
    "raw_predictions": "tools/tapir/runs/predictions.json"
  }
}
Paper figure

Academic Info

Paper identity and contribution summary.

TitleTAPIR: Tracking Any Point with per-frame Initialization and temporal Refinement
AuthorsAdd authors
VenueDeepMind 2023
ContributionTracks arbitrary points with per-frame initialization and temporal refinement, supporting visual feedback when objects move during execution.

Citation

@misc{tapir2023,
  title={TAPIR: Tracking Any Point with per-frame Initialization and temporal Refinement},
  author={Author},
  year={2023},
  note={DeepMind 2023},
  url={https://arxiv.org/abs/2306.08637}
}

Benchmark

Only compact, source-reported numbers are shown here.

DatasetMetricValueRuntimeSource
TAP-Vid benchmarkAverage Jaccard (AJ)60.2 / 62.9 / 88.3 / 73.3 on Kinetics / DAVIS / Kubric / RGB-StackingTAPIROfficial TAPIR project page
TAP-Vid baseline comparisonAverage Jaccard (AJ)46.6 / 38.4 / 65.4 / 59.9 for TAP-Net; 35.3 / 42.0 / 59.1 / 37.3 for PIPsOfficial comparison tableOfficial TAPIR project page

Artifacts

Official TAPIR project benchmark table, mock point-tracking input, trajectory output, occlusion flag, and confidence output.

Demo Images

Visual references from the original tool. Click any image to inspect the original size.