Execution and Control

R3M

Compares post-action visual observations with a goal instruction to decide whether a manipulation step physically succeeded.

Tool Introduction

Core parameters, trigger timing, and visual before/after demo references.

Short Explanation

Use R3M after an action finishes to block false claims of completion and stop errors from accumulating.

InputPost-action frame + goal instruction
OutputVerification score and completion flag
Trigger TimingTriggered on demand after the required input files and configuration are prepared.
RuntimeLocal GPU
BeforePost-action frame + goal instruction

Prepare the scene, image, video, sensor stream, prompt, or configuration expected by the original project.

AfterVerification score and completion flag

Read the produced visualization, prediction, map, trajectory, mask, grasp pose, or other documented artifact.

Preset Example

A quick-run style example for the documentation page.

Inputtools/r3m/examples/end.png
Promptpick up the red cup
ExpectedA task completion score and boolean success flag.

Parameters And Output

Readable controls and the meaning of each returned artifact.

Parameter Explanation

current_camera_viewfiletools/r3m/examples/end.png

Post-action camera image.

semantic_task_texttextpick up the red cup

Natural-language goal or step that should be verified.

start_imagefile

Optional pre-action image for comparing state change.

Output Explanation

task_completion_score

Score estimating how well the final visual state matches the instruction.

is_successful

Boolean completion decision from the wrapper.

feature_extractor_mode

Backend representation mode used for the score.

How To Use

Official resources, deployment steps, academic context, citation, and source-reported benchmark numbers.

Deployment Notes

  1. Clone or download the official R3M repository.
  2. Install the PyTorch dependencies and prepare the selected visual representation checkpoint.
  3. Prepare start and end images plus a semantic task instruction under tools/r3m/examples/.
  4. Run the verifier and save scores under tools/r3m/runs/.

Relative Path Example

python tools/r3m/run.py --start-image tools/r3m/examples/start.png --end-image tools/r3m/examples/end.png --instruction "pick up the red cup" --output tools/r3m/runs/verification.json

Expected Result Shape

{
  "tool": "r3m",
  "status": "ok",
  "results": [
    {
      "label": "Post-action success verification",
      "score": 0.87,
      "output": "Verification score and completion flag"
    }
  ],
  "timing": {
    "runtime": "The submitted wrapper describes about 50 ms local verification; the official R3M paper focuses on policy success rather than verifier latency.",
    "device": "documented in source benchmark when available"
  },
  "artifacts": {
    "visualization": "tools/r3m/runs/visualization.png",
    "raw_predictions": "tools/r3m/runs/predictions.json"
  }
}
Paper figure

Academic Info

Paper identity and contribution summary.

TitleR3M: A Universal Visual Representation for Robot Manipulation
AuthorsAdd authors
VenueCoRL 2022
ContributionProvides a robot manipulation visual representation that can score whether an executed step matches the intended semantic goal.

Citation

@misc{r3m2022,
  title={R3M: A Universal Visual Representation for Robot Manipulation},
  author={Author},
  year={2022},
  note={CoRL 2022},
  url={https://arxiv.org/abs/2203.12601}
}

Benchmark

Only compact, source-reported numbers are shown here.

DatasetMetricValueRuntimeSource
12 simulated manipulation tasksAverage imitation-learning success rate62% success; over 20% above training from scratch and over 10% above CLIP/MoCo baselinesFrozen R3M representation for downstream policy learningOfficial CoRL 2022 paper, Fig. 4
Franka Kitchen / MetaWorld / AdroitR3M ablation success rate53.1+/-2.7% / 69.2+/-2.0% / 65.0+/-1.7%; all domains 62.4+/-1.3%Downstream behavior cloning evaluationOfficial CoRL 2022 paper, Table 1
Real-world Franka Emika Panda manipulationDemonstration count20 demonstrations for real cluttered-apartment manipulation tasksReal-robot learning setupOfficial CoRL 2022 paper

Artifacts

Official CoRL 2022 paper, project page, repository link, mock start/end image shape, task instruction, and verification output from the submitted spreadsheet.

Demo Images

Visual references from the original tool. Click any image to inspect the original size.