Short Explanation
Use R3M after an action finishes to block false claims of completion and stop errors from accumulating.
Compares post-action visual observations with a goal instruction to decide whether a manipulation step physically succeeded.
Core parameters, trigger timing, and visual before/after demo references.
Use R3M after an action finishes to block false claims of completion and stop errors from accumulating.
Prepare the scene, image, video, sensor stream, prompt, or configuration expected by the original project.
Read the produced visualization, prediction, map, trajectory, mask, grasp pose, or other documented artifact.
A quick-run style example for the documentation page.
Readable controls and the meaning of each returned artifact.
current_camera_viewfiletools/r3m/examples/end.pngPost-action camera image.
semantic_task_texttextpick up the red cupNatural-language goal or step that should be verified.
start_imagefileOptional pre-action image for comparing state change.
task_completion_scoreScore estimating how well the final visual state matches the instruction.
is_successfulBoolean completion decision from the wrapper.
feature_extractor_modeBackend representation mode used for the score.
Official resources, deployment steps, academic context, citation, and source-reported benchmark numbers.
python tools/r3m/run.py --start-image tools/r3m/examples/start.png --end-image tools/r3m/examples/end.png --instruction "pick up the red cup" --output tools/r3m/runs/verification.json
{
"tool": "r3m",
"status": "ok",
"results": [
{
"label": "Post-action success verification",
"score": 0.87,
"output": "Verification score and completion flag"
}
],
"timing": {
"runtime": "The submitted wrapper describes about 50 ms local verification; the official R3M paper focuses on policy success rather than verifier latency.",
"device": "documented in source benchmark when available"
},
"artifacts": {
"visualization": "tools/r3m/runs/visualization.png",
"raw_predictions": "tools/r3m/runs/predictions.json"
}
}Paper identity and contribution summary.
@misc{r3m2022,
title={R3M: A Universal Visual Representation for Robot Manipulation},
author={Author},
year={2022},
note={CoRL 2022},
url={https://arxiv.org/abs/2203.12601}
}Only compact, source-reported numbers are shown here.
| Dataset | Metric | Value | Runtime | Source |
|---|---|---|---|---|
| 12 simulated manipulation tasks | Average imitation-learning success rate | 62% success; over 20% above training from scratch and over 10% above CLIP/MoCo baselines | Frozen R3M representation for downstream policy learning | Official CoRL 2022 paper, Fig. 4 |
| Franka Kitchen / MetaWorld / Adroit | R3M ablation success rate | 53.1+/-2.7% / 69.2+/-2.0% / 65.0+/-1.7%; all domains 62.4+/-1.3% | Downstream behavior cloning evaluation | Official CoRL 2022 paper, Table 1 |
| Real-world Franka Emika Panda manipulation | Demonstration count | 20 demonstrations for real cluttered-apartment manipulation tasks | Real-robot learning setup | Official CoRL 2022 paper |
Official CoRL 2022 paper, project page, repository link, mock start/end image shape, task instruction, and verification output from the submitted spreadsheet.
Visual references from the original tool. Click any image to inspect the original size.