Perception and Grounding

Dense Object Nets

Dense Object Nets learn pixel-level object descriptors that let a robot identify corresponding points across views and use them as manipulation targets.

Tool Introduction

Core parameters, trigger timing, and visual before/after demo references.

Short Explanation

Use Dense Object Nets when a manipulation system needs to track or re-identify a specific point on an object across camera views.

InputRGB images + query pixels or object views
OutputDense descriptors and point correspondences
Trigger TimingTriggered on demand after the required input files and configuration are prepared.
RuntimePython / PyTorch research code
BeforeRGB images + query pixels or object views

Prepare the scene, image, video, sensor stream, prompt, or configuration expected by the original project.

AfterDense descriptors and point correspondences

Read the produced visualization, prediction, map, trajectory, mask, grasp pose, or other documented artifact.

Parameters And Output

Readable controls and the meaning of each returned artifact.

Parameter Explanation

source_imagefile

Image containing the point or object part to match.

query_pixeltext

Pixel coordinate whose descriptor should be matched in another view.

target_imagefile

Image where the corresponding object point should be found.

Output Explanation

descriptor_map

Dense per-pixel visual descriptor representation.

matched_pixel

Predicted corresponding point in the target image.

How To Use

Official resources, deployment steps, academic context, citation, and source-reported benchmark numbers.

Deployment Notes

  1. Prepare multi-view object images or robot-collected observations.
  2. Train or load the descriptor model following the official project instructions.
  3. Use nearest-neighbor descriptor matches to recover manipulation-relevant object points.

Relative Path Example

# Follow the official Dense Object Nets repository and paper setup.
# Typical use: train descriptors on object views, then query descriptor matches for manipulation points.

Expected Result Shape

{
  "tool": "dense-object-nets",
  "status": "ok",
  "results": [
    {
      "label": "Dense visual correspondence for manipulation",
      "score": 0.87,
      "output": "Dense descriptors and point correspondences"
    }
  ],
  "timing": {
    "runtime": "The paper reports approximate training time rather than a single deployment latency number.",
    "device": "documented in source benchmark when available"
  },
  "artifacts": {
    "visualization": "tools/dense-object-nets/runs/visualization.png",
    "raw_predictions": "tools/dense-object-nets/runs/predictions.json"
  }
}
Paper figure

Academic Info

Paper identity and contribution summary.

TitleDense Object Nets: Learning Dense Visual Object Descriptors By and For Robotic Manipulation
AuthorsPeter R. Florence, Lucas Manuelli, Russ Tedrake
VenueCoRL 2018
ContributionShows that self-supervised dense descriptors can provide object-centric correspondence signals for robotic manipulation policies.

Citation

@misc{denseobjectnets2018,
  title={Dense Object Nets: Learning Dense Visual Object Descriptors By and For Robotic Manipulation},
  author={Peter R. Florence and Lucas Manuelli and Russ Tedrake},
  year={2018},
  note={CoRL 2018},
  url={https://arxiv.org/abs/1806.08756}
}

Benchmark

Only compact, source-reported numbers are shown here.

DatasetMetricValueRuntimeSource
Robot-collected object correspondence, standard-SOImage-pair match precision93% of image pairs have normalized pixel error under 13% of the image diagonalDescriptor correspondence evaluationOfficial CoRL 2018 paper, Fig. 3
Dense Object Nets object setTraining/evaluation coverage47 objects, including 3 object classesSelf-supervised robot data collection setupOfficial CoRL 2018 paper
standard-SO descriptor modelTraining timeabout 20 minutesModel trainingOfficial CoRL 2018 paper

Artifacts

Official CoRL 2018 paper, descriptor visualizations, correspondence examples, and manipulation demonstrations.

Demo Images

Visual references from the original tool. Click any image to inspect the original size.