Perception and Grounding

Dense Object Nets

Dense Object Nets learn pixel-level object descriptors that let a robot identify corresponding points across views and use them as manipulation targets.

Tool Introduction

Core parameters, trigger timing, and visual before/after demo references.

Short Explanation

Use Dense Object Nets when a manipulation system needs to track or re-identify a specific point on an object across camera views.

InputRGB images + query pixels or object views

OutputDense descriptors and point correspondences

Trigger TimingTriggered on demand after the required input files and configuration are prepared.

RuntimePython / PyTorch research code

BeforeRGB images + query pixels or object views

Prepare the scene, image, video, sensor stream, prompt, or configuration expected by the original project.

AfterDense descriptors and point correspondences

Read the produced visualization, prediction, map, trajectory, mask, grasp pose, or other documented artifact.

Parameters And Output

Readable controls and the meaning of each returned artifact.

Parameter Explanation

source_imagefile

Image containing the point or object part to match.

query_pixeltext

Pixel coordinate whose descriptor should be matched in another view.

target_imagefile

Image where the corresponding object point should be found.

Output Explanation

descriptor_map

Dense per-pixel visual descriptor representation.

matched_pixel

Predicted corresponding point in the target image.

How To Use

Official resources, deployment steps, academic context, citation, and source-reported benchmark numbers.

Resources

arXivhttps://arxiv.org/abs/1806.08756 PMLRhttps://proceedings.mlr.press/v87/florence18a.html

Deployment Notes

Prepare multi-view object images or robot-collected observations.
Train or load the descriptor model following the official project instructions.
Use nearest-neighbor descriptor matches to recover manipulation-relevant object points.

Relative Path Example

# Follow the official Dense Object Nets repository and paper setup.
# Typical use: train descriptors on object views, then query descriptor matches for manipulation points.

Expected Result Shape

{
  "tool": "dense-object-nets",
  "status": "ok",
  "results": [
    {
      "label": "Dense visual correspondence for manipulation",
      "score": 0.87,
      "output": "Dense descriptors and point correspondences"
    }
  ],
  "timing": {
    "runtime": "The paper reports approximate training time rather than a single deployment latency number.",
    "device": "documented in source benchmark when available"
  },
  "artifacts": {
    "visualization": "tools/dense-object-nets/runs/visualization.png",
    "raw_predictions": "tools/dense-object-nets/runs/predictions.json"
  }
}

Paper figure

Academic Info

Paper identity and contribution summary.

TitleDense Object Nets: Learning Dense Visual Object Descriptors By and For Robotic Manipulation

AuthorsPeter R. Florence, Lucas Manuelli, Russ Tedrake

VenueCoRL 2018

ContributionShows that self-supervised dense descriptors can provide object-centric correspondence signals for robotic manipulation policies.

Citation

@misc{denseobjectnets2018,
  title={Dense Object Nets: Learning Dense Visual Object Descriptors By and For Robotic Manipulation},
  author={Peter R. Florence and Lucas Manuelli and Russ Tedrake},
  year={2018},
  note={CoRL 2018},
  url={https://arxiv.org/abs/1806.08756}
}

Benchmark

Only compact, source-reported numbers are shown here.

Dataset	Metric	Value	Runtime	Source
Robot-collected object correspondence, standard-SO	Image-pair match precision	93% of image pairs have normalized pixel error under 13% of the image diagonal	Descriptor correspondence evaluation	Official CoRL 2018 paper, Fig. 3
Dense Object Nets object set	Training/evaluation coverage	47 objects, including 3 object classes	Self-supervised robot data collection setup	Official CoRL 2018 paper
standard-SO descriptor model	Training time	about 20 minutes	Model training	Official CoRL 2018 paper

Artifacts

Official CoRL 2018 paper, descriptor visualizations, correspondence examples, and manipulation demonstrations.

Demo Images

Visual references from the original tool. Click any image to inspect the original size.