Short Explanation
MiDaS predicts relative depth maps for single RGB images across diverse scenes.
Cross-dataset depth estimation model for relative depth prediction from single images.
Core parameters, trigger timing, and visual before/after demo references.
MiDaS predicts relative depth maps for single RGB images across diverse scenes.
Prepare the scene, image, video, sensor stream, prompt, or configuration expected by the original project.
Read the produced visualization, prediction, map, trajectory, mask, grasp pose, or other documented artifact.
A quick-run style example for the documentation page.
Readable controls and the meaning of each returned artifact.
input_pathpathInput image file or folder.
output_pathpathDestination for predicted depth outputs.
model_typeselectdpt_beit_large_512Checkpoint variant controlling quality and speed.
depth_mapSingle-channel relative depth prediction.
sidecar_visualizationOptional visualization image for quick inspection.
Official resources, deployment steps, academic context, citation, and source-reported benchmark numbers.
python run.py --input_path tools/midas/examples --output_path tools/midas/runs --model_type dpt_beit_large_512
{
"tool": "midas",
"status": "ok",
"depth_map": [
{
"label": "Monocular depth estimation",
"score": 0.87,
"output": "Depth map"
}
],
"timing": {
"runtime": "The official README reports 5.7 FPS for BEiT-L-512 on RTX 3090 with 345M parameters, and 30 FPS for Next-ViT-L-384 with 72M parameters.",
"device": "documented in source benchmark when available"
},
"artifacts": {
"visualization": "tools/midas/runs/visualization.png",
"raw_predictions": "tools/midas/runs/predictions.json"
}
}Paper identity and contribution summary.
@misc{midas2022,
title={Towards Robust Monocular Depth Estimation: Mixing Datasets for Zero-shot Cross-dataset Transfer},
author={René Ranftl and Katrin Lasinger and David Hafner and Konrad Schindler and Vladlen Koltun},
year={2022},
note={TPAMI 2022 / arXiv:1907.01341},
url={https://arxiv.org/abs/1907.01341}
}Only compact, source-reported numbers are shown here.
| Dataset | Metric | Value | Runtime | Source |
|---|---|---|---|---|
| 6-dataset zero-shot benchmark | DIW WHDR / ETH3D AbsRel / Sintel AbsRel / TUM / KITTI / NYUv2 zero-shot error | 0.1137 / 0.0659 / 0.2366 / 6.13 / 11.56 / 1.86 for MiDaS v3.1 BEiT-L-512 | 345M params, 5.7 FPS on RTX 3090 | Official README |
| 6-dataset zero-shot benchmark | DIW WHDR / ETH3D AbsRel / Sintel AbsRel / TUM / KITTI / NYUv2 zero-shot error | 0.1031 / 0.0954 / 0.2295 / 9.21 / 6.89 / 3.47 for MiDaS v3.1 Next-ViT-L-384 | 72M params, 30 FPS on RTX 3090 | Official README |
Official model table, checkpoints, accuracy-speed figure, and run.py inference entry.
Visual references from the original tool. Click any image to inspect the original size.