Dynamics models, whether simulators or learned world models, have long been central to robotic manipulation, but most of these models focus on minimizing prediction error rather than confronting a more fundamental challenge: real-world manipulation is inherently uncertain. We argue that robust manipulation under uncertainty is fundamentally an integration problem: uncertainties must be represented, propagated, and constrained within the planning loop, not merely suppressed during training.
We present and open-source ManiDreams, a modular framework for uncertainty-aware manipulation planning over intuitive physics models that realizes this integration through composable abstractions for distributional state representation, backend-agnostic dynamics prediction, and declarative constraint specification for action optimization. The framework explicitly addresses three sources of uncertainty: perceptual, parametric, and structural. It wraps any base policy with a sample-predict-constrain loop that evaluates candidate actions against distributional outcomes, adding robustness without retraining. Experiments on default ManiSkill tasks show that ManiDreams maintains robust performance under various perturbations, where the RL baseline degrades significantly. Runnable examples on pushing, picking, catching, and real world deployment demonstrate flexibility for applications across different policies, optimizers, physics backends, and executors.
Only stereo streaming, generalizes to any unknown objects without any pre-modeling. Real-time (15 FPS) on a RTX3070.
Physics backend: Newton v1.0.0.
Bounding box estimation: Fast-FoundationStereo + SAM2.
(Drag horizontally for more)
The cage-constrained planning pipeline: generate candidate actions → parallel forward prediction via TSIP → cage evaluation & validation → execute the best valid action.
Three-layer modular architecture: abstract interfaces → concrete implementations → task-specific integrations.
Cage-constrained manipulation across four tasks with different solver and TSIP configurations.
Evaluation on standard ManiSkill benchmarks (top: simulation-based TSIP, bottom: executor).
Comparison of cage-constrained methods against baselines across manipulation tasks.
| DRIS instances m | 1 | 4 | 8 | 16 | 32 |
|---|---|---|---|---|---|
| Success rate (%) | 58 | 72 | 82 | 85 | 86 |
| Solver samples N | 1 | 4 | 8 | 16 | 32 |
|---|---|---|---|---|---|
| Success rate (%) | 52 | 71 | 82 | 87 | 88 |
| Distribution width | Narrow | Medium | Wide | ||
|---|---|---|---|---|---|
| Success rate (%) | 74 | 82 | 76 | ||
Wall-clock time and GPU memory overhead of cage-constrained planning across configurations.