ManiDreams: An Open-Source Library for Robust Object Manipulation via Uncertainty-aware Task-specific Intuitive Physics

1Rice University 2Robotics and AI Institute
ManiDreams Splash Figure

ManiDreams maintains a time-varying constraint (cage) around target objects, sampling and evaluating candidate actions via parallel forward simulation for robust execution.

Abstract

Dynamics models, whether simulators or learned world models, have long been central to robotic manipulation, but most of these models focus on minimizing prediction error rather than confronting a more fundamental challenge: real-world manipulation is inherently uncertain. We argue that robust manipulation under uncertainty is fundamentally an integration problem: uncertainties must be represented, propagated, and constrained within the planning loop, not merely suppressed during training.

We present and open-source ManiDreams, a modular framework for uncertainty-aware manipulation planning over intuitive physics models that realizes this integration through composable abstractions for distributional state representation, backend-agnostic dynamics prediction, and declarative constraint specification for action optimization. The framework explicitly addresses three sources of uncertainty: perceptual, parametric, and structural. It wraps any base policy with a sample-predict-constrain loop that evaluates candidate actions against distributional outcomes, adding robustness without retraining. Experiments on default ManiSkill tasks show that ManiDreams maintains robust performance under various perturbations, where the RL baseline degrades significantly. Runnable examples on pushing, picking, catching, and real world deployment demonstrate flexibility for applications across different policies, optimizers, physics backends, and executors.

Real2Sim with DRIS

Only stereo streaming, generalizes to any unknown objects without any pre-modeling. Real-time (15 FPS) on a RTX3070.
Physics backend: Newton v1.0.0.
Bounding box estimation: Fast-FoundationStereo + SAM2.
(Drag horizontally for more)

Pipeline

ManiDreams Pipeline

The cage-constrained planning pipeline: generate candidate actions → parallel forward prediction via TSIP → cage evaluation & validation → execute the best valid action.

Architecture & Core Concepts

ManiDreams Architecture

Three-layer modular architecture: abstract interfaces → concrete implementations → task-specific integrations.

DRIS
Domain-Randomized Instance Set
Universal state representation carrying observation data plus domain-randomization context. Supports state vectors, images, point clouds, or any combination.
TSIP
Task-Specific Intuitive Physics
Forward model predicting next state given current state and action. Supports simulation-based (ManiSkill, Newton, etc.) and learning-based (diffusion model) backends.
Cage
Spatial Constraint Evaluator
Virtual boundary providing continuous cost evaluation and validation (for constraint satisfaction). Supports time-varying parameters including deformation and custom trajectories.
Solver
Action Selection via Sampling & Optimization
Generates and evaluates candidate actions. Combines samplers (PolicySampler, GaussianSampler) with optimizers (MPPI, Geometric, MPC) to propose and refine actions under cage constraints.

Runnable Examples

Cage-constrained manipulation across four tasks with different solver and TSIP configurations.

Object Pushing
Simulation-based TSIP
Object Pushing
Learning-based TSIP
Ball Catching
RL policy sampler
Card Picking
MPPI optimizer

Real Robot Experiments with Diffusion-based TSIP

Random Object Picking from Clutter
Push-then-pick strategy with real-time DRIS via SAM2
Flat Object Scooping from Clutter
Push-to-corner-then-scoop strategy with real-time DRIS via SAM2

Default ManiSkill Tasks

Evaluation on standard ManiSkill benchmarks (top: simulation-based TSIP, bottom: executor).

PushCube
Push a cube to a target location
PickCube
Pick up a cube and move to a goal position
PushT
Push a T-shaped block to a target pose

Benchmark Results

Benchmark Results

Comparison of cage-constrained methods against baselines across manipulation tasks.

Ablation Study

DRIS instances m 1 4 8 16 32
Success rate (%) 58 72 82 85 86
Solver samples N 1 4 8 16 32
Success rate (%) 52 71 82 87 88
Distribution width Narrow Medium Wide
Success rate (%) 74 82 76

Computation Overhead

Computation Overhead

Wall-clock time and GPU memory overhead of cage-constrained planning across configurations.