TO
Transfer Oracle
Sign in

Sim-to-Real Benchmark

Your sim covers 100% of the state space.
Your real robot reaches 43%.

We analyzed 200 matched simulation and real-robot demonstrations from the RoboMimic dataset (MIT licensed, Franka Panda). The real robot's joint ranges, velocity profiles, and structural coverage are dramatically narrower than what the simulation predicts. Standard deployment checklists don't catch this.

43%Distribution coverage
40/100Velocity structure score
31%Worst joint range overlap

Sim vs real: see the difference

Real robot camera alongside simulated view. Scrub through a demonstration and watch per-joint structural analysis update live.

Structural analysis results

Our proprietary structural analysis compares the distribution geometry — not individual frames. These scores measure how well the simulation's representation structure is preserved in reality.

Per-joint range overlap

What percentage of each sim joint's operating range does the real robot actually use? Shoulder and wrist are the worst — the real robot moves in a much narrower range than the simulation predicts.

Distribution comparison per joint

Side-by-side histograms for each joint: position and velocity distributions from 9,666 sim frames vs 11,524 real frames. Toggle between position and velocity to see where the real robot's behavior diverges.

Position structure is preserved. Dynamics are not.

Joint position analysis

67.5

/100 structural score

The subspace is aligned (1.0). The sim and real robot explore similar principal directions in joint space. Position structure transfers reasonably well.

Velocity dynamics analysis

40.0

/100 structural score

The real robot moves 3-5x slower (velocity ratios 0.18-0.36). Dynamics distributions barely overlap. Transfer risk: 0.72 (HIGH). This is where sim-trained policies fail under perturbation.

Honest assessment

What to do about it

1. Detect

Run structural analysis on your sim vs real distributions. Know exactly which joints and which dynamics diverge — before deployment, not after failure.

2. Optimize

Use structural distance as a dense RL reward signal. Train across sim variations to close the gap — no real robot needed for the optimization loop.

3. Verify

Re-run structural analysis. Confirm the gap decreased. Deploy with measured confidence, not blind hope.

Methodology

Dataset: RoboMimic v0.1 — Lift task, Proficient Human demonstrations

License: MIT (free for commercial and research use)

Robot: Franka Panda 7-DOF arm

Simulation: robosuite / MuJoCo — 200 demos, 9,666 frames

Real robot: Physical Franka Panda — 200 demos, 11,524 frames

Data type: Unpaired demonstrations (different operators, same task). All comparisons are distribution-level.

Metrics: Distribution coverage, per-joint range overlap, variance ratios, and additional proprietary quality scores — all valid on unpaired data.

NOT used: Per-frame cosine, Spearman rank, kNN accuracy — these require paired samples and would be misleading on this dataset.

Reference: Mandlekar et al., “What Matters in Learning from Offline Human Demonstrations for Robot Manipulation”, CoRL 2021.

Audit your robot's transfer pipeline

43% coverage. 31% range overlap on shoulder. 40/100 velocity score. Know the gaps before you deploy.