Diffusion-forcing world model rollouts. Each sample is ~2.7 s (16 raw frames @ 6 fps).
Side-by-side videos show ground truth on the left, model prediction on the right.
First half of each clip = history (GT), second half = autoregressively sampled future.
Summary
Metrics are computed on saved future latent frames. Contact/no-contact split uses tactile latent-to-reference energy.
run
view MSE
TL MSE
TR MSE
tactile contact MSE
tactile no-contact MSE
E1_best_epoch36_val00200
0.049644
0.009406
0.004801
0.007103
n/a
E1_last_epoch99_val03365
0.049644
0.009406
0.004801
0.007103
n/a
E1_best_epoch36_val00200
E1_best_epoch36_val00200/sample_000
Panel (rows: tl, tr, view; cols: time, history|future split)
view_left — GT (L) vs rollout (R)tactile_left — GT vs rollouttactile_right — GT vs rollout
E1_best_epoch36_val00200/sample_001
Panel (rows: tl, tr, view; cols: time, history|future split)
view_left — GT (L) vs rollout (R)tactile_left — GT vs rollouttactile_right — GT vs rollout
E1_best_epoch36_val00200/sample_002
Panel (rows: tl, tr, view; cols: time, history|future split)
view_left — GT (L) vs rollout (R)tactile_left — GT vs rollouttactile_right — GT vs rollout
E1_best_epoch36_val00200/sample_003
Panel (rows: tl, tr, view; cols: time, history|future split)
view_left — GT (L) vs rollout (R)tactile_left — GT vs rollouttactile_right — GT vs rollout
E1_best_epoch36_val00200/sample_004
Panel (rows: tl, tr, view; cols: time, history|future split)
view_left — GT (L) vs rollout (R)tactile_left — GT vs rollouttactile_right — GT vs rollout
E1_best_epoch36_val00200/sample_005
Panel (rows: tl, tr, view; cols: time, history|future split)
view_left — GT (L) vs rollout (R)tactile_left — GT vs rollouttactile_right — GT vs rollout
E1_best_epoch36_val00200/sample_006
Panel (rows: tl, tr, view; cols: time, history|future split)
view_left — GT (L) vs rollout (R)tactile_left — GT vs rollouttactile_right — GT vs rollout
E1_best_epoch36_val00200/sample_007
Panel (rows: tl, tr, view; cols: time, history|future split)
view_left — GT (L) vs rollout (R)tactile_left — GT vs rollouttactile_right — GT vs rollout
E1_best_epoch36_val00200/sample_008
Panel (rows: tl, tr, view; cols: time, history|future split)
view_left — GT (L) vs rollout (R)tactile_left — GT vs rollouttactile_right — GT vs rollout
E1_best_epoch36_val00200/sample_009
Panel (rows: tl, tr, view; cols: time, history|future split)
view_left — GT (L) vs rollout (R)tactile_left — GT vs rollouttactile_right — GT vs rollout
E1_last_epoch99_val03365
E1_last_epoch99_val03365/sample_000
Panel (rows: tl, tr, view; cols: time, history|future split)
view_left — GT (L) vs rollout (R)tactile_left — GT vs rollouttactile_right — GT vs rollout
E1_last_epoch99_val03365/sample_001
Panel (rows: tl, tr, view; cols: time, history|future split)
view_left — GT (L) vs rollout (R)tactile_left — GT vs rollouttactile_right — GT vs rollout
E1_last_epoch99_val03365/sample_002
Panel (rows: tl, tr, view; cols: time, history|future split)
view_left — GT (L) vs rollout (R)tactile_left — GT vs rollouttactile_right — GT vs rollout
E1_last_epoch99_val03365/sample_003
Panel (rows: tl, tr, view; cols: time, history|future split)
view_left — GT (L) vs rollout (R)tactile_left — GT vs rollouttactile_right — GT vs rollout
E1_last_epoch99_val03365/sample_004
Panel (rows: tl, tr, view; cols: time, history|future split)
view_left — GT (L) vs rollout (R)tactile_left — GT vs rollouttactile_right — GT vs rollout
E1_last_epoch99_val03365/sample_005
Panel (rows: tl, tr, view; cols: time, history|future split)
view_left — GT (L) vs rollout (R)tactile_left — GT vs rollouttactile_right — GT vs rollout
E1_last_epoch99_val03365/sample_006
Panel (rows: tl, tr, view; cols: time, history|future split)
view_left — GT (L) vs rollout (R)tactile_left — GT vs rollouttactile_right — GT vs rollout
E1_last_epoch99_val03365/sample_007
Panel (rows: tl, tr, view; cols: time, history|future split)
view_left — GT (L) vs rollout (R)tactile_left — GT vs rollouttactile_right — GT vs rollout
E1_last_epoch99_val03365/sample_008
Panel (rows: tl, tr, view; cols: time, history|future split)
view_left — GT (L) vs rollout (R)tactile_left — GT vs rollouttactile_right — GT vs rollout
E1_last_epoch99_val03365/sample_009
Panel (rows: tl, tr, view; cols: time, history|future split)
view_left — GT (L) vs rollout (R)tactile_left — GT vs rollouttactile_right — GT vs rollout