Generating, Fast and Slow: Scalable Parallel Video Generation with Video Interface Networks

Bhishma Dedhia1,2, David Bourgin2, Krishna Kumar Singh2, Yuheng Li2, Yan Kang2, Zhan Xu2, Niraj K. Jha1, Yuchen Liu2

1Princeton University, 2Adobe Research

Main Paper

Figure 1 (b)

VINs

256 frames@16fps

Fluffy Persian cat in pearl necklace, studio, burgundy background, portrait

VINs

256 frames@16fps

Back view of a young woman dressed in a yellow jacket, walking in the forest

VINs

256 frames@16fps

Shark swimming in the clear Carribean ocean

Figure 1 (c)

A fat rabbit wearing a purple robe walking through a fantasy landscape

Full Attention

256 frames@16fps

Autoregressive

256 frames@16fps

VINs

256 frames@16fps

Figure 6

Two Pandas Discussing an Academic Paper

Autoregressive

256 frames@16fps

Spectral Blending

256 frames@16fps

VINs

256 frames@16fps

Happy dog wearing a yellow turtleneck, studio, potrait

Autoregressive

256 frames@16fps

Spectral Blending

256 frames@16fps

VINs

256 frames@16fps

Figure 7

A drone flying over a snow forest

Autoregressive

256 frames@16fps

Spectral Blending

256 frames@16fps

VINs

256 frames@16fps

Cherry blossoms swing in front of ocean view

Autoregressive

256 frames@12fps

Spectral Blending

256 frames@12fps

VINs

256 frames@16fps

Figure 9

VINs

128 frames@24fps

Bird's eye view of dense fir forests and a transparent lake in the middle in Dolomites

VINs

192 frames@24fps

A glass bead falling into water with a huge splash. Sunset in the background

VINs

256 frames@16fps

Happy Corgi playing in the park, golden hour, 4K

Supplementary/Appendix

Figure 14

Fluffy Persian cat in pearl necklace, studio, burgundy background, portrait

OpenSora v1.2

256 frames

Mochi-1

256 frames

HunyuanVideo

256 frames

VINs

256 frames

Happy Corgi playing in the park, golden hour, 4K

OpenSora v1.2

256 frames

Mochi-1

256 frames

HunyuanVideo

256 frames

VINs

256 frames

Figure 15

Shark swimming in the clear Carribean ocean

OpenSora v1.2

Mochi-1

HunyuanVideo

VINs

256 frames

Back view of a young woman dressed in a yellow jacket, walking in the forest

OpenSora v1.2

Mochi-1

HunyuanVideo

VINs

Figure 16

Confused grizzly bear trying to learn calculus

Full

256 frames@16fps

Autoreg.

256 frames@12fps

ST2V

256 frames@8fps

FreeNoise

256 frames@16fps

Spectral Blending

256 frames@12fps

VINs

256 frames@16fps

Back view of a young woman dressed in a yellow jacket walking in the forest

Full

256 frames@16fps

Autoreg.

256 frames@12fps

ST2V

256 frames@8fps

Freenoise

256 frames@16fps

Spectral Blending

256 frames@12fps

VINs

256 frames@16fps

Figure 17

A raccoon dressed in suit playing the trumpet

Full

256 frames@16fps

Autoreg.

256 frames@12fps

ST2V

256 frames@8fps

Freenoise

256 frames@16fps

Spectral Blending

256 frames@12fps

VINs

256 frames@16fps

A dog wearing a superhero outfit with red cape flying through the sky

Full

256 frames@16fps

Autoreg.

256 frames@12fps

ST2V

256 frames@8fps

Freenoise

256 frames@16fps

Spectral Blending

256 frames@12fps

VINs

256 frames@16fps

Figure 18

A Shark swimming in the clear Carribean ocean

Full

256 frames@16fps

Autoreg.

256 frames@12fps

ST2V

256 frames@8fps

Freenoise

256 frames@16fps

Spectral Blending

256 frames@12fps

VINs

256 frames@16fps

A fat rabbit wearing a purple robe walking through a fantasy landscape

Full

256 frames@16fps

Autoreg.

256 frames@16fps

ST2V

256 frames@8fps

""

Freenoise

256 frames@16fps

Spectral Blending

256 frames@12fps

VINs

256 frames@16fps

Figure 19

A drone flying over a snowy forest

Full

256 frames@16fps

Autoreg.

256 frames@16fps

ST2V

256 frames@8fps

Freenoise

256 frames@16fps

Spectral Blending

256 frames@16fps

VINs

256 frames@16fps

Yellow flowers swinging in the wild

Full

256 frames@16fps

Autoreg.

256 frames@12fps

ST2V

256 frames@8fps

Freenoise

256 frames@16fps

Spectral Blending

256 frames@12fps

VINs

256 frames@16fps

Figure 20

Campfire at night in a snowy forest with starry sky

Full

256 frames@16fps

Autoreg.

256 frames@12fps

ST2V

256 frames@8fps

Freenoise

256 frames@16fps

Spectral Blending

256 frames@12fps

VINs

256 frames@16fps

Cherry blossoms swing in front of ocean view

Full

256 frames@16fps

Autoreg.

256 frames@12fps

ST2V

256 frames@8fps

Freenoise

256 frames@16fps

Spectral Blending

256 frames@12fps

VINs

256 frames@16fps

Figure 21

Fluffy Persian cat in pearl necklace, studio, burgundy background, portrait

VINs with global tokens

256 frames@16fps

VINs without global tokens

256 frames@16fps

Smiling elderly gentlemen with rimmed glasses, in a tweed jacket, studio, potrait

VINs with global tokens

256 frames@16fps

VINs without global tokens

256 frames@16fps

Two pandas reading an academic paper

VINs with global tokens

256 frames@16fps

VINs without global tokens

256 frames@16fps

Distinguished poodle wearing tweed vest, studio, potrait

VINs with global tokens

256 frames@16fps

VINs without global tokens

256 frames@16fps