Transforming abrupt semantic jumps into smooth, evenly-paced transitions
Cat to Lion
Left: Source video with non-linear semantic progression |
Right: ReTime output with uniform semantic flow
Watch how the SPF curve reveals uneven pacing and how ReTime corrects it
Transformations produced by image and video generation models often evolve in a highly non-linear manner: long stretches where the content barely changes are followed by sudden, abrupt semantic jumps. To analyze and correct this behavior, we introduce a Semantic Progress Function, a one-dimensional representation that captures how the meaning of a given sequence evolves over time. For each frame, we compute distances between semantic embeddings and fit a smooth curve that reflects the cumulative semantic shift across the sequence. Departures of this curve from a straight line reveal uneven semantic pacing. Building on this insight, we propose a semantic linearization procedure that reparameterizes (or re-times) the sequence so that semantic change unfolds at a constant rate, yielding smoother and more coherent transitions. Beyond linearization, our framework provides a model-agnostic foundation for identifying temporal irregularities, comparing semantic pacing across different generators, and steering both generated and real-world video sequences toward arbitrary target pacing.
Video generation models produce transformations that evolve unevenly. Long stretches where content barely changes are followed by sudden, jarring semantic jumps.
Balloon to Lantern
Basket to Nest
Bead to Bee
Berry to Bird
Birdhouse to Camera
Broom to Cat
City to Circuit
Cloud to Sheep
Corn to Chick
Dumpling to Bunny
Lollipop to Sheep
Macaron to Bunny
Panda to Red Panda
Straw to Flamingo
Teddy to Panda
Traffic Cone to Fox
Wombat to Badger
Cat to Lion
Zoom In
Desert to Ice
Face Makeup
Fox to Corgi
Pumpkin to Mosaic
Tomato to Bell Pepper
The Semantic Progress Function (SPF) reveals these irregularities: the source curve deviates from the ideal diagonal, showing regions of stagnation followed by rapid change. Our ReTime method corrects this pacing.
The Semantic Progress Function (SPF) captures cumulative semantic change over time
Each video frame is embedded using a semantic encoder (SigLIP). Pairwise distances capture how much meaning changes between frames.
A smooth curve is fitted to represent cumulative semantic progress. Its slope reflects the instantaneous rate of semantic change.
We warp temporal positional embeddings (RoPE) to redistribute time according to semantic progress, achieving constant semantic velocity.
Watch how the SPF curve reveals non-linear semantic evolution and how ReTime corrects it
Traffic Cone to Fox
Berry to Bird
Birdhouse to Camera
Panda to Red Panda
Lollipop to Sheep
Cloud to Sheep
Corn to Chick
Tomato to Bell Pepper
Balloon to Lantern
City to Circuit Board
Dumpling to Bunny
Macaron to Bunny
Straw to Flamingo
Teacup to Bowl
Zoom In
Desert to Ice
Face Makeup
Pumpkin to Mosaic
Teddy to Panda
Wombat to Badger
Bead to Bee
Broom to Cat
Fox to Corgi
Left: Source video with non-linear semantic progression |
Right: ReTime output with uniform semantic flow
SPF computed using SigLIP embeddings with k=30
For videos we cannot regenerate from scratch, we segment the SPF curve and synthesize new clips conditioned on boundary frames, with lengths proportional to their semantic span
Input film clips (left) vs Wan2.2-generated videos (right)
Stranger Things - Vecna Transformation
Thor: Ragnarok - Loki's Disguise Reveal
Lucifer - Wings Reveal (Penthouse)
Lucifer - Wings Reveal (Office)
LTX-2 generates synchronized audio-video from keyframes. With ReTime, the audio aligns with smooth visual transitions rather than abrupt semantic jumps.
These videos have audio! Click to unmute and hear the synchronized sound.
Teddy to Panda
Dragon to Phoenix
Towel to Elephant
Low Angle View
Overhead View
Behind View
Macro Sea Glass
Cheetah Acceleration
Panda to Red Panda
Fox to Raccoon
Chow Chow to Shiba Inu
Penguin to Puffin
Cupcake to Cat
Bread to Hamster
Cinnamon to Hedgehog
Glove to Octopus
Dumpling to Bunny
Muffin to Hedgehog
Donut to Hamster
Cactus to Hedgehog
Sunflower to Rose
Autumn to Winter
Acorn to Squirrel
Lemon to Bird
Corn to Chick
Carrot to Rocket
Traffic Cone to Fox
Trophy to Penguin
Candle to Firefly
Button to Ladybug
Soap to Seal
Doily to Spider
Battery Drain
Plate Level Up
If you find our work useful in your research, please consider citing:
@article{spf2026,
title = {Video Analysis and Generation via a Semantic Progress Function},
author = {Metzer, Gal and Polaczek, Sagi and Mahdavi-Amiri, Ali
and Giryes, Raja and Cohen-Or, Daniel},
journal = {TBD},
year = {2026}
}