[SANA-Video] Adding 5s pre-trained 480p SANA-Video inference #12584

lawrence-cj · 2025-11-04T03:09:51Z

What does this PR do?

This PR add SANA-Video, a new text/image-to-video model from NVIDIA
Paper
Project
HF weight

import torch
from diffusers import SanaPipeline, SanaVideoPipeline, UniPCMultistepScheduler, DPMSolverMultistepScheduler
from diffusers import AutoencoderKLWan
from diffusers.utils import export_to_video


model_id = "Efficient-Large-Model/SANA-Video_2B_480p_diffusers"
pipe = SanaVideoPipeline.from_pretrained(model_id, torch_dtype=torch.bfloat16)
# pipe.scheduler = UniPCMultistepScheduler.from_config(pipe.scheduler.config, flow_shift=8.0)
# pipe.scheduler = DPMSolverMultistepScheduler.from_config(pipe.scheduler.config, flow_shift=8.0)
pipe.vae.to(torch.float32)
pipe.text_encoder.to(torch.bfloat16)
pipe.to("cuda")
model_score = 30

prompt = "Evening, backlight, side lighting, soft light, high contrast, mid-shot, centered composition, clean solo shot, warm color. A young Caucasian man stands in a forest, golden light glimmers on his hair as sunlight filters through the leaves. He wears a light shirt, wind gently blowing his hair and collar, light dances across his face with his movements. The background is blurred, with dappled light and soft tree shadows in the distance. The camera focuses on his lifted gaze, clear and emotional."
negative_prompt = "A chaotic sequence with misshapen, deformed limbs in heavy motion blur, sudden disappearance, jump cuts, jerky movements, rapid shot changes, frames out of sync, inconsistent character shapes, temporal artifacts, jitter, and ghosting effects, creating a disorienting visual experience."
motion_prompt = f" motion score: {model_score}."
prompt = prompt + motion_prompt

video = pipe(
    prompt=prompt,
    negative_prompt=negative_prompt,
    height=480,
    width=832,
    frames=81,
    guidance_scale=6,
    num_inference_steps=50,
    generator=torch.Generator(device="cuda").manual_seed(42),
).frames[0]

export_to_video(video, "sana_video.mp4", fps=16)

Results:

sana_v2.mp4

2. add `SanaVideoPipeline` in pipeline_sana_video.py 3. add all code we need for import `SanaVideoPipeline`

2. add reshape function in sana-video-processor; 3. fix convert pth to safetensor bugs;

dg845 · 2025-11-04T23:16:11Z

src/diffusers/video_processor.py

+        return int(default_hw[0]), int(default_hw[1])
+
+    @staticmethod
+    def resize_and_crop_tensor(samples: torch.Tensor, new_width: int, new_height: int) -> torch.Tensor:


I think exposing an interface like VaeImageProcessor.resize:

diffusers/src/diffusers/image_processor.py

Lines 468 to 474 in dcfb18a

def resize(

self,

image: Union[PIL.Image.Image, np.ndarray, torch.Tensor],

height: int,

width: int,

resize_mode: str = "default", # "default", "fill", "crop"

) -> Union[PIL.Image.Image, np.ndarray, torch.Tensor]:

would be more robust, since different video preprocessing pipelines will probably make different choices here. Not blocking, on the diffusers side we can follow up to support more video pipelines here.

OK, I would let u guys help to finish this part. Thanks!!

src/diffusers/models/transformers/transformer_sana_video.py

src/diffusers/pipelines/sana/pipeline_sana_video.py

src/diffusers/models/transformers/transformer_sana_video.py

src/diffusers/pipelines/sana/pipeline_sana_video.py

src/diffusers/models/transformers/transformer_sana_video.py

dg845

Thanks for the PR! Would you be able to add tests and docs? We can help with both, especially the tests, but for the docs it may be harder for us as we are not as familiar with the intricacies of the model.

Documentation example (Wan): https://github.com/huggingface/diffusers/blob/main/docs/source/en/api/pipelines/wan.md
Model tests example (WanTransformer3DModel): https://github.com/huggingface/diffusers/blob/main/tests/models/transformers/test_models_transformer_wan.py
Pipeline tests example (WanPipeline): https://github.com/huggingface/diffusers/blob/main/tests/pipelines/wan/test_wan.py

Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com>

Co-authored-by: YiYi Xu <yixu310@gmail.com>

2. fix typos;

lawrence-cj · 2025-11-05T05:40:10Z

I have added markdown we need.

THANKS so much for your support!
@yiyixuxu @dg845

lawrence-cj · 2025-11-05T05:49:36Z

I also added two test cases for you for reference. Please feel free to modify them

d98f93c

tests/pipelines/sana/test_sana_video.py

Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com>

tests/pipelines/sana/test_sana_video.py

Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com>

tests/pipelines/sana/test_sana_video.py

dg845

Thanks for the follow up changes! I have made some suggestions that should help the Sana Video pipeline tests pass.

Sorry for all the small change requests, but could you also do the following?

Can you run the following to make sure that the CI code quality check is green?

make style
make quality
make fix-copies

Can you add the new Sana Video markdown docs to docs/source/en/_toctree.yml? For reference, here is how the Sana pipeline docs were added:

diffusers/docs/source/en/_toctree.yml

Lines 562 to 563 in dcfb18a

- local: api/pipelines/sana

title: Sana

This change will help the docs build correctly.

src/diffusers/pipelines/sana/pipeline_sana_video.py

src/diffusers/video_processor.py

Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com>

make quality make fix-copies

lawrence-cj · 2025-11-05T09:14:38Z

Thanks for the follow up changes! I have made some suggestions that should help the Sana Video pipeline tests pass.

Sorry for all the small change requests, but could you also do the following?

Can you run the following to make sure that the CI code quality check is green?
make style
make quality
make fix-copies
Can you add the new Sana Video markdown docs to docs/source/en/_toctree.yml? For reference, here is how the Sana pipeline docs were added:

diffusers/docs/source/en/_toctree.yml

Lines 562 to 563 in dcfb18a

- local: api/pipelines/sana

title: Sana

This change will help the docs build correctly.

Done! Let's test it.

dg845 · 2025-11-05T20:19:00Z

docs/source/en/_toctree.yml

+      - local: api/models/sana_video_transformer3d
+        title: SanaVideoTransformer3DModel


This will cause an error when building the docs since the api/models/sana_video_transformer3d file doesn't currently exist. Could you add a markdown doc for the transformer as well? For reference, here is the documentation for SanaTransformer2DModel: https://github.com/huggingface/diffusers/blob/main/docs/source/en/api/models/sana_transformer2d.md

lawrence-cj added 6 commits November 3, 2025 17:53

1. add SanaVideoTransformer3DModel in transformer_sana_video.py

13e516c

2. add `SanaVideoPipeline` in pipeline_sana_video.py 3. add all code we need for import `SanaVideoPipeline`

add a sample about how to use sana-video;

5eb5354

code update;

c6d7876

update hf model path;

d67ab2a

update code;

a5f19e0

sana-video can run now;

c15ae23

sayakpaul requested a review from dg845 November 4, 2025 03:16

lawrence-cj added 4 commits November 3, 2025 21:12

1. add aspect ratio in sana-video-pipeline;

ee79af3

2. add reshape function in sana-video-processor; 3. fix convert pth to safetensor bugs;

Merge branch 'main' into feat/sana-video

f06a93d

default to use use_resolution_binning;

49557c1

make style;

857ca30

lawrence-cj mentioned this pull request Nov 4, 2025

SANA-Video PR is under construction NVlabs/Sana#321

Merged

remove unused code;

3ed7000

dg845 reviewed Nov 4, 2025

View reviewed changes

src/diffusers/models/transformers/transformer_sana_video.py Outdated Show resolved Hide resolved

dg845 reviewed Nov 5, 2025

View reviewed changes

src/diffusers/models/transformers/transformer_sana_video.py Outdated Show resolved Hide resolved

dg845 reviewed Nov 5, 2025

View reviewed changes

src/diffusers/models/transformers/transformer_sana_video.py Outdated Show resolved Hide resolved

dg845 reviewed Nov 5, 2025

View reviewed changes

src/diffusers/models/transformers/transformer_sana_video.py Outdated Show resolved Hide resolved

yiyixuxu reviewed Nov 5, 2025

View reviewed changes

src/diffusers/pipelines/sana/pipeline_sana_video.py Outdated Show resolved Hide resolved

dg845 reviewed Nov 5, 2025

View reviewed changes

src/diffusers/models/transformers/transformer_sana_video.py Outdated Show resolved Hide resolved

dg845 reviewed Nov 5, 2025

View reviewed changes

src/diffusers/models/transformers/transformer_sana_video.py Outdated Show resolved Hide resolved

dg845 reviewed Nov 5, 2025

View reviewed changes

src/diffusers/models/transformers/transformer_sana_video.py Show resolved Hide resolved

dg845 reviewed Nov 5, 2025

View reviewed changes

src/diffusers/pipelines/sana/pipeline_sana_video.py Outdated Show resolved Hide resolved

dg845 reviewed Nov 5, 2025

View reviewed changes

src/diffusers/pipelines/sana/pipeline_sana_video.py Outdated Show resolved Hide resolved

dg845 reviewed Nov 5, 2025

View reviewed changes

src/diffusers/models/transformers/transformer_sana_video.py Outdated Show resolved Hide resolved

dg845 reviewed Nov 5, 2025

View reviewed changes

lawrence-cj and others added 4 commits November 5, 2025 11:59

Update src/diffusers/models/transformers/transformer_sana_video.py

439bf58

Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com>

Update src/diffusers/models/transformers/transformer_sana_video.py

de4cf31

Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com>

Update src/diffusers/models/transformers/transformer_sana_video.py

fe73287

Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com>

Update src/diffusers/pipelines/sana/pipeline_sana_video.py

118677a

Co-authored-by: YiYi Xu <yixu310@gmail.com>

1. add sana-video markdown;

f2a9d0b

2. fix typos;

add two test case for sana-video (need check)

d98f93c

dg845 reviewed Nov 5, 2025

View reviewed changes

tests/pipelines/sana/test_sana_video.py Outdated Show resolved Hide resolved

fix text-encoder in test-sana-video;

4569d0b

dg845 reviewed Nov 5, 2025

View reviewed changes

tests/pipelines/sana/test_sana_video.py Show resolved Hide resolved

lawrence-cj commented Nov 5, 2025

View reviewed changes

tests/pipelines/sana/test_sana_video.py Show resolved Hide resolved

dg845 reviewed Nov 5, 2025

View reviewed changes

tests/pipelines/sana/test_sana_video.py Outdated Show resolved Hide resolved

lawrence-cj and others added 2 commits November 5, 2025 16:04

Update tests/pipelines/sana/test_sana_video.py

1379391

Update tests/pipelines/sana/test_sana_video.py

b359240

Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com>

dg845 reviewed Nov 5, 2025

View reviewed changes

tests/pipelines/sana/test_sana_video.py Outdated Show resolved Hide resolved

Update tests/pipelines/sana/test_sana_video.py

7256023

Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com>

dg845 reviewed Nov 5, 2025

View reviewed changes

tests/pipelines/sana/test_sana_video.py Outdated Show resolved Hide resolved

dg845 reviewed Nov 5, 2025

View reviewed changes

tests/pipelines/sana/test_sana_video.py Outdated Show resolved Hide resolved

dg845 reviewed Nov 5, 2025

View reviewed changes

tests/pipelines/sana/test_sana_video.py Show resolved Hide resolved

dg845 approved these changes Nov 5, 2025

View reviewed changes

dg845 reviewed Nov 5, 2025

View reviewed changes

src/diffusers/pipelines/sana/pipeline_sana_video.py Outdated Show resolved Hide resolved

dg845 reviewed Nov 5, 2025

View reviewed changes

src/diffusers/video_processor.py Outdated Show resolved Hide resolved

lawrence-cj and others added 7 commits November 5, 2025 17:07

Update tests/pipelines/sana/test_sana_video.py

25d1a4c

Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com>

Update tests/pipelines/sana/test_sana_video.py

a9c16eb

Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com>

Update tests/pipelines/sana/test_sana_video.py

8a27d58

Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com>

Update src/diffusers/pipelines/sana/pipeline_sana_video.py

4c25427

Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com>

Update src/diffusers/video_processor.py

31c9fa5

Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com>

make style

0ed7eee

make quality make fix-copies

toctree yaml update;

e31f91b

dg845 reviewed Nov 5, 2025

View reviewed changes

lawrence-cj added 2 commits November 5, 2025 18:39

add sana-video-transformer3d markdown;

cb31fc2

Merge branch 'main' into feat/sana-video

2b8c3e3

	def resize(
	self,
	image: Union[PIL.Image.Image, np.ndarray, torch.Tensor],
	height: int,
	width: int,
	resize_mode: str = "default", # "default", "fill", "crop"
	) -> Union[PIL.Image.Image, np.ndarray, torch.Tensor]:

		- local: api/models/sana_video_transformer3d
		title: SanaVideoTransformer3DModel

[SANA-Video] Adding 5s pre-trained 480p SANA-Video inference #12584

Are you sure you want to change the base?

[SANA-Video] Adding 5s pre-trained 480p SANA-Video inference #12584

Conversation

lawrence-cj commented Nov 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Uh oh!

dg845 Nov 4, 2025

Choose a reason for hiding this comment

Uh oh!

lawrence-cj Nov 5, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

dg845 left a comment

Choose a reason for hiding this comment

Uh oh!

lawrence-cj commented Nov 5, 2025

Uh oh!

lawrence-cj commented Nov 5, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

dg845 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

lawrence-cj commented Nov 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dg845 Nov 5, 2025

Choose a reason for hiding this comment

Uh oh!

lawrence-cj Nov 6, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

lawrence-cj commented Nov 4, 2025 •

edited

Loading

dg845 left a comment •

edited

Loading

lawrence-cj commented Nov 5, 2025 •

edited

Loading