Skip to content

Conversation

@stduhpf
Copy link
Contributor

@stduhpf stduhpf commented Nov 3, 2025

https://github.com/madebyollin/taehv

Model weights: https://github.com/madebyollin/taehv/blob/main/taew2_1.pth

Only tested "successfuly" for decoding Qwen-Image outputs, still need some work to support video models or encoding.

.\bin\Release\sd.exe --diffusion-model ..\..\ComfyUI\models\diffusion_models\qwen-image-Q8_0.gguf --vae ..\..\ComfyUI\models\vae\qwen_image_vae.safetensors --qwen2vl ..\..\ComfyUI\models\text_encoders\Qwen2.5-VL-7B-Instruct-Q8_0.gguf -p '一个穿着"QWEN"标志的T恤的中国美女正拿着黑色的马克笔面相镜头微笑。她身后的玻璃板上手写体写着 “一、Qwen-Image的技术路线: 探索视觉生成基础模型的极限,开创理解与生成一体化的未来。二、Qwen-Image的模型特色:1、复杂文字渲染。支持中英渲染、自动布局; 2、精准图像编辑。支持文字编辑、物体增减、风格变换。三、Qwen-Image的未来愿景:赋能专业内容创作、助力生成式AI发展。”' --cfg-scale 2.5 --sampling-method euler -v --offload-to-cpu -H 1024 -W 1024 --diffusion-fa --flow-shift 3 --tae ..\ComfyUI\models\vae_approx\taew2_1.pth

output

Result is a bit broken for now, maybe I missed some post-processing step.

Speedup and memory saving aren't that impressive yet, maybe it can be improved further?

@stduhpf
Copy link
Contributor Author

stduhpf commented Nov 3, 2025

Sorry for the unrelated whitespace changes and the debug spam, will fix later

@stduhpf
Copy link
Contributor Author

stduhpf commented Nov 3, 2025

Oh a new version of the taew2.1 weights just came out, coincidentally.

Old Weights New Weights
output - Copy (112) output

@stduhpf
Copy link
Contributor Author

stduhpf commented Nov 3, 2025

Now tae decoding for the outputs of Wan2.1 models (and Wan2.2 A14B) works in txt2img mode.

Video decoding is running as well, but the results are obviously incorrect (flashing lights warning)

If someone can see what I'm doing wrong when decoding videos, let me know.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant