当前位置: 首页 > news >正文

最新视频合成后调优技术ExVideo模型部署

ExVideo是一种新型的视频合成模型后调优技术,由华东师范大学和阿里巴巴的研究人员共同开发。

ExVideo提出了一种新的后调优策略,无需对整个模型进行大规模重训,仅通过对模型中时序相关组件的微调,就能够显著增强其生成更长视频片段的能力,大大降低了对计算资源的需求,仅需1.5kgpu小时就能将视频生成帧数提高至原模型的5倍。

ExVideo在提升视频长度的同时,并没有牺牲模型的泛化能力,生成的视频在风格和分辨率上依然具有多样性。

该技术还采用了多种工程优化技术,比如参数冻结、混合精度训练、梯度检查点技术和Flash Attention,以及使用DeepSpeed库来分片优化器状态和梯度,从而在有限的计算资源下高效训练。

github项目地址:https://github.com/modelscope/DiffSynth-Studio.git。

一、环境安装

1、python环境

建议安装python版本在3.10以上。

2、pip库安装

pip install torch==2.0.1+cu118 torchvision==0.15.2+cu118 torchaudio==2.0.2 --index-url https://download.pytorch.org/whl/cu118

pip install -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple

3、ExVideo-SVD-128f模型下载

git lfs install

git clone https://www.modelscope.cn/ECNU-CILab/ExVideo-SVD-128f-v1.git

4、HunyuanDiT模型下载

git lfs install

git clone https://www.modelscope.cn/api/v1/models/modelscope/HunyuanDiT.git

5、stable-video-diffusion模型下载

git lfs install

git clone https://www.modelscope.cn/api/v1/models/AI-ModelScope/stable-video-diffusion-img2vid-xt.git

、功能测试

1、运行测试

(1)python代码调用测试

import os
import torch
from diffsynth import save_video, ModelManager, SVDVideoPipeline, HunyuanDiTImagePipeline, download_modelsdef generate_image():# Set environment variables for better performanceos.environ["TOKENIZERS_PARALLELISM"] = "True"# Download necessary modelsdownload_models(["HunyuanDiT"])# Initialize ModelManager with required modelsmodel_manager = ModelManager(torch_dtype=torch.float16, device="cuda", file_path_list=["models/HunyuanDiT/t2i/clip_text_encoder/pytorch_model.bin","models/HunyuanDiT/t2i/mt5/pytorch_model.bin","models/HunyuanDiT/t2i/model/pytorch_model_ema.pt","models/HunyuanDiT/t2i/sdxl-vae-fp16-fix/diffusion_pytorch_model.bin",])# Create image generation pipelinepipe = HunyuanDiTImagePipeline.from_model_manager(model_manager)# Generate and return the imagetorch.manual_seed(0)image = pipe(prompt="sunset time lapse at the beach with moving clouds and colors in the sky",negative_prompt="错误的眼睛,糟糕的人脸,毁容,糟糕的艺术,变形,多余的肢体,模糊的颜色,模糊,重复,病态,残缺,",num_inference_steps=50, height=1024, width=1024,)# Move model to CPU to free up GPU memorymodel_manager.to("cpu")return imagedef generate_video(image):# Download necessary modelsdownload_models(["stable-video-diffusion-img2vid-xt", "ExVideo-SVD-128f-v1"])# Initialize ModelManager with required modelsmodel_manager = ModelManager(torch_dtype=torch.float16,device="cuda",file_path_list=["models/stable_video_diffusion/svd_xt.safetensors","models/stable_video_diffusion/model.fp16.safetensors",])# Create video generation pipelinepipe = SVDVideoPipeline.from_model_manager(model_manager)# Generate and return the videotorch.manual_seed(1)video = pipe(input_image=image.resize((512, 512)),num_frames=128, fps=30, height=512, width=512,motion_bucket_id=127,num_inference_steps=50,min_cfg_scale=2, max_cfg_scale=2, contrast_enhance_scale=1.2)# Move model to CPU to free up GPU memorymodel_manager.to("cpu")return videodef upscale_video(image, video):# Download necessary modelsdownload_models(["stable-video-diffusion-img2vid-xt", "ExVideo-SVD-128f-v1"])# Initialize ModelManager with required modelsmodel_manager = ModelManager(torch_dtype=torch.float16,device="cuda",file_path_list=["models/stable_video_diffusion/svd_xt.safetensors","models/stable_video_diffusion/model.fp16.safetensors",])# Create video upscaling pipelinepipe = SVDVideoPipeline.from_model_manager(model_manager)# Generate and return the upscaled videotorch.manual_seed(2)video = pipe(input_image=image.resize((1024, 1024)),input_video=[frame.resize((1024, 1024)) for frame in video],denoising_strength=0.5,num_frames=128, fps=30, height=1024, width=1024,motion_bucket_id=127,num_inference_steps=25,min_cfg_scale=2, max_cfg_scale=2, contrast_enhance_scale=1.2)# Move model to CPU to free up GPU memorymodel_manager.to("cpu")return video# Main workflow
if __name__ == '__main__':# Generate the initial imageimage = generate_image()image.save("image.png")# Generate a video based on the initial imagevideo = generate_video(image)save_video(video, "video_512.mp4", fps=30)# Optionally upscale the video to higher resolutionupscaled_video = upscale_video(image, video)save_video(upscaled_video, "video_1024.mp4", fps=30)

未完......

更多详细的内容欢迎关注:杰哥新技术
 


http://www.mrgr.cn/news/15060.html

相关文章:

  • 4 Docker 容器导入导出
  • 神经网络卷积层
  • 零基础一文学会Docker与Kubernetes
  • LVS工作模式
  • Python制作的桌面宠物-python实战-python源码-python项目练习
  • 《深入浅出WPF》读书笔记.9Command系统
  • Redis: 用于纯缓存模式需要注意的地方
  • ubuntu 更新网卡丢失
  • Java 入门指南:初识 Java NIO
  • 数据结构——归并排序
  • “npm run serve”到51%就卡住【完美解决】
  • redis的紧凑列表ziplist、quicklist、listpack
  • C语言阴阳迷宫
  • C# 实现傅里叶变化(DFT)
  • 38. 字符串的排列【难】
  • 工作中常用的100个知识点
  • centos yum 源停用整改
  • PostgreSQL支持的数据类型
  • 28 TreeView组件
  • MyBatis中#{}和 ${}的区别是什么?