Hacker News new | past | comments | ask | show | jobs | submit login
Packing Input Frame Context in Next-Frame Prediction Models for Video Generation (github.com/lllyasviel)
1 point by vikrantrathore 15 hours ago | hide | past | favorite | 2 comments





FramePack

    Diffuse thousands of frames at full fps-30 with 13B models using 6GB laptop GPU memory.
    Finetune 13B video model at batch size 64 on a single 8xA100/H100 node for personal/lab experiments.
    Personal RTX 4090 generates at speed 2.5 seconds/frame (unoptimized) or 1.5 seconds/frame (teacache).
    No timestep distillation.
    Video diffusion, but feels like image diffusion.
Paper link: https://lllyasviel.github.io/frame_pack_gitpage/pack.pdf

Website link: https://lllyasviel.github.io/frame_pack_gitpage/





Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: