โจ PUSA V1.0 โจ
๐ฌ Revolutionary Video Generation with Vectorized Timestep Adaptation
๐ฅ BREAKTHROUGH PERFORMANCE: Surpassing Wan-I2V on Vbench-I2V with only $500 training cost! ๐ฅ
๐ 4 Powerful Modes: I2V โข Multi-Frame โข V2V โข T2V ๐
Image-to-Video Generation (I2V)
Generate videos from a single starting image. Perfect for bringing static images to life with natural motion and animation.
๐ท Input Image
โ๏ธ Generation Parameters
๐ Text Prompts
๐น Output
๐ญ Demo Examples
Prompt: "A wide-angle shot shows a serene monk meditating with gentle swaying and peaceful movement..."
- Noise Multiplier: 0.2
- LoRA Alpha: 1.4
Prompt: "A female climber rock climbing on an asteroid in deep space with dynamic movement..."
- Noise Multiplier: 0.3
- LoRA Alpha: 1.2
๐ฌ Demo Gallery - See Pusa V1.0 in Action!
Explore real examples showcasing the power and versatility of Pusa V1.0 across different generation modes.
๐ Note: Demo files should be placed in ./demos/ and ./assets/ directories to display properly.
๐ทโก๏ธ๐ฌ Image-to-Video Generation Example
๐ผ๏ธ Input Image
Settings Used:
- Prompt: "A wide-angle shot shows a serene monk meditating perched a top of the letter E of a pile of weathered rocks that vertically spell out 'ZEN'. The rock formation is perched atop a misty mountain peak at sunrise..."
- Conditioning Position: 0 (first frame)
- Noise Multiplier: 0.2
- LoRA Alpha: 1.4
- Inference Steps: 30
- File Path: ./demos/input_image.jpg
๐ฅ Generated Video
๐ About Pusa V1.0
Pusa V1.0 leverages vectorized timestep adaptation (VTA) for fine-grained temporal control within a unified video diffusion framework. The model achieves unprecedented efficiency, surpassing Wan-I2V on Vbench-I2V with only $500 training cost and 4k data.
๐ก Pro Tips for Best Results
๐๏ธ LoRA Alpha: Use values between 1-2 for optimal balance between quality and consistency
๐ Noise Multipliers: Lower values (0.0-0.3) for faithful conditioning, higher values (0.4-1.0) for more variation
๐ Conditioning Positions: Frame 0 is first frame, frame 20 is last frame in the 21-frame latent space
โ๏ธ Prompts: Be descriptive and specific for better results
๐ Important Links
๐ Project Page - Official project website
๐ Technical Report - Detailed research paper
๐ค Model on HuggingFace - Download models
๐ Training Dataset - Training data
โจ Made with โค๏ธ for the AI Community โจ
Experience the future of video generation with Pusa V1.0 ๐