Kling 3.0 adds multi-shot generation, native audio, and 15-second videos
Kling 3.0 is a category-defining release. The Omni Model architecture processes text, image, and audio simultaneously, enabling coherent multi-shot videos with synchronized dialogue in a single generation.
What changed for prompting
Multi-shot prompting: describe up to 6 labeled shots in one prompt. No more stitching clips together manually.
Native audio: add an AUDIO section to every prompt. Characters can speak with specific voice tones using [Character A: description, voice] labels.
Motion intensity: numerical values from 0.1 to 1.0 now produce predictable results. Use 0.4 for subtle and 0.9 for high energy.
Temporal flow is required. Describe beginning, middle, and end. The model understands narrative arc.
Named light sources produce better results. "Neon signs", "candlelight", and "golden hour" outperform "dramatic lighting".
HonePrompt updated: Kling system prompt fully rebuilt for Kling 3.0.
Hone a Kling 3.0 prompt →