← All posts

Kling 3.0 Prompting Guide: Multi-Shot, Native Audio, and Cinematic Control

Kling 3.0 (V3) is a fundamentally different model from its predecessors. It thinks in shots, not clips. It generates native audio alongside video. It can produce up to 6 labeled shots in a single pass. Most guides are still written for Kling 1.6. This isn't one of them.

What makes Kling 3.0 different

Three capabilities separate Kling 3.0 from every prior version:

This changes how you should write prompts entirely. You're no longer describing a visual moment. You're directing a scene.

The five-layer prompt structure

Kling 3.0 responds to structure. Use this order every time:

1. Scene

Establish the environment first. Be specific. Not "a room" — "a rain-slicked Tokyo alleyway at 2am, neon signs reflecting in the wet pavement, steam rising from a grate." The scene is the container everything else lives in.

2. Characters

Define who is in the scene with full visual descriptions. Use consistent labels — these labels are how you reference characters in dialogue later:

[Character A: exhausted detective in a rumpled navy suit, mid-50s, dark circles under his eyes]

3. Action

Describe what physically happens, sequentially. Beginning to end. Kling 3.0 understands temporal flow — "then," "as," "until" are meaningful. Include motion endpoints to prevent infinite loops:

Good: "walks to the window, pauses, then turns slowly to face the camera"
Bad: "walks around the room" (no endpoint — will loop or distort)

4. Camera

Always motivated movement. Why is the camera moving? Kling 3.0 understands:

5. Audio

When native audio is enabled, structure dialogue with character labels:

[Character A, voice tone]: "Exact words spoken."

Then describe the sound environment: music genre and tempo, ambient sounds, room acoustics.

Multi-shot format

For narrative sequences, label each shot explicitly:

Shot 1 (0-5s): Wide establishing shot — rain-slicked alleyway, neon reflections. Camera locked off static. Shot 2 (5-10s): Medium close-up — [Character A] steps into frame, collar up, scanning the space. Slow dolly push in. Shot 3 (10-15s): Tight close-up — his eyes land on something off-camera. Rack focus to background figure in the mist. AUDIO: Rain hitting pavement, distant traffic hum, no music. [Character A, low quiet voice]: "I thought you were dead."

Key rule: Define characters in the Shot descriptions first, then reference them by the same label in the AUDIO section. Inconsistent names break the character sync.

Motion intensity guide

ValueWhat it looks likeUse for
0.1–0.3Barely perceptible motionBreathing, micro-expressions, idle
0.4–0.6Natural, conversational paceDialogue scenes, casual movement
0.7–0.9Active, engaged motionWalking, gesturing, dynamic environment
1.0Maximum energyRunning, action, dramatic physical scenes

What Kling 3.0 excels at

Common failures and how to avoid them

Let HonePrompt write your Kling 3.0 prompts

Type your rough idea. Pick Kling 3.0. Get a structured, multi-shot ready prompt in seconds.

Try it free