Veo 3 UGC in Claude Code — Complete Guide

00

Overview: What This Replaces

From 3 tools to 1 terminal

Previously, creating a Veo 3 UGC video ad required bouncing between three platforms:

Before (Old Stack)	After (Claude Code)
Gemini Gem — Upload video, get UGC deconstruction + Veo 3 prompt	/veo3-prompt — Same analysis + prompt generation, with superior structured output
Custom GPT — Script writing, hook generation, ad copy	/hook-doctor + /copy-alchemist — Specialized skills for each task
VeoStack App — localhost:3333 web UI for prompt gen + kie.ai pipeline	/kieai + /ai-ugc-creator — Direct API calls from terminal, no server needed

Key Advantage

Everything runs in a single Claude Code session. No browser tabs, no localhost servers, no copy-pasting between tools. You upload an image, Claude does the analysis, generates the prompt, creates the images via kie.ai API, and animates them via Kling — all in one conversation.

01

Architecture: How It All Connects

Skills as pipeline stages

The system is a 6-stage pipeline where each stage maps to a dedicated Claude skill. You call them sequentially (or skip stages you don't need):

Deconstruct
/veo3-prompt

→

Prompt Engineer
/veo3-prompt

→

Script & Hooks
/hook-doctor

→

Generate Images
/kieai

→

Animate Video
/kieai

→

Full UGC Ad
/ai-ugc-creator

How Data Flows

You upload an image or screenshot of a UGC video you want to recreate
/veo3-prompt deconstructs the image into UGC components (camera, lighting, subject, motion, audio) and synthesizes a 2048-character Veo 3 prompt
You copy the Veo 3 prompt into Google Veo 3 to generate the video — OR continue in Claude Code:
/hook-doctor writes scroll-stopping hooks; /copy-alchemist writes the full ad script
/kieai generates storyboard images via Nano Banana 2, then animates them to 5s video clips via Kling
/ai-ugc-creator orchestrates the full pipeline: script → storyboard → images → video → export specs

Two Paths

Path A (Veo 3): Use /veo3-prompt to generate prompts → paste into Google Veo 3 for cinematic output.
Path B (kie.ai + Kling): Use /kieai to generate images + animate entirely within Claude Code. Lower cost, more control, no waitlist.

02

Skills Map: What Each Skill Does

7 skills, each with a specific role

🎬

/veo3-prompt

Replaces: Gemini Gem. Deconstructs UGC images/videos into authenticity components, then generates a single 2048-char Veo 3 prompt using the 6-section template. Also generates reusable character templates for multi-shot consistency.

🎣

/hook-doctor

Replaces: Custom GPT for hooks. Expert short-form video hook writer using Kallaway's Hook Framework. Generates scroll-stopping openers for the first 1.5 seconds of any UGC ad.

✍

/copy-alchemist

Replaces: Custom GPT for ad scripts. Direct response copywriting engine. Writes full video ad scripts, buyer personas, viral hooks, micro-lead copy. Powers the dialogue section of Veo 3 prompts.

🖼

/kieai

Replaces: VeoStack App image/video endpoints. Direct API calls to kie.ai — Nano Banana 2 for image generation, Kling v2.1 for image-to-video animation. Creates tasks, polls for completion, returns download URLs.

📹

/ai-ugc-creator

Replaces: VeoStack App UGC pipeline. Full end-to-end orchestration: concept → script → storyboard → image generation → video animation → voiceover → lip sync → final ad. Also includes AI influencer character creation with identity locking.

📊

/veostack-method

Strategy reference. The full VeoStack business method — Golden Stack Formula, multi-platform repurposing (1 clip → 7-10 assets), monetization models, weekly batching workflow. Call this for strategy decisions, not prompt generation.

📢

/meta-ads

Distribution. Once your UGC ad is ready, use this to plan and launch Meta ad campaigns. Covers audience targeting, budget allocation, creative testing, and optimization.

03

Step 1: Deconstruct a UGC Video

Skill: /veo3-prompt

What this does: You upload a screenshot or frame from a UGC video you like, and Claude reverse-engineers every element that makes it feel authentic — the camera type, framing imperfections, lighting, audio quality, subject performance, and social context.

How to Use It

1. Take a screenshot of any UGC video (TikTok, Reels, etc.)
2. In Claude Code, type:
   /veo3-prompt
3. Upload the screenshot when prompted
4. Say: "Deconstruct this and give me a Veo 3 prompt"

What Claude Analyzes

Implied Device & Capture ▼

Claude infers the camera model based on visual evidence:

Aspect ratio: 9:16 = phone vertical, 16:9 = landscape/webcam
Lens distortion: Wide-angle barrel distortion = front camera selfie
Dynamic range: Crushed shadows = older phone; HDR bloom = iPhone 14+
Noise pattern: Fine luminance noise = good sensor; chroma blotching = cheap sensor

Visual Authenticity Cues ▼

Framing: Off-center, too much headroom, rule of thirds broken
Camera motion: Handheld wobble, selfie grip jitter, abrupt pans
Lighting: Harsh overhead, uneven window light, ring light catch in eyes
Editing: Single take, rough jump cuts, no color grading
Visual noise: Grain in shadows, minor lens flare, JPEG compression

Audio & Subject Performance ▼

Background: Room echo, AC hum, muffled traffic, kitchen sounds
Mic quality: Phone-mic echo, proximity bass boost, wind noise
Delivery: Filler words ("um", "like"), natural gestures, eye contact with camera
Body language: Relaxed posture, casual hand movements, authentic expressions

Pro Tip

The better the screenshot, the better the analysis. Pause the video on a representative frame — ideally one that shows the person, the lighting, the setting, and any product. Avoid blurred motion frames.

04

Step 2: Generate the Veo 3 Prompt

Skill: /veo3-prompt (Step 2 output)

What this does: Claude takes everything from the deconstruction and compresses it into a single, copy-paste-ready Veo 3 prompt — exactly 2048 characters, following the official 6-section template.

The 6 Sections (Always in This Order)

#	Section	What It Contains
1	Cinematography & Shot Type	Shot size, camera model, framing, movement, focus, resolution, color grade, filename
2	Subject Description	Name, age, ethnicity, hair, face, eyes, skin, build, clothing, accessories
3	Action & Physics	Position, posture, specific movements in beats (3 minimum)
4	Environment & Lighting	Atmosphere, mood, light source and quality, shadow details
5	Audio & Dialogue	Mic type, audio quality, background sounds, voice characteristics, exact dialogue with filler words
6	Style Guidelines & Negatives	Visual style keywords, editing style, universal quality control negatives list

Critical Rule

The prompt is capped at exactly 2048 characters. This is Veo 3's sweet spot. Longer prompts get truncated and lose coherence. Shorter prompts leave too much to Veo's imagination.

Using the Output

Copy the prompt from the code block Claude outputs
Paste directly into Google Veo 3 (AI Test Kitchen or Flow)
Generate — the video should match the UGC feel of your reference
Iterate: If close but not right, tell Claude to change one variable (e.g., "make the lighting warmer" or "change to golden hour")

05

Step 3: Write the Script & Hooks

Skills: /hook-doctor + /copy-alchemist

Before generating visuals, nail the script. Two skills handle this:

/hook-doctor — The First 1.5 Seconds

/hook-doctor

"Write 10 scroll-stopping hooks for a UGC ad about
Daily Dosey dog supplement pouches. Target: female
dog owners 25-45 on Instagram Reels."

Returns 10 hooks ranked by pattern type (curiosity, controversy, transformation, social proof, etc.). Pick the strongest one for your Veo 3 prompt dialogue.

/copy-alchemist — The Full Script

/copy-alchemist

"Write a 15-second UGC video ad script for Daily Dosey.
Hook: [paste winning hook from hook-doctor]
Structure: Hook (0-3s) > One Benefit (3-12s) > CTA (12-15s)
Tone: casual, authentic, like texting a friend
Include filler words for realism."

The One-Benefit Rule

Every UGC ad promotes ONE benefit per ad. Not three. Not five. One. Multiple benefits dilute the message and feel scripted — the opposite of authentic UGC.

The script output feeds directly into Section 5 (Audio & Dialogue) of your Veo 3 prompt.

06

Step 4: Generate Storyboard Images

Skill: /kieai

If you're using Path B (kie.ai + Kling) instead of pasting into Veo 3, this is where you generate your storyboard images.

How to Call It

/kieai

"Generate an image: A 28-year-old Indian woman with
shoulder-length black hair, sitting in a modern kitchen,
holding a Daily Dosey stand-up pouch, looking at camera
with a surprised expression, natural window lighting,
slightly off-center framing, photorealistic, iPhone quality,
natural lighting, no text, no watermarks"

What Happens Under the Hood

Claude sends a POST request to kie.ai's createTask endpoint with the Nano Banana 2 model
Gets back a taskId
Polls every 10-15 seconds until state = "success"
Returns the image URL you can view and download

Cost

Nano Banana 2 image generation is very cheap — fractions of a cent per image. Generate multiple versions and pick the best one.

UGC Image Authenticity Tricks

Add these to your image prompts to make AI images look like real phone photos:

End every prompt with: "photorealistic, iPhone quality, natural lighting, no text, no watermarks"
Describe imperfect framing: "slightly off-center, too much headroom"
Include environmental mess: "messy desk in background", "laundry basket visible"
Specify the social scenario: "selfie taken in a bathroom mirror" not just "woman smiling"
For Indian market: always specify "Indian woman/man" explicitly
For TFT products: Daily Dosey is a stand-up pouch (NEVER jar/bottle)

07

Step 5: Animate Images to Video

Skill: /kieai (Kling model)

Once you have storyboard images, animate them into 5-second video clips using Kling AI via the same /kieai skill.

How to Call It

/kieai

"Animate this image to video:
Image URL: [paste the URL from Step 4]
Motion: The woman looks at camera with a surprised expression,
then holds up the pouch and smiles. Natural handheld selfie
motion with subtle shake. Slight zoom-in on product.
Duration: 5 seconds
Aspect: 9:16
Model: kling-v2.1-pro-i2v"

Video Model Tiers

Model	Resolution	Cost/5s	Best For
kling-v2.1-standard-i2v	720p	$0.125	Quick tests, drafts
kling-v2.1-pro-i2v	1080p	$0.25	Production ads (recommended)
kling-v2.1-master-i2v	1080p+	$0.80	Premium quality, hero shots

Motion Prompt Best Practices

Describe in beats: "She looks up, pauses, then holds up the product" (not "she moves naturally")
Camera motion: Always include "Natural handheld selfie motion with subtle shake" for UGC feel
Keep it simple: 2-3 actions max for a 5-second clip
Pure static = looks AI. Pure chaos = also looks AI. The middle ground is real.

08

Step 6: Build an AI Influencer Character

Skill: /ai-ugc-creator (AI Influencer Module)

For brands that need a consistent AI creator across multiple ads, the /ai-ugc-creator skill includes an AI Influencer module with identity locking.

How to Create a Character

/ai-ugc-creator

"Build an AI influencer character for Treat for Tails:
- Female, 28-30, Indian
- Warm, approachable, dog-mom energy
- Casual style (oversized tees, messy bun)
- Generate a character template + model sheet"

Identity Lock System

Character Template: Claude generates a reusable text block with exact physical features
Model Sheet: Multi-angle reference grid (front, 3/4, side) generated via /kieai
Consistency Test: Generate 5+ images in different settings — the character should be recognizable
Repeat the FULL character description in every single prompt — never abbreviate

Common Mistake

Don't shorthand the character description after the first prompt. AI models have no memory between generations. Paste the complete character template every time or the face/body will drift.

09

Step 7: Produce Multi-Shot UGC Ads

Skill: /ai-ugc-creator (Full Pipeline)

For a complete production-ready UGC ad (not just a single Veo 3 clip), use the full orchestration pipeline:

/ai-ugc-creator

"Create a 6-shot UGC ad for Daily Dosey dog supplement.
Platform: Instagram Reels 9:16
Character: 28-year-old Indian woman, dog mom
Setting: Modern apartment, living room + kitchen
Structure: Hook > Problem > Discovery > Demo > Result > CTA"

What Claude Produces

Ad script with voiceover text per shot
6 image prompts (Nano Banana 2) — character locked across all
6 motion prompts (Kling) — one per shot with camera directions
Platform export specs (resolution, duration, aspect ratio)

Then call /kieai for each shot to generate images and animate them. The skill handles this sequentially — generate image, wait for completion, animate, wait, move to next shot.

Post-Production (Manual Steps)

Voiceover: Run the script through ElevenLabs (or use Veo 3's native audio for speech)
Lip sync: Use ElevenLabs Flows or CapCut for lip-syncing
Assembly: Stitch clips in CapCut, add text overlays, export

10

Reference: The 6-Section Prompt Template

Copy and fill in the blanks

[1. CINEMATOGRAPHY]
Shot Size: Selfie Shot (Vertical 9:16).
Camera: IPHONE 15 PRO Front Camera (~24mm equivalent).
Framing: [FRAMING], filmed [LOCATION].
Movement: [MOVEMENT].
Focus: [DEPTH OF FIELD].
Resolution: 720x1280 (Vertical).
Grade: iPhone HDR auto-tone; [COLOR PALETTE]; [FILTER].
Filename: "IMG_[XXXX].MOV".

[2. SUBJECT]
Subject: [NAME], a [AGE] [ETHNICITY] [GENDER] with [HAIR].
Face: [FACIAL FEATURES].
Eyes: [COLOR] [SHAPE] eyes [DETAILS].
Skin: [TONE with undertones, natural realistic pores].
Build: [BUILD]. Attire: [CLOTHING] and [ACCESSORIES].

[3. ACTION & PHYSICS]
Position: [He/She] [sits/stands] [WHERE].
Physics: Holds phone at arm's length. [POSTURE].
Movements:
- [Beat 1]
- [Beat 2]
- [Beat 3]

[4. ENVIRONMENT & LIGHTING]
Atmosphere: [MOOD] -- like [he/she]'s [EMOTIONAL CONTEXT].
Lighting: [SOURCE & QUALITY], illuminating face [HOW].
Shadows: [SHADOW DETAILS].

[5. AUDIO & DIALOGUE]
Audio: [PHONE] internal mic. [QUALITY]. [BG SOUNDS].
Voice: [CHARACTERISTICS]. Tone: [TONE].
Dialogue:
[NAME] says: "[SCRIPT WITH FILLER WORDS. 3-8 SENTENCES.]"

[6. STYLE & NEGATIVES]
Style: Smartphone selfie, handheld realism, direct-to-camera,
raw unfiltered [PLATFORM] aesthetic, [EDITING STYLE].
Negatives: Subtitles, captions, watermark, text overlays,
logo, branding, blurry, artifacts, cartoon effects,
distorted hands, artificial lighting, oversaturation.

11

Reference: UGC Authenticity Tricks

Make AI output look like real phone footage

For Image Prompts (kie.ai / Midjourney)

Trick	What to Add	Why It Works
Kill beautification	`--stylize 0 --style raw` (MJ only)	Removes AI "perfection" that screams fake
Specify device	"taken on iPhone 11"	Triggers device-specific rendering characteristics
Add filename	"IMG_4673.HEIC"	HEIC = higher dynamic range; JPG = grainier
Social platform	"Posted on Instagram"	Applies platform-specific compression artifacts
Timeframe	"Posted in 2016"	Matches era-specific phone camera quality
Controlled randomness	`--weird 4` (MJ only)	Introduces natural imperfection
Social scenario	"photo taken at a work party"	Contextualizes the pose and setting

For Video Prompts (Veo 3 / Kling)

Camera motion: "Subtle handheld sway and jitter consistent with a selfie grip" — not "smooth" or "static"
Imperfect framing: "Slightly off-center, too much headroom on the left"
Lighting flaws: "Uneven natural light, slight overexposure on the right cheek"
Audio imperfections: "Faint AC hum", "slight room echo"
Filler words in dialogue: "uh", "like", "you know", "honestly" — real people don't speak in clean sentences
Environmental clutter: "Messy desk visible behind", "laundry basket in corner"

12

Reference: Veo 3 Prompting Rules

From Google's official prompting guide

Prompt Anatomy (order matters) ▼

[Cinematography/Lens] + [Subject] + [Action/Physics] + [Environment] + [Lighting] + [Audio/Dialogue]

This order gives Veo the visual hierarchy it needs. Camera first = it "sets up the shot" before populating it.

Motion: Beats, Not Vagueness ▼

Weak: "Actor walks across the room"

Strong: "Actor takes four steps to the window, pauses, and pulls the curtain in the final second"

Describe actions in beats or counts — small steps, gestures, pauses. This gives Veo timing anchors.

Lighting & Color Consistency ▼

For multi-shot ads, name 3-5 specific colors to keep palette stable.

Weak: "bright room"

Strong: "Soft window light with a warm lamp fill and a cool edge from the hallway"

Describe both the quality of light AND the color anchors.

Dialogue & Audio Rules ▼

Format: Character Name: "Line of dialogue."
Timing: A 4-second shot fits ONE short exchange
Long speeches break lip-sync — keep it concise
Always specify ambient audio even for "silent" shots

Image Input for Character Consistency ▼

Upload a reference image (from kie.ai or Midjourney)
Veo uses it as an anchor for the first frame
Your text prompt defines what happens next
This is the best way to maintain character across shots

Iterate with Remix ▼

Change one variable at a time when a result is close:

"Same shot, but change the lighting to Golden Hour"
"Same action, but add the sound of a police siren"

If misfiring: freeze the camera, simplify the action, clear the background. Layer complexity back step by step.

13

Production Checklist

Run through this before every ad

Pre-Production

Reference video/image selected and screenshot taken
Product identified (name, type, key benefit)
Target platform chosen (TikTok/Reels/Shorts/Feed)
Character template created (if multi-shot)
One-benefit rule: single benefit identified for this ad

Script

Hook written via /hook-doctor (1.5s scroll-stopper)
Full script via /copy-alchemist (Hook > Benefit > CTA)
Filler words included ("uh", "like", "honestly")
Script fits 15-second format (3-8 sentences max)

Generation

Veo 3 prompt generated via /veo3-prompt (2048 chars, 6 sections)
OR storyboard images generated via /kieai
Images pass UGC authenticity check (imperfect framing, natural lighting)
Video clips animated via /kieai Kling (Pro model, 9:16)
Motion prompts use beats/counts, not vague descriptions

Post-Production

Voiceover added (ElevenLabs or Veo native audio)
Lip sync verified (if talking head)
Clips stitched in order (CapCut or similar)
No visible AI artifacts (distorted hands, wrong proportions)
No text overlays in the generated video (add in post only)
Exported at correct specs for target platform

Platform Export Specs

Platform	Aspect	Resolution	Duration
TikTok / Reels / Shorts	9:16	1080x1920	6-30s
Instagram Feed	1:1	1080x1080	15-60s
Facebook Feed	4:5	1080x1350	15-30s
YouTube Pre-roll	16:9	1920x1080	15-30s

Veo 3 UGC Video Systemin Claude Code

How Data Flows

How to Use It

What Claude Analyzes

The 6 Sections (Always in This Order)

Using the Output

/hook-doctor — The First 1.5 Seconds

/copy-alchemist — The Full Script

How to Call It

What Happens Under the Hood

UGC Image Authenticity Tricks

How to Call It

Video Model Tiers

Motion Prompt Best Practices

How to Create a Character

Identity Lock System

What Claude Produces

Post-Production (Manual Steps)

For Image Prompts (kie.ai / Midjourney)

For Video Prompts (Veo 3 / Kling)

Pre-Production

Script

Generation

Post-Production

Platform Export Specs

Veo 3 UGC Video System
in Claude Code