BakeBetter Apps·Explainer Hub
← All Explainers
← Hub
Replaces Gemini Gem + Custom GPTs

Veo 3 UGC Video System
in Claude Code

The complete pipeline for deconstructing UGC videos, engineering Veo 3 prompts, generating images, animating to video, and producing full AI UGC ads — all without leaving your terminal.

7
Skills
6
Pipeline Stages
0
External Apps
00
Overview: What This Replaces
From 3 tools to 1 terminal

Previously, creating a Veo 3 UGC video ad required bouncing between three platforms:

Before (Old Stack)After (Claude Code)
Gemini Gem — Upload video, get UGC deconstruction + Veo 3 prompt/veo3-prompt — Same analysis + prompt generation, with superior structured output
Custom GPT — Script writing, hook generation, ad copy/hook-doctor + /copy-alchemist — Specialized skills for each task
VeoStack App — localhost:3333 web UI for prompt gen + kie.ai pipeline/kieai + /ai-ugc-creator — Direct API calls from terminal, no server needed
Key Advantage
Everything runs in a single Claude Code session. No browser tabs, no localhost servers, no copy-pasting between tools. You upload an image, Claude does the analysis, generates the prompt, creates the images via kie.ai API, and animates them via Kling — all in one conversation.
01
Architecture: How It All Connects
Skills as pipeline stages

The system is a 6-stage pipeline where each stage maps to a dedicated Claude skill. You call them sequentially (or skip stages you don't need):

Deconstruct
/veo3-prompt
Prompt Engineer
/veo3-prompt
Script & Hooks
/hook-doctor
Generate Images
/kieai
Animate Video
/kieai
Full UGC Ad
/ai-ugc-creator

How Data Flows

  1. You upload an image or screenshot of a UGC video you want to recreate
  2. /veo3-prompt deconstructs the image into UGC components (camera, lighting, subject, motion, audio) and synthesizes a 2048-character Veo 3 prompt
  3. You copy the Veo 3 prompt into Google Veo 3 to generate the video — OR continue in Claude Code:
  4. /hook-doctor writes scroll-stopping hooks; /copy-alchemist writes the full ad script
  5. /kieai generates storyboard images via Nano Banana 2, then animates them to 5s video clips via Kling
  6. /ai-ugc-creator orchestrates the full pipeline: script → storyboard → images → video → export specs
Two Paths
Path A (Veo 3): Use /veo3-prompt to generate prompts → paste into Google Veo 3 for cinematic output.
Path B (kie.ai + Kling): Use /kieai to generate images + animate entirely within Claude Code. Lower cost, more control, no waitlist.
02
Skills Map: What Each Skill Does
7 skills, each with a specific role
🎬
/veo3-prompt
Replaces: Gemini Gem. Deconstructs UGC images/videos into authenticity components, then generates a single 2048-char Veo 3 prompt using the 6-section template. Also generates reusable character templates for multi-shot consistency.
🎣
/hook-doctor
Replaces: Custom GPT for hooks. Expert short-form video hook writer using Kallaway's Hook Framework. Generates scroll-stopping openers for the first 1.5 seconds of any UGC ad.
/copy-alchemist
Replaces: Custom GPT for ad scripts. Direct response copywriting engine. Writes full video ad scripts, buyer personas, viral hooks, micro-lead copy. Powers the dialogue section of Veo 3 prompts.
🖼
/kieai
Replaces: VeoStack App image/video endpoints. Direct API calls to kie.ai — Nano Banana 2 for image generation, Kling v2.1 for image-to-video animation. Creates tasks, polls for completion, returns download URLs.
📹
/ai-ugc-creator
Replaces: VeoStack App UGC pipeline. Full end-to-end orchestration: concept → script → storyboard → image generation → video animation → voiceover → lip sync → final ad. Also includes AI influencer character creation with identity locking.
📊
/veostack-method
Strategy reference. The full VeoStack business method — Golden Stack Formula, multi-platform repurposing (1 clip → 7-10 assets), monetization models, weekly batching workflow. Call this for strategy decisions, not prompt generation.
📢
/meta-ads
Distribution. Once your UGC ad is ready, use this to plan and launch Meta ad campaigns. Covers audience targeting, budget allocation, creative testing, and optimization.
03
Step 1: Deconstruct a UGC Video
Skill: /veo3-prompt

What this does: You upload a screenshot or frame from a UGC video you like, and Claude reverse-engineers every element that makes it feel authentic — the camera type, framing imperfections, lighting, audio quality, subject performance, and social context.

How to Use It

1. Take a screenshot of any UGC video (TikTok, Reels, etc.)
2. In Claude Code, type:
   /veo3-prompt
3. Upload the screenshot when prompted
4. Say: "Deconstruct this and give me a Veo 3 prompt"

What Claude Analyzes

Implied Device & Capture

Claude infers the camera model based on visual evidence:

  • Aspect ratio: 9:16 = phone vertical, 16:9 = landscape/webcam
  • Lens distortion: Wide-angle barrel distortion = front camera selfie
  • Dynamic range: Crushed shadows = older phone; HDR bloom = iPhone 14+
  • Noise pattern: Fine luminance noise = good sensor; chroma blotching = cheap sensor
Visual Authenticity Cues
  • Framing: Off-center, too much headroom, rule of thirds broken
  • Camera motion: Handheld wobble, selfie grip jitter, abrupt pans
  • Lighting: Harsh overhead, uneven window light, ring light catch in eyes
  • Editing: Single take, rough jump cuts, no color grading
  • Visual noise: Grain in shadows, minor lens flare, JPEG compression
Audio & Subject Performance
  • Background: Room echo, AC hum, muffled traffic, kitchen sounds
  • Mic quality: Phone-mic echo, proximity bass boost, wind noise
  • Delivery: Filler words ("um", "like"), natural gestures, eye contact with camera
  • Body language: Relaxed posture, casual hand movements, authentic expressions
Pro Tip
The better the screenshot, the better the analysis. Pause the video on a representative frame — ideally one that shows the person, the lighting, the setting, and any product. Avoid blurred motion frames.
04
Step 2: Generate the Veo 3 Prompt
Skill: /veo3-prompt (Step 2 output)

What this does: Claude takes everything from the deconstruction and compresses it into a single, copy-paste-ready Veo 3 prompt — exactly 2048 characters, following the official 6-section template.

The 6 Sections (Always in This Order)

#SectionWhat It Contains
1Cinematography & Shot TypeShot size, camera model, framing, movement, focus, resolution, color grade, filename
2Subject DescriptionName, age, ethnicity, hair, face, eyes, skin, build, clothing, accessories
3Action & PhysicsPosition, posture, specific movements in beats (3 minimum)
4Environment & LightingAtmosphere, mood, light source and quality, shadow details
5Audio & DialogueMic type, audio quality, background sounds, voice characteristics, exact dialogue with filler words
6Style Guidelines & NegativesVisual style keywords, editing style, universal quality control negatives list
Critical Rule
The prompt is capped at exactly 2048 characters. This is Veo 3's sweet spot. Longer prompts get truncated and lose coherence. Shorter prompts leave too much to Veo's imagination.

Using the Output

  1. Copy the prompt from the code block Claude outputs
  2. Paste directly into Google Veo 3 (AI Test Kitchen or Flow)
  3. Generate — the video should match the UGC feel of your reference
  4. Iterate: If close but not right, tell Claude to change one variable (e.g., "make the lighting warmer" or "change to golden hour")
05
Step 3: Write the Script & Hooks
Skills: /hook-doctor + /copy-alchemist

Before generating visuals, nail the script. Two skills handle this:

/hook-doctor — The First 1.5 Seconds

/hook-doctor

"Write 10 scroll-stopping hooks for a UGC ad about
Daily Dosey dog supplement pouches. Target: female
dog owners 25-45 on Instagram Reels."

Returns 10 hooks ranked by pattern type (curiosity, controversy, transformation, social proof, etc.). Pick the strongest one for your Veo 3 prompt dialogue.

/copy-alchemist — The Full Script

/copy-alchemist

"Write a 15-second UGC video ad script for Daily Dosey.
Hook: [paste winning hook from hook-doctor]
Structure: Hook (0-3s) > One Benefit (3-12s) > CTA (12-15s)
Tone: casual, authentic, like texting a friend
Include filler words for realism."
The One-Benefit Rule
Every UGC ad promotes ONE benefit per ad. Not three. Not five. One. Multiple benefits dilute the message and feel scripted — the opposite of authentic UGC.

The script output feeds directly into Section 5 (Audio & Dialogue) of your Veo 3 prompt.

06
Step 4: Generate Storyboard Images
Skill: /kieai

If you're using Path B (kie.ai + Kling) instead of pasting into Veo 3, this is where you generate your storyboard images.

How to Call It

/kieai

"Generate an image: A 28-year-old Indian woman with
shoulder-length black hair, sitting in a modern kitchen,
holding a Daily Dosey stand-up pouch, looking at camera
with a surprised expression, natural window lighting,
slightly off-center framing, photorealistic, iPhone quality,
natural lighting, no text, no watermarks"

What Happens Under the Hood

  1. Claude sends a POST request to kie.ai's createTask endpoint with the Nano Banana 2 model
  2. Gets back a taskId
  3. Polls every 10-15 seconds until state = "success"
  4. Returns the image URL you can view and download
Cost
Nano Banana 2 image generation is very cheap — fractions of a cent per image. Generate multiple versions and pick the best one.

UGC Image Authenticity Tricks

Add these to your image prompts to make AI images look like real phone photos:

  • End every prompt with: "photorealistic, iPhone quality, natural lighting, no text, no watermarks"
  • Describe imperfect framing: "slightly off-center, too much headroom"
  • Include environmental mess: "messy desk in background", "laundry basket visible"
  • Specify the social scenario: "selfie taken in a bathroom mirror" not just "woman smiling"
  • For Indian market: always specify "Indian woman/man" explicitly
  • For TFT products: Daily Dosey is a stand-up pouch (NEVER jar/bottle)
07
Step 5: Animate Images to Video
Skill: /kieai (Kling model)

Once you have storyboard images, animate them into 5-second video clips using Kling AI via the same /kieai skill.

How to Call It

/kieai

"Animate this image to video:
Image URL: [paste the URL from Step 4]
Motion: The woman looks at camera with a surprised expression,
then holds up the pouch and smiles. Natural handheld selfie
motion with subtle shake. Slight zoom-in on product.
Duration: 5 seconds
Aspect: 9:16
Model: kling-v2.1-pro-i2v"

Video Model Tiers

ModelResolutionCost/5sBest For
kling-v2.1-standard-i2v720p$0.125Quick tests, drafts
kling-v2.1-pro-i2v1080p$0.25Production ads (recommended)
kling-v2.1-master-i2v1080p+$0.80Premium quality, hero shots

Motion Prompt Best Practices

  • Describe in beats: "She looks up, pauses, then holds up the product" (not "she moves naturally")
  • Camera motion: Always include "Natural handheld selfie motion with subtle shake" for UGC feel
  • Keep it simple: 2-3 actions max for a 5-second clip
  • Pure static = looks AI. Pure chaos = also looks AI. The middle ground is real.
08
Step 6: Build an AI Influencer Character
Skill: /ai-ugc-creator (AI Influencer Module)

For brands that need a consistent AI creator across multiple ads, the /ai-ugc-creator skill includes an AI Influencer module with identity locking.

How to Create a Character

/ai-ugc-creator

"Build an AI influencer character for Treat for Tails:
- Female, 28-30, Indian
- Warm, approachable, dog-mom energy
- Casual style (oversized tees, messy bun)
- Generate a character template + model sheet"

Identity Lock System

  1. Character Template: Claude generates a reusable text block with exact physical features
  2. Model Sheet: Multi-angle reference grid (front, 3/4, side) generated via /kieai
  3. Consistency Test: Generate 5+ images in different settings — the character should be recognizable
  4. Repeat the FULL character description in every single prompt — never abbreviate
Common Mistake
Don't shorthand the character description after the first prompt. AI models have no memory between generations. Paste the complete character template every time or the face/body will drift.
09
Step 7: Produce Multi-Shot UGC Ads
Skill: /ai-ugc-creator (Full Pipeline)

For a complete production-ready UGC ad (not just a single Veo 3 clip), use the full orchestration pipeline:

/ai-ugc-creator

"Create a 6-shot UGC ad for Daily Dosey dog supplement.
Platform: Instagram Reels 9:16
Character: 28-year-old Indian woman, dog mom
Setting: Modern apartment, living room + kitchen
Structure: Hook > Problem > Discovery > Demo > Result > CTA"

What Claude Produces

  1. Ad script with voiceover text per shot
  2. 6 image prompts (Nano Banana 2) — character locked across all
  3. 6 motion prompts (Kling) — one per shot with camera directions
  4. Platform export specs (resolution, duration, aspect ratio)

Then call /kieai for each shot to generate images and animate them. The skill handles this sequentially — generate image, wait for completion, animate, wait, move to next shot.

Post-Production (Manual Steps)

  • Voiceover: Run the script through ElevenLabs (or use Veo 3's native audio for speech)
  • Lip sync: Use ElevenLabs Flows or CapCut for lip-syncing
  • Assembly: Stitch clips in CapCut, add text overlays, export
10
Reference: The 6-Section Prompt Template
Copy and fill in the blanks
[1. CINEMATOGRAPHY]
Shot Size: Selfie Shot (Vertical 9:16).
Camera: IPHONE 15 PRO Front Camera (~24mm equivalent).
Framing: [FRAMING], filmed [LOCATION].
Movement: [MOVEMENT].
Focus: [DEPTH OF FIELD].
Resolution: 720x1280 (Vertical).
Grade: iPhone HDR auto-tone; [COLOR PALETTE]; [FILTER].
Filename: "IMG_[XXXX].MOV".

[2. SUBJECT]
Subject: [NAME], a [AGE] [ETHNICITY] [GENDER] with [HAIR].
Face: [FACIAL FEATURES].
Eyes: [COLOR] [SHAPE] eyes [DETAILS].
Skin: [TONE with undertones, natural realistic pores].
Build: [BUILD]. Attire: [CLOTHING] and [ACCESSORIES].

[3. ACTION & PHYSICS]
Position: [He/She] [sits/stands] [WHERE].
Physics: Holds phone at arm's length. [POSTURE].
Movements:
- [Beat 1]
- [Beat 2]
- [Beat 3]

[4. ENVIRONMENT & LIGHTING]
Atmosphere: [MOOD] -- like [he/she]'s [EMOTIONAL CONTEXT].
Lighting: [SOURCE & QUALITY], illuminating face [HOW].
Shadows: [SHADOW DETAILS].

[5. AUDIO & DIALOGUE]
Audio: [PHONE] internal mic. [QUALITY]. [BG SOUNDS].
Voice: [CHARACTERISTICS]. Tone: [TONE].
Dialogue:
[NAME] says: "[SCRIPT WITH FILLER WORDS. 3-8 SENTENCES.]"

[6. STYLE & NEGATIVES]
Style: Smartphone selfie, handheld realism, direct-to-camera,
raw unfiltered [PLATFORM] aesthetic, [EDITING STYLE].
Negatives: Subtitles, captions, watermark, text overlays,
logo, branding, blurry, artifacts, cartoon effects,
distorted hands, artificial lighting, oversaturation.
11
Reference: UGC Authenticity Tricks
Make AI output look like real phone footage

For Image Prompts (kie.ai / Midjourney)

TrickWhat to AddWhy It Works
Kill beautification--stylize 0 --style raw (MJ only)Removes AI "perfection" that screams fake
Specify device"taken on iPhone 11"Triggers device-specific rendering characteristics
Add filename"IMG_4673.HEIC"HEIC = higher dynamic range; JPG = grainier
Social platform"Posted on Instagram"Applies platform-specific compression artifacts
Timeframe"Posted in 2016"Matches era-specific phone camera quality
Controlled randomness--weird 4 (MJ only)Introduces natural imperfection
Social scenario"photo taken at a work party"Contextualizes the pose and setting

For Video Prompts (Veo 3 / Kling)

  • Camera motion: "Subtle handheld sway and jitter consistent with a selfie grip" — not "smooth" or "static"
  • Imperfect framing: "Slightly off-center, too much headroom on the left"
  • Lighting flaws: "Uneven natural light, slight overexposure on the right cheek"
  • Audio imperfections: "Faint AC hum", "slight room echo"
  • Filler words in dialogue: "uh", "like", "you know", "honestly" — real people don't speak in clean sentences
  • Environmental clutter: "Messy desk visible behind", "laundry basket in corner"
12
Reference: Veo 3 Prompting Rules
From Google's official prompting guide
Prompt Anatomy (order matters)

[Cinematography/Lens] + [Subject] + [Action/Physics] + [Environment] + [Lighting] + [Audio/Dialogue]

This order gives Veo the visual hierarchy it needs. Camera first = it "sets up the shot" before populating it.

Motion: Beats, Not Vagueness

Weak: "Actor walks across the room"

Strong: "Actor takes four steps to the window, pauses, and pulls the curtain in the final second"

Describe actions in beats or counts — small steps, gestures, pauses. This gives Veo timing anchors.

Lighting & Color Consistency

For multi-shot ads, name 3-5 specific colors to keep palette stable.

Weak: "bright room"

Strong: "Soft window light with a warm lamp fill and a cool edge from the hallway"

Describe both the quality of light AND the color anchors.

Dialogue & Audio Rules
  • Format: Character Name: "Line of dialogue."
  • Timing: A 4-second shot fits ONE short exchange
  • Long speeches break lip-sync — keep it concise
  • Always specify ambient audio even for "silent" shots
Image Input for Character Consistency
  1. Upload a reference image (from kie.ai or Midjourney)
  2. Veo uses it as an anchor for the first frame
  3. Your text prompt defines what happens next
  4. This is the best way to maintain character across shots
Iterate with Remix

Change one variable at a time when a result is close:

  • "Same shot, but change the lighting to Golden Hour"
  • "Same action, but add the sound of a police siren"

If misfiring: freeze the camera, simplify the action, clear the background. Layer complexity back step by step.

13
Production Checklist
Run through this before every ad

Pre-Production

  • Reference video/image selected and screenshot taken
  • Product identified (name, type, key benefit)
  • Target platform chosen (TikTok/Reels/Shorts/Feed)
  • Character template created (if multi-shot)
  • One-benefit rule: single benefit identified for this ad

Script

  • Hook written via /hook-doctor (1.5s scroll-stopper)
  • Full script via /copy-alchemist (Hook > Benefit > CTA)
  • Filler words included ("uh", "like", "honestly")
  • Script fits 15-second format (3-8 sentences max)

Generation

  • Veo 3 prompt generated via /veo3-prompt (2048 chars, 6 sections)
  • OR storyboard images generated via /kieai
  • Images pass UGC authenticity check (imperfect framing, natural lighting)
  • Video clips animated via /kieai Kling (Pro model, 9:16)
  • Motion prompts use beats/counts, not vague descriptions

Post-Production

  • Voiceover added (ElevenLabs or Veo native audio)
  • Lip sync verified (if talking head)
  • Clips stitched in order (CapCut or similar)
  • No visible AI artifacts (distorted hands, wrong proportions)
  • No text overlays in the generated video (add in post only)
  • Exported at correct specs for target platform

Platform Export Specs

PlatformAspectResolutionDuration
TikTok / Reels / Shorts9:161080x19206-30s
Instagram Feed1:11080x108015-60s
Facebook Feed4:51080x135015-30s
YouTube Pre-roll16:91920x108015-30s
Built with Claude Code — Veo 3 UGC System • April 2026