Airium
Course · viral formats · ~6 minutes

First 2 seconds: the science of the hook

Algorithms don't care what you shot with. They care whether people watch to the end. This course is about the mechanics of attention: hooks, structures, and 5 formats that AI generates better than any camera.

🪝 7 hook types📐 retention formula🎬 5 viral formats✍️ ready-made prompts
Start learning →
Course frames generated in Airium Studio
1
Lesson 1 · 2 minutes

Hook: 7 ways to stop the scroll

Viewers decide in 1.7 seconds. A hook isn't the beginning of a clip — it's the reason to watch it:

7 hook types · with prompt templates
1 · The Impossible
the brain is compelled to watch something that shouldn't be possible
"an elephant on a tightrope above a megacity, the camera drops down"
2 · Unfinished Action
a hand reaches for the button — what happens next?
"a finger slowly approaches the red button, everything trembles"
3 · Transformation
promise of metamorphosis in the first frame
"a drawing on paper comes to life and becomes real"
4 · Pattern Break
familiar scene + one wrong detail
"an ordinary office, but coffee pours upward"
5 · Face + Emotion
close-up of a reaction before the cause is revealed
"close-up: eyes widen in horror, reflection in the pupil"
6 · Sound Strike
Grok/Veo only: dialogue or sound from the very first frame
"a character looks into the camera and says: “Don’t scroll. Seriously.”"
7 · Scale
camera pulls back — and the entire understanding of the scene changes
"macro dew drop, sharp pull-back — it's a whale's eye"
🪝 Rule: the hook goes IN THE FIRST FRAME of the prompt, not in the second sentence. The AI renders what's written first.
🔊 Hooks 5 and 6 are Grok Imagine 1.5's superpower: a lip-sync line straight to camera from the first frame — impossible with regular shooting without an actor.
2
Lesson 2 · 1 minute

Structure: the retention formula

The hook buys 2 seconds. After that, structure takes over:

6-second clip formula
0–2 sec
HOOK
stop the scroll
2–4 sec
DEVELOPMENT
"what happens next"
4–6 sec
PAYOFF
resolution or loop
the full formula in one prompt (Grok, 6 sec)
"Coffee pours OUT of the cup UPWARD to the ceiling (hook), the barista notices and slowly looks up (development), the entire café is floating in the air, customers calmly drinking upside-down coffee (payoff), one static frame, smooth motion"
🔁 Loop: if the last frame ≈ the first, the clip loops and plays 2–3 times — algorithms count this as a rewatch. In Kling 3.0/2.1, set identical start and end frames.
3
Lesson 3 · 2 minutes

5 viral format machines

Formats where AI beats the camera — and which model to use for each:

format → model → prompt framework

🔄 Infinite zoom · Kling 3.0 multi-shot
"camera flies into a pupil → inside a city → flies into a window → inside a room → on the table, an eye" — each layer = a new multi-shot scene

🥛 Impossible physics ASMR · Hailuo 02
"a knife slices a glass orange, the segments chime and scatter as crystals, macro, slow motion" — Hailuo physics + calm rhythm = hypnotic scroll-stop

🗣 The impossible speaks · Grok 1.5 (sound!)
"a street pigeon looks into the camera and says tiredly: “You again with your bread”" — lip-sync on animals/objects is pure viral gold

⏪ Before/After · Kling 2.1 (start + end frame)
old courtyard (photo) → the same courtyard as a dream garden. Transitioning between YOUR OWN frames = personal story, comments guaranteed

📦 Hero product · Seedream → Seedance
product frame in Seedream → "a jar opens like petals, a glowing garden inside" in Seedance — ads people actually watch to the end

🎯 One clip = one idea. Viral videos are deceptively simple: an impossible thing + an everyday context.
4
Lesson 4 · 1 minute

The Series Approach: virality as a system

One viral clip is luck. A series is strategy:

series formula
series template
"{Object N} looks into the camera and says: “{tired phrase about its job}”" — a pigeon, an ATM, a traffic light, an elevator, a Wi-Fi router…
clip 1
🐦 pigeon
clip 2
🏧 ATM
clip 3
🚦 traffic light
+your variation
📈 Change ONE variable in the template, keep everything else the same — the audience recognizes the format and expects more.
💸 Series economics: drafts on budget models (Seedance Fast, 480p), the winning clip's final version — in 720p/1080p.
5
Lesson 5 · 30 seconds

Pre-publish checklist

Run your clip through the checklist — each item increases your chances:

virality checklist
✅ Hook in the first 2 seconds (not titles, not a logo!)
✅ 9:16 vertical for Reels/Shorts/TikTok
✅ Has audio (Grok/Veo — native; for others — a track from Voiceover)
✅ Understandable without sound too (50% watch on mute)
✅ Loop or question at the end — reason to rewatch/comment
✅ 6–10 seconds: watch-through rate matters more than beauty
?
FAQ

FAQ

Which AI model is best for viral videos?
Depends on the format: Grok Imagine 1.5 — talking characters and sound, Hailuo 02 — impossible physics and ASMR, Kling 3.0 — multi-shot and infinite zoom, Kling 2.1 — before/after transitions. All available in Airium Studio.

What is a hook and why does it matter more than video quality?
A hook is the reason not to scroll past in the first 2 seconds. Algorithms rank by watch-through rate: an average clip with a strong hook outperforms a beautiful clip with a dull opening.

How much does it cost to make a viral AI video?
Drafts — from ~5–10 tokens (Seedance Fast, 480p), a final 6–10 second clip — 18–50 tokens depending on the model. New users receive free tokens from Airium.

Can I publish AI-generated videos on TikTok and Reels?
Yes, generated content belongs to you. We recommend 9:16 vertical format and 6–10 second duration.

Finale

Cheat sheet

WhatHowTip
Hookfirst 2 seconds, in the first frame of the prompt7 types from lesson 1
Structurehook → development → payoffloop = rewatches
9:16 formatReels/Shorts/TikTokset in the studio
AudioGrok/Veo nativelyor a track from Voiceover
Seriestemplate × 10 variationschange one variable
Budgetdraft 480p → final 720p+savings ×3–4

🪝 Hook bank

Impossible · unfinished · transformation · pattern break · face · sound · scale.

🔄 Loop

End = beginning → the clip plays on loop, the algorithm counts it as a rewatch.

🗣 Talking objects

Grok lip-sync anything — a format no camera can match.

📈 System

Not one clip, but a series: the audience recognizes the format and comes back for more.

Start learning →