How does Prizmad pricing work for Happy Horse?

Happy Horse charges per second of output by resolution: 720p costs 1 token per second, 1080p costs 2 tokens per second. So a 5-second 1080p clip costs 10 tokens, and a 10-second 720p clip costs 10 tokens. Tokens come from your Prizmad plan first; if they run out you can buy a one-off top-up without changing your plan. There are no separate accounts to manage, no API key setup, and no extra billing.

What's the difference between Image-to-Video and Reference-to-Video?

Image-to-Video takes a single starting frame and animates it — the model decides the motion and continuation from that frame. Reference-to-Video takes 1–9 character images as identity references; the prompt drives the scene and Happy Horse keeps each referenced character consistent throughout. Use Image-to-Video when you have an exact frame you want to animate; use Reference-to-Video when you have product or person photos that need to appear inside a new scene.

Does Happy Horse really generate audio with the video?

Yes. Happy Horse generates audio and video in a single forward pass: dialogue, ambient sound, Foley effects, and lip-sync are produced together, not added in post. Lip-sync supports English, Mandarin, Cantonese, Japanese, Korean, German, and French. The output mp4 file already contains the audio track.

How long does a generation take?

On Prizmad the Happy Horse workspace shows an estimated time of about three minutes per clip. Actual time scales with duration and resolution — short 720p clips return faster than 15-second 1080p clips. Generations run asynchronously, so you can queue another while one is rendering.

Do I own the rights to videos I generate?

Yes. You retain full commercial rights to every video generated with Happy Horse on Prizmad — use them in paid ads, on landing pages, in social campaigns, and on owned channels with no extra licensing.

Happy Horse

Generate Happy Horse videos from prompts, a starting image, or up to 9 character references

Mode

Prompt

Describe the scene, character action, motion and camera. For references, mention character1, character2, etc. in image order.

Resolution

1080p costs more tokens because Happy Horse pricing is per second by resolution.

Duration

Aspect Ratio

Aspect ratio is sent only for Text-to-Video and Reference-to-Video modes.

Seed

Balance: 0 tokens

Prompt is required

Your generated video will appear here

Overview

Happy Horse 1.0 is Alibaba's video model that took #1 on the Artificial Analysis Video Arena in April 2026, beating Veo 3 and Kling 3.0 in blind preference voting. Prizmad ships it as a built-in tool — text-to-video, image-to-video, and reference-to-video with up to nine character images, audio included.

Background

What is Happy Horse 1.0?

Happy Horse 1.0 is the video AI model Alibaba revealed on April 9, 2026, after it appeared as a stealth entry on the Artificial Analysis Video Arena and immediately ranked #1 in both Text-to-Video and Image-to-Video categories. Prizmad shipped it as a built-in tool on April 27, 2026 — the day it became publicly available.

Architecturally it's a 15-billion-parameter transformer with 40 layers and a unified self-attention sequence — the same 32 middle layers handle text, image, video, and audio with no cross-attention modules. It generates audio and video in a single forward pass, so dialogue, ambient sound, Foley, and lip-sync are produced together rather than stitched in post.

On Prizmad, Happy Horse runs alongside ChatGPT Image 2, AI avatars, voiceover, and music inside one subscription. Pick a mode, write your prompt, attach images if you have them, and the model returns a 3- to 15-second clip at up to 1080p in 16:9, 9:16, 1:1, 4:3, or 3:4.

Use cases

Feature	Happy Horse 1.0	Veo 3	Kling 3.0
Artificial Analysis Elo (visual)	1381 (#1)	1217	1218–1242
Native audio in output	Yes (single pass)	Yes	Limited
Multilingual lip-sync	7 languages	Limited	Limited
Max resolution	1080p	1080p	1080p
Duration range	3–15 s	Up to 8 s	5–10 s
Reference characters	Up to 9	1	1
Available on Prizmad	Yes	No	Motion Control variant only
Best for	UGC ads, product reveals, multi-character	General text-to-video	Stylized motion, longer pans

Happy Horse

What is Happy Horse 1.0?

Happy Horse

What is Happy Horse 1.0?

What you can create with Happy Horse

UGC-style product ads

Product showcase reels

Multi-character scenes

Localized ad variants

Real clips generated with Happy Horse

Key capabilities

#1 on Artificial Analysis arena

Native audio with the video

Multilingual lip-sync

Up to nine reference characters

Cinematic motion and camera

1080p, 3–15 seconds

How to use Happy Horse on Prizmad

How Happy Horse fits Prizmad pricing

Pay per second of video

Included in every plan

Run out? Top up, don't upgrade

Happy Horse 1.0 vs Veo 3 vs Kling 3.0

Frequently asked questions