



const main = async () => {
const response = await fetch('https://api.ai.cc/v2/video/generations', {
method: 'POST',
headers: {
Authorization: 'Bearer ',
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'alibaba/wan2.2-vace-fun-a14b-pose',
prompt: 'Mona Lisa puts on glasses with her hands.',
video_url: 'https://storage.googleapis.com/falserverless/example_inputs/wan_animate_input_video.mp4',
image_url: 'https://s2-111386.kwimgs.com/bs2/mmu-aiplatform-temp/kling/20240620/1.jpeg',
resolution: "720p",
}),
}).then((res) => res.json());
console.log('Generation:', response);
};
main()
import requests
def main():
url = "https://api.ai.cc/v2/video/generations"
payload = {
"model": "alibaba/wan2.2-vace-fun-a14b-pose",
"prompt": "Mona Lisa puts on glasses with her hands.",
"video_url": "https://storage.googleapis.com/falserverless/example_inputs/wan_animate_input_video.mp4",
"image_url": "https://s2-111386.kwimgs.com/bs2/mmu-aiplatform-temp/kling/20240620/1.jpeg",
"resolution": "720p",
}
headers = {"Authorization": "Bearer ", "Content-Type": "application/json"}
response = requests.post(url, json=payload, headers=headers)
print("Generation:", response.json())
if __name__ == "__main__":
main()
-
AI Playground

Test all API models in the sandbox environment before you integrate.
We provide more than 300 models to integrate into your app.


Product Detail
Model Overview
Wan 2.2 VACE Pose is an advanced video generation model developed by Wan Labs using the innovative VACE (Video Adaptive Compositional Editing) technology. It specializes in controllable video synthesis with fine-grained parameter support such as Canny edge detection, Depth maps, Pose estimation, MLSD (line segment detection), and more. This model empowers creators with enhanced control over video content generation, enabling high-quality, consistent, and fluid video sequences ideal for professional and creative applications.
Technical Specifications
Performance Benchmarks
- ✔ Produces high-quality cinematic video sequences with smooth motion transitions and stable, flicker-free outputs.
- ✔ Significantly improved temporal consistency and semantic accuracy over Wan 2.1 models.
- ✔ Efficient memory use with optimized VAE compression enabling faster generation.
- ✔ Real-time video editing and generation workflows supported on modern GPUs (e.g., RTX 4090).
- ✔ Outperforms earlier versions in handling complex dynamic scenes involving multiple characters and camera movement.
- ✔ Superior fine-grained control for precise creative direction (lighting, composition, motion).
Key Features
- ★ Advanced Video Control: Supports Canny, Depth, Pose, MLSD maps, and trajectory parameters for fine-grained video content manipulation.
- ★ Multi-Resolution Support: Generates videos efficiently at 512, 768, and 1024 pixel resolutions.
- ★ Temporal Frame Handling: Trained on sequences of 81 frames at 16 FPS for smooth, coherent video generation.
- ★ Multilingual Input Support: Enables video generation with prompts in multiple languages.
- ★ Generative Flexibility: Capable of subject-specific video generation using tailored control inputs.
- ★ Open and Extensible: Fully open-source model weights and workflows available for customization and integration.
- ★ Stable Camera Control: Supports video editing with stable camera lens control and complex motion paths.
API Pricing
💲 360p: $0.0525
💲 540p: $0.07875
💲 720p: $0.105
Use Cases
- ➤ Professional video production and animation with precise pose and motion control.
- ➤ Cinematic storytelling and concept video creation.
- ➤ Digital art animation with enhanced stability and visual coherence.
- ➤ Real-time compositional editing for film, advertising, and multimedia projects.
- ➤ Interactive video content generation with multi-modal input triggers.
Code Sample
<snippet data-name="alibaba.create-video-to-video-generation" data-model="alibaba/wan2.2-vace-fun-a14b-pose"></snippet>
Comparison with Other Models
▶ vs Veo 3.0: Veo 3.0 focuses on fast text-to-video conversion with efficient rendering speeds but features limited direct control over spatial video elements. Wan 2.2 VACE Fun A14B Pose, while slower and more VRAM-intensive, allows detailed video editing and precise motion handling, making it better suited for high-quality production workflows.
▶ vs KLING 2.0: Both models provide advanced video generation capabilities with open-source codebases. Wan 2.2 VACE excels in adaptive compositional editing with multi-condition inputs (pose, depth, canny edges), while KLING 2.0 matches in raw video quality but lacks the fine-grained editable controls present in VACE 2.0.
▶ vs Wan 2.1 VACE: Wan 2.2 VACE Fun A14B Pose shows improved temporal consistency and visual fidelity over Wan 2.1 VACE, particularly in complex scenes involving multiple moving subjects. It reduces identity loss in face and pose representation more effectively and supports higher native resolutions (up to 1080p).
API Integration
Accessible via AI/ML API. Documentation: 🔗 Wan 2.2 VACE API Reference
Frequently Asked Questions (FAQ)
❓ What is Wan 2.2 Vace Fun A14B Pose and what makes it unique for character posing?
Wan 2.2 Vace Fun A14B Pose is a specialized AI model focused on generating and manipulating character poses with creative flair and anatomical accuracy. Its uniqueness lies in understanding human and creature anatomy, generating dynamic and expressive poses, and maintaining character consistency while exploring various stances, gestures, and action positions. The 'Vace Fun' aspect ensures poses are engaging, storytelling-focused, and visually appealing.
❓ What types of poses and gestures can this model generate most effectively?
The model excels at generating: dynamic action poses (fighting, sports, dance), expressive emotional gestures and body language, natural resting and casual positions, stylized cartoon and anime poses, creature and animal stances, interactive multi-character compositions, and pose sequences showing movement progression. It understands weight distribution, balance, and anatomical constraints to create believable and visually interesting poses.
❓ How does the 'Fun' aspect influence pose generation?
The 'Fun' aspect transforms pose generation from technical positioning to expressive storytelling by: creating poses with personality and character, suggesting dynamic and engaging compositions, adding whimsical or exaggerated elements when appropriate, maintaining a sense of movement and energy, and ensuring poses serve narrative purposes rather than just anatomical correctness. This makes generated poses more suitable for animation, illustration, and character design.
❓ What are the practical applications for AI-powered pose generation?
Practical applications include: character design and concept art, animation keyframe creation, storyboard and comic panel development, game character posing, reference image generation for artists, fashion and product posing, educational anatomy demonstrations, and social media content creation. It's particularly valuable for artists and creators who need quick pose references or inspiration for character expressions.
❓ What techniques yield the best pose generation results?
Best results come from: describing the character's emotion or action clearly, specifying the pose style (realistic, cartoon, anime), indicating camera angle and perspective, describing weight distribution and balance points, mentioning any props or environmental interactions, and providing context about the character's personality. Example: 'A superhero landing pose from low angle, dynamic impact with cape billowing, confident expression, realistic human proportions with slight stylistic exaggeration.'
Learn how you can transformyour company with AICC APIs



Log in