



const main = async () => {
const response = await fetch('https://api.ai.cc/v2/generate/video/alibaba/generation', {
method: 'POST',
headers: {
Authorization: 'Bearer ',
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'wan/v2.1/1.3b/text-to-video',
prompt: 'A DJ on the stand is playing, around a World War II battlefield, lots of explosions, thousands of dancing soldiers, between tanks shooting, barbed wire fences, lots of smoke and fire, black and white old video: hyper realistic, photorealistic, photography, super detailed, very sharp, on a very white background',
aspect_ratio: '16:9',
}),
}).then((res) => res.json());
console.log('Generation:', response);
};
main()
import requests
def main():
url = "https://api.ai.cc/v2/generate/video/alibaba/generation"
payload = {
"model": "wan/v2.1/1.3b/text-to-video",
"prompt": "A DJ on the stand is playing, around a World War II battlefield, lots of explosions, thousands of dancing soldiers, between tanks shooting, barbed wire fences, lots of smoke and fire, black and white old video: hyper realistic, photorealistic, photography, super detailed, very sharp, on a very white background",
"aspect_ratio": "16:9",
}
headers = {"Authorization": "Bearer ", "Content-Type": "application/json"}
response = requests.post(url, json=payload, headers=headers)
print("Generation:", response.json())
if __name__ == "__main__":
main()
-
AI Playground

Test all API models in the sandbox environment before you integrate.
We provide more than 300 models to integrate into your app.


Product Detail
💡Overview:
Wan 2.1, developed by Alibaba's Wan AI team, is a state-of-the-art video foundation model designed for advanced generative video tasks. Supporting Text-to-Video (T2V), it incorporates groundbreaking innovations to deliver high-quality outputs with exceptional computational efficiency.
✨Key Features:
- Visual Text Generation: Generates text in both Chinese and English within videos.
- 3D Variational Autoencoder (Wan-VAE): Encodes and decodes unlimited-length 1080P videos with temporal precision.
- High-Quality Outputs: Produces visually dynamic and temporally consistent videos at resolutions of up to 720P.
🎯Intended Use:
Wan 2.1 is designed for applications in:
- Creative Industries: Video production.
- Content Generation: For social media and marketing campaigns.
- Automated Workflows: Involving multimedia processing.
🌍Language Support:
The model supports multilingual text generation, including Chinese and English.
⚙️Technical Details:
🏗️Architecture:
Wan 2.1 is built on the diffusion transformer paradigm with several innovative features:
- 3D Variational Autoencoder (Wan-VAE): Enhances spatio-temporal compression and ensures temporal causality during video generation.
- Video Diffusion DiT Framework: Uses Flow Matching with a T5 Encoder for text encoding and cross-attention layers embedded in transformer blocks.
🚀Performance Metrics:
Wan 2.1 achieves an impressive 84.7% VBench score, excelling in dynamic scenes, spatial consistency, and aesthetics. It generates 1080p video at 30 FPS with realistic motion, thanks to its advanced space-time attention mechanism. As a leading open-source video generation model, it rivals proprietary alternatives like Sora, though they may outperform it in certain areas.
💻Usage:
Code Samples:
The model is available on the AI/ML API platform as "Wan 2.1".
Parameters:
- negative_prompt [str]: The negative prompt to use. Use it to address details that you don't want in the video (e.g., blurry, low resolution).
- seed [int]: Random seed for reproducibility. If None, a random seed is chosen.
- aspect_ratio [9:16, 16:9]: Aspect ratio of the generated video.
- inference_steps [int]: Number of inference steps for sampling. Higher values give better quality but take longer.
- guidance_scale [number]: Classifier-free guidance scale. Controls prompt adherence / creativity.
- shift [number]: Noise schedule shift parameter. Affects temporal dynamics.
- sampler ['unipc', 'dpm+']: The sampler to use for generation.
- enable_safety_checker [boolean]: If set to true, the safety checker will be enabled.
- enable_prompt_expansion [boolean]: Whether to enable prompt expansion.
To get the generated video:
API Documentation:
Detailed API Documentation is available here.
✅Ethical Guidelines:
Alibaba emphasizes responsible usage of Wan 2.1 for ethical applications in content creation while discouraging misuse such as deepfake generation or inappropriate content creation.
📜Licensing:
Wan 2.1 is licensed under Apache 2.0, allowing both commercial and research use with transparent terms.
Get Wan 2.1 API here!
❓Frequently Asked Questions (FAQ):
- Q1: What is Wan 2.1?
- Wan 2.1 is an advanced video foundation model developed by Alibaba's Wan AI team, specializing in generative video tasks like Text-to-Video (T2V) with high-quality outputs and computational efficiency.
- Q2: What resolutions does Wan 2.1 support for video generation?
- The model is capable of producing visually dynamic and temporally consistent videos at resolutions up to 720P, while internally generating 1080p video at 30 FPS for realistic motion.
- Q3: Can Wan 2.1 generate text within videos, and in what languages?
- Yes, Wan 2.1 features visual text generation, supporting text embedding in both Chinese and English within the generated videos.
- Q4: What is the licensing model for Wan 2.1?
- Wan 2.1 is licensed under Apache 2.0, which permits both commercial and research use under transparent terms.
- Q5: How does Wan 2.1 compare to other video generation models?
- Wan 2.1 achieves an impressive 84.7% VBench score and is considered a leading open-source model. It rivals proprietary alternatives like Sora, though specific performance can vary across different metrics.
Learn how you can transformyour company with AICC APIs



Log in