



const main = async () => {
const response = await fetch('https://api.ai.cc/v2/generate/video/alibaba/generation', {
method: 'POST',
headers: {
Authorization: 'Bearer ',
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'alibaba/wan2.1-t2v-plus',
prompt: 'A DJ on the stand is playing, around a World War II battlefield, lots of explosions, thousands of dancing soldiers, between tanks shooting, barbed wire fences, lots of smoke and fire, black and white old video: hyper realistic, photorealistic, photography, super detailed, very sharp, on a very white background',
aspect_ratio: '16:9',
}),
}).then((res) => res.json());
console.log('Generation:', response);
};
main()
import requests
def main():
url = "https://api.ai.cc/v2/generate/video/alibaba/generation"
payload = {
"model": "alibaba/wan2.1-t2v-plus",
"prompt": "A DJ on the stand is playing, around a World War II battlefield, lots of explosions, thousands of dancing soldiers, between tanks shooting, barbed wire fences, lots of smoke and fire, black and white old video: hyper realistic, photorealistic, photography, super detailed, very sharp, on a very white background",
"aspect_ratio": "16:9",
}
headers = {"Authorization": "Bearer ", "Content-Type": "application/json"}
response = requests.post(url, json=payload, headers=headers)
print("Generation:", response.json())
if __name__ == "__main__":
main()

Product Detail
Alibaba's Wan2.1 Plus represents a significant leap in text-to-video generation, engineered to produce high-quality, cinematic video outputs with unparalleled precision and efficiency. This advanced AI model leverages sophisticated multi-modal understanding, seamlessly translating intricate textual prompts into visually coherent and dynamic videos. It excels in large-scale video synthesis, offering granular control over motion dynamics and detailed scene composition, making it an indispensable tool for creative and professional applications.
✨ Key Features & Technical Specifications
- ✔️ Video Generation Quality: Delivers high fidelity in dynamic motions, nuanced facial expressions, and intricate object interactions, ensuring professional-grade output.
- 🧠 Multi-step Reasoning: Possesses a strong contextual understanding of complex prompts, enabling sophisticated video synthesis that aligns perfectly with user intent.
- 🎯 Instruction Following: Demonstrates enhanced adherence to user prompts and upholds physical realism in all generated video content.
- 🎬 Text-to-Video Synthesis: Effortlessly generates smooth, contextually accurate videos directly from natural language descriptions.
- 🖼️ Multi-modal Scene Understanding: Integrates scene layout, colors, lighting, and movement for truly cinematic and immersive visual effects.
- ⚙️ Fine Control: Supports detailed prompt-based tuning for aesthetic parameters, including precise adjustments to lighting, camera angles, and color tones.
💰 API Pricing
Only $0.525 per video
💡 Optimal Use Cases
- 🎥 Creative Content Production: Ideal for filmmaking, advertising, and storyboarding workflows that demand high-definition video output generated from text.
- 📚 Visual Storytelling: Transforms textual narratives into dynamic, richly detailed visuals, bringing stories to life with unprecedented ease.
- 🎮 Interactive Media & Entertainment: Facilitates the rapid development of visual assets from script or dialogue inputs for games and interactive experiences.
- 📈 Business Presentations & Marketing: Enables the generation of tailored video content, significantly enhancing communication impact in business contexts.
Code Sample
⚖️ Comparison with Other Models
- Vs. Wan2.2-T2V: Wan2.1-T2V-Plus provides solid performance focusing on cost-effective 1080P video generation, whereas Wan2.2 offers advancements with larger parameter models and a multi-expert architecture for superior aesthetics and efficiency.
- Vs. Gemini 2.5 Flash: Wan2.1 delivers competitive text-to-video capabilities, proving particularly valuable for 1080P generation tasks where cost-efficiency is a primary concern.
- Vs. OpenAI GPT-4 Vision: Wan2.1 specifically emphasizes dedicated video synthesis from text with robust higher resolution pricing support, contrasting with GPT-4’s broader multimodal conversational strengths.
⚠️ Limitations
- Minor Artifacts: Some generated videos may exhibit minor artifacts or inconsistencies, especially with highly complex prompts. While advanced tuning can mitigate these, complete elimination is not always guaranteed.
- Video Length: Currently optimized primarily for 5-second video clips. Generating longer videos may require additional processing steps or resources.
❓ Frequently Asked Questions (FAQ)
Q: What is Alibaba Wan2.1 Plus primarily designed for?
A: Alibaba Wan2.1 Plus is an advanced AI model specifically designed for high-quality, cinematic text-to-video generation, excelling in translating textual prompts into visually coherent video outputs.
Q: What kind of control does Wan2.1 Plus offer over video generation?
A: It provides fine control over aesthetic parameters, allowing detailed prompt-based tuning for lighting, camera angles, and color tones to achieve desired cinematic effects.
Q: How does its pricing compare to other models?
A: Wan2.1 Plus offers competitive pricing at $0.525 per video, making it particularly valuable for cost-sensitive 1080P video generation tasks compared to some broader multimodal AI models.
Q: What are the main limitations of Wan2.1 Plus?
A: Primary limitations include potential minor artifacts with complex prompts and current optimization mainly for 5-second video clips, requiring additional processing for longer durations.
Q: In what industries can Wan2.1 Plus be optimally utilized?
A: It is optimally utilized in creative content production (filmmaking, advertising), visual storytelling, interactive media and entertainment, and for enhancing business presentations and marketing.
AI Playground



Log in