qwen-bg
max-ico04
In
Out
max-ico02
Chat
max-ico03
disable
Aura 2
With high concurrency support and cost-efficient pricing, Aura 2 enables seamless, clear, and responsive voice AI interactions for industries like finance, healthcare, and customer support.
Free $1 Tokens for New Members
Text to Speech
                                        const fs = require('fs');
const path = require('path');

const axios = require('axios').default;
const api = new axios.create({
  baseURL: 'https://api.ai.cc/v1',
  headers: { Authorization: 'Bearer ' },
});

const main = async () => {
  const response = await api.post(
    '/tts',
    {
      model: '#g1_aura-2-amalthea-en',
      text: 'Hi! What are you doing today?',
    },
    { responseType: 'stream' },
  );

  const dist = path.resolve(__dirname, './audio.wav');
  const writeStream = fs.createWriteStream(dist);

  response.data.pipe(writeStream);

  writeStream.on('close', () => console.log('Audio saved to:', dist));
};

main();

                                
                                        import os
import requests


def main():
    url = "https://api.ai.cc/v1/tts"
    headers = {
        "Authorization": "Bearer ",
    }
    payload = {
        "model": "#g1_aura-2-amalthea-en",
        "text": "Hi! What are you doing today?",
    }

    response = requests.post(url, headers=headers, json=payload, stream=True)
    dist = os.path.join(os.path.dirname(__file__), "audio.wav")

    with open(dist, "wb") as write_stream:
        for chunk in response.iter_content(chunk_size=8192):
            if chunk:
                write_stream.write(chunk)

    print("Audio saved to:", dist)


main()
Docs

One API 300+ AI Models

Save 20% on Costs & $1 Free Tokens
  • ico01-1
    AI Playground

    Test all API models in the sandbox environment before you integrate.

    We provide more than 300 models to integrate into your app.

    copy-img02img01
qwenmax-bg
img
Aura 2

Product Detail

🌟 Aura-2 by Deepgram: Enterprise Text-to-Speech Excellence

Deepgram's Aura-2 is a state-of-the-art text-to-speech (TTS) solution engineered specifically for enterprise applications. It delivers live, natural voice synthesis with unparalleled clarity and accurate domain-specific pronunciations.

Designed for flexibility, Aura-2 offers versatile deployment options, including cloud and on-premise environments, ensuring instant, context-sensitive speech creation for critical applications like voice agents, interactive voice response (IVR) systems, and advanced AI conversations.

⚙️ Technical Specifications

  • ⚡ Latency: Consistent
  • 💻 Inference Tech: GPU-accelerated streaming-first architecture with quantization and pruning for efficiency.
  • 📈 Scalability: Stateless distributed runtime allows for rapid, bottleneck-free scaling.
  • 🔒 Security: Built with enterprise-grade deployment and data locality compliance in mind.

📊 Performance Benchmarks

  • ✓ Achieves sub-200ms TTFB latency for ultra-responsive conversational flow.
  • ✓ Real-Time Factor (RTF) of 0.111x, generating 1 second of audio in ~100 milliseconds.
  • ✓ Supports thousands of concurrent sessions with consistent low latency and high-quality output.
  • ✓ Maintains minimal variance and low maximum latency even under high concurrency, critical for real-time virtual agents.
  • Outperforms many competitors by consistently staying below the 200ms conversational threshold.
  • ✓ Designed with GPU-accelerated and optimized streaming-first Enterprise Runtime for fast inference.
  • ✓ Flexible deployment on cloud, VPC, or on-premises to reduce roundtrip delays and meet compliance needs.
  • ✓ Stateless distributed runtime architecture enables rapid scaling and efficient load balancing.
Deepgram Aura-2 performance comparison
Aura-2 consistently outperforms competitors like ElevenLabs and OpenAI’s TTS solutions in latency-sensitive enterprise contexts.

💲 API Pricing

💰 $0.0315/1k characters

✨ Key Features of Aura-2

  • Real-Time Performance: Sub-200ms TTFB latency ensures natural, fluid conversations.
  • Fast Audio Generation: RTF of 0.111x, synthesizing 1 second of audio in just over 100 ms.
  • 🔍 Domain-Specific Accuracy: Superior pronunciation for currency, dates, technical terms, and more.
  • 💻 Enterprise Scalability: Supports thousands of concurrent sessions without latency degradation.
  • 📧 Deployment Flexibility: Available via REST and WebSocket APIs; deployable on private clouds, VPCs, or on-premises.
  • 🎤 Broad Voice Catalog: 40+ professional voices tailored for diverse contexts and tones.
  • 🌐 Multilingual Future-Proofing: Primarily English, with planned multi-language support.

🗣️ Model Variants Overview: English Voices

Deepgram Aura-2 offers a rich catalog of voices, each optimized for specific enterprise usage and voice characteristics:

  • aura-2-amalthea-en: Warm, approachable female voice for customer support.
  • aura-2-andromeda-en: Clear, authoritative male voice suited for financial domains.
  • aura-2-apollo-en: Energetic, youthful male voice for marketing and retail.
  • aura-2-arcas-en: Calm, neutral male voice ideal for healthcare communications.
  • aura-2-aries-en: Strong, confident male voice for technical support.
  • aura-2-asteria-en: Soft, caring female voice targeting education and training.
  • aura-2-athena-en: Professional, articulate female voice for legal and corporate sectors.
  • aura-2-atlas-en: Deep, steady male voice designed for logistics and transportation.
  • aura-2-aurora-en: Bright, clear female voice for media and broadcasting.
  • aura-2-callista-en: Friendly, engaging female voice for customer engagement.
  • aura-2-cora-en: Warm and friendly female voice, perfect for customer engagement and educational content.
  • aura-2-cordelia-en: Clear and professional female voice ideal for corporate training and support calls.
  • aura-2-delia-en: Calm, empathetic female voice designed for healthcare and wellness applications.
  • aura-2-draco-en: Assertive male voice well suited for technical support and financial services.
  • aura-2-electra-en: Energetic and dynamic female voice for marketing and retail promotions.
  • aura-2-harmonia-en: Balanced female voice offering clarity and a soothing tone for voice assistants.
  • aura-2-helena-en: Articulate female voice with a corporate tone, suitable for legal and business sectors.
  • aura-2-hera-en: Confident female voice ideal for education and training modules.
  • aura-2-hermes-en: Clear and authoritative male voice, fit for executive communications and announcements.
  • aura-2-hyperion-en: Deep, steady male voice crafted for logistics, transportation, and industrial use cases.
  • aura-2-iris-en: Bright and engaging female voice for media and broadcasting contexts.
  • aura-2-janus-en: Versatile male voice suitable for multi-purpose enterprise applications.
  • aura-2-juno-en: Friendly, approachable female voice for customer service and support channels.
  • aura-2-jupiter-en: Powerful, confident male voice tailored for financial and advisory services.
  • aura-2-luna-en: Soft and gentle female voice preferred in healthcare and personal coaching.
  • aura-2-mars-en: Strong and clear male voice designed for technical and operational environments.
  • aura-2-minerva-en: Intelligent, polished female voice, effective for training and educational use.
  • aura-2-neptune-en: Calm male voice well suited for meditation and wellness apps.
  • aura-2-odysseus-en: Narrative-style male voice designed for storytelling and guided tours.
  • aura-2-ophelia-en: Warm female voice with empathetic intonation for service industries.
  • aura-2-orion-en: Bold male voice for authoritative announcements and industrial contexts.
  • aura-2-orpheus-en: Smooth male voice with artistic tone, suited for media and creative applications.
  • aura-2-pandora-en: Engaging female voice crafted for marketing and promotions.
  • aura-2-phoebe-en: Clear, professional female voice ideal for e-learning and corporate communications.
  • aura-2-pluto-en: Deep male voice with a calm demeanor perfect for narration and voice-overs.
  • aura-2-saturn-en: Strong male voice tailored for customer support and financial sectors.
  • aura-2-selene-en: Soft female voice ideal for wellness, mindfulness, and personal care apps.
  • aura-2-thalia-en: Bright and dynamic female voice, great for retail and promotional content.
  • aura-2-theia-en: Professional female voice suitable for healthcare and legal domains.
  • aura-2-vesta-en: Clear female voice with steady pace designed for technical and customer service roles.
  • aura-2-zeus-en: Commanding, powerful male voice perfect for executive announcements and presentations.

Each voice is crafted with distinct tonal qualities and enterprise context appropriateness, ensuring businesses can select the perfect voice for their brand identity and use case.

🌍 Spanish Voice Variants

  • aura-2-celeste-es: Clear and friendly female Spanish voice for broad customer engagement.
  • aura-2-estrella-es: Warm and articulate female Spanish voice tailored for educational and media use.
  • aura-2-nestor-es: Assertive male Spanish voice designed for professional and corporate settings.

🎯 Common Use Cases

  • 👤 Real-time conversational voice AI agents
  • 📞 Interactive Voice Response (IVR) systems
  • 💬 Customer support automation
  • 📢 Transactional notifications (reminders, alerts)
  • 🔍 Domain-specific voice assistants requiring accurate pronunciation
  • 🏠 On-premises deployments for sensitive data environments

🆚 Comparison with Other Models

Deepgram Aura-2 vs. ElevenLabs Flash

Aura-2 excels in real-time enterprise use with its consistent sub-200ms latency and flexible deployment (including on-premises and VPC). While ElevenLabs Flash offers very fast generation (~75ms start time), it has plan restrictions and is cloud-only. Aura-2 is also approximately 40% more cost-effective for large-scale business operations.

Deepgram Aura-2 vs. OpenAI TTS

Aura-2 surpasses OpenAI’s TTS in latency performance, maintaining consistent sub-200ms response even under high concurrency, which is crucial for live agents and IVRs. OpenAI’s TTS prioritizes voice expressiveness for offline or media applications, trading some real-time speed. Aura-2's architecture is optimized for throughput and scalability in demanding enterprise environments.

Deepgram Aura-2 vs. Cartesia Sonic

Aura-2 offers a more affordable per-character cost and lower latency than Cartesia Sonic, alongside supporting distributed and on-premises deployments. Cartesia Sonic is primarily cloud-based with higher latency (~300ms), making Aura-2 better suited for use cases requiring rapid, natural conversations. Aura-2’s specialized runtime provides lower infrastructure overhead at scale.

❓ Frequently Asked Questions (FAQ)

Q: What makes Aura-2 unique in the AI model landscape?

A: Aura-2 is a cutting-edge text-to-speech solution crafted for enterprise applications requiring live, natural voice synthesis. Its uniqueness lies in its exceptional clarity, accurate domain pronunciations, flexible deployment options (cloud or on-prem), and consistent sub-200ms latency even under high concurrency.

Q: What specific capabilities does Aura-2 offer for real-time voice synthesis?

A: Aura-2 delivers sub-200ms Time-To-First-Byte (TTFB) latency and achieves a Real-Time Factor (RTF) of 0.111x, meaning it generates 1 second of audio in just over 100 milliseconds. This ensures ultra-responsive, natural conversational flow crucial for live voice agents and IVR systems.

Q: How does Aura-2 handle domain-specific pronunciations?

A: Aura-2 is designed with superior pronunciation accuracy for complex terms including currency, dates, technical jargon, URLs, and addresses, making it ideal for specialized enterprise applications where precision is paramount.

Q: What are the deployment options for Deepgram Aura-2?

A: Aura-2 offers extensive deployment flexibility. It can be accessed via REST and WebSocket APIs and can be deployed on public clouds, private Virtual Private Clouds (VPCs), or fully on-premises to meet specific security, compliance, and latency requirements.

Q: How does Aura-2 compare in terms of cost-effectiveness for large-scale use?

A: For large-scale business applications, Aura-2 is notably cost-effective. For instance, it is approximately 40% more affordable per character compared to some competitors like ElevenLabs Flash, while also providing superior latency and deployment flexibility crucial for enterprise needs.

Learn how you can transformyour company with AICC APIs

Discover how to revolutionize your business with AICC API! Unlock powerfultools to automate processes, enhance decision-making, and personalize customer experiences.
Contact sales
api-right-1
model-bg02-1

One API
300+ AI Models

Save 20% on Costs