Out

Chat

gpt-image-1

OpenAI’s GPT-Image-1 is a GPT-4–class multimodal transformer that converts natural-language prompts (and reference images) into high-fidelity, typography-accurate pictures and in-place edits with enterprise-grade safety via a production API.

Free $1 Tokens for New Members

Text to Speech

Javascript

Python

                                        const main = async () => {
  const response = await fetch('https://api.ai.cc/v1/images/generations', {
    method: 'POST',
    headers: {
      Authorization: 'Bearer ',
      'Content-Type': 'application/json',
    },
    body: JSON.stringify({
      prompt: 'A jellyfish in the ocean',
      model: 'openai/gpt-image-1',
    }),
  }).then((res) => res.json());

  console.log('Generation:', response);
};

main();

                                        import requests


def main():
    response = requests.post(
        "https://api.ai.cc/v1/images/generations",
        headers={
            "Authorization": "Bearer ",
            "Content-Type": "application/json",
        },
        json={
            "prompt": "A jellyfish in the ocean",
            "model": "openai/gpt-image-1",
        },
    )

    response.raise_for_status()
    data = response.json()

    print("Generation:", data)


if __name__ == "__main__":
    main()

Docs

One API 300+ AI Models

Save 20% on Costs & $1 Free Tokens

Get API Key Explore Models

gpt-image-1

Product Detail

✨ GPT-Image-1: High-Fidelity AI Image Generation & Editing

OpenAI's GPT-Image-1 is a groundbreaking natively multimodal generative transformer designed for high-fidelity text-to-image creation and editing. This advanced model extends a GPT-4-class decoder with specialized visual token embeddings and cross-modal attention. This unique architecture empowers it to accurately follow intricate design instructions, leverage extensive world knowledge, and precisely render on-image text, making it a powerful tool for a wide range of visual content needs.

🚀 Technical Specifications

Performance Benchmarks

OpenAI Image 1 is meticulously optimized for superior image generation and visual content creation:

• Architecture: GPT-4-derived decoder integrated with vision adapters and an additional masked-editing head for advanced in-painting capabilities.
• Native Output Sizes: Supports 1024x1024 px square, with widescreen (1024x1536 px) and portrait (1536x1024 px) variants. On-demand 4K upscaling is also available.

API Pricing Overview

• Text Tokens Input: $5.25
• Image Tokens Input: $10.5
• Low Quality Price per Image Generation:
- 1024x1024: $0.0116
- 1024x1536: $0.017
- 1536x1024: $0.017
• Medium Quality Price per Image Generation:
- 1024x1024: $0.044
- 1024x1536: $0.066
- 1536x1024: $0.066
• High Quality Price Per Image Generation:
- 1024x1024: $0.175
- 1024x1536: $0.263
- 1536x1024: $0.263

Performance Metrics & Achievements

⭐ GIE-Bench (2025): GPT-Image-1 achieved the highest functional-correctness scores among all tested models in a 1,000-task grounded image-editing benchmark, while also maintaining strong content preservation. For details, refer to the original research: GIE-Bench (2025).
✍️ STRICT Text-Rendering Stress-Test: Marketed inside ChatGPT as "GPT-4o images," GPT-Image-1 is one of only two proprietary models to maintain low error rates on multi-line text up to ≈800 characters, significantly outperforming open-source diffusers. See the full report: STRICT text-rendering stress-test.
📈 Enterprise Roll-outs: Early adopters including Adobe Firefly, Figma Design, Canva, and Wix have reported "double-digit prompt-to-asset speed-ups" after integrating GPT-Image-1. Read more on its impact: OpenAI ChatGPT Image Generation Model: Adobe, Figma.

💡 Key Capabilities of OpenAI Image 1

OpenAI Image 1 consistently delivers precise visual outputs, making it ideal for even the most complex creative workflows:

🎨 Multi-Style Generation: Generate photorealism, illustration, anime, vector art, 3D renders, and data visualizations all from a single endpoint.
✍️ Accurate Typography: Create posters, UI mocks, and multi-line labels with clean, legible text, even when using small fonts.
🌍 World-Knowledge Synthesis: Leverages the GPT-4o family’s language grounding to accurately place branded items, real people, or factual diagrams within images.
🔒 Enterprise-Grade Safety: Features provenance watermarking, tunable moderation, and a commitment to no training on customer data, ensuring alignment with legal and brand-safety requirements.

Example of a generated image with high quality parameters, created with the prompt: “Generate an anime image of a hedgehog holding a paper that says Try GPT-Image-1 today with AI/ML API."

GPT-Image-1 Example Generation

🎯 Optimal Use Cases

• Creative & Marketing: Social media ads, hero shots, product lifestyle renders.
• Design Prototyping: Rapid concept art, theme exploration, on-canvas edits within tools like Figma or Adobe.
• E-commerce: Background removal, colorway variations, staged scenes for product catalogs.
• Education & Publishing: Diagrams, flashcards, worksheet graphics with embedded text.
• Game / Film Pre-production: Storyboards, environment studies, quick asset variations.
• Enterprise Reporting: Auto-generated infographics and data-visuals directly from analytical text.

🛠️ Code Samples & Parameters

Text-to-Image Code Sample

<snippet data-name="image.openai" data-model="openai/gpt-image-1"></snippet>

Text-to-Image Parameters

• prompt [str]: The text prompt detailing the image's content, style, or composition.
• n [1-10]: Number of images to generate.
• output_compression [int]: Compression level (0-100%) for generated images.
• size [1024x1024, 1024x1536, 1536x1024]: Desired size of the generated image.
• background [transparent, opaque, auto]: Sets background transparency. 'Auto' lets the model decide. 'Transparent' requires 'png' or 'webp' output format.
• moderation [low, auto]: Controls the content moderation level.
• output_format [png, jpeg, webp]: Format of the generated image.
• quality [low, medium, high]: Quality setting for the generated image.
• response_format [url, b64_json]: Format for returning generated images.

Image Editing Code Sample

<snippet data-name="image.openai-edit" data-model="openai/gpt-image-1"></snippet>

Image Editing Parameters

• prompt [str]: Text prompt describing desired content, style, or composition for the edited image.
• image [file | list of files]: The image(s) to edit. Supports png, webp, jpg files under 50MB (up to 16 images).
• mask [file]: An additional PNG file (under 4MB, same dimensions as the image) where transparent areas indicate edit regions. Applies to the first image if multiple are provided.
• n [1-10]: Number of images to generate.
• output_compression [int]: Compression level (0-100%) for generated images.
• size [1024x1024, 1024x1536, 1536x1024]: Desired size of the generated image.
• background [transparent, opaque, auto]: Sets background transparency. 'Auto' lets the model decide. 'Transparent' requires 'png' or 'webp' output format.
• moderation [low, auto]: Controls the content moderation level.
• output_format [png, jpeg, webp]: Format of the generated image.
• quality [low, medium, high]: Quality setting for the image.
• response_format [url, b64_json]: Format for returning generated images.

📊 Comparison with Other Leading Models

• Versus DALL·E 3: GPT-Image-1 offers sharper typography and higher prompt adherence. DALL·E 3, however, remains slightly faster for single-shot 512 px drafts.
• Versus Stable Diffusion XL 1.0: GPT-Image-1 shows major gains in instruction following and text rendering. SDXL retains its advantage as a fully open-source option for local or offline deployment.
• Versus Midjourney v7: With deterministic seeds and built-in guardrails, GPT-Image-1 gains an edge for production pipelines. Midjourney still offers a broader community-driven style palette.

🔗 API Integration

GPT-Image-1 is readily accessible via the AI/ML API. Comprehensive documentation for integration can be found here.

❓ Frequently Asked Questions (FAQ)

Q: What makes GPT-Image-1 unique for image generation?
A: GPT-Image-1 is a natively multimodal generative transformer leveraging a GPT-4-class decoder. Its strength lies in its ability to follow intricate design instructions, synthesize world knowledge, and accurately render on-image text, setting a new standard for high-fidelity text-to-image creation and editing.
Q: What output sizes does GPT-Image-1 support?
A: It natively supports 1024x1024 px square images, along with widescreen (1024x1536 px) and portrait (1536x1024 px) variants. Users can also request 4K upscaling on demand.
Q: How does GPT-Image-1 handle text rendering compared to other models?
A: GPT-Image-1 (marketed as "GPT-4o images" within ChatGPT) excels at accurate typography. It is one of the few proprietary models that maintain low error rates on multi-line text up to approximately 800 characters, significantly outperforming many open-source alternatives.
Q: What are the key safety features of GPT-Image-1 for enterprise use?
A: For enterprise users, GPT-Image-1 includes robust safety features such as provenance watermarking, tunable content moderation, and a strict policy of no training on customer data, ensuring brand and legal compliance.
Q: Where can I find the API documentation for GPT-Image-1?
A: The comprehensive API documentation for integrating GPT-Image-1 is available on the AI/ML API documentation portal. Please refer to the official documentation for detailed instructions.

AI Playground

Test all API models in the sandbox environment before you integrate. We provide more than 300 models to integrate into your app.

Try For Free

One API
300+ AI Models

Save 20% on Costs

Free $1 Tokens for New Members