# Built-in AI Image Generation Tool

### 📋 Feature Overview

The MaiAgent system integrates four top-tier AI image generation engines, providing comprehensive solutions from everyday creation to professional design. Simply check the settings to start generating images immediately.

{% hint style="info" %}
Please refer to: [Configure Tools for AI Assistant](/maiagent-user-guide/maiagent-user-guide-en/tools/configure_tools.md)
{% endhint %}

***

### 🎨 Complete Tool Comparison Table

| Tool           | Language Support                   | Core Features                                    | Use Cases                                   |
| -------------- | ---------------------------------- | ------------------------------------------------ | ------------------------------------------- |
| **Gemini 2.0** | 🇹🇼 Traditional Chinese + English | Contextual understanding, conversational editing | Everyday creation, Chinese language needs   |
| **GPT Image**  | 🇺🇸 English                       | Professional quality, multi-round optimization   | Brand design, professional use              |
| **DALL-E 3**   | 🇺🇸 English                       | Rapid generation, concept visualization          | Quick prototyping, supporting illustrations |
| **Imagen 4.0** | 🇺🇸 English                       | Photorealistic quality, product rendering        | Commercial photography, product showcases   |

### ⭐️ Actual Generation Examples

#### Gemini Native - Everyday Creation Example

**Prompt:** "A cute orange cat sitting on a windowsill with sunlight streaming through the window onto it"

**Features:** Perfect understanding of Chinese descriptions, warm and natural colors, suitable for everyday creative needs

***

#### GPT Image - Professional Design Example

**Prompt:** "Professional logo design: minimalist coffee cup with transparent background"

**Features:** Transparent background, refined lines, suitable for brand applications

***

#### DALL-E 3 - Quick Concept Example

**Prompt:** "Quick concept sketch: futuristic city skyline with flying cars"

**Features:** Rapid generation, clear concepts, suitable for creative brainstorming

***

#### Google Imagen - Product Photography Example

**Prompt:** "Professional product photography: sleek smartphone with studio lighting"

**Features:** Photorealistic quality, professional lighting, commercial-grade output

***

### 📝 Usage Methods and Best Practices

#### Basic Usage Syntax

| Use Case                | Example Command                                               | Recommended Engine |
| ----------------------- | ------------------------------------------------------------- | ------------------ |
| **Everyday Creation**   | `Draw a cute puppy`                                           | Gemini Native      |
| **Professional Design** | `Design a modern minimalist logo with transparent background` | GPT Image          |
| **Quick Prototyping**   | `Quickly generate a concept image for a website homepage`     | DALL-E 3           |
| **Product Showcase**    | `Create a professional product photography image`             | Google Imagen      |

#### Multi-round Iterative Optimization (GPT Image)

```
Round 1: "Design a coffee shop logo"
Round 2: "Change the color to dark brown"
Round 3: "Add some steam effects"
Round 4: "Make it more minimalist"
```

#### Image Reference Editing (Gemini Native)

```
"Based on this image, change the background to a beach scene"
"Keep the person unchanged, only modify the clothing color"
"Add some flowers to this scene"
```

***

### 🎯 Application Scenario Practical Guide

#### **Scenario 1: Social Media Content Creation**

**Need:** Create images for Instagram posts

**Recommendation:** Gemini Native

**Example Command:** `Create a warm coffee shop scene suitable for IG posts`

#### **Scenario 2: Corporate Brand Design**

**Need:** Design company logo and brand materials

**Recommendation:** GPT Image

**Example Command:** `Design a tech company logo, minimalist modern style, transparent background`

#### **Scenario 3: Product Display Images**

**Need:** Product main images for e-commerce platforms

**Recommendation:** Google Imagen

**Example Command:** `Professional product shot of wireless headphones on white background`

#### **Scenario 4: Creative Ideation and Prototyping**

**Need:** Quickly visualize creative concepts

**Recommendation:** DALL-E 3

**Example Command:** `Concept art for a mobile app interface design`

***

### ❓ Common Questions and Solutions

#### Quality-Related Issues

**Q: How to obtain the highest quality images?**

**A:** Use GPT Image or Google Imagen, and provide detailed descriptions:

* ✅ Specific style requirements (e.g., "professional photography style")
* ✅ Detailed scene descriptions (lighting, angles, atmosphere)
* ✅ Clear quality requirements (e.g., "high resolution" "commercial quality")

**Q: Why doesn't the generated image match expectations?**

**A:** Suggestions for optimizing prompts:

* 🎯 Use specific rather than abstract descriptions
* 🎨 Specify clear artistic styles
* 📐 Explain composition and perspective requirements
* 🌈 Describe color and lighting effects

#### Functional Usage Issues

**Q: How to generate images with transparent backgrounds?**

**A:** Explicitly mention "transparent background" in the description:

```
Design a logo with transparent background
Create an icon with transparent background
```

**Q: Can I modify already generated images?**

**A:** Yes! Use Gemini Native's image reference feature:

```
Based on this image above, change the sky to sunset colors
Keep the composition unchanged, only modify the character's clothing
```


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.maiagent.ai/maiagent-user-guide/maiagent-user-guide-en/tools/ai-image-generation-guide.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
