Image Generation
Generate AI images from text prompts with optional reference images for style guidance.
Generate AI images from text descriptions. Supports multiple resolutions, aspect ratios, and optional reference images for style guidance. Images are saved as local files.
For AI Agents: The full content of this page is available as text at https://listenhub.ai/docs/en/skills/image.mdx. Use WebFetch to read it before helping the user with this skill.
Trigger
Invoke this skill with /image-gen, or use any of these phrases:
| Phrase | Language |
|---|---|
generate an image / generate image | English |
draw / visualize / create picture | English |
生成图片 / 画一张 | Chinese |
AI图 / 配图 | Chinese |
Requires ListenHub Skills to be installed — see Getting Started.
Quick Example
Generate an image: cyberpunk city at night, 16:9, 2KThe AI collects your preferences and generates the image.
Parameters
| Parameter | Options | Default |
|---|---|---|
| Model | pro (recommended), flash | — |
| Resolution | 1K, 2K (recommended), 4K | — |
| Aspect ratio | 16:9, 1:1, 9:16, 2:3, 3:2, 3:4, 4:3, 21:9 | — |
| Reference images | Up to 14 image URLs | None |
pro uses 🍌 Nano Banana Pro (gemini-3-pro-image-preview) for higher quality. flash uses ⚡️ Nano Banana 2 (gemini-3.1-flash-image-preview), faster and cheaper, and also unlocks extreme aspect ratios: 1:4, 4:1, 1:8, 8:1 (panoramic / tall).
Writing Good Prompts
A good prompt covers these elements:
- Subject — what is in the image
- Style — art style or visual treatment
- Composition — how elements are arranged
- Lighting/Mood — atmosphere and time of day
- Quality — detail level and rendering quality
Examples
Basic:
a cat sitting on a windowsillBetter:
a fluffy orange tabby cat sitting on a sunny windowsill, warm afternoon light, cozy interior, highly detailed, photorealisticStyle Keywords
| Style | Keywords |
|---|---|
| Photorealistic | photorealistic, highly detailed, 8K, professional photography |
| Cyberpunk | neon lights, futuristic, dystopian, rain-slicked streets |
| Ink painting | Chinese ink painting, traditional art style, brush strokes |
| Watercolor | watercolor painting, soft edges, flowing colors |
| Anime | anime style, Japanese animation, cel shading |
| Minimalist | minimalist, clean lines, simple composition, white space |
Always write prompts in English — the image model is trained on English descriptions. If you describe in Chinese, the AI translates automatically.
Reference Images
Reference images guide the AI on style, not content. Your prompt still controls what appears in the image.
To use reference images:
- Upload your reference to an image hosting service (imgbb.com, sm.ms, postimages.org)
- Copy the direct image URL (ending in
.jpg,.png,.webp, or.gif) - Provide the URL when the AI asks about references
Reference images must be publicly accessible URLs. Local file paths cannot be used directly — upload to an image host first.
Output
Output behavior follows the outputMode set during config:
inline(default) — the image is displayed directly in the conversationdownload— saved to.listenhub/image-gen/YYYY-MM-DD-{id}/in the current projectboth— displayed inline and saved locally
The output mode can be changed at any time by saying "reconfigure" when the AI shows your current config.
API Reference
See the Image Generation API endpoints for technical details.