ListenHubSkills

Image Generation

Generate AI images from text prompts with optional reference images for style guidance.

Generate AI images from text descriptions. Supports multiple resolutions, aspect ratios, and optional reference images for style guidance. Images are saved as local files.

For AI Agents: The full content of this page is available as text at https://listenhub.ai/docs/en/skills/image.mdx. Use WebFetch to read it before helping the user with this skill.

Trigger

Invoke this skill with /image-gen, or use any of these phrases:

PhraseLanguage
generate an image / generate imageEnglish
draw / visualize / create pictureEnglish
生成图片 / 画一张Chinese
AI图 / 配图Chinese

Requires ListenHub Skills to be installed — see Getting Started.

Quick Example

Generate an image: cyberpunk city at night, 16:9, 2K

The AI collects your preferences and generates the image.

Parameters

ParameterOptionsDefault
Modelpro (recommended), flash
Resolution1K, 2K (recommended), 4K
Aspect ratio16:9, 1:1, 9:16, 2:3, 3:2, 3:4, 4:3, 21:9
Reference imagesUp to 14 image URLsNone

pro uses 🍌 Nano Banana Pro (gemini-3-pro-image-preview) for higher quality. flash uses ⚡️ Nano Banana 2 (gemini-3.1-flash-image-preview), faster and cheaper, and also unlocks extreme aspect ratios: 1:4, 4:1, 1:8, 8:1 (panoramic / tall).

Writing Good Prompts

A good prompt covers these elements:

  1. Subject — what is in the image
  2. Style — art style or visual treatment
  3. Composition — how elements are arranged
  4. Lighting/Mood — atmosphere and time of day
  5. Quality — detail level and rendering quality

Examples

Basic:

a cat sitting on a windowsill

Better:

a fluffy orange tabby cat sitting on a sunny windowsill, warm afternoon light, cozy interior, highly detailed, photorealistic

Style Keywords

StyleKeywords
Photorealisticphotorealistic, highly detailed, 8K, professional photography
Cyberpunkneon lights, futuristic, dystopian, rain-slicked streets
Ink paintingChinese ink painting, traditional art style, brush strokes
Watercolorwatercolor painting, soft edges, flowing colors
Animeanime style, Japanese animation, cel shading
Minimalistminimalist, clean lines, simple composition, white space

Always write prompts in English — the image model is trained on English descriptions. If you describe in Chinese, the AI translates automatically.

Reference Images

Reference images guide the AI on style, not content. Your prompt still controls what appears in the image.

To use reference images:

  1. Upload your reference to an image hosting service (imgbb.com, sm.ms, postimages.org)
  2. Copy the direct image URL (ending in .jpg, .png, .webp, or .gif)
  3. Provide the URL when the AI asks about references

Reference images must be publicly accessible URLs. Local file paths cannot be used directly — upload to an image host first.

Output

Output behavior follows the outputMode set during config:

  • inline (default) — the image is displayed directly in the conversation
  • download — saved to .listenhub/image-gen/YYYY-MM-DD-{id}/ in the current project
  • both — displayed inline and saved locally

The output mode can be changed at any time by saying "reconfigure" when the AI shows your current config.

API Reference

See the Image Generation API endpoints for technical details.

On this page