ListenHubSkills

Image Generation

Generate AI images from text prompts with optional reference images for style guidance.

Generate AI images from text descriptions. Supports multiple resolutions, aspect ratios, and optional reference images for style guidance. Images are saved as local files.

Trigger

Invoke this skill with /image-gen, or use any of these phrases:

PhraseLanguage
generate an image / generate imageEnglish
draw / visualize / create pictureEnglish
生成图片 / 画一张Chinese
AI图 / 配图Chinese

Requires ListenHub Skills to be installed — see Getting Started.

Quick Example

Generate an image: cyberpunk city at night, 16:9, 2K

The AI collects your preferences and generates the image.

Parameters

ParameterOptionsDefault
Model🍌 pro (gemini-3-pro-image-preview, higher quality, recommended), ⚡️ flash (gemini-3.1-flash-image-preview, faster and cheaper)
Resolution1K, 2K (recommended), 4K
Aspect ratio16:9, 1:1, 9:16, 2:3, 3:2, 3:4, 4:3, 21:9; flash also supports 1:4, 4:1, 1:8, 8:1
Reference imagesUp to 14, via image URL or base64None

Writing Good Prompts

A good prompt covers these elements:

  1. Subject — what is in the image
  2. Style — art style or visual treatment
  3. Composition — how elements are arranged
  4. Lighting/Mood — atmosphere and time of day
  5. Quality — detail level and rendering quality

Examples

Basic:

a cat sitting on a windowsill

Better:

a fluffy orange tabby cat sitting on a sunny windowsill, warm afternoon light, cozy interior, highly detailed, photorealistic

Style Keywords

StyleKeywords
Photorealisticphotorealistic, highly detailed, 8K, professional photography
Cyberpunkneon lights, futuristic, dystopian, rain-slicked streets
Ink paintingChinese ink painting, traditional art style, brush strokes
Watercolorwatercolor painting, soft edges, flowing colors
Animeanime style, Japanese animation, cel shading
Minimalistminimalist, clean lines, simple composition, white space

Always write prompts in English — the image model is trained on English descriptions. If you describe in Chinese, the AI translates automatically.

Reference Images

Reference images guide the AI on style, not content. Your prompt still controls what appears in the image.

Using Image URLs

  1. Upload your reference to an image hosting service (imgbb.com, sm.ms, postimages.org)
  2. Copy the direct image URL (ending in .jpg, .png, .webp, or .gif)
  3. Provide the URL when the AI asks about references

Using Base64 Inline Data (API)

When calling the Image Generation API directly, you can also provide reference images as base64-encoded data via the inlineData field — no image hosting required. This is useful for programmatic workflows where you already have the image in memory.

Each reference image must use exactly one of fileData (URL) or inlineData (base64), not both. See the API reference for request format and code examples.

Output

Output behavior follows the outputMode set during config:

  • inline (default) — the image is displayed directly in the conversation
  • download — saved to .listenhub/image-gen/YYYY-MM-DD-{id}/ in the current project
  • both — displayed inline and saved locally

The output mode can be changed at any time by saying "reconfigure" when the AI shows your current config.

API Reference

See the Image Generation API reference for endpoint details, request parameters, and code examples.

On this page