Silkwave App Logo

Multimodal Features

Chat with video, images, audio, and PDFs, and generate images with Gemini.

Silkwave Chat supports "multimodal" interactions, meaning you can send more than just text to the AI.

Analyzing Files

You can attach files using the paperclip icon in the chat input or by dragging and dropping them directly into the chat window.

Supported File Types:

  • Video Files: Upload .mp4 or .mov files for analysis. The AI can summarize content, answer questions about specific scenes, or analyze the spoken audio.
  • Images: Upload .png or .jpg files for visual analysis.
  • Audio Files: Upload .mp3 or .wav files.
  • PDF Documents: Upload .pdf files. The AI can read the document to extract text, summarize long reports, or answer specific questions based on the content.

Note: Image, video, audio, and document analysis capabilities depend on the specific model selected. Check the Modalities icons in Settings → Models to confirm what each model supports.

Generating Images

You can generate images directly within the chat using Gemini's image generation models (e.g., gemini-3-pro-image-preview).

  1. Select a Gemini image model from the model dropdown.
  2. Type your prompt (e.g., "Please generate a flat vector illustration of a peaceful mountain range in muted, deep colors.").
  3. The image will appear in the chat.
  4. Click on the image to preview it, or save it to your desktop.

Rich Text & Math

Silkwave Chat supports advanced rendering for technical users:

  • Markdown: Headers, lists, bold text, code blocks, tables, and more.
  • LaTeX: Mathematical formulas.

On this page