Files

Clawdbot 4842dc64a5 feat: scaffold Fish Audio speech provider plugin

- index.ts: plugin entry with definePluginEntry + registerSpeechProvider
- speech-provider.ts: full SpeechProviderPlugin implementation
  - resolveConfig from messages.tts.providers.fish-audio
  - parseDirectiveToken for voice, model, speed, latency, temperature, top_p
  - listVoices merging official + user's own voices
  - synthesize with format-aware output (opus for voice-note, mp3 otherwise)
  - stub Talk Mode (resolveTalkConfig/resolveTalkOverrides)
- tts.ts: raw fishAudioTTS() fetch + listFishAudioVoices()
  - streaming chunked → buffer, error body included in exceptions
  - parallel voice listing with graceful partial failure
- speech-provider.test.ts: voice ID validation tests
- openclaw.plugin.json: speechProviders contract
- package.json: peer dep on openclaw >=2026.3.0

2026-03-29 18:14:29 +11:00

2.6 KiB

Raw Blame History

Fish Audio Speech Plugin for OpenClaw

A speech provider plugin that integrates Fish Audio TTS with OpenClaw.

Features

Fish Audio S2-Pro / S1 / S2 model support
Dynamic voice listing — your own cloned voices + official Fish Audio voices
Format-aware output — opus for voice notes (Telegram, WhatsApp), mp3 otherwise
Inline directives — switch voice, speed, model, and latency mid-message
No core changes required — standard SpeechProviderPlugin extension

Installation

openclaw plugins install @openclaw/fish-audio-speech

Configuration

In your openclaw.json:

{
  "messages": {
    "tts": {
      "provider": "fish-audio",
      "providers": {
        "fish-audio": {
          "apiKey": "your-fish-audio-api-key",
          "voiceId": "8a2d42279389471993460b85340235c5",
          "model": "s2-pro",
          "latency": "normal",
          "speed": 1.0
        }
      }
    }
  }
}

Config Options

Field	Type	Default	Description
`apiKey`	string	—	Required. Fish Audio API key
`voiceId`	string	`8a2d42...`	Reference ID of the voice to use
`model`	string	`s2-pro`	TTS model (`s2-pro`, `s1`, `s2`)
`latency`	string	`normal`	Latency mode (`normal`, `balanced`, `low`)
`speed`	number	—	Prosody speed (0.5–2.0)
`temperature`	number	—	Sampling temperature (0–1)
`topP`	number	—	Top-p sampling (0–1)
`baseUrl`	string	`https://api.fish.audio`	API base URL

Environment Variable

You can also set the API key via environment variable:

FISH_AUDIO_API_KEY=your-key

Directives

Use inline directives in your messages to control TTS per-message:

[[tts:voice=<ref_id>]]     Switch voice
[[tts:speed=1.2]]          Prosody speed (0.5–2.0)
[[tts:model=s1]]           Model override
[[tts:latency=low]]        Latency mode
[[tts:temperature=0.7]]    Sampling temperature
[[tts:top_p=0.8]]          Top-p sampling

Voice Listing

The plugin dynamically lists available voices via /tts voices:

Official Fish Audio voices (~38 voices)
Your own cloned/trained voices (marked with "(mine)")

Output Format

The plugin automatically selects the best format based on the channel:

Voice note channels (Telegram, WhatsApp, Matrix, Feishu) → Opus
All other channels → MP3

Both formats set voiceCompatible: true — Fish Audio output works cleanly as native voice notes.

Requirements

OpenClaw ≥ 2026.3.0
Fish Audio API key (get one here)

License

MIT

2.6 KiB Raw Blame History Unescape Escape