- index.ts: plugin entry with definePluginEntry + registerSpeechProvider - speech-provider.ts: full SpeechProviderPlugin implementation - resolveConfig from messages.tts.providers.fish-audio - parseDirectiveToken for voice, model, speed, latency, temperature, top_p - listVoices merging official + user's own voices - synthesize with format-aware output (opus for voice-note, mp3 otherwise) - stub Talk Mode (resolveTalkConfig/resolveTalkOverrides) - tts.ts: raw fishAudioTTS() fetch + listFishAudioVoices() - streaming chunked → buffer, error body included in exceptions - parallel voice listing with graceful partial failure - speech-provider.test.ts: voice ID validation tests - openclaw.plugin.json: speechProviders contract - package.json: peer dep on openclaw >=2026.3.0
2.6 KiB
2.6 KiB
Fish Audio Speech Plugin for OpenClaw
A speech provider plugin that integrates Fish Audio TTS with OpenClaw.
Features
- Fish Audio S2-Pro / S1 / S2 model support
- Dynamic voice listing — your own cloned voices + official Fish Audio voices
- Format-aware output — opus for voice notes (Telegram, WhatsApp), mp3 otherwise
- Inline directives — switch voice, speed, model, and latency mid-message
- No core changes required — standard
SpeechProviderPluginextension
Installation
openclaw plugins install @openclaw/fish-audio-speech
Configuration
In your openclaw.json:
{
"messages": {
"tts": {
"provider": "fish-audio",
"providers": {
"fish-audio": {
"apiKey": "your-fish-audio-api-key",
"voiceId": "8a2d42279389471993460b85340235c5",
"model": "s2-pro",
"latency": "normal",
"speed": 1.0
}
}
}
}
}
Config Options
| Field | Type | Default | Description |
|---|---|---|---|
apiKey |
string | — | Required. Fish Audio API key |
voiceId |
string | 8a2d42... |
Reference ID of the voice to use |
model |
string | s2-pro |
TTS model (s2-pro, s1, s2) |
latency |
string | normal |
Latency mode (normal, balanced, low) |
speed |
number | — | Prosody speed (0.5–2.0) |
temperature |
number | — | Sampling temperature (0–1) |
topP |
number | — | Top-p sampling (0–1) |
baseUrl |
string | https://api.fish.audio |
API base URL |
Environment Variable
You can also set the API key via environment variable:
FISH_AUDIO_API_KEY=your-key
Directives
Use inline directives in your messages to control TTS per-message:
[[tts:voice=<ref_id>]] Switch voice
[[tts:speed=1.2]] Prosody speed (0.5–2.0)
[[tts:model=s1]] Model override
[[tts:latency=low]] Latency mode
[[tts:temperature=0.7]] Sampling temperature
[[tts:top_p=0.8]] Top-p sampling
Voice Listing
The plugin dynamically lists available voices via /tts voices:
- Official Fish Audio voices (~38 voices)
- Your own cloned/trained voices (marked with "(mine)")
Output Format
The plugin automatically selects the best format based on the channel:
- Voice note channels (Telegram, WhatsApp, Matrix, Feishu) → Opus
- All other channels → MP3
Both formats set voiceCompatible: true — Fish Audio output works cleanly as native voice notes.
Requirements
- OpenClaw ≥ 2026.3.0
- Fish Audio API key (get one here)
License
MIT