Files
fish-audio-plugin/README.md
Clawdbot 4842dc64a5 feat: scaffold Fish Audio speech provider plugin
- index.ts: plugin entry with definePluginEntry + registerSpeechProvider
- speech-provider.ts: full SpeechProviderPlugin implementation
  - resolveConfig from messages.tts.providers.fish-audio
  - parseDirectiveToken for voice, model, speed, latency, temperature, top_p
  - listVoices merging official + user's own voices
  - synthesize with format-aware output (opus for voice-note, mp3 otherwise)
  - stub Talk Mode (resolveTalkConfig/resolveTalkOverrides)
- tts.ts: raw fishAudioTTS() fetch + listFishAudioVoices()
  - streaming chunked → buffer, error body included in exceptions
  - parallel voice listing with graceful partial failure
- speech-provider.test.ts: voice ID validation tests
- openclaw.plugin.json: speechProviders contract
- package.json: peer dep on openclaw >=2026.3.0
2026-03-29 18:14:29 +11:00

98 lines
2.6 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Fish Audio Speech Plugin for OpenClaw
A speech provider plugin that integrates [Fish Audio](https://fish.audio) TTS with OpenClaw.
## Features
- **Fish Audio S2-Pro / S1 / S2** model support
- **Dynamic voice listing** — your own cloned voices + official Fish Audio voices
- **Format-aware output** — opus for voice notes (Telegram, WhatsApp), mp3 otherwise
- **Inline directives** — switch voice, speed, model, and latency mid-message
- **No core changes required** — standard `SpeechProviderPlugin` extension
## Installation
```bash
openclaw plugins install @openclaw/fish-audio-speech
```
## Configuration
In your `openclaw.json`:
```json
{
"messages": {
"tts": {
"provider": "fish-audio",
"providers": {
"fish-audio": {
"apiKey": "your-fish-audio-api-key",
"voiceId": "8a2d42279389471993460b85340235c5",
"model": "s2-pro",
"latency": "normal",
"speed": 1.0
}
}
}
}
}
```
### Config Options
| Field | Type | Default | Description |
|-------|------|---------|-------------|
| `apiKey` | string | — | **Required.** Fish Audio API key |
| `voiceId` | string | `8a2d42...` | Reference ID of the voice to use |
| `model` | string | `s2-pro` | TTS model (`s2-pro`, `s1`, `s2`) |
| `latency` | string | `normal` | Latency mode (`normal`, `balanced`, `low`) |
| `speed` | number | — | Prosody speed (0.52.0) |
| `temperature` | number | — | Sampling temperature (01) |
| `topP` | number | — | Top-p sampling (01) |
| `baseUrl` | string | `https://api.fish.audio` | API base URL |
### Environment Variable
You can also set the API key via environment variable:
```bash
FISH_AUDIO_API_KEY=your-key
```
## Directives
Use inline directives in your messages to control TTS per-message:
```
[[tts:voice=<ref_id>]] Switch voice
[[tts:speed=1.2]] Prosody speed (0.52.0)
[[tts:model=s1]] Model override
[[tts:latency=low]] Latency mode
[[tts:temperature=0.7]] Sampling temperature
[[tts:top_p=0.8]] Top-p sampling
```
## Voice Listing
The plugin dynamically lists available voices via `/tts voices`:
- **Official Fish Audio voices** (~38 voices)
- **Your own cloned/trained voices** (marked with "(mine)")
## Output Format
The plugin automatically selects the best format based on the channel:
- **Voice note channels** (Telegram, WhatsApp, Matrix, Feishu) → Opus
- **All other channels** → MP3
Both formats set `voiceCompatible: true` — Fish Audio output works cleanly as native voice notes.
## Requirements
- OpenClaw ≥ 2026.3.0
- Fish Audio API key ([get one here](https://fish.audio))
## License
MIT