fish-audio-plugin/README.md

# Fish Audio Speech Plugin for OpenClaw

A speech provider plugin that integrates [Fish Audio](https://fish.audio) TTS with OpenClaw.

## Features

- **Fish Audio S2-Pro / S1 / S2** model support
- **Dynamic voice listing** — your own cloned voices + official Fish Audio voices
- **Format-aware output** — opus for voice notes (Telegram, WhatsApp), mp3 otherwise
- **Inline directives** — switch voice, speed, model, and latency mid-message
- **No core changes required** — standard `SpeechProviderPlugin` extension

## Installation

```bash
openclaw plugins install @openclaw/fish-audio-speech
```

## Configuration

In your `openclaw.json`:

```json
{
  "messages": {
    "tts": {
      "provider": "fish-audio",
      "providers": {
        "fish-audio": {
          "apiKey": "your-fish-audio-api-key",
          "voiceId": "8a2d42279389471993460b85340235c5",
          "model": "s2-pro",
          "latency": "normal",
          "speed": 1.0
        }
      }
    }
  }
}
```

### Config Options

| Field | Type | Default | Description |
|-------|------|---------|-------------|
| `apiKey` | string | — | **Required.** Fish Audio API key |
| `voiceId` | string | — | **Required.** Reference ID of the voice to use |
| `model` | string | `s2-pro` | TTS model (`s2-pro`, `s1`, `s2`) |
| `latency` | string | `normal` | Latency mode (`normal`, `balanced`, `low`) |
| `speed` | number | — | Prosody speed (0.5–2.0) |
| `temperature` | number | — | Sampling temperature (0–1) |
| `topP` | number | — | Top-p sampling (0–1) |
| `baseUrl` | string | `https://api.fish.audio` | API base URL |

### Environment Variable

You can also set the API key via environment variable:

```bash
FISH_AUDIO_API_KEY=your-key
```

## Directives

Use inline directives in your messages to control TTS per-message:

```
[[tts:voice=<ref_id>]]     Switch voice
[[tts:speed=1.2]]          Prosody speed (0.5–2.0)
[[tts:model=s1]]           Model override
[[tts:latency=low]]        Latency mode
[[tts:temperature=0.7]]    Sampling temperature
[[tts:top_p=0.8]]          Top-p sampling
```

## Voice Listing

The plugin dynamically lists available voices via `/tts voices`:
- **Official Fish Audio voices** (~38 voices)
- **Your own cloned/trained voices** (marked with "(mine)")

## Output Format

The plugin automatically selects the best format based on the channel:
- **Voice note channels** (Telegram, WhatsApp, Matrix, Feishu) → Opus
- **All other channels** → MP3

Both formats set `voiceCompatible: true` — Fish Audio output works cleanly as native voice notes.

## Requirements

- OpenClaw ≥ 2026.3.0
- Fish Audio API key ([get one here](https://fish.audio))

## License

MIT