voice api

The voice API

Gabriel needed a voice, so we built one. Now the same stack is opening up to the public. Endpoints are OpenAI-compatible, so if your code talks to the OpenAI API, it already talks to ours.

read the docs developer portal

compat

OpenAI SDKs

streaming

HTTP + WebSocket

formats

mp3 wav opus pcm

price

free in early access

Text to Speech

live

▸ OpenAI-compatible /v1/audio/speech endpoint
▸ Voice cloning from your own audio samples
▸ Real-time streaming over HTTP and WebSocket
▸ PCM, WAV, MP3, and Opus output
▸ Per-key rate limits and usage tracking

Speech to Text

in testing

Built on parakeet.cpp. Fast, accurate transcription without a datacenter bill. Currently in internal testing, public access soon.

▸ Same base URL, same API keys as TTS
▸ OpenAI-compatible /v1/audio/transcriptions
▸ Built on parakeet.cpp for speed
▸ Streaming transcription planned

want early access? say hi: hello@hoppou.ai

quick start

# python, with the openai sdk

from openai import OpenAI

client = OpenAI(
    base_url="https://api.hoppou.ai/tts/v1",
    api_key="your-hoppou-key",
)

audio = client.audio.speech.create(
    model="pocket-tts",
    voice="your-cloned-voice",
    input="hello world",
)
audio.write_to_file("speech.mp3")

// plain fetch, no sdk needed

const res = await fetch("https://api.hoppou.ai/tts/v1/audio/speech", {
  method: "POST",
  headers: {
    Authorization: "Bearer your-hoppou-key",
    "Content-Type": "application/json",
  },
  body: JSON.stringify({
    model: "pocket-tts",
    voice: "your-cloned-voice",
    input: "hello world",
  }),
});
const audio = await res.arrayBuffer();

# straight from the terminal

curl https://api.hoppou.ai/tts/v1/audio/speech \
  -H "Authorization: Bearer your-hoppou-key" \
  -H "Content-Type: application/json" \
  -d '{"model":"pocket-tts","voice":"your-cloned-voice","input":"hello world"}' \
  --output speech.mp3

# speech to text, same client (in testing)

text = client.audio.transcriptions.create(
    model="parakeet",
    file=open("recording.wav", "rb"),
)
print(text.text)

people point this at

discord botsvrchat aisgame npcsaccessibility toolshome assistantscontent pipelinestwitch alerts

Get a key

Clone a voice

Upload a clean audio sample on the portal and you get a voice ID you can use in any speech request.

Ship it

Point your existing OpenAI SDK at our base URL. Streaming, WebSocket, and batch all work the same way.

questions people ask

Is it actually OpenAI-compatible? +

Yes. Same request and response shapes as /v1/audio/speech and /v1/audio/transcriptions. Point your existing SDK at our base URL, swap the model name, done. If something behaves differently, that's a bug and we want to hear about it.

What does it cost? +

Nothing right now. Keys are free during early access while we figure out what fair limits look like. There's no credit card field anywhere on this site.

Can I clone any voice? +

You can clone voices you have the rights to: your own, ones you made, or ones you have permission for. Don't clone real people without their OK. We pull keys for that.

How is this different from the big providers? +

It's small and fast, and it's the same stack our VRChat AIs talk through in production every day. No sales call, no dashboard maze, no surprise invoice.

When does speech-to-text open up? +

It's in internal testing on parakeet.cpp right now. Email hello@hoppou.ai if you want to kick the tires early.

get started

Ready when you are

Grab a key, point your SDK north, and you're making noise in a couple minutes. Questions first? We actually answer email.

get a key hello@hoppou.ai