Skip to main content

REST API

All endpoints are under the /api prefix. Voice processing (splitting and embedding) runs asynchronously — upload a recording and poll the voice status.

Authentication

Supply your API key in the X-Api-Key header or as an api_key query parameter.

curl -X GET https://voice.hidoba.com/api/voices/my-partner \
-H "X-Api-Key: YOUR_API_KEY"

Voices

List Voices

GET /api/voices/{partner}

Response

[
{
"name": "speaker-1",
"has_original": true,
"sample_count": 12,
"splitting_status": "completed",
"embedding_status": "completed",
"creator_name": "John",
"creator_email": "john@example.com"
}
]

Create Voice (with Audio)

POST /api/voices/{partner}
Content-Type: multipart/form-data

Uploading an audio file automatically triggers splitting and embedding.

curl -X POST https://voice.hidoba.com/api/voices/my-partner \
-H "X-Api-Key: YOUR_API_KEY" \
-F "name=speaker-1" \
-F "file=@recording.wav" \
-F "creator_name=John" \
-F "creator_email=john@example.com"

Form Parameters

ParameterTypeDefaultDescription
nameStringVoice name (1–255 chars). Required.
fileFileAudio file to upload. Optional.
creator_nameStringName of the person who created the voice.
creator_emailStringEmail of the creator.

Create Empty Voice

POST /api/voices/{partner}/empty

Creates a voice without an audio file. Use this when uploading samples manually.

curl -X POST https://voice.hidoba.com/api/voices/my-partner/empty \
-H "X-Api-Key: YOUR_API_KEY" \
-F "name=speaker-2"

Get Voice Details

GET /api/voices/{partner}/{voice}

Response

{
"name": "speaker-1",
"original_filename": "recording.opus",
"original_uploaded_name": "recording.wav",
"original_duration": 185.4,
"status": {
"splitting": "completed",
"embedding": "completed"
},
"samples": [
{
"index": 1,
"source_timestamp": "00:00",
"duration": 15.0
},
{
"index": 2,
"source_timestamp": "00:15",
"duration": 15.0
}
],
"embedding_info": {
"tts_type": "kyutai",
"last_update_status": "success",
"last_update_time": "2025-01-15T10:30:00Z",
"test_samples": [
{
"sample_index": 1,
"sentence_index": 1,
"filename": "test_1_1.opus"
}
],
"sample_scores": [
{ "sample_index": 3, "score": 0.92 },
{ "sample_index": 1, "score": 0.88 }
]
},
"embedding_type": "kyutai",
"embedding_zip_url": "https://...",
"preferred_sample_index": 3,
"creator_name": "John",
"creator_email": "john@example.com"
}

Delete Voice

DELETE /api/voices/{partner}/{voice}
note

Voices cannot be deleted while splitting or embedding is in progress. You will receive a 409 Conflict response.

Rename Voice

PUT /api/voices/{partner}/{voice}/name
Content-Type: multipart/form-data
ParameterTypeDescription
nameStringNew voice name. Required.

Samples

Upload Sample

POST /api/voices/{partner}/{voice}/samples
Content-Type: multipart/form-data

Upload a single audio sample to a voice that was created empty. The file is automatically converted to opus 96kbps mono.

curl -X POST https://voice.hidoba.com/api/voices/my-partner/speaker-2/samples \
-H "X-Api-Key: YOUR_API_KEY" \
-F "file=@sample.wav"

Download Sample

GET /api/voices/{partner}/{voice}/samples/{index}

Returns the opus audio file for the sample at the given index.

Delete Sample

DELETE /api/voices/{partner}/{voice}/samples/{index}

Download Sample Bundle

GET /api/voices/{partner}/{voice}/samples-bundle

Returns a ZIP archive containing all samples as {index}.opus files and a voice_info.json metadata file.

Splitting

Queue Splitting Job

POST /api/voices/{partner}/{voice}/split
Content-Type: application/x-www-form-urlencoded

Re-splits the original recording with custom parameters. Clears existing samples.

ParameterTypeDefaultDescription
max_samplesInteger25Maximum number of samples to extract (1–100).
sample_durationInteger15Duration of each sample in seconds (5–60).
curl -X POST https://voice.hidoba.com/api/voices/my-partner/speaker-1/split \
-H "X-Api-Key: YOUR_API_KEY" \
-d "max_samples=30&sample_duration=10"

Embedding

Queue Embedding Job

POST /api/voices/{partner}/{voice}/embed

Submits the voice samples for embedding generation via RunPod.

ParameterTypeDefaultDescription
tts_typeStringTTS type override. If omitted, uses the voice's embedding_type (default kyutai).

Get Test Samples

GET /api/voices/{partner}/{voice}/test-samples

Returns a list of TTS-generated test audio files created during the embedding process.

ParameterTypeDefaultDescription
tts_typeString (query)kyutaiFilter test samples by TTS type.

Download Test Sample

GET /api/voices/{partner}/{voice}/test-samples/{filename}

Returns the test sample audio file.

ParameterTypeDefaultDescription
tts_typeString (query)kyutaiTTS type folder to look in.

Delete Test Files

DELETE /api/voices/{partner}/{voice}/test-files/{tts_type}

Set Preferred Sample

PUT /api/voices/{partner}/{voice}/preferred-sample

Sets the preferred sample index based on quality scores or manual selection.

ParameterTypeDefaultDescription
sample_indexInteger (form)Sample index to set as preferred. Pass empty to clear.

Voice Status

A voice tracks two independent processing states:

StatusValuesDescription
splittingidle, in_progress, completed, failedAudio splitting into samples.
embeddingidle, in_progress, completed, failedVoice embedding generation.

When either status is in_progress, the voice is locked — rename and delete operations will return 409 Conflict.

Error Responses

Unauthorized (401)

{
"detail": "Invalid API key"
}

Forbidden (403)

{
"detail": "Access denied"
}

Not Found (404)

{
"detail": "Voice not found"
}

Conflict (409)

{
"detail": "Voice is currently being processed"
}

Rate Limited (429)

{
"detail": "Rate limit exceeded"
}