REST API
All endpoints are under the /api prefix. Voice processing (splitting and embedding) runs asynchronously — upload a recording and poll the voice status.
Authentication
Supply your API key in the X-Api-Key header or as an api_key query parameter.
curl -X GET https://voice.hidoba.com/api/voices/my-partner \
-H "X-Api-Key: YOUR_API_KEY"
Voices
List Voices
GET /api/voices/{partner}
Response
[
{
"name": "speaker-1",
"has_original": true,
"sample_count": 12,
"splitting_status": "completed",
"embedding_status": "completed",
"creator_name": "John",
"creator_email": "john@example.com"
}
]
Create Voice (with Audio)
POST /api/voices/{partner}
Content-Type: multipart/form-data
Uploading an audio file automatically triggers splitting and embedding.
curl -X POST https://voice.hidoba.com/api/voices/my-partner \
-H "X-Api-Key: YOUR_API_KEY" \
-F "name=speaker-1" \
-F "file=@recording.wav" \
-F "creator_name=John" \
-F "creator_email=john@example.com"
Form Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
name | String | — | Voice name (1–255 chars). Required. |
file | File | — | Audio file to upload. Optional. |
creator_name | String | — | Name of the person who created the voice. |
creator_email | String | — | Email of the creator. |
Create Empty Voice
POST /api/voices/{partner}/empty
Creates a voice without an audio file. Use this when uploading samples manually.
curl -X POST https://voice.hidoba.com/api/voices/my-partner/empty \
-H "X-Api-Key: YOUR_API_KEY" \
-F "name=speaker-2"
Get Voice Details
GET /api/voices/{partner}/{voice}
Response
{
"name": "speaker-1",
"original_filename": "recording.opus",
"original_uploaded_name": "recording.wav",
"original_duration": 185.4,
"status": {
"splitting": "completed",
"embedding": "completed"
},
"samples": [
{
"index": 1,
"source_timestamp": "00:00",
"duration": 15.0
},
{
"index": 2,
"source_timestamp": "00:15",
"duration": 15.0
}
],
"embedding_info": {
"tts_type": "kyutai",
"last_update_status": "success",
"last_update_time": "2025-01-15T10:30:00Z",
"test_samples": [
{
"sample_index": 1,
"sentence_index": 1,
"filename": "test_1_1.opus"
}
],
"sample_scores": [
{ "sample_index": 3, "score": 0.92 },
{ "sample_index": 1, "score": 0.88 }
]
},
"embedding_type": "kyutai",
"embedding_zip_url": "https://...",
"preferred_sample_index": 3,
"creator_name": "John",
"creator_email": "john@example.com"
}
Delete Voice
DELETE /api/voices/{partner}/{voice}
Voices cannot be deleted while splitting or embedding is in progress. You will receive a 409 Conflict response.
Rename Voice
PUT /api/voices/{partner}/{voice}/name
Content-Type: multipart/form-data
| Parameter | Type | Description |
|---|---|---|
name | String | New voice name. Required. |
Samples
Upload Sample
POST /api/voices/{partner}/{voice}/samples
Content-Type: multipart/form-data
Upload a single audio sample to a voice that was created empty. The file is automatically converted to opus 96kbps mono.
curl -X POST https://voice.hidoba.com/api/voices/my-partner/speaker-2/samples \
-H "X-Api-Key: YOUR_API_KEY" \
-F "file=@sample.wav"
Download Sample
GET /api/voices/{partner}/{voice}/samples/{index}
Returns the opus audio file for the sample at the given index.
Delete Sample
DELETE /api/voices/{partner}/{voice}/samples/{index}
Download Sample Bundle
GET /api/voices/{partner}/{voice}/samples-bundle
Returns a ZIP archive containing all samples as {index}.opus files and a voice_info.json metadata file.
Splitting
Queue Splitting Job
POST /api/voices/{partner}/{voice}/split
Content-Type: application/x-www-form-urlencoded
Re-splits the original recording with custom parameters. Clears existing samples.
| Parameter | Type | Default | Description |
|---|---|---|---|
max_samples | Integer | 25 | Maximum number of samples to extract (1–100). |
sample_duration | Integer | 15 | Duration of each sample in seconds (5–60). |
curl -X POST https://voice.hidoba.com/api/voices/my-partner/speaker-1/split \
-H "X-Api-Key: YOUR_API_KEY" \
-d "max_samples=30&sample_duration=10"
Embedding
Queue Embedding Job
POST /api/voices/{partner}/{voice}/embed
Submits the voice samples for embedding generation via RunPod.
| Parameter | Type | Default | Description |
|---|---|---|---|
tts_type | String | — | TTS type override. If omitted, uses the voice's embedding_type (default kyutai). |
Get Test Samples
GET /api/voices/{partner}/{voice}/test-samples
Returns a list of TTS-generated test audio files created during the embedding process.
| Parameter | Type | Default | Description |
|---|---|---|---|
tts_type | String (query) | kyutai | Filter test samples by TTS type. |
Download Test Sample
GET /api/voices/{partner}/{voice}/test-samples/{filename}
Returns the test sample audio file.
| Parameter | Type | Default | Description |
|---|---|---|---|
tts_type | String (query) | kyutai | TTS type folder to look in. |
Delete Test Files
DELETE /api/voices/{partner}/{voice}/test-files/{tts_type}
Set Preferred Sample
PUT /api/voices/{partner}/{voice}/preferred-sample
Sets the preferred sample index based on quality scores or manual selection.
| Parameter | Type | Default | Description |
|---|---|---|---|
sample_index | Integer (form) | — | Sample index to set as preferred. Pass empty to clear. |
Voice Status
A voice tracks two independent processing states:
| Status | Values | Description |
|---|---|---|
splitting | idle, in_progress, completed, failed | Audio splitting into samples. |
embedding | idle, in_progress, completed, failed | Voice embedding generation. |
When either status is in_progress, the voice is locked — rename and delete operations will return 409 Conflict.
Error Responses
Unauthorized (401)
{
"detail": "Invalid API key"
}
Forbidden (403)
{
"detail": "Access denied"
}
Not Found (404)
{
"detail": "Voice not found"
}
Conflict (409)
{
"detail": "Voice is currently being processed"
}
Rate Limited (429)
{
"detail": "Rate limit exceeded"
}