REST API

All endpoints are under the /api prefix. Voice processing (splitting and embedding) runs asynchronously — upload a recording and poll the voice status.

Authentication

Supply your API key in the X-Api-Key header or as an api_key query parameter.

curl -X GET https://voice.hidoba.com/api/voices/my-partner \
  -H "X-Api-Key: YOUR_API_KEY"

Voices

List Voices

GET /api/voices/{partner}

Response

[
  {
    "name": "speaker-1",
    "has_original": true,
    "sample_count": 12,
    "splitting_status": "completed",
    "embedding_status": "completed",
    "creator_name": "John",
    "creator_email": "john@example.com"
  }
]

Create Voice (with Audio)

POST /api/voices/{partner}
Content-Type: multipart/form-data

Uploading an audio file automatically triggers splitting and embedding.

curl -X POST https://voice.hidoba.com/api/voices/my-partner \
  -H "X-Api-Key: YOUR_API_KEY" \
  -F "name=speaker-1" \
  -F "file=@recording.wav" \
  -F "creator_name=John" \
  -F "creator_email=john@example.com"

Form Parameters

Parameter	Type	Default	Description
`name`	String	—	Voice name (1–255 chars). Required.
`file`	File	—	Audio file to upload. Optional.
`creator_name`	String	—	Name of the person who created the voice.
`creator_email`	String	—	Email of the creator.

Create Empty Voice

POST /api/voices/{partner}/empty

Creates a voice without an audio file. Use this when uploading samples manually.

curl -X POST https://voice.hidoba.com/api/voices/my-partner/empty \
  -H "X-Api-Key: YOUR_API_KEY" \
  -F "name=speaker-2"

Get Voice Details

GET /api/voices/{partner}/{voice}

Response

{
  "name": "speaker-1",
  "original_filename": "recording.opus",
  "original_uploaded_name": "recording.wav",
  "original_duration": 185.4,
  "status": {
    "splitting": "completed",
    "embedding": "completed"
  },
  "samples": [
    {
      "index": 1,
      "source_timestamp": "00:00",
      "duration": 15.0
    },
    {
      "index": 2,
      "source_timestamp": "00:15",
      "duration": 15.0
    }
  ],
  "embedding_info": {
    "tts_type": "kyutai",
    "last_update_status": "success",
    "last_update_time": "2025-01-15T10:30:00Z",
    "test_samples": [
      {
        "sample_index": 1,
        "sentence_index": 1,
        "filename": "test_1_1.opus"
      }
    ],
    "sample_scores": [
      { "sample_index": 3, "score": 0.92 },
      { "sample_index": 1, "score": 0.88 }
    ]
  },
  "embedding_type": "kyutai",
  "embedding_zip_url": "https://...",
  "preferred_sample_index": 3,
  "creator_name": "John",
  "creator_email": "john@example.com"
}

Delete Voice

DELETE /api/voices/{partner}/{voice}

note

Voices cannot be deleted while splitting or embedding is in progress. You will receive a 409 Conflict response.

Rename Voice

PUT /api/voices/{partner}/{voice}/name
Content-Type: multipart/form-data

Parameter	Type	Description
`name`	String	New voice name. Required.

Samples

Upload Sample

POST /api/voices/{partner}/{voice}/samples
Content-Type: multipart/form-data

Upload a single audio sample to a voice that was created empty. The file is automatically converted to opus 96kbps mono.

curl -X POST https://voice.hidoba.com/api/voices/my-partner/speaker-2/samples \
  -H "X-Api-Key: YOUR_API_KEY" \
  -F "file=@sample.wav"

Download Sample

GET /api/voices/{partner}/{voice}/samples/{index}

Returns the opus audio file for the sample at the given index.

Delete Sample

DELETE /api/voices/{partner}/{voice}/samples/{index}

Download Sample Bundle

GET /api/voices/{partner}/{voice}/samples-bundle

Returns a ZIP archive containing all samples as {index}.opus files and a voice_info.json metadata file.

Splitting

Queue Splitting Job

POST /api/voices/{partner}/{voice}/split
Content-Type: application/x-www-form-urlencoded

Re-splits the original recording with custom parameters. Clears existing samples.

Parameter	Type	Default	Description
`max_samples`	Integer	`25`	Maximum number of samples to extract (1–100).
`sample_duration`	Integer	`15`	Duration of each sample in seconds (5–60).

curl -X POST https://voice.hidoba.com/api/voices/my-partner/speaker-1/split \
  -H "X-Api-Key: YOUR_API_KEY" \
  -d "max_samples=30&sample_duration=10"

Embedding

Queue Embedding Job

POST /api/voices/{partner}/{voice}/embed

Submits the voice samples for embedding generation via RunPod.

Parameter	Type	Default	Description
`tts_type`	String	—	TTS type override. If omitted, uses the voice's `embedding_type` (default `kyutai`).

Get Test Samples

GET /api/voices/{partner}/{voice}/test-samples

Returns a list of TTS-generated test audio files created during the embedding process.

Parameter	Type	Default	Description
`tts_type`	String (query)	`kyutai`	Filter test samples by TTS type.

Download Test Sample

GET /api/voices/{partner}/{voice}/test-samples/{filename}

Returns the test sample audio file.

Parameter	Type	Default	Description
`tts_type`	String (query)	`kyutai`	TTS type folder to look in.

Delete Test Files

DELETE /api/voices/{partner}/{voice}/test-files/{tts_type}

Set Preferred Sample

PUT /api/voices/{partner}/{voice}/preferred-sample

Sets the preferred sample index based on quality scores or manual selection.

Parameter	Type	Default	Description
`sample_index`	Integer (form)	—	Sample index to set as preferred. Pass empty to clear.

Voice Status

A voice tracks two independent processing states:

Status	Values	Description
`splitting`	`idle`, `in_progress`, `completed`, `failed`	Audio splitting into samples.
`embedding`	`idle`, `in_progress`, `completed`, `failed`	Voice embedding generation.

When either status is in_progress, the voice is locked — rename and delete operations will return 409 Conflict.

Error Responses

Unauthorized (401)

{
  "detail": "Invalid API key"
}

Forbidden (403)

{
  "detail": "Access denied"
}

Not Found (404)

{
  "detail": "Voice not found"
}

Conflict (409)

{
  "detail": "Voice is currently being processed"
}

Rate Limited (429)

{
  "detail": "Rate limit exceeded"
}

Authentication​

Voices​

List Voices​

Response​

Create Voice (with Audio)​

Form Parameters​

Create Empty Voice​

Get Voice Details​

Response​

Delete Voice​

Rename Voice​

Samples​

Upload Sample​

Download Sample​

Delete Sample​

Download Sample Bundle​

Splitting​

Queue Splitting Job​

Embedding​

Queue Embedding Job​

Get Test Samples​

Download Test Sample​

Delete Test Files​

Set Preferred Sample​

Voice Status​

Error Responses​

Unauthorized (401)​

Forbidden (403)​

Not Found (404)​

Conflict (409)​

Rate Limited (429)​

Authentication

Voices

List Voices

Response

Create Voice (with Audio)

Form Parameters

Create Empty Voice

Get Voice Details

Response

Delete Voice

Rename Voice

Samples

Upload Sample

Download Sample

Delete Sample

Download Sample Bundle

Splitting

Queue Splitting Job

Embedding

Queue Embedding Job

Get Test Samples

Download Test Sample

Delete Test Files

Set Preferred Sample

Voice Status

Error Responses

Unauthorized (401)

Forbidden (403)

Not Found (404)

Conflict (409)

Rate Limited (429)