Data Format

Input Format

Input has the following parts:

Message history - includes past (context) and new (to be processed) user messages
Audio files - contents of audio files referenced in message history
Character - AI character to use for responses (string identifier OR custom character object)
Character params - optional field, only applicable if character personality is a template (jinja). Note: context name is reserved for the RAG
Generate audio - optional flag indicating whether audio should be generated for the response (default: true)
Num options - optional number of response variants to generate (default: 1, max: 12)

{
    "input": {
        "messages": [...],
        "audio_files": { ... },
        "character": "...",
        "character_params": { ... },
        "generate_audio": true|false,
        "num_options": 1
    }
}

Multiple Response Options

When num_options is greater than 1, you must set generate_audio: false. The response will include an options array on the generated message containing all response variants.

Character Parameter

You can specify a character in two ways:

Option 1: Character Identifier (String)

Use a character identifier string to reference a predefined character:

{
    "input": {
        "character": "github:partner/character",
        "messages": [...]
    }
}

Option 2: Custom Character (Object)

Define a custom character configuration inline:

{
    "input": {
        "character": {
            "description": "{\"name\": \"Assistant\", \"voice\": \"...\"}",
            "personality": "You are a helpful assistant..."
        },
        "messages": [...]
    }
}

When using a custom character:

description must be a JSON string containing the character configuration
personality must be a string containing the system prompt

See the character editor app for valid configuration examples.

Legacy Support: The character_name field is still supported for backwards compatibility but is deprecated in favor of character.

Message History

Messages should be ordered chronologically, past (context) messages followed by 1 or more new (to be processed) user messages - distinguished by a flag.

All past messages should only have text, new messages might be either text or audio and should have unique id.

Past Message Format

{
    "id": 1,
    "originator": "bot|user",
    "text": "",
    "is_processed": true
}

New Message Format

{
    "id": 5,
    "type": "text|audio",
    "text": "",
    "audio_file_id": "",
    "is_processed": false
}

Audio Files

Audio files included in both input and output as base64-encoded. The only supported format is OGG, response is also OGG (Telegram compatible). Please note that max size for the whole payload is 5MB.

{
    "msg1": "",
    "msg2": ""
}

Output Format

Output includes all relevant messages:

New user messages with transcripts
Generated audio message with the transcript

{
    "messages": [...],
    "audio_files": {
        "response": ""
    }
}

New User Message Format

{
    "id": "",
    "type": "",
    "text": ""
}

Generated Message Format

{
    "type": "audio",
    "text": "",
    "audio_file_id": "response"
}

Generated Message with Options

The generated message always includes an options array containing response variants:

{
    "type": "text",
    "text": "First response option",
    "options": [
        {"type": "text", "text": "First response option"},
        {"type": "text", "text": "Second response option"},
        {"type": "text", "text": "Third response option"}
    ]
}

The options array is always present on the generated message. When num_options is 1 (or not specified), the array contains a single element.

Input Format​

Character Parameter​

Message History​

Past Message Format​

New Message Format​

Audio Files​

Output Format​

New User Message Format​

Generated Message Format​

Generated Message with Options​