Documentation

1. Generate Speech

POST /generate

The primary endpoint for converting text to speech.

Request Body

json

{
  "mode": "sft",
  "tts_text": "The text to be converted to speech",
  "sft_spk": "Chinese Female",
  "prompt_text": "Prompt Text",
  "prompt_wav_url": "Prompt Audio URL",
  "seed": 42,
  "speed": 1.0
}

Form Parameters

mode（Optional）：Inference Mode. Optional values are "sft", "zero_shot", and "cross_lingual", with the default being "sft"
tts_text（Required）：Input the text to be synthesized
sft_spk（Optional）：Pre-trained Voice. Optional values are "Chinese Female", "Chinese Male", "English Female", "English Male", "Japanese Male", "Cantonese Female", and "Korean Female"
prompt_text（Optional）：Prompt Text
prompt_wav_url（Optional）：Prompt Audio URL
seed（Optional）：Random Seed
speed（Optional）：Speech Speed Adjustment (supported only for non-streaming inference). Range: 0.5-2.0, default value: 1.0

Response (200)

json

{
  "id": "Task ID",
  "status": "Task Status",
  "message": "Status Information"
}

2. Get Task Status

GET /status/{id}

Query the status of a specific task.

Path Parameter

id（Required）：Task ID

Response (200)

json

{
  "id": "Task ID",
  "status": "Task Status",
  "message": "Status Information"
}

3. Download Audio

GET /download/{id}

Download the generated audio.

Path Parameter

id（Required）：Task ID

Response (200)

json

{
  // 音频数据
}

1. Generate Speech ​

2. Get Task Status ​

3. Download Audio ​

1. Generate Speech

2. Get Task Status

3. Download Audio