Skip to content

1. Generate Speech

POST /generate

The primary endpoint for converting text to speech.

Request Body

json
{
  "mode": "sft",
  "tts_text": "The text to be converted to speech",
  "sft_spk": "Chinese Female",
  "prompt_text": "Prompt Text",
  "prompt_wav_url": "Prompt Audio URL",
  "seed": 42,
  "speed": 1.0
}

Form Parameters

  • mode(Optional):Inference Mode. Optional values are "sft", "zero_shot", and "cross_lingual", with the default being "sft"

  • tts_text(Required):Input the text to be synthesized

  • sft_spk(Optional):Pre-trained Voice. Optional values are "Chinese Female", "Chinese Male", "English Female", "English Male", "Japanese Male", "Cantonese Female", and "Korean Female"

  • prompt_text(Optional):Prompt Text

  • prompt_wav_url(Optional):Prompt Audio URL

  • seed(Optional):Random Seed

  • speed(Optional):Speech Speed Adjustment (supported only for non-streaming inference). Range: 0.5-2.0, default value: 1.0

Response (200)

json
{
  "id": "Task ID",
  "status": "Task Status",
  "message": "Status Information"
}

2. Get Task Status

GET /status/{id}

Query the status of a specific task.

Path Parameter

  • id(Required):Task ID

Response (200)

json
{
  "id": "Task ID",
  "status": "Task Status",
  "message": "Status Information"
}

3. Download Audio

GET /download/{id}

Download the generated audio.

Path Parameter

  • id(Required):Task ID

Response (200)

json
{
  // 音频数据
}