[fast_inference] 回退策略,减少padding影响,开放选项,同步代码 (#986)
* Update README * Optimize-English-G2P * docs: change akward expression * docs: update Changelog_KO.md * Fix CN punc in EN,add 's match * Adjust normalize and g2p logic * Update zh_CN.json * Update README (#827) Update README.md Update some outdated file paths and commands * 修复英文多音字,调整字典热加载,新增姓名匹配 (#869) * Fix homograph dict * Add JSON in dict * Adjust hot dict to hot reload * Add English name dict * Adjust get name dict logic * Make API Great Again (#894) * Add zh/jp/en mix * Optimize code readability and formatted output. * Try OGG streaming * Add stream mode arg * Add media type arg * Add cut punc arg * Eliminate punc risk * Update README (#895) * Update README * Update README * update README * update README * fix typo s/Licence /License (#904) * fix reformat cmd (#917) Co-authored-by: starylan <starylan@outlook.com> * Update README.md * Normalize chinese arithmetic operations (#947) * 改变训练和推理时的mask策略,以修复当batch_size>1时,产生的复读现象 * 同步main分支代码,增加“保持随机”选项 * 在colab中运行colab_webui.ipynb发生的uvr5模型缺失问题 (#968) 在colab中使用git下载uvr5模型时报错: fatal: destination path 'uvr5_weights' already exists and is not an empty directory. 通过在下载前将原本从本仓库下载的uvr5_weights文件夹删除可以解决问题。 * [ASR] 修复FasterWhisper遍历输入路径失败 (#956) * remove glob * rename * reset mirror pos * 回退mask策略; 回退pad策略; 在T2SBlock中添加padding_mask,以减少pad的影响; 开放repetition_penalty参数,让用户自行调整重复惩罚的强度; 增加parallel_infer参数,用于开启或关闭并行推理,关闭时与0307版本保持一致; 在webui中增加“保持随机”选项; 同步main分支代码。 * 删除无用注释 --------- Co-authored-by: Lion <drain.daters.0p@icloud.com> Co-authored-by: RVC-Boss <129054828+RVC-Boss@users.noreply.github.com> Co-authored-by: KamioRinn <snowsdream@live.com> Co-authored-by: Pengoose <pengoose_dev@naver.com> Co-authored-by: Yuan-Man <68322456+Yuan-ManX@users.noreply.github.com> Co-authored-by: XXXXRT666 <157766680+XXXXRT666@users.noreply.github.com> Co-authored-by: KamioRinn <63162909+KamioRinn@users.noreply.github.com> Co-authored-by: Lion-Wu <130235128+Lion-Wu@users.noreply.github.com> Co-authored-by: digger yu <digger-yu@outlook.com> Co-authored-by: SapphireLab <36986837+SapphireLab@users.noreply.github.com> Co-authored-by: starylan <starylan@outlook.com> Co-authored-by: shadow01a <141255649+shadow01a@users.noreply.github.com>
This commit is contained in:
14
api_v2.py
14
api_v2.py
@@ -22,7 +22,7 @@ POST:
|
||||
```json
|
||||
{
|
||||
"text": "", # str.(required) text to be synthesized
|
||||
"text_lang": "", # str.(required) language of the text to be synthesized
|
||||
"text_lang": "", # str.(required) language of the text to be synthesized
|
||||
"ref_audio_path": "", # str.(required) reference audio path.
|
||||
"prompt_text": "", # str.(optional) prompt text for the reference audio
|
||||
"prompt_lang": "", # str.(required) language of the prompt text for the reference audio
|
||||
@@ -32,12 +32,14 @@ POST:
|
||||
"text_split_method": "cut5", # str.(optional) text split method, see text_segmentation_method.py for details.
|
||||
"batch_size": 1, # int.(optional) batch size for inference
|
||||
"batch_threshold": 0.75, # float.(optional) threshold for batch splitting.
|
||||
"split_bucket": true, # bool.(optional) whether to split the batch into multiple buckets.
|
||||
"split_bucket": true, # bool.(optional) whether to split the batch into multiple buckets.
|
||||
"speed_factor":1.0, # float.(optional) control the speed of the synthesized audio.
|
||||
"fragment_interval":0.3, # float.(optional) to control the interval of the audio fragment.
|
||||
"seed": -1, # int.(optional) random seed for reproducibility.
|
||||
"media_type": "wav", # str.(optional) media type of the output audio, support "wav", "raw", "ogg", "aac".
|
||||
"streaming_mode": false, # bool.(optional) whether to return a streaming response.
|
||||
"parallel_infer": True, # bool.(optional) whether to use parallel inference.
|
||||
"repetition_penalty": 1.35 # float.(optional) repetition penalty for T2S model.
|
||||
}
|
||||
```
|
||||
|
||||
@@ -159,6 +161,8 @@ class TTS_Request(BaseModel):
|
||||
seed:int = -1
|
||||
media_type:str = "wav"
|
||||
streaming_mode:bool = False
|
||||
parallel_infer:bool = True
|
||||
repetition_penalty:float = 1.35
|
||||
|
||||
### modify from https://github.com/RVC-Boss/GPT-SoVITS/pull/894/files
|
||||
def pack_ogg(io_buffer:BytesIO, data:np.ndarray, rate:int):
|
||||
@@ -287,6 +291,8 @@ async def tts_handle(req:dict):
|
||||
"seed": -1, # int. random seed for reproducibility.
|
||||
"media_type": "wav", # str. media type of the output audio, support "wav", "raw", "ogg", "aac".
|
||||
"streaming_mode": False, # bool. whether to return a streaming response.
|
||||
"parallel_infer": True, # bool.(optional) whether to use parallel inference.
|
||||
"repetition_penalty": 1.35 # float.(optional) repetition penalty for T2S model.
|
||||
}
|
||||
returns:
|
||||
StreamingResponse: audio stream response.
|
||||
@@ -354,6 +360,8 @@ async def tts_get_endpoint(
|
||||
seed:int = -1,
|
||||
media_type:str = "wav",
|
||||
streaming_mode:bool = False,
|
||||
parallel_infer:bool = True,
|
||||
repetition_penalty:float = 1.35
|
||||
):
|
||||
req = {
|
||||
"text": text,
|
||||
@@ -373,6 +381,8 @@ async def tts_get_endpoint(
|
||||
"seed":seed,
|
||||
"media_type":media_type,
|
||||
"streaming_mode":streaming_mode,
|
||||
"parallel_infer":parallel_infer,
|
||||
"repetition_penalty":float(repetition_penalty)
|
||||
}
|
||||
return await tts_handle(req)
|
||||
|
||||
|
||||
Reference in New Issue
Block a user