I finetuned a BERT VITS 2 text to speech model, and used the export to onnx code to convert the model to an ONNX model. I want to use the model with Barracuda in Unity, and this only allows for one ONNX file. However, the export to ONNX code in BERT VITS 2 gives six different ONNX models: Speaker_dec.onnx, Speaker_dp.onnx, Speaker_emb.onnx, Speaker_enc_p.onnx, Speaker_flow.onnx, Speaker_sdp.onnx.
I tried merging these models using the compose method provided in the ONNX API per the below code.
import onnx
from onnx import compose
model_files = [
""
]
# Load the first model
merged_model = onnx.load(model_files[0])
# Start merging from the second model
for i in range(1, len(model_files)):
current_model = onnx.load(model_files[i])
# Add a prefix to the current model to avoid name clashes
current_model_with_prefix = compose.add_prefix(current_model, prefix=f"m{i}_")
# Now, create an io_map that connects the last output of the merged_model to the first input of current_model_with_prefix
io_map = [(output.name, input.name) for output, input in zip(merged_model.graph.output, current_model_with_prefix.graph.input)]
# Merge the models
merged_model = compose.merge_models(merged_model, current_model_with_prefix, io_map=io_map)
# Save the merged model to a new ONNX file
onnx.save(merged_model, "combined_model.onnx")
When I try to load this model, however, I run into the following error:
Input Name: z_in, Output Name: o
Traceback (most recent call last):
File "c:\EasyBertVits2\Playing around\testing.py", line
19, in <module>
outputs = session.run([output_name], {input_name: input_data})
File "c:\Users\User\AppData\Local\Programs\Python\Python310\lib\site-packages\onnxruntime\capi\onnxruntime_inference_collection.py", line 196, in run
raise ValueError("Model requires {} inputs. Input Feed contains {}".format(num_required_inputs, num_inputs))
ValueError: Model requires 2 inputs. Input Feed contains 1
How can I merge the models correctly so that I can use it as one ONNX model?