Feed additional inputs to the model

How can I change the code so I can feed additional inputs to the model? For example, how can I feed the gender of the speaker (both for training and inference) to the model?

That’s a hard question to answer because the answer is not very helpful: you’ll have to change pretty much the entire code. The model definition to use the new information, the feeding code to provide that information to the training loop, the loss function/optimization code if needed, the native clients to accept and feed that information, etc.

I imagined I would have to change the model architectures and the feeding code (the loss would remain the same as it is now). In this aspect, I think I would be able to perform such modifications.
But now that you’ve mentioned the native clients, I am afraid that this is gonna be much more difficult >.<

If it’s not asking too much, could you please point me the main functions I should be looking to modify?

You only have to modify the native client code if you want to use the clients, obviously. You could always implement inference in Python directly (see the do_single_file_inference function and the --one_shot_infer command line flag).