Standard for new Dataset

Hi, I wonder the standard for a new dataset. Like how long, how many sentences or words does it need?

I want to make a new voice, kids voice and the language still English. Is it possible to do that by fine-tune the best model with small dataset?

Thank you and thanks for the hard work.

this might help a bit