I am collecting data on Indian English. as i do not have 1000 hours Indian English data, i can not do building Depspeech from scratch. i understand i need to do transfer learning with my 20 hour Indian English financial domain data.
Earlier i run transfer learning, i saw i need to use version 6.0 alphabet.txt that does not contains any numbers, upper case english letters and special symbols. but my training data contains numbers and many special symbols. Is there any workaround so that i can keep all numbers and special symbols in transfer learning.
or do i need to convert number 9 as nine in my training excels, convert all uppercase to lowercase and remove all special symbols from my training excels.