Accent doesn’t just apply regionally, it also depends on sex, age, social class, ethnicity, native tongue and even sexuality; things that are quite personal to the speaker.
Rather than cataloguing every accent and either putting people into buckets or forcing them to self-identity, I think it’d make more sense to detect them automatically and not care about human classification at all.
Maybe use a single sentence in each language that every contributor reads from time to time, and compute accent markers from that data. The discovered clusters and distance from them become accent detection / calibration data, and end-users read that same phrase to get their own accent calibration.
Is this how it actually works, and labels will just be used just to detect skew in the data sets?