No the sentence lists are good. It looks like it is the recording step where some how (stale session data as @dabinat suggested ?) the language gets mixed up.
As validator I don’t have Frisian in my profile, only Dutch.
But the person (female voice) who has recorded both sentences from the starting post perhaps does have both Dutch and Frisian.
These two sentences from the starting post should be reported by me (also
SanderE as profile name as well there (although at the time it could be “sander” or even done anonymously, I’m not entirely sure).
I just took a look at the voice web code and especially the mysql DB schema, I tried to puzzle it together from all the schema changes.
It seems both the sentence and the clips table have a foreign key to the locales table.
But there seems to be no requirement for the sentence locale id to be equal to the clip locale id.
So what does a SQL query turn up when you compare the locale id’s for each clip with the locale id from the original sentence and get the mismatches (for Dutch and other languages) ?
Hmmm I seem to have missed a migration from 2019-12 which seems to have forced them to have the same locale if they differ:
So it seems to at least have been a problem before, with at least a fixup for the data (although that could be incomplete, because if the locale was wrong, people could have down voted it instead of reporting, so perhaps the votes will have to be reset to zero as well).
But is there any reason to have a separate locale id for clips (I can’t think of it, except as a query optimization so you don’t have to do a join when getting the clips) ?