Suggestion for two sub-Discources

I think about these:

  1. A showroom

Presentation of projects in the area. There is one named “Using the Common Voice Dataset” but it is not active. Actually I’m thinking of a more general one, independent of the dataset. These might give ideas for all of us and also increase participation in open-source ones.

  1. Resources

There are lots of open-source projects related to voice-AI and NLP-related tasks in general. A more-or-less curated collection of these might be helpful for all of us.

1 Like

I like the suggestion of a wider voice-ai and NLP space.

I few suggestions:

  1. How will we ensure it’s multilingual ? From my understanding you can only have e.g common voice > language sub-discourse
  2. How will we not be repeating other resources
  3. Could it be better on MDN ?
2 Likes

My point of views on your points:

Well, there are sub-Discourses which are not languages. The default language of the main forum is English, and this can be extended there. Many of the resources people will be sharing would present themselves in English, but most of them will be multi-lingual already.

It can just be me, but actually, I don’t know any good resource for things like that, and existing information is distributed (you might say Huggingface, but it is so different). I usually find myself fishing some libraries/tools on the web or github, or just ask in some chat, to not search for America again, for stuff like multi-lingual stemmers, web scraping for language models etc. Some of them are also behind paywalls :frowning:
If there already is such a place, please let me know :slight_smile:

Might be, but it is mostly for programmers, nothing on the science or AI…

The problem is: People are working on the datasets and want to do something with them, see something working, do something with it and even show their community that their efforts are worthwhile (it was the driving force behind the voice-chess). The “showroom & resources” idea was to fulfill this need. See something done in X language and use resources to do similar in their language. As this is a forum, people can talk about the projects/resources, shortcomings, good parts, how to make them better, alternatives etc.

Of course, everybody else can be much more knowledgable and might not need these :slight_smile:

1 Like

Hey @bozden,

Thanks for your response.

Cool - I will ask for one to be set up for the resources by the discourse admin.

At this time I don’t think we should do a showroom one outside of Common Voice/independent datasets. Rather I think they should be in resources - e.g How I trained a model using X dataset and how you can do the same for Y dataset.

Plus there also a new wiki for open voice tech: https://openvoice-tech.net/index.php?title=Main_Page which could be a better and more independent space to surface other datasets.

What do you think ?

1 Like

@heyhillary, thank you - and it is a fair decision on “showroom”…

About the Resources category category has been created for resources. Thanks so mcuh for the suggestion. If you have any other ideas about this please let me know.

First thing I need to do is set up the descritption are you happy with me using your intial wording ?

1 Like

Thank you @heyhillary :slight_smile:
I’ll write you the answers to those questions with my wording tomorrow, it needs a bit of thinking - and your approval. The initial passage does not answer all of those…