Create a cc0 chat channel - interesting approach to collect random daily dialog

Hi,

I had joined “g0v hackathon” [1], a day event last weekend, and took Common Voice as my topic. During the hackathon, we come up with a great idea to collect more CC0 daily chat texts -

Create a #rand0m at our community Slack, instead of default #random channel.

Every Slack instance had a default random channel, and people chat, share random stuff, making jokes… If those texts are all released under CC0, that would be great.

So we create a #rand0m, set the channel topic as below, then I went to #random and ask people to switch, also promote the initiative at the final presentation.

Everything here will release under CC0 to the public domain, chat freely!

That’s all. We have collected 600 sentences in just one day. People are fascinated with the idea. Chat crazy and help the public, how nice it is!

1) Event page (English brief intro at the bottom)

2) The screenshot of #rand0m at g0v slack, we have 42 people in the channels so far

2 Likes

I like the idea. How are you ensuring that the sentences do not include abbreviations or common mistakes people make during chat conversations?

I copy the chat log and review / fix them before we can submit. It’s quite fast in fact, took about 10 mins to review 300 sentences​.

Is there a way to automate that. I’m thinking about scaling and how would you solve this with 100X volume.

I will leave the question until when we have 100x people participate in the channel. By then we can easily to find more people helping moderate. I don’t want to over-design the system too far.

I understand, but I think it’s really important that any effort we do we have this scale problem in mind, since sooner or later we will need to address it.

Maybe it can automatically import all sentences that only contain dictionary words and flag up any that don’t?

The channel is really pretty randomly and so far I can only manual select good sentences to submit, took like 5 mins per day for 200 sentences so not a big hassle so far.