Sourcing French Speech Training Data for Many Dialects


When a Fortune 500 tech company wanted to create speech-enabled technologies for the French market, they knew it would be a challenge. There are as many as 28 different dialects or accents present in the region, so for a voice assistant to be truly inclusive, it needed to recognize every last one of them. To help the company meet its business goals, we sourced comprehensive training data – specifically, 600 hours of speech from 1,000 unique speakers who represented a cross-section of genders, ages, and regional dialects.