Data Transparency and Selectability: A New Era in the Defined.ai Marketplace
Now that we have started a new quarter and spring is in the air, here at Defined.ai, we’d like to take you on a tour of everything that has happened at our marketplace during the first quarter. After all, even though turmoil plagued the markets, we had a lot of new additions, changes, and feature updates that you might have missed!
Unlocking the Potential of Data Transparency and Selectability
Let’s start with something that’s not as visible (yet) as other updates but one that’s, for sure, important for anybody in the market for data of any type: data transparency and selectability. Have you ever acquired data only to realize after the fact it wasn’t quite what you expected it to be? Were some parts of the data less useful to your models or even detrimental to their performance? At Defined.ai, we’ve learned over the years that what makes data right for you is unique to your use case, and this is where data transparency and selectability come into play. Some of our clients need noisy telephony data, while others need the cleanest possible recordings, to mention just one thing. So, to make sure you get the data that is right for you (without the fluff!), we improved our back-end to make it possible to provide you with the full metadata of our datasets, unit per unit, for you to select from, enhancing data transparency and selectability. For audio, for instance, that means speaker metadata, turn information, audio format, signal-to-noise, voice activity detection output, and more! Once you make your selection, we can package your custom off-the-shelf data for you in a matter of days!
Expanding our Catalog: New Locales, Verticals, and Data Types
Of course, data transparency and selectability are only helpful if there is something to select from. We are constantly updating our catalog with new locales, verticals and domains, and new data types, ensuring a wider range of options for data transparency and selectability. Here’s a selection of what we added in the last months:
- Spontaneous dialogue speech data in a variety of locales like Mandarin Chinese, Korean, Norwegian Bokmal, Gaelic, Swedish, and Russian
- Scripted speech data in Norwegian Bokmal, Czech, Canadian French and Flemish, and Black American English
- Phonetic lexica in a variety of locales
- Medical app usage data – doctor’s job search data, physician’s demographics, search data, and more
- Brazilian retail images
- Anonymized medical history data
Stay Tuned: A Glimpse at Our Upcoming Data Offerings
Of course, we also want to take a look ahead at some exciting things we have coming up. Keep your eyes peeled for the following:
- Full-consent, ethically sourced live data!
- Real online meetings
- Real far-field user queries to a device-based assistant
Check out our marketplace with the above datasets and many more here, showcasing our commitment to data transparency and selectability.