General Knowledge Prompt and Response Data for LLMs

Live Data
LLM
Large Language Models
NLP

About this Dataset

Defined.ai has added one of the most valuable data assets for natural language understanding and LLM training! This dataset contains unprompted, user-initiated prompts from one million unique users interacting with a generic digital assistant. The data is cleansed of PII (Personally Identifiable Information) and each prompt has intent and entity annotations. Queries cover hundreds of intents and subintents such as asking about the weather, searching for businesses, playing music, knowledge questions, and more.

License Information

This dataset is covered by our standard Data License Agreement. The license agreement is perpetual and allows for the commercialization of all models built on the data.

Sample Preview

LLM_EN_QA_Generic_Short_Sample.png

Download Sample

Tell us about yourself, and get access to a sample.
All fields are required

By clicking on the appropriate button or by downloading, installing, accessing, and/or using the data sample, you are agreeing with Defined.ai Privacy Policy, Terms of Use, and Data License Agreement.

You might also be interested in:

STEM Q&A Pairs

STEM Question-Answer Dataset of 150,000 units coming soon
English
Chemistry
Mathematics
+4
DAI logo
Defined.ai hosts the leading online marketplace for buying and selling AI data, tools and models, and offers professional services to help deliver success in complex machine learning projects. Defined.ai is a community of AI professionals building fair, accessible and ethical AI of the future.
Datasets
Contact
1201 3rd Avenue, STE 2200, Seattle WA
[email protected]
Wired logo
Forbes 2019 AI50 logo
CB insights logo
Forbes 2020 logo
Inc. 5000 logo
PME logo

© 2023 DefinedCrowd. All rights reserved.