Blog post

Improve your CV models with this diverse Human Dataset

Defined.ai is proud to introduce its newest CV dataset, a valuable addition to our online marketplace for ethically sourced training data for AI. This dataset comprises over 497,000 original images of human individuals, including both headshots and full-body images. Each subject in the human dataset has signed a biometric model release, ensuring that their data is used ethically and responsibly.

Why are we excited about this human dataset?

At Defined.ai, we believe ethical sourcing of training data is crucial in the AI industry. While other companies are experiencing the drawbacks of dubious data collection practices, we stand firm on the same values that we’ve had for years. We’re excited to offer this new human dataset, as it meets these standards while still delivering valuable insights to businesses working in computer vision and other AI applications.


This dataset is a valuable resource for any ML team working in Human-Computer Interactions. It contains copious and diverse examples of people exhibiting different behaviors, including facial expressions, head angles, demographics, age, and more. The data is indexable and can be partitioned based on your niche demographic or behavioral requirements. Moreover, by ensuring that the data is ethically sourced and that the individuals involved have given their informed consent, we’re helping to build trust in the AI industry and making it more accessible to a wider audience.z

What is in this Human Dataset?

This dataset is currently divided into three specific subsets, each with its unique applications:

1. The multi-image dataset

The multi-image dataset contains multiple images of each individual captured at different face angles and with different emotions. This subset could be useful for training facial recognition systems or emotion detection models, among other applications. For example, a retail store can use facial recognition technology to identify repeat customers and offer them personalized shopping experiences based on their preferences and buying history. An entertainment company can use emotion detection models to analyze how people react to different types of content and tailor their marketing strategies accordingly.

2. The single-image dataset

The single-image dataset contains a single front-facing image of each individual, accompanied by tags such as age, ethnicity, and emotion. This subset could be useful for building diverse sets of avatars or for demographic analysis. For example, a virtual reality game company can use the dataset to create avatars more representative of their user base, improving the overall user experience. A research organization can use the dataset for demographic analysis and research to better understand societal trends and patterns.

3. The video dataset

The video dataset contains a collection of ~3-minute videos of people performing a set of movements and facial expressions. This subset could be useful for training models to recognize and interpret human gestures and movements in human-computer interaction or virtual reality applications. For example, a healthcare provider can use the dataset to train physical therapists to recognize and respond to patients’ movements and expressions more accurately, improving the quality of care.

Follow the competition: How is the market leveraging similar datasets?

Various companies and organizations already use human CV datasets like this to develop cutting-edge AI applications. Here are some real-world examples:

  • Kairos is a facial recognition company that provides an API for developers to integrate facial recognition technology into their applications. The company’s technology has been used in security, retail, and healthcare applications.
  • Clarifai is an image and video recognition company that provides an API for developers to integrate image and video recognition into their applications. The company’s technology has been used in applications such as advertising, media, and e-commerce.

Why purchase from Defined.ai?

At Defined.ai, we are committed to providing our clients with ethically sourced AI training data. This latest dataset is a testament to that commitment. This offering’s vast amount of human imagery and video content is an invaluable resource for AI applications in countless fields. We can’t wait to see how our clients will utilize this unique dataset to drive innovation across all industries.

0

Leave a comment

Your email address will not be published. Required fields are marked *

Terms of Use agreement

When contributing, do not post any material that contains:

  • hate speech
  • profanity, obscenity or vulgarity
  • comments that could be considered prejudicial, racist or inflammatory
  • nudity or offensive imagery (including, but not limited to, in profile pictures)
  • defamation to a person or people
  • name calling and/or personal attacks
  • comments whose main purpose are commercial in nature and/or to sell a product
  • comments that infringe on copyright or another person’s intellectual property
  • spam comments from individuals or groups, such as the same comment posted repeatedly on a profile
  • personal information about you or another individual (including identifying information, email addresses, phone numbers or private addresses)
  • false representation of another individual, organisation, government or entity
  • promotion of a product, business, company or organisation

We retain the right to remove any content that does not comply with these guidelines or we deem inappropriate.
Repeated violations may cause the author to be blocked from our channels.

Thank you for your comment!

Please allow several working hours for the comment to be moderated before it is published.

Director of Strategic AI Partnerships