Detecting Hallucinations and Truthfulness in Generative AI
Evaluation Of Large Language Models — Hallucinations and Truthfulness
Perhaps the main reason why Large Language Models started blowing everybody's minds in late 2022 was the unprecedented scope of questions the model was able to answer, and the accurateness of the answers given. After all, nobody would be particularly impressed if all the models could do was to give grammatically well-formed but incorrect or nonsense answers. For the first time, it looked like you could have a conversation with an entity that has most if not all of human knowledge at its disposal.
That being said, LLMs are not infallible. Wrong or conflicting training data can result in less than stellar output, generally thought of as hallucinations (incorrect output not directly based on a specific input source) or truthfulness issues (incorrect output due to incorrect information in the training data).
Defined.ai is able to help weed out these issues. Our specialized contributors can annotate any question-answer pair for the correctness of the answer. We can also adapt the exact nature of the annotation your needs. See some of our examples below!