Core concepts

Comments

Now that you've created and run your Thread, it's time to start evaluating and analyzing it. This is where Comments come in.

Commenting is where the real work is done. Comments allow you and the community to dive into the LLM output on your Thread, discovering shortcomings, successes, and areas for improvement. This is all documented automatically by the system, allowing you to develop a comprehensive understanding of LLM success and failures.

What you will learn

  • Why Comments drive understanding and improvement of LLMs
  • How you, domain experts, and the larger community utilize comments to drive LLM improvements

Comments Drive LLM Success

For many of us that grew up in the age of what's now "first-gen" AI (the classics: object recognition, classification, all those fun topics everyone is suddenly "too cool" for...), evaluation was often an afterthought. On many projects you'd be looking at a single accuracy number. To add a little spice and excitement into life, you'd throw in a confusion matrix or maybe even an ROC curve if you were feeling crazy.

LLMs challenge this notion. LLMs are really easy to apply everywhere, but that doesn't mean they work everywhere right away. Domains as disparate as medical care to business intelligence to education are all trying to adopt LLMs. But how do domain experts and practioners in each of these fields know if their LLMs are working well? For example, if you're a global health expert creating an LLMs to help provide information in underserved rural countries, is knowing that an off-the-shelf LLM has an MMLU score of 65.78 all you need to know to put it into production?

Of course not! You need to see exactly how the LLM will respond to patient inputs, and then analyze these responses in detail with domain experts. Computer scientists may develop the base LLMs, but in this case they can't tell you if the LLM works for this domain. You need the global health experts to do this. And this is where Comments come in.

Using Comments to Source Expert Evaluation of LLMs

Comments allow you and other domain experts to deeply analyze and understand LLM output. Once you've run your Thread on an LLM, simply click on any LLM output to pull up the Comment box. The Comment box displays all previous Comments as well as lets you create new Comments.

This is how you get to truly deep, rigorous, domain-specific LLM evaluations. As discussed above, evaluating LLMs is not as easy as distillilng down to a single number. Thorough engagement and analysis via Comments can help you truly understand where an LLM is suceeding or failing in a given domain.

You can use Comments in any manner to disect and analyze LLM output, but here are some common uses:

  • Identifying hallucinations
  • Understanding issues with logic
  • Surfacing failure modes for the LLM
  • Flagging problems that need to be addressed later via fine-tuning, prompt engineering, or other LLM improvement methods

Who Can Comment

Getting the right people to give Comments is important. Given the peer-to-peer style of PeerAI (it's in our name!), there are many different ways to get the detailed and insightful Comments that will drive your LLM improvement:

Flying Solo

You may know your particular problem the best out of everyone. In this case, fly solo and dig deep into responses generated by mulitple LLMs. Keep track of your reviews, perform your analysis, and capture it all in Comments.

Creating Communities

You work in a community with other experts that also share a deep knowledge of the problems you work on. Easily invite and collaborate with other domain experts within your community. The more viewpoints and opinions you get, the better analysis you will be able to drive. Click here for a guide to inviting others to edit and comment on your Threads.

Crowd Sourcing

Active communities on Reddit, Hugging Face, and other platforms are great places to solicit Comments and input. Post a link to your Thread on one of these platforms and get the entire web involved in helping you analyze LLM responses.

Creating Comments

Setting the Ground Truth for your Thread is easy. Click here for detailed instructions

Previous
Ground Truth