LLM-based Style Consistency
Definition
LLM-based Style Consistency outputs a score between 0.0 - 1.0 assessing the relevance and completeness of the generated answer based on the question. It assess style aspects such as tone, verbosity, formality, complexity, use of terminology, etc.
Scoring rubric in LLM Prompt:
- 0.0 means that the answer is in a completely different style as the reference answer(s).
- 0.33 means that the answer is barely in the same style as the reference answer(s), with noticable differences.
- 0.66 means that the answer is largely in the same style as the reference answer(s) but there’s a slight difference in some aspects.
- 1.0 means that there’s no dicernable style difference between the generated answer and reference answer(s).
Example Usage
Required data items: answer
, ground_truths