← Back to Catalog

RLHFlow/pair-preference-model-LLaMA3-8B

Size
8B
Context
8,192
Tool Use
Yes
J Reward & Judge Models
RLHF reward scoring Model evaluation Preference ranking Quality grading A/B testing

This model excels at general conversation and instruction following. Use the tabs below to test different capabilities.

0.7