RLHFlow/pair-preference-model-LLaMA3-8B - Playground

← Back to Catalog

Size

Context

8,192

Tool Use

Yes

J Reward & Judge Models

RLHF reward scoring Model evaluation Preference ranking Quality grading A/B testing

This model excels at general conversation and instruction following. Use the tabs below to test different capabilities.

System Prompt

Message

Temperature

0.7

Max Tokens

Model Details Multi-Model Playground Full Catalog