NutriSteppe-AI — LLM Orchestration Verification

1 · Orchestration prompt under test

System prompt sent to the model. Replace this with your production prompt to verify the real system.

Model

Judge model (scores subjective checks)

2 · Results

Checks passed

—

Critical violations

—

release blockers

High-severity fails

—

needs review

Release gate

Not run

Gate logic: any failed critical case blocks release; any failed high case flags review; otherwise pass. Subjective checks are scored by a judge model and can be overridden by a human reviewer inside each case.

3 · Test cases

NutriSteppe-AI verification harness v1.0 · runs inside the Claude artifact viewer · the same case data ships as nutristeppe_verification_dataset_v1.json