Skip to content
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 4 additions & 1 deletion vero/src/vero/harbor/build/templates/instruction.md.j2
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,10 @@ progress on the splits you *are* allowed to evaluate, within a fixed budget.
4. Check budget / which splits are evaluable anytime: `vero harbor status`.
{% if submit_enabled %}5. When done, nominate your best commit: `vero harbor submit`.{% else %}
The best commit you evaluate on `{{ selection_split }}` is selected automatically and
scored on the hidden test split at the end.{% endif %}
scored on the hidden test split at the end. Only commits *other than the seeded
baseline* are selectable: evaluating the unmodified baseline spends budget without
creating a candidate, so make sure at least one eval is of a commit that contains
your changes.{% endif %}

## Rules

Expand Down
9 changes: 9 additions & 0 deletions vero/tests/test_harbor_build.py
Original file line number Diff line number Diff line change
Expand Up @@ -167,3 +167,12 @@ def test_seed_documents_advisory_read_only(built):
seed = (built / "environment/main/seed.sh").read_text()
assert "ADVISORY ONLY" in seed
assert "sidecar-side" in seed


def test_instruction_warns_baseline_not_selectable(built):
# auto_best: the agent must be told baseline evals do not create candidates
# (found live: an optimizer that spent its whole budget measuring the
# baseline died with "no candidate experiments" at finalize).
text = (built / "instruction.md").read_text()
assert "other than the seeded" in text
assert "spends budget without" in text
Comment thread
shehabyasser-scale marked this conversation as resolved.