Requirements
To claim this bounty, we expect you to
- reimplement METR’s methodology from scratch to ensure they made no implementation errors;
- use at least three of the same benchmarks as METR (your choices must be listed in the preregistration); and
- include at least one model not included in the original result.
If you do not meet these requirements, still consider submitting a proposal, but we may need to lower the bounty.
If you’d like a higher bounty, you could try
- using a different statistical model to estimate time horizons;
- including more benchmarks (especially proprietary/newly created benchmarks); or
- getting more/better human baselines.
Contact us if you have any questions.
Submit a Proposal
Use our submission form to send us your proposal. See the instructions for more details on the proposal and preregistration process.