Evaluate your LLM architectures on SWE-bench-lite

Accessible and free evaluations. Get started by uploading your all_preds.jsonl predictions file below.

Evaluations on SWE-bench-full coming soon!