SlopCodeBench is a benchmark designed to evaluate the performance of code generation models on practical programming tasks. The benchmark includes a diverse set of problems that test various aspects of code generation, including:

Algorithm implementation
Data structure manipulation
API integration
Bug fixing and code modification
Test-driven development

Each problem comes with a detailed description, test cases, and evaluation criteria to ensure consistent and fair assessment across all models.

SlopCodeBench

Diverse Problems

Rigorous Evaluation

Open Access

Overview