GitScribe Benchmarks#
This directory contains tools and test cases for benchmarking how different Large Language Models (LLMs) generate git commit messages.
Structure#
cases/: Contains subdirectories for specific scenarios (e.g., bug fixes, refactors, new features). Each case should include:staged_files.txt: A list of files staged for commit.diff.patch: The git diff for the changes.history.txt: (Optional) Recent commit history for style reference.guidance.md: (Optional) Project-specific guidance.
runner.go: (Planned) A script to iterate through available models and record their output for each case.results.md: (Planned) A matrix comparing model outputs.
Usage#
- Add a new test case to
cases/. - Run the benchmark tool (to be implemented).
- Review results in
results.md.