GitScribe Benchmarks#

This directory contains tools and test cases for benchmarking how different Large Language Models (LLMs) generate git commit messages.

Structure#

cases/: Contains subdirectories for specific scenarios (e.g., bug fixes, refactors, new features). Each case should include:
- staged_files.txt: A list of files staged for commit.
- diff.patch: The git diff for the changes.
- history.txt: (Optional) Recent commit history for style reference.
- guidance.md: (Optional) Project-specific guidance.
runner.go: (Planned) A script to iterate through available models and record their output for each case.
results.md: (Planned) A matrix comparing model outputs.