My repo is getting a bit too big, maybe I should stop saving tested model results in JSON files...
The advantages of JSON files:
- they are automatically versioned in .git
- easily searchable both in the editor and by LLMs
- can immediately see what changed in git diff
- human readable, easy to understand and portable (can be transferred and used in other apps with a simple copy-paste)
Disadvantages:
- not efficient to run queries on (e.g. find highest value for key XYZ across files; but in my case I do a local pre-processing of all the values I need, in a summary json file)
- slows down git and git diff a lot once you have large files or many files
- llms can sometimes over-read files and have the context filled with too much unrelated data
A solution could be moving to a database, SQLite would be the closest choice to current setup, the only difference is that:
- versioning would be lost (unless intentionally saving old version of files)
- LLMs would need a database query tool, so they can find data
- LLMs would need a way to find what changed since last snapshot
For now, I will stick with JSON files (I changed the structure to make sure JSON files don't get too big), mostly because of the transparency they provide in seeing exactly what changed (while the app is still being development, it's nice to see if any code execution has unintended effects or not).