Accurate README thanks to LLM
Last week I was browsing code in a directory of a large codebase and was happy to find a README.md file. It was quite clean and gave useful explanations for what I was planning to do. Then I opened a Python file and š”. Things were clearly off-sync from what was described in the README. After inspection, it appears the code had numerous modifications in the last years, while the README file was only added by the original author and never touched since. If you are a software engineer, Iām sure you already lived this situation multiple times.
Having documentation not in sync with code is a problem as old as software programming.
But guess what is the technology everybody talks these days and could solve the problem ?
Is it a human reviewer in charge of reviewing each MR, ensuring READMEs are updated, and blaming engineers on Slack?
Of course not, we are talking about LLM.
They can make an awesome job responding to a basic prompt like:
Write a bullet list of discrepancies between README.md files and software code.
They will warn you about things like:
- Utility commands mentioned but not existing
- Incomplete list of components (you started document some, but forgot to add the new ones)
- Flawed Mermaid schema
And many other inconsistencies.
As a first step, if you are a Claude Code user as I am, you can add a check-doc command to your Makefile, running :
claude --model sonnet -p "DOC CHECKING PROMPT"
Then, why not upgrade this to a check in your CI ?
Finally, the README file can become the main source of truth of what the code is doing. And become the trusted source of knowledge for humans and LLM working on the codebase.
By Thomas Martin
Follow me or comment