My LLM usage

2nd Dec 2025
4 min read
Tags:
software engineering,
llm,
retrospective

Since September I'm using Claude Code regularly.

It is still early to have a comprehensive analysis of an optimal workflow but, given the maturity of the ecosystem, it seems appropriate to take a snapshot of my current workflow.

I have two types of usages:

Professional:
- As a coder reviewer, before pushing my code, I ask for an depth review of my patch, looking for potential improvement. Most of the time it has many interesting comments which improve my delivery (my coworker noticed it).
- As a boilerplate generator (such as CLI, or SQL), it gives a first draft, but, 90% I have to heavily rework it.
- As a debugger, it does not always work (maybe 30-40% of the time), but when it does, it saves me a lot of time.
Personal:
As a project bootstrap tool, when the project is mainstream (e.g. a CRUD API, or nothing too technical), it is good enough for a MVP.
Self-reviewing it's output, which increase greatly the quality
Large refactoring sessions, at some point I got lazy, it mostly work for simple operations (e.g. enforcing a new style, renaming), still it tends to take autonomous decisions to add or remove more code than I expect.
I only use it for utils I have n my backlog for a long time, I do not publish it.

Regarding prompts, I rely on the plan mode (which is a two steps process, the first one is to generate a detailed plan for the task, and then let it run with regular small integrations).

There are many styles of prompting (e.g. C.R.A.F.T., this one, this one).

Prompts should be short to be efficient according to some studies (I cannot find of course), mine are 2-3 lines long, containing the context (relevant files), the target, the feedback criteria (how to compile and test), the methodology (test and commit regularly, etc.).

It usually looks like this:

I want to add a "read later" mechanism on the bottom right of each article, you might be interested by FILE0, FILE1, FILE2, and FILE3. Compiles regularly with cabal build -j, and, once you are done with a step, test with cabal test and commit. Ultrathink, and give me a detailed, comprehensive, step-by-step plan.

I iterate on the plan, then I let it roll.

Note: even though I let it autonomously work, I monitor it so it do not loop when it encounters a difficulty.

Then, once it says it has no work left to do:

As an experienced code reviewer, and seasoned engineer, review the work, looking for potential issues, defects, and improvements.

I tend to micro-manage it, while there are plenty of alternatives which give more autonomy to LLMs (e.g. spec-kit, or Agents).

It gives good enough results, while avoiding to hit the limit too much.

I have tested other LLMs and mode:

Claude Code Cloud: it produces a lot code, it seems correct, butt, without any feedback loop, it hallucinates quickly
Gemini CLI: the code is okay-ish but, at some point, it justs ignores my instructions, stopping to get feedback producing broken code at the speed of light
OpenAI Codex/ChatGPT: each time I ask for something, the code is malformed (non existing options, library, function, etc.) or includes a lot of dead code

I do not think I could make a productive use of LLMs without a software engineering background as, most of the time, it seems unable to take the appropriate design decisions, while in some situations it totally deviates.

To give a proper example, I was trying to build a Rust SOCKS WireGuard proxy, at the beginning, I have picked two incompatible libraries, since it was unable to proceed, he has just changed the scope.

Later on, it has found out that one of the library got a bug, which made two features mutually exclusive, instead of stopping, it just kept adding and removing the same lines of code, "hoping" it will work.

Most of the time, I'm skeptic about the generated code, I do a very in-depth code review, but, from time to time, it takes surprisingly clever (at some point, I have asked it to rename my project, and it was able to infer the origin of the new name, come up with a catch-phrase and propose me to plan for the next changes).

It changed my practice on two aspects:

I take more care during code reviews
I focus more on product ownership