Back to Blog
June 16, 20269 min read· WinClaw

A Simple Metric for Whether You Are Using AI Well

How much context window one request consumes reveals whether you still use AI as a Q&A tool or have started delegating real complex work to it.

AICodexContext WindowCoding AgentAI Workflow

AI coding agent navigating a large context map

Recently, I have been thinking about a simple but useful signal for whether someone is actually using AI well:

How much context window does one request consume?

Before explaining that, we need to look at the real world AI is now entering.

Facing the Real World

In the real world, a real project is never just a few snippets of code. It has directory structure, historical baggage, hidden constraints, testing habits, deployment conventions, naming traditions, and boundaries that nobody wants to touch. If AI wants to truly enter such a project, it cannot rely on the twenty lines of error message you pasted into the chat. It has to build a map of the project first. It needs to know where the entry points are, where the core logic lives, where the tests are, where the configuration is, and which pieces of legacy debt should not be touched casually.

So for a complex project, AI will often consume a large amount of context before it edits a single file. In many of my own projects, it is normal to spend 100K tokens of context before the first actual code change appears.

Understanding a project has always been expensive. In the past, humans paid that cost. When a new engineer took over an old project, they might spend days or weeks reading code, asking questions, setting up the environment, and digging through old documentation. We would not call that "wasting time before writing code," because we know complex systems should not be changed blindly.

AI is the same. The difference is that AI compresses this process into a conversation window. For the first time, you can see very directly that understanding a project was expensive all along.

And then there is verification.

Writing code can be fast. But verification is also expensive. AI may need to write tests, run tests, start the service, and validate behavior in the actual environment. If it is a web application, AI may even open the website and test the feature it just implemented.

The size of the context window used by a request reflects two things: the complexity of the task itself, and whether the request you gave AI is close enough to real work.

If AI Can Do More Complex Work, You Should Give It Larger-Grained Requests

As context windows grow, AI can access more context and execute more complex actions. That means people using AI should change how they make requests. Instead of feeding it fragmented questions, they should give it larger-grained goals.

In the past, you could only ask AI to look at one piece of code or answer one local question, because it could not hold much context and could not reliably complete long chains of work. That is changing. With larger context windows, you can give it project background, business goals, key constraints, validation methods, historical code, and testing conventions together, then ask it to work toward a more complete outcome.

So if someone is still staying at a tiny request size for a long time, the limitation may no longer be the tool.

Many people still use AI at a very small grain. For example:

  • write a function for me;
  • explain this error;
  • convert this code to TypeScript;
  • name this variable;
  • generate a regular expression.

These are useful. They save time and reduce repetitive work. But this usage still treats AI as a smarter search engine, code completion tool, or syntax assistant.

The real change is somewhere else.

The real question is whether you can delegate a complete goal.

Not "write a piece of code," but "make this feature verifiable."
Not "look at this error," but "find the cause, fix it, add tests, and run the checks."
Not "polish this paragraph," but "turn this argument into a publishable article and place it in the project structure."
Not "tell me how to use this library," but "choose the least invasive approach for the current project and implement it."

Once the request grain becomes larger, the context window naturally becomes larger too.

Because AI is no longer only answering. It is working. Work requires background, constraints, process, validation, and deliverables. It needs to read the project, compare options, keep intermediate judgments, retry after failure, and explain the final result to you.

So when one request uses a large context window, that is often a good sign. It means you are starting to move AI from a question-answering tool to a task agent.

If You Always Use Tiny Windows, It May Be Time to Reflect

This may sound a little sharp, but I think it is worth saying:

If someone uses AI for a long time and every request always consumes a tiny context window, that may not mean they are efficient. It may mean they have not learned how to use AI for larger work yet.

Tiny-window usage feels comfortable.

You ask one question, it gives one answer. You feel in control. The risk is low. The result is fast. But there is a problem: the human remains the bottleneck of the entire process.

You split the work. You find the files. You assemble the context. You decide the next step. You merge the result. AI is only a small tool called repeatedly along the way. It helps locally, but it never really enters the full workflow.

It is like having a team that can work continuously, but only asking them to pass a sheet of paper, fix a typo, or look up a word. Each action is fast, but the leverage never opens up.

The most important AI-era ability is not making AI answer shorter. It is making AI run reliably inside a larger goal.

That requires people to change how they express requests.

You cannot only say "help me change this." You need to explain:

  • what the background is;
  • what success means;
  • what must not be changed;
  • what checks can be run;
  • whether AI should keep trying or stop when something fails;
  • what final deliverable should be produced;
  • what process should be recorded so the next run can continue.

This is not just prompt technique. It is a mix of management ability, product thinking, and engineering judgment.

Whether someone can write larger AI tasks, keep AI on track across longer context, and validate results across files, steps, and tools is becoming a real dividing line.

AI Usage Is Moving from Prompting to Delegation

People used to talk about AI usage mostly as prompt writing.

Prompts matter. But prompts are only the entrance. The deeper capability is delegation.

Delegation means you are not asking for an answer. You are handing off a task. You need to define the goal, provide boundaries, allow exploration, set validation criteria, and make decisions at key checkpoints.

This is similar to management, but not exactly the same.

When you delegate to a human teammate, that person has organizational experience, business intuition, and a sense of responsibility. When you delegate to AI, it has execution power, patience, and speed, but it does not have a real ownership boundary. So you must make the boundary clearer, the validation more concrete, and the context easier to work with.

That is why a new ability is emerging: AI task design.

This ability includes:

  • turning vague ideas into executable goals;
  • splitting large requests into stages that can continue;
  • deciding which context must be provided and which context AI can inspect by itself;
  • allowing AI to spend enough window on exploration;
  • asking AI to form a plan after exploration;
  • using tests, previews, logs, and documents as validation tools;
  • keeping continuity when the context window rolls over.

If someone has this ability, their single-request size will naturally become larger.

Because they are no longer satisfied with having AI fill in a small piece. They let AI take responsibility for a complete unit of work.

That is the difference between "using AI well" and "asking AI often."

Context Window Usage Can Become a Self-Observation Metric

I would not turn single-request context usage into an absolute KPI.

Different projects, tasks, tools, and models are not directly comparable. A simple task completed in a small context window is perfectly reasonable. Making a simple thing complicated just to fill the window is also inefficient.

Context usage is not the cause. It is the result.

But because it is the result, it can be a useful self-observation metric.

You can ask yourself:

Are my recent AI requests still local questions?
Have I ever handed AI a complete workflow and let it finish?
Have I let AI read the project, plan, implement, verify, and summarize?
Have I written a goal that can support a long-running task?
Did the context I consumed turn into mergeable code, a publishable article, reusable documentation, or verified results?
Am I still manually breaking everything into tiny pieces so AI can only do one small part?

If the answer is no for a long time, then it is worth reflecting.

Not on whether you know how to write prompts, but on whether you have upgraded your working style.

AI is no longer only a tool for answering questions. It is becoming an execution system capable of taking on complex tasks. If you keep feeding it tiny questions, it will still answer them. But your output ceiling remains locked by your own task granularity.

Closing

Single-request context usage is not a simple efficiency metric.

It is a shadow metric. Behind it are project complexity, request granularity, context organization ability, delegation ability, and validation ability.

In complex projects, it is normal for AI to consume a large amount of context during exploration. What matters is whether it can turn that context into effective action. The larger the request grain, the more context AI needs. The stronger the delegation ability, the more likely one request can cross from "answering" into "working."

That is why I increasingly see context window usage as one useful signal for whether someone is using AI well.

If someone can consistently let AI move forward inside large context windows, they have started using AI to handle real complexity.

If someone always uses tiny windows and only asks for fragmented answers, they may still be stuck in the habits of the previous generation of tools.

The AI-era gap is not just a model gap.

The larger gap is whether a person can define a large enough request, organize enough clear context, and confidently delegate a real task.

A Simple Metric for Whether You Are Using AI Well | Hailin Zhu