Claude Code is the most powerful coding tool available, but its ease of use can also lead to rapid token usage. However, most users hit their usage limits due to inefficient use rather than the tool being expensive. By making three small changes, users can get 3-5x more out of their existing plan.
Optimizing Model Usage
Claude Code has multiple models, including Opus 4.6 and Sonnet 4.6. Opus 4.6 is the most intelligent model, while Sonnet 4.6 is faster and more affordable. By default, Claude Code uses Opus for all tasks, which can lead to excessive token usage. To optimize model usage, users can run the command /model opus-plan in their Claude Code session. This command ensures that Opus only handles planning tasks, while Sonnet handles execution tasks, resulting in roughly 5x cheaper token usage for heavy lifting.
Using Subagents
Every message sent to Claude Code requires the tool to re-read the entire chat history, leading to bloated context and increased token usage. Subagents can be used to mitigate this issue. A subagent runs in its own context window, allowing users to send it to perform heavy reading tasks, such as exploring codebases or researching libraries, without affecting the main chat. To use a subagent, users can simply ask Claude Code to do so, and the tool will automatically spin up a subagent. This approach ensures that users only pay for the summary provided by the subagent, rather than the entire context.
Installing the Caveman Plugin
The Caveman plugin is a Claude Code plugin that makes the tool respond in a concise, caveman-like language, reducing token usage by up to 65%. The plugin can be installed by telling Claude Code to /install caveman. Once installed, users can activate the plugin by typing /caveman and selecting their desired level of conciseness: lite, full, or ultra.
Combining the Tricks
By combining these three tricks, users can significantly extend their token usage limits and get more out of their Claude Code plan. The Opus and Sonnet models can be used efficiently, subagents can be utilized for research and exploration, and the Caveman plugin can be installed to reduce token usage. By implementing these methods, users can maximize their token usage without sacrificing performance.
Conclusion
Claude Code is a powerful tool, but its token usage can quickly add up. By implementing the three tricks outlined above, users can significantly reduce their token usage and get more out of their existing plan. Whether you're a seasoned developer or just starting out, these tricks can help you maximize your Claude Code usage and achieve your coding goals.