Claude Caveman Prompt: How Grunting at an AI Can Cut Your Costs by 75%

It started as a joke on Reddit.

A developer taught Claude to respond like a prehistoric human and reported that output tokens dropped by up to 75%.

The post hit r/ClaudeAI and racked up over 10,000 votes and 400 comments.

Then developers looked at their API bills and stopped laughing.

What Is The Claude Caveman Prompt?

Claude, like most AI models, is trained to sound helpful and warm.

Every response opens with “Sure! I’d be happy to help,” hedges with “you might want to consider,” and closes with “Let me know if you need anything else.”

Those phrases feel polite. Those phrases burn tokens.

Every token costs money on API calls and counts against your context window (the amount of text the model can “remember” during a conversation).

The caveman prompt is a system instruction that strips all of that filler out, leaving only the actual answer.

Less talk = Fewer tokens used = Savings!!!

The model still understands the problem the same way and performs the same internal reasoning. The only thing that changes is how it expresses the answer.

The Caveman Prompt

Here is the version that went viral:

			
Respond like a caveman. Drop articles (a, an, the). 
Drop filler words (just, really, basically). 
Drop pleasantries (sure, certainly, happy to help). 
No preamble. No postamble. Execute first, explain only if asked. 
Keep all technical accuracy. Code blocks unchanged.

		

The caveman skill is installable in one command via skills.sh and works globally across projects.

For Claude.ai users, you can add a version of this to your custom instructions or paste it as a system prompt at the start of any session.

So if we asked Claude to write something to pitch you our services, it would look like this:

You need words. We make words work.

Website not say what you mean? We fix. LinkedIn look empty? We fill. Blog exist but no one read? We fix that too.

Bluemint write. You grow.

Talk to us →https://wa.link/rjjwcd

The Code – 3 Ways You Can Use

Option 1: Minimal system prompt (works anywhere)

Paste this into Claude’s system prompt, custom instructions, or any tool that accepts a system message:

			
Be like a caveman.
No preamble. No sign-off. No filler phrases.
Never narrate what you're about to do.
Max 2 sentences unless asked.
Action first. Explain only if asked.
Drop articles (a, an, the).
Drop pleasantries (sure, certainly, happy to help).
Short synonyms: "big" not "extensive", "fix" not "implement a solution for".

		

Option 2: Julius Brussee’s SKILL.md plugin (Claude Code / Codex)

The Julius Brussee version comes with three compression modes — Normal, Lite, and Ultra — and keeps all code blocks unchanged, error messages quoted exactly, and technical terms intact. Caveman only strips the English wrapper around the facts.

Install with one command:

bash

npx skills add JuliusBrussee/caveman

Option 3: Shawnchee’s universal agent skill (40+ tools)

The caveman skill is installable in one command via skills.sh and works globally across projects.

bash

curl -fsSL skills.sh | sh -s JuliusBrussee/caveman

The skill distills the approach into 10 rules: no filler phrases, execute before explaining, no meta-commentary, no preamble, no postamble, no tool announcements, explain only when needed, let code speak for itself, treat errors as things to fix rather than narrate.

How The Caveman Prompt Evolved

The original Reddit post

An anonymous developer posts on r/ClaudeAI: “Taught Claude to talk like a caveman to use 75% less tokens.” Shows ~180 tokens → ~45 tokens on a web search task.

10,000+ upvotes · 400+ comments
Julius Brussee publishes the GitHub skill

Developer Julius Brussee formalises the idea as a SKILL.md file — JuliusBrussee/caveman.

Includes three compression modes: Normal, Lite, and Ultra. 10 strict rules. Compatible with Claude Code and Codex.0 → 20,000+ stars in ~12 hours on Hacker News
Shawnchee packages a universal agent skill

Developer Shawnchee releases a standalone caveman-skill compatible with Claude Code, Cursor, Windsurf, Copilot, and 40+ other AI agents. Verifies benchmarks with tiktoken: 68% savings on web search, 50% on code edits, 72% on Q&A.

Works across 40+ AI agent tools
Covered by Decrypt, Yahoo Tech, Hackaday

Mainstream tech press picks up the story. Developers begin independently benchmarking. Kuba Guzik publishes a counter-benchmark: real savings land at 14–21% when input tokens are counted, though a 6-line micro-prompt he distilled outperforms the full skill.Independent benchmarks published
Community extensions and variations

Memory compression companion tool released alongside caveman, cutting ~45% of input tokens per session.

Developers share their own stripped-down versions. Called “the most powerful prompt skill of 2026” across X and Threads.

Still active on GitHub today

Example Output Changes

Task: “Why is my app slow?”

Default Claude ~140 tokens	Caveman mode ~22 tokens
“The performance issues you’re experiencing are likely caused by the fact that your application is making multiple redundant API calls on each page load. I’d recommend first profiling your app to identify the specific bottlenecks before implementing any optimizations.”	“Redundant API calls on page load. Profile first. Then fix bottlenecks.”

Task: Auth middleware bug fix

Default Claude ~180 tokens	Caveman mode ~45 tokens
“Sure! I’d be happy to help you with that. The issue you’re experiencing is most likely caused by your authentication middleware not properly validating the token expiry. Let me take a look and suggest a fix for you.”	“Bug: auth middleware. Token expiry not checked. Fix: add expiry validation before granting access.”

Do We Really Save So Many Tokens?

The 75% headline gets complicated under scrutiny.

Benchmarks in Shawnchee’s repo, verified with tiktoken, show output token reductions of 68% on web search tasks, 50% on code edits, and 72% on Q&A — for an average of 61% across four standard tasks. Decrypt
Developer Kuba Guzik ran 72 independent test runs across Claude Sonnet and Opus. His real-world savings landed at 14–21% when the full session context is counted — because input tokens (conversation history, system prompt, files) typically dwarf output tokens. Medium
More surprising: a six-line micro prompt he distilled from the original — just 85 tokens instead of the skill’s 552 — outperformed the full skill on both models, with zero quality loss across all 72 runs. DEV Community

The honest picture:

Output-only savings are real and significant.
Session-level savings are smaller but still meaningful, especially for agentic workflows with dozens of turns.

When to use it — and when not to

Use caveman mode when:

Running automated pipelines with hundreds of API calls
Doing code reviews, debugging sessions, or quick Q&A
Hitting Claude Pro usage limits during a heavy session
Working solo or with technical teammates who don’t need hand-holding

Skip it when:

Writing for clients, users, or non-technical readers
Producing step-by-step explanations or tutorials
Doing complex reasoning tasks where full context matters
Feeding the model its own prompts in caveman style — that can spiral into a “garbage in, garbage out” situation.

To Caveman, or Not To Caveman

A handful of researchers argued that forcing an AI into a less sophisticated persona could hurt its reasoning quality. That verbal constraints might bleed into cognitive ones. The concern has not been definitively settled.

What to do right now

If you use Claude.ai: Go to Settings → Custom Instructions. Add the minimal prompt above. Test it on your next five tasks.
If you use Claude Code: Run npx skills add JuliusBrussee/caveman. Three modes available — start with Normal.
If you build on the Claude API: Add a concise system-prompt constraint before going full caveman. A simple “be concise, return JSON” already handles 60% of the savings — caveman adds another 14–21% on top.

Measure before and after. The numbers will tell you whether it’s worth keeping.

Latest posts

A Brand Is A Gut Feeling

Marty Neumeier spent decades working with Apple, Netscape, and HP before writing the most cited definition of a brand in modern history. Here is what he said — and why…

Content

·

April 30, 2026
7 Marketing Mistakes Founders Make Before Series A – But You Will Not

You raised your seed round on an idea, early signals, and someone who believed in you enough to write a cheque. Now you are building. Hiring. Firefighting. And marketing takes…

Content

·

April 15, 2026
Claude Caveman Prompt: How Grunting at an AI Can Cut Your Costs by 75%

It started as a joke on Reddit. A developer taught Claude to respond like a prehistoric human and reported that output tokens dropped by up to 75%. The post hit…

AI

·

April 13, 2026

Claude Caveman Prompt: How Grunting at an AI Can Cut Your Costs by 75%

What Is The Claude Caveman Prompt?

The Caveman Prompt

The Code – 3 Ways You Can Use

How The Caveman Prompt Evolved

Example Output Changes

Do We Really Save So Many Tokens?

When to use it — and when not to

To Caveman, or Not To Caveman

What to do right now

Share this:

Latest posts

A Brand Is A Gut Feeling

7 Marketing Mistakes Founders Make Before Series A – But You Will Not

Claude Caveman Prompt: How Grunting at an AI Can Cut Your Costs by 75%