AI

Claude’s New Council Skill Turns AI Debates Into Decisions

Claude’s default agreeableness can lead to dangerously one-sided answers. A new skill, inspired by Andrej Karpathy’s LLM Council, spins up five distinct AI advisors to argue, anonymously peer-review, and deliver a concrete verdict. Installation takes 10 seconds, and the process works entirely within Claude Code—no external APIs or multiple models needed. For high-stakes decisions, it’s a way to surface blind spots before committing.

What the LLM Council Skill Does

The LLM Council Skill is a free plugin for Claude Code that replaces a single AI’s answer with a structured debate among five specialized advisors. Instead of asking Claude a question once and accepting its response, the skill:

  1. Spins up five distinct AI agents, each with a unique thinking style.
  2. Collects their independent responses to your question.
  3. Anonymizes and shuffles the responses so reviewers don’t know which advisor wrote which.
  4. Has each advisor peer-review all five responses, answering three specific questions:
    • Which response is the strongest?
    • Which response is the weakest?
    • What critical point is missing from all responses?
  5. Passes all responses and reviews to a final "chairman" agent, which synthesizes a verdict with a concrete next step.

The result is not a list of pros and cons but a single, actionable answer that accounts for perspectives no single advisor considered.

How It Works Under the Hood

The skill doesn’t rely on multiple AI models or external APIs. Instead, it uses Claude Code’s sub-agent system to simulate five distinct advisors, each with a predefined role:

  • The Optimist: Focuses on opportunities, upside, and feasibility.
  • The Skeptic: Highlights risks, edge cases, and potential failures.
  • The Strategist: Evaluates long-term alignment with goals and systems.
  • The Tactician: Prioritizes immediate execution and logistics.
  • The Ethicist: Flags ethical, legal, or reputational concerns.

These roles are hardcoded into the skill’s prompts, ensuring the advisors don’t converge on the same answer. After generating their responses, the skill anonymizes them by assigning random letters (A through E) to each. Reviewers then evaluate the responses purely on merit, without knowing which advisor wrote which.

The chairman agent, which produces the final verdict, has access to all five responses and all peer-review feedback. Its output is designed to be decisive: not "it depends" or "consider both sides," but a clear recommendation with a first step to act on.

How to Install and Use It

The skill can be installed in two ways:

  1. One-line install (recommended): Open Claude Code and type:

    install this skill for me: https://github.com/tenfoldmarc/llm-council-skill
    

    Claude Code will handle the rest.

  2. Manual install: Run this command in your terminal:

    pip install git+https://github.com/tenfoldmarc/llm-council-skill.git
    

    Then open Claude Code and invoke the skill with:

    council this: [your question]
    

The skill works best for decisions where:

  • The cost of being wrong is high (e.g., business strategy, product launches, hiring).
  • You’ve gone back and forth on the answer multiple times.
  • You suspect your own framing might be biasing the outcome.

It’s not designed for low-stakes questions (e.g., "What should I eat for lunch?") or situations where you already know the answer and just want validation.

Tradeoffs and Limitations

The LLM Council Skill has clear advantages but also some constraints:

Pros:

  • Reduces single-model bias: By forcing multiple perspectives, it surfaces blind spots Claude might miss on its own.
  • No external dependencies: Runs entirely within Claude Code, so no API keys, rate limits, or additional costs.
  • Fast and lightweight: The entire process completes in one session, typically in under a minute.
  • Free and open-source: No paywall or subscription required.

Cons:

  • Not for simple questions: Overkill for straightforward queries where a single answer suffices.
  • Still AI-dependent: The quality of the verdict depends on Claude’s underlying capabilities. Garbage in, garbage out.
  • No human oversight: The peer review is entirely AI-driven. For critical decisions, human review is still recommended.
  • Limited to Claude Code: Won’t work with other AI platforms or standalone Claude interfaces.

When to Use It (and When to Skip It)

Use the LLM Council Skill when:

  • You’re making a high-stakes decision (e.g., launching a product, entering a new market, hiring a key role).
  • You’ve asked Claude the same question multiple times and gotten inconsistent answers.
  • You suspect your own framing or assumptions are shaping the response.
  • You need a concrete next step, not just a list of pros and cons.

Skip it when:

  • The question is low-stakes or has a clear, objective answer (e.g., "What’s the capital of France?").
  • You’re looking for validation of a decision you’ve already made.
  • You need a quick answer and don’t have time for the full debate process.
  • You’re working outside of Claude Code.

Bottom Line

The LLM Council Skill is a practical way to turn Claude’s agreeableness from a weakness into a strength. By simulating a structured debate among five distinct advisors, it surfaces perspectives and blind spots that a single AI might miss. Installation is trivial, and the skill requires no external tools or APIs. For high-stakes decisions, it’s a free and fast way to stress-test your thinking—provided you’re open to answers you might not want to hear.

Similar Articles

More articles like this

AI 2 min

DeepClaude Lets You Run Claude Code With DeepSeek's Brain for 17x Cheaper - Decrypt

A new cloud-based service, DeepClaude, slashes costs for running OpenAI's Claude large language model by leveraging the massively parallel architecture of DeepSeek's Brain, a custom-designed ASIC, to achieve a 17-fold reduction in computational expenses, making high-performance LLM inference accessible to a broader range of developers and enterprises. This breakthrough is poised to accelerate AI adoption across industries. The service's efficiency is attributed to its ability to optimize Claude's neural network for DeepSeek's Brain's unique hardware capabilities. AI-assisted, human-reviewed.

AI 4 min

59 Claude Prompts to Solve Real-Life Problems—Not Just ‘Productivity Hacks’

Claude’s potential is often wasted on generic queries. A curated set of 59 prompts—organized by real-world problems like finance, life admin, and creative problem-solving—helps users extract more value from the AI. The key? Treating Claude as a collaborative tool, not a search engine, and refining outputs through iterative feedback. Here’s how to use them effectively.

AI 2 min

Week one of the Musk v. Altman trial: What it was like in the room

A high-stakes showdown between tech titans unfolded in an Oakland courtroom, as Elon Musk took OpenAI to task over alleged mismanagement of his $20 million investment, sparking a contentious trial that may redefine the boundaries of AI research and corporate accountability. Musk's lawsuit centers on OpenAI's handling of its multimodal model, Llama 3, and the company's decision to integrate it with its Operator API. The trial's outcome will have far-reaching implications for the AI industry. AI-assisted, human-reviewed.

AI 1 min

Tailoring AI solutions for health care needs

Healthcare AI’s hype cycle is colliding with clinical reality: vendors now ship narrow, HIPAA-compliant microservices—think Nuance DAX for ambient scribing or Viz.ai’s stroke-detection inference engines—that plug directly into Epic and Cerner workflows, cutting documentation time by 30-40 % while sidestepping the regulatory quicksand of autonomous diagnosis. The real shift isn’t grand transformation but granular integration, where latency under 200 ms and FHIR-native APIs decide adoption over lofty promises. AI-assisted, human-reviewed.

AI 4 min

Google’s Next-Gen Gemini Flash Spotted in Stealth Testing

A previously unannounced Google Gemini model is undergoing stealth testing on LM Arena, delivering output quality far beyond the current Gemini 3 Flash. Observers speculate it could be Gemini 3.1 Flash, 3.2 Flash, or even 3.5 Flash, with performance closer to Gemini 3.1 Pro. The discovery aligns with Google’s pattern of pre-release testing and comes weeks before Google I/O 2026, where major AI updates are expected.

AI 3 min

Build a 5-Minute Weekly Trend Scanner with Replit and AI

A Replit-based AI agent now lets non-developers scrape trending AI topics and e-commerce products from six sources in under five minutes per week. The tool aggregates growth data, ranks findings by niche, and exports ready-to-use briefs to Notion. The setup requires only one prompt and runs automatically every Sunday, delivering a prioritized list by Monday morning.