LLM Council

Multi-model LLM council with live dashboard for querying multiple AI models simultaneously and synthesizing consensus

LLM Council

LLM Council is a multi-model orchestration skill for developers and researchers who want to query several AI models simultaneously, compare their responses side by side, and synthesize a consensus answer through anonymous model-to-model voting.

What Is LLM Council

LLM Council coordinates requests across multiple large language models, including Claude, GPT, and Gemini, through a unified AI Gateway integration. Each model receives the same prompt and streams its response back to a live web dashboard using Server-Sent Events. The dashboard renders all responses in parallel, supports light and dark themes, and handles markdown formatting so code blocks, lists, and headings display correctly.

Behind the scenes, a ThreadingHTTPServer manages concurrent connections, allowing multiple model streams to arrive and render without blocking one another. A cache-busting mechanism ensures the browser always loads the latest dashboard assets rather than serving stale files from a previous session. The result is a swarm-style interface where you can watch several AI models reason through the same problem in real time.

Why Use LLM Council

Single-model queries give you one perspective. That perspective may be confident, fluent, and completely wrong. By routing the same question to several models at once, LLM Council surfaces disagreements, highlights areas of consensus, and exposes edge cases that any single model might miss.

The anonymous voting round adds a second layer of quality control. After each model produces its initial response, the council enters a voting phase where models evaluate the other responses without knowing which model generated them. This reduces brand bias and encourages evaluation based on reasoning quality alone. The synthesized consensus that emerges from this process is often more reliable than any individual response.

For teams building AI-powered products, LLM Council also serves as a benchmarking tool. You can run the same prompt across model versions, compare latency, and identify which model handles specific task types most effectively.

How to Use LLM Council

Start by configuring your AI Gateway credentials and specifying which models to include in the council. A typical configuration looks like this:

COUNCIL_MODELS = [
    {"id": "claude-3-5-sonnet", "provider": "anthropic"},
    {"id": "gpt-4o", "provider": "openai"},
    {"id": "gemini-1.5-pro", "provider": "google"},
]

GATEWAY_BASE_URL = "https://your-gateway-endpoint.example.com"
GATEWAY_API_KEY = "your_api_key_here"

Once configured, launch the ThreadingHTTPServer and open the dashboard in your browser:

from http.server import ThreadingHTTPServer
from council_handler import CouncilRequestHandler

server = ThreadingHTTPServer(("localhost", 8080), CouncilRequestHandler)
print("Council dashboard running at http://localhost:8080")
server.serve_forever()

Submit a prompt through the dashboard input field. The server fans out the request to all configured models simultaneously. Each model streams tokens back via SSE, and the dashboard appends them to the corresponding response panel as they arrive. When all streams complete, the voting phase begins automatically.

During voting, each model receives the full set of anonymized responses and returns a ranked preference. The skill aggregates these rankings, computes a weighted consensus score, and highlights the top-ranked response. You can also read the synthesized summary, which blends key points from the highest-scoring answers.

When to Use LLM Council

Use LLM Council when the cost of a wrong answer is high. Legal interpretation, medical information summaries, security vulnerability analysis, and financial reasoning all benefit from multi-model validation before any result reaches a user or downstream system.

It is also valuable during prompt engineering. Running a new prompt template through the council immediately reveals how different models interpret ambiguous phrasing, helping you refine instructions before committing to a single-model deployment.

Research workflows gain from LLM Council when you need to survey how current models handle a specific domain. Instead of running sequential experiments, you get parallel results in a single session, with timestamps and response lengths logged automatically for later analysis.

Important Notes

API costs multiply with each model added to the council. A three-model council triples your token spend per query, so set appropriate usage limits in your gateway configuration before running high-volume sessions.

SSE connections remain open until the stream completes. Firewalls and reverse proxies that enforce short request timeouts may interrupt streams from slower models. Configure your infrastructure to allow at least 120 seconds for long-form responses.

The anonymous voting phase requires all initial responses to complete before it begins. If one model times out, the voting round proceeds with the remaining responses, and the timed-out model is marked as unavailable in the dashboard summary.

FAQ

Q: What is the LLM Council skill in Happycapy?

LLM Council is a Skills module for Happycapy that enables users to query multiple large language models simultaneously and synthesize their responses into a consensus answer.

Q: How does the live dashboard work in the LLM Council skill?

The live dashboard in LLM Council provides a real-time interface to submit queries to several AI models at once and view their individual and combined responses.

Q: Can I use LLM Council to compare outputs from different AI agents?

Yes, LLM Council is designed to let you compare outputs from multiple AI agent models side by side, making it easier to evaluate and synthesize their answers.

Q: Is LLM Council compatible with other Skills in the Happycapy platform?

LLM Council integrates smoothly with other Skills on the Happycapy platform, allowing you to build more complex workflows involving multiple AI agent capabilities.

Q: Where can I find the source code or documentation for LLM Council?

You can access the source code and documentation for the LLM Council skill on the official Happycapy Skills GitHub repository.