Guidance

Automate and integrate Guidance-based prompt control into your AI workflows

Guidance is a community skill for structured language model output generation using the Guidance library, covering constrained decoding, template composition, grammar enforcement, function calling patterns, and output validation for reliable LLM applications.

What Is This?

Overview

Guidance provides tools for controlling language model outputs through structured templates and constrained generation. It covers constrained decoding that restricts token generation to valid options at each step ensuring outputs match specified patterns, template composition that builds reusable prompt templates with typed generation slots for structured outputs, grammar enforcement that applies context-free grammars to guarantee syntactically valid JSON, XML, or custom formats, function calling patterns that interleave model generation with tool execution for agentic workflows, and output validation that verifies generated content meets schema requirements before downstream processing. The skill enables developers to build LLM applications that produce consistently structured and valid outputs.

Who Should Use This

This skill serves AI application developers needing reliable structured outputs from language models, prompt engineers building complex generation pipelines, and backend teams integrating LLM outputs into typed data systems.

Why Use It?

Problems It Solves

Free-form language model outputs fail to parse as valid JSON or structured data in a significant fraction of requests. Retry-based approaches to getting valid outputs waste tokens and increase latency. Prompt engineering alone cannot guarantee output format compliance especially for complex nested structures. Function calling requires careful orchestration between generation and tool execution steps.

Core Highlights

Constraint engine restricts generation to valid tokens at each decoding step. Template builder composes structured prompts with typed generation slots. Grammar enforcer guarantees outputs match specified context-free grammars. Function router interleaves generation with tool calls in a controlled execution flow.

How to Use It?

Basic Usage

import guidance

@guidance
def extract_entity(
  lm,
  text: str
):
  lm += f'''Extract entity """
{text}
"""
Name: {guidance.gen(
  name="name",
  stop="\n")}
Type: {guidance.select(
  options=[
    "person",
    "organization",
    "location",
    "product"],
  name="type")}
Confidence: {guidance.gen(
  name="confidence",
  regex=(
    r"[0-9]\\.[0-9]{2}"),
  stop="\n")}
'''
  return lm

model = guidance.models\
  .LlamaCpp(
    'model.gguf')
result = model +\
  extract_entity(
    'OpenAI released '
    'GPT-4 in 2023')
print(result['name'])
print(result['type'])

Real-World Examples

import guidance

@guidance
def generate_json(
  lm,
  description: str
):
  lm += f'''Generate a JSON
object for: {description}
```json
{guidance.json(
  name="output",
  schema={{
    "type": "object",
    "properties": {{
      "title": {{
        "type": "string"}},
      "tags": {{
        "type": "array",
        "items": {{
          "type":
            "string"}}}},
      "priority": {{
        "type": "integer",
        "minimum": 1,
        "maximum": 5}}}},
    "required": [
      "title",
      "tags",
      "priority"]}})}

''' return lm

model = guidance.models
.LlamaCpp( 'model.gguf') result = model +
generate_json( 'bug fix for login') data = result['output']


### Advanced Tips

Chain multiple guidance functions together to build multi-step generation pipelines where earlier outputs inform later generation prompts. Use select with computed options to create dynamic constrained choices based on runtime context. Combine regex constraints with stop tokens for fields that need both format validation and length control.

## When to Use It?

### Use Cases

Generate guaranteed valid JSON from language model outputs for API response construction. Build structured extraction pipelines that produce typed entity data from unstructured text. Create interactive generation workflows where tool results feed back into subsequent generation steps.

### Related Topics

Structured generation, constrained decoding, prompt engineering, JSON schema, grammar-guided generation, LLM applications, and output validation.

## Important Notes

### Requirements

Guidance library installed with appropriate model backend support. Compatible language model that supports token-level logit access for constrained decoding. Python 3.9 or newer for type annotation features.

### Usage Recommendations

**Do:** define explicit schemas for JSON generation to leverage grammar enforcement. Test templates with varied inputs to verify constraint satisfaction across edge cases. Use select for categorical fields and regex for formatted values.

**Don't:** apply overly restrictive constraints that prevent the model from generating coherent content within the allowed token space. Mix constrained and unconstrained generation without clear boundaries between structured and free text sections. Assume all model backends support every constraint type equally.

### Limitations

Constrained decoding requires token-level logit access which is not available through all API providers or cloud-hosted model endpoints. Complex grammars can slow generation speed since each token must be validated against the grammar constraint at decode time. Guidance templates are tightly coupled to specific model interfaces and may need updates when switching between different backends.