Repomix

Package entire code repositories into single AI-friendly files using Repomix. Capabilities include pack codebases with customizable include/exclude pa

What Is Repomix?

Repomix is a command-line tool designed to package entire code repositories into single, AI-friendly files. Its core purpose is to streamline the process of preparing large codebases for analysis by large language models (LLMs) such as Claude, ChatGPT, and Gemini. By consolidating complex directory structures and diverse file types into a single output file, Repomix helps users deliver context-rich code representations that are optimized for AI consumption. The tool offers advanced customization, including granular control over which files and directories to include or exclude, multiple output formats, and token counting to align with LLM context constraints.

Why Use Repomix?

As AI-driven code analysis, documentation, and security auditing become increasingly common, developers and teams frequently need to present entire repositories to LLMs. However, most AI models have strict context size limitations and perform best when provided with well-structured, relevant input. Manually curating and formatting codebases for this purpose is tedious and error-prone. Repomix automates this process, ensuring that codebases are packaged efficiently and effectively for AI applications. Typical use cases include:

  • Packaging codebases for LLM-based code review or comprehension
  • Generating repository snapshots as context for AI-powered tools
  • Analyzing third-party libraries for vulnerabilities or architecture
  • Preparing concise, AI-ready documentation
  • Conducting security audits with comprehensive code input
  • Investigating bugs or unfamiliar codebases via LLMs

By leveraging Repomix, teams can reduce manual effort, minimize irrelevant content, and maximize the effectiveness of AI-driven workflows.

How to Get Started

Repomix offers a straightforward installation and usage process, supporting major platforms and multiple installation methods.

Installation

You can install Repomix globally using npm or Homebrew:

## Install via npm
npm install -g repomix

## Install via Homebrew (macOS/Linux)
brew install repomix

To verify the installation:

repomix --version

Basic Usage

To package the current directory into a single output file (default: XML format):

repomix

Specify a different output format, such as Markdown or JSON:

repomix --style markdown
repomix --style json

To package a remote GitHub repository directly:

npx repomix --remote owner/repo

You can also customize inclusion and exclusion patterns, output paths, and more. For full options, consult the built-in help:

repomix --help

Key Features

Repomix provides a rich set of features that make it a powerful asset for codebase packaging and AI integration:

  • Customizable Include/Exclude Patterns: Select exactly which files or directories to include or exclude using glob patterns. This lets you filter out unnecessary artifacts such as node_modules, build outputs, or test fixtures.

    repomix --exclude "node_modules/**" --include "src/**"
  • Multiple Output Formats: Generate output in XML, Markdown, plain text, or JSON. This flexibility ensures compatibility with various LLM input requirements.

    repomix --style markdown
    repomix --style xml
  • Preserve File Structure and Context: The output preserves the logical structure of the repository, making it easier for LLMs to understand file relationships and context.

  • Token Counting and AI Optimization: Repomix can count tokens in the output, helping you stay within the context window of your target LLM.

    repomix --max-tokens 32000
  • Filter by File Types and Directories: Focus the package on specific languages or folders, reducing noise and highlighting relevant code.

  • Custom Headers and Summaries: Enhance the packaged output with custom headers, summaries, or other metadata to provide additional context for AI analysis.

  • Remote Repository Support: Package codebases directly from GitHub without local cloning.

  • Snapshot Creation: Useful for creating time-stamped snapshots for audits, documentation, or incident analysis.

Best Practices

To maximize Repomix’s effectiveness in your workflows:

  • Define Clear Inclusion/Exclusion Rules: Use specific patterns to avoid packaging unnecessary files. For example, exclude dependencies and build artifacts:

    repomix --exclude "node_modules/**" --exclude "dist/**"
  • Limit Output Size: Consider token limits early, especially when targeting LLMs with strict context windows. Use --max-tokens to automatically trim the output.

  • Choose the Right Format: Match the output style (Markdown, XML, JSON) to your target AI tool’s expected input.

  • Annotate Output: Add custom headers or summaries for each file to provide the LLM with valuable context.

  • Test with Small Samples: Run Repomix on a subset of your repository to validate output before packaging the full codebase.

Important Notes

  • Sensitive Data: Repomix packages all files matching your patterns. Double-check inclusion/exclusion rules to avoid leaking secrets or credentials.
  • Performance: For very large repositories, packaging may take significant time and resources. Use exclusion patterns and token limits to optimize performance.
  • Output Review: Always review the generated output before sharing with AI tools, especially for proprietary or regulated codebases.
  • LLM Limitations: Be aware of your target AI model’s maximum context size when packaging repositories.
  • Ongoing Development: Repomix is actively maintained; refer to the official repository for updates, documentation, and issue tracking.