Firecrawl Download

Firecrawl Download

Downloads full websites as local markdown files, screenshots, or multiple formats

Category: development Source: firecrawl/cli

What Is This?

Overview

Firecrawl Download is a command-line skill that enables developers and researchers to save entire websites as organized local files. By combining site mapping and page scraping into a single workflow, it produces clean output in markdown, screenshots, or multiple formats per page. The result is a structured local directory that mirrors the content of any target website.

The skill operates through the Firecrawl CLI, automating what would otherwise require dozens of manual steps. Instead of visiting each page individually and copying content, users run a single command that handles crawling, content extraction, and file organization automatically. This makes it practical for documentation archiving, offline research, and content migration projects.

Unlike simple web scrapers that return raw HTML, Firecrawl Download produces readable, well-formatted output. Markdown files strip away navigation clutter and advertisements, leaving only the core content. Screenshots capture the visual state of each page. Both formats can be generated simultaneously, giving users flexible options depending on their downstream use case.

Who Should Use This

  • Backend developers who need to download API documentation or technical references for offline access during travel or restricted network environments
  • Technical writers who want to archive third-party documentation before it changes or becomes unavailable
  • Data engineers building training datasets or content pipelines that require bulk page downloads in clean text format
  • DevOps engineers who need to snapshot documentation sites before major infrastructure migrations
  • Researchers conducting content analysis across large websites who require local copies for processing
  • Product managers who want to save competitor documentation or public knowledge bases for structured review

Why Use It?

Problems It Solves

  • Manual download overhead: Saving individual pages one at a time is slow and error-prone. Firecrawl Download automates the entire process across hundreds or thousands of pages.
  • HTML noise in saved files: Browser-saved HTML files include scripts, styles, and navigation that obscure the actual content. This skill produces clean markdown output instead.
  • Broken offline references: Standard browser downloads often fail to resolve internal links or embedded assets correctly. Firecrawl organizes files into directories that preserve the site structure.
  • Inconsistent formatting: When multiple team members save pages manually, output formats vary. A single CLI command ensures consistent file naming and structure across the entire download.
  • Documentation availability risk: External documentation can be removed or changed without notice. Local copies protect against losing access to critical reference material.

Core Highlights

  • Downloads entire websites with a single command
  • Outputs clean markdown, screenshots, or both formats per page
  • Organizes files into structured local directories
  • Handles site mapping automatically before scraping
  • Supports bulk operations across large documentation sites
  • Produces human-readable content without HTML clutter
  • Works with public websites and documentation portals
  • Integrates directly into existing development workflows via CLI

How to Use It?

Basic Usage

To download a website as markdown files, run the following command:

firecrawl crawl https://docs.example.com --output ./local-docs --format markdown

To capture screenshots of each page alongside markdown output:

firecrawl crawl https://docs.example.com --output ./local-docs --format markdown,screenshot

To limit the crawl depth and control scope:

firecrawl crawl https://docs.example.com --output ./local-docs --max-depth 3 --format markdown

Specific Scenarios

Scenario 1: Archiving API documentation A developer preparing for a flight needs offline access to a framework's API reference. Running the crawl command against the documentation root saves all pages as markdown files, organized by URL path, ready for local browsing.

Scenario 2: Content migration preparation A technical writer migrating content from one platform to another downloads the existing site as markdown. The clean output can be reviewed, edited, and imported into the new system without manual reformatting.

Real-World Examples

Example 1: Saving a library's documentation

firecrawl crawl https://docs.somelib.io --output ./somelib-docs --format markdown

Example 2: Capturing visual snapshots for a design review

firecrawl crawl https://product.example.com --output ./screenshots --format screenshot

When to Use It?

Use Cases

  • Downloading framework or library documentation for offline development
  • Archiving public knowledge bases before planned migrations
  • Building local content datasets for analysis or machine learning
  • Creating reference snapshots of competitor or partner documentation
  • Saving tutorial sites for team onboarding packages
  • Preserving project documentation before a site is decommissioned
  • Generating markdown content libraries from existing web resources

Important Notes

Requirements

  • Firecrawl CLI must be installed and authenticated before use
  • A valid API key is required for crawl operations
  • Target websites must be publicly accessible or reachable from the executing environment
  • Sufficient local disk space should be available for large site downloads