Libfuzzer

Automate and integrate LibFuzzer coverage-guided fuzzing into your testing workflows

Libfuzzer is a community skill for implementing in-process fuzz testing with libFuzzer, covering harness writing, corpus management, sanitizer integration, dictionary usage, and coverage analysis for C and C++ code testing.

What Is This?

Overview

Libfuzzer provides tools for building in-process coverage-guided fuzz tests using the libFuzzer engine. It covers harness writing that implements LLVMFuzzerTestOneInput entry points to expose target functions to fuzzed input data, corpus management that maintains and minimizes test input collections for maximum code coverage with minimal file count, sanitizer integration that combines fuzzing with AddressSanitizer, MemorySanitizer, and UndefinedBehaviorSanitizer for precise bug detection, dictionary usage that supplies keyword and token lists to guide mutations toward syntactically meaningful inputs, and coverage analysis that measures which code paths the fuzzer has explored using LLVM coverage instrumentation. The skill enables developers to find memory safety bugs and undefined behavior through automated testing.

Who Should Use This

This skill serves C and C++ developers testing parsers and protocol implementations, security engineers auditing native code libraries, and open source maintainers integrating continuous fuzzing into their projects.

Why Use It?

Problems It Solves

Unit tests cover expected inputs but miss unexpected byte sequences that trigger memory corruption or crashes. Manual testing cannot explore the input space deeply enough to find edge cases in parsing and deserialization code. Buffer overflows and use-after-free bugs hide in code paths that normal testing never exercises. Sanitizer bugs only surface when specific input patterns trigger them during execution.

Core Highlights

Harness generator creates fuzz entry points for target library functions with proper input handling. Corpus minimizer reduces test collections while preserving coverage. Sanitizer combiner runs fuzz targets with multiple sanitizers for comprehensive bug detection. Coverage reporter tracks explored code paths across fuzzing campaigns.

How to Use It?

Basic Usage

// libFuzzer harness
#include <stdint.h>
#include <stddef.h>

// Target function
int parse_message(
  const uint8_t *data,
  size_t size);

int LLVMFuzzerTestOneInput(
  const uint8_t *data,
  size_t size
) {
  if (size < 4)
    return 0;
  parse_message(
    data, size);
  return 0;
}

/*
 * Build command:
 * clang -g -O1
 *   -fsanitize=fuzzer,
 *     address
 *   -o fuzz_target
 *   fuzz_harness.c
 *   target_lib.c
 *
 * Run:
 * ./fuzz_target
 *   corpus_dir/
 *   -max_len=1024
 *   -timeout=10
 */

Real-World Examples

import subprocess
import os
from pathlib import Path

class FuzzRunner:
  def __init__(
    self,
    binary: str,
    corpus_dir: str,
    max_len: int = 4096
  ):
    self.binary = binary
    self.corpus = Path(
      corpus_dir)
    self.max_len = max_len
    self.corpus.mkdir(
      exist_ok=True)

  def run(
    self,
    max_time: int = 300,
    jobs: int = 1
  ) -> str:
    cmd = [
      self.binary,
      str(self.corpus),
      f'-max_len='
        f'{self.max_len}',
      f'-max_total_time='
        f'{max_time}',
      f'-jobs={jobs}']
    result = subprocess\
      .run(
        cmd,
        capture_output=True,
        text=True)
    return result.stderr

  def minimize(
    self
  ) -> str:
    mini = Path(
      f'{self.corpus}'
      f'_minimized')
    mini.mkdir(
      exist_ok=True)
    cmd = [
      self.binary,
      '-merge=1',
      str(mini),
      str(self.corpus)]
    result = subprocess\
      .run(
        cmd,
        capture_output=True,
        text=True)
    return result.stderr

Advanced Tips

Create dictionaries with protocol keywords and magic bytes to help the fuzzer generate inputs that pass initial parsing checks and reach deeper code paths. Use corpus merging to combine coverage from multiple fuzzing sessions into a single minimized set. Enable comparison logging with use_value_profile to help the fuzzer solve multi-byte comparison checks.

When to Use It?

Use Cases

Fuzz test a JSON parser to find memory safety issues with malformed input documents. Integrate continuous fuzzing into CI to catch regressions in a protocol handling library. Test a cryptographic library implementation against unexpected input lengths and formats.

Important Notes

Requirements

Clang compiler with libFuzzer support for building instrumented targets. LLVM sanitizer runtime libraries for bug detection. Seed corpus with sample valid inputs for the target function.

Usage Recommendations

Do: combine fuzzing with AddressSanitizer to detect memory access violations immediately. Start with a corpus of valid inputs from existing test suites to bootstrap coverage. Set appropriate max_len values based on the expected input sizes for the target function.

Don't: ignore timeout configuration since slow inputs consume fuzzing resources without finding new coverage. Run fuzzing with optimizations disabled as this significantly reduces execution throughput. Skip corpus minimization on long-running campaigns since bloated corpora slow restart time.

Limitations

In-process fuzzing shares address space with the target so crashes terminate the fuzzer process. libFuzzer is designed for LLVM-based compilation and does not support GCC-compiled targets. Structured input generation requires custom mutators that add development effort beyond basic byte-level fuzzing.

More Skills You Might Like

Explore similar skills to enhance your workflow