Dwarf Expert

Automate and integrate Dwarf Expert for specialized knowledge and task automation

DWARF Expert is a community skill for analyzing DWARF debugging information in compiled binaries, covering debug info parsing, type information extraction, source mapping analysis, variable location tracking, and call frame information decoding for binary analysis and debugger development.

What Is This?

Overview

DWARF Expert provides patterns for reading and interpreting DWARF debug information embedded in compiled executables and shared libraries. It covers debug info parsing that reads DWARF sections including .debug_info, .debug_abbrev, and .debug_line from ELF binaries, type information extraction that decodes struct, enum, and typedef definitions from the debug data, source mapping analysis that links machine code addresses to source file lines and columns, variable location tracking that determines where variables live in registers or memory at each program point, and call frame information decoding that reads CFA rules for stack unwinding. The skill enables developers to build debugging tools and perform deep binary analysis.

Who Should Use This

This skill serves debugger and toolchain developers implementing DWARF consumers, reverse engineers analyzing compiled binaries with debug information, and compiler engineers verifying debug information quality in their output.

Why Use It?

Problems It Solves

DWARF format is complex with multiple encoding schemes and version differences that are difficult to parse correctly. Extracting type information requires following chains of references across debug sections. Mapping addresses to source lines needs decoding the line number program state machine. Variable locations change throughout execution requiring interpretation of DWARF expressions.

Core Highlights

Section parser reads .debug_info entries with abbreviation decoding. Type decoder extracts struct layouts, enum values, and type hierarchies. Line mapper converts instruction addresses to source file positions. Location evaluator interprets DWARF expressions for variable tracking.

How to Use It?

Basic Usage

from elftools\
  .elf.elffile\
    import ELFFile
from elftools\
  .dwarf.descriptions\
    import describe_form_class

class DwarfReader:
  def __init__(
    self,
    binary_path: str
  ):
    self.file = open(
      binary_path, 'rb')
    self.elf = ELFFile(
      self.file)
    self.dwarf =\
      self.elf\
        .get_dwarf_info()

  def get_functions(
    self
  ) -> list[dict]:
    funcs = []
    for cu in self.dwarf\
        .iter_CUs():
      for die in cu\
          .iter_DIEs():
        if die.tag ==\
            'DW_TAG_'\
            'subprogram':
          name = die\
            .attributes\
            .get(
              'DW_AT_name')
          if name:
            funcs.append({
              'name':
                name.value\
                  .decode(),
              'offset':
                die.offset})
    return funcs

Real-World Examples

class LineMapper:
  def __init__(
    self,
    dwarf_info
  ):
    self.dwarf =\
      dwarf_info
    self.mapping = {}

  def build_map(self):
    for cu in self.dwarf\
        .iter_CUs():
      line_prog =\
        self.dwarf\
          .line_program_for_CU(
            cu)
      if not line_prog:
        continue
      for entry\
          in line_prog\
            .get_entries():
        state =\
          entry.state
        if state\
            and not state\
              .end_sequence:
          self.mapping[
            state\
              .address] = {
            'file':
              line_prog[
                'file_entry'
              ][state.file
                - 1].name\
                  .decode(),
            'line':
              state.line}
    return self.mapping

  def lookup(
    self,
    address: int
  ) -> dict:
    return self.mapping\
      .get(address, {})

Advanced Tips

Use pyelftools with DWARF version 5 support for modern binaries that use the latest debug format features including split debug information. Cache parsed compilation unit data to avoid re-reading DWARF sections when performing multiple queries on the same binary. Handle both little-endian and big-endian encodings since DWARF data inherits the byte order of the target architecture.

When to Use It?

Use Cases

Extract function names and source locations from a compiled binary for a custom debugging tool. Map crash addresses to source file lines using DWARF line number information. Analyze struct layouts and type definitions from debug information for binary compatibility checking.

Related Topics

DWARF debugging, ELF binaries, debug information, binary analysis, pyelftools, and compiler toolchains.

Important Notes

Requirements

Pyelftools library for parsing ELF and DWARF data structures. Compiled binary with debug information included using the compiler debug flag. Understanding of DWARF format concepts including DIEs, attributes, and form encodings.

Usage Recommendations

Do: check the DWARF version in the compilation unit header before parsing since different versions use different encodings. Handle missing attributes gracefully as compilers may omit optional debug information. Close file handles properly when finished reading to avoid resource leaks.

Don't: assume all functions have DWARF entries since inlined or optimized-out functions may be absent from the debug data. Parse large binaries entirely into memory when you only need specific compilation units or address ranges. Ignore the difference between DWARF expression opcodes across versions which can produce incorrect location results.

Limitations

Optimized binaries may have incomplete or inaccurate debug information where variables are optimized out or source lines are rearranged. DWARF format complexity means edge cases in encoding may not be handled by all parsing libraries. Stripped binaries contain no DWARF data and require separate debug symbol files for analysis.