#symbol-name #debug-info #symbols

wholesym

A complete solution for fetching symbol files and resolving code addresses to symbols and debuginfo

8 releases (5 breaking)

0.7.0 Jun 28, 2024
0.6.0 Jun 20, 2024
0.5.0 Apr 15, 2024
0.4.1 Feb 12, 2024
0.1.1 Dec 8, 2022

#185 in Profiling

Download history 236/week @ 2024-07-02 194/week @ 2024-07-09 209/week @ 2024-07-16 169/week @ 2024-07-23 203/week @ 2024-07-30 215/week @ 2024-08-06 242/week @ 2024-08-13 181/week @ 2024-08-20 209/week @ 2024-08-27 203/week @ 2024-09-03 184/week @ 2024-09-10 186/week @ 2024-09-17 273/week @ 2024-09-24 183/week @ 2024-10-01 246/week @ 2024-10-08 192/week @ 2024-10-15

927 downloads per month
Used in 3 crates

MIT/Apache

485KB
9K SLoC

wholesym

wholesym is a fully-featured library for fetching symbol files and for resolving code addresses to symbols and debug info. It supports Windows, macOS and Linux. It is lightning-fast and optimized for minimal time-to-first-symbol.

Use it as follows:

  1. Create a SymbolManager using SymbolManager::with_config.
  2. Load a SymbolMap with SymbolManager::load_symbol_map_for_binary_at_path.
  3. Look up an address with SymbolMap::lookup_relative_address.
  4. Inspect the returned AddressInfo, which gives you the symbol name, and potentially file and line information, along with inlined function info.

Behind the scenes, wholesym loads symbol files much like a debugger would. It supports symbol servers, collecting information from multiple files, and all kinds of different ways to embed symbol information in various file formats.

Example

use wholesym::{SymbolManager, SymbolManagerConfig, FramesLookupResult};
use std::path::Path;

# async fn run() -> Result<(), wholesym::Error> {
let symbol_manager = SymbolManager::with_config(SymbolManagerConfig::default());
let symbol_map = symbol_manager
    .load_symbol_map_for_binary_at_path(Path::new("/usr/bin/ls"), None)
    .await?;
println!("Looking up 0xd6f4 in /usr/bin/ls. Results:");
if let Some(address_info) = symbol_map.lookup_relative_address(0xd6f4) {
    println!(
        "Symbol: {:#x} {}",
        address_info.symbol.address, address_info.symbol.name
    );
    let frames = match address_info.frames {
        FramesLookupResult::Available(frames) => Some(frames),
        FramesLookupResult::External(ext_ref) => {
            symbol_manager
                .lookup_external(&symbol_map.symbol_file_origin(), &ext_ref)
                .await
        }
        FramesLookupResult::Unavailable => None,
    };
    if let Some(frames) = frames {
        for (i, frame) in frames.into_iter().enumerate() {
            let function = frame.function.unwrap();
            let file = frame.file_path.unwrap().display_path();
            let line = frame.line_number.unwrap();
            println!("  #{i:02} {function} at {file}:{line}");
        }
    }
} else {
    println!("No symbol for 0xd6f4 was found.");
}
# Ok(())
# }

This example prints the following output on my machine:

Looking up 0xd6f4 in /usr/bin/ls. Results:
Symbol: 0xd5d4 gobble_file.constprop.0
  #00 do_lstat at ./src/ls.c:1184
  #01 gobble_file at ./src/ls.c:3403

The example demonstrates support for debuglink and debugaltlink. It gets the symbol information from local debug files at /usr/lib/debug/.build-id/63/260a3e6e46db57abf718f6a3562c6eedccf269.debug and at /usr/lib/debug/.dwz/aarch64-linux-gnu/coreutils.debug, which were installed by the coreutils-dbgsym package. If these files are not present, it will fall back to whichever information is available.

Features

Windows

Supported symbol file sources:

  • Local PDB files at the absolute PDB path that's written down in the .exe / .dll
  • PDB files on Windows symbol servers + _NT_SYMBOL_PATH environment variable
  • Breakpad symbol files, local or on a server
  • DWARF-in-PE debug info
  • Fallback symbols from exported functions and function start addresses

Unsupported for now (patches accepted):

  • Support for /DEBUG:FASTLINK PDB files (issue #53)

macOS

Supported symbol file sources:

  • Local dSYM bundles with symbol tables + DWARF, found in vicinity of the binary
  • dSYM bundles found via Spotlight
  • DWARF found in object files which are referred to from a linked binary (via OSO stabs symbols)
  • Breakpad symbol files, local or on a server
  • Symbols from the regular symbol table
  • Fallback symbols from exported functions and function start addresses

Unsupported for now (patches accepted):

Linux

Supported symbol file sources:

  • DWARF and symbol tables in binaries
  • DWARF and symbol tables in separate debug files found via build ID or debug link
  • Symbol tables in MiniDebugInfo
  • Combining multiple files with DWARF if debug info has been partially moved with dwz (using debugaltlink)
  • debuginfod servers and the DEBUGINFOD_URLS environment variable
  • Breakpad symbol files, local or on a server
  • Symbols from the regular symbol table
  • Fallback symbols from exported functions and function start addresses
  • Split DWARF with .dwo files
  • Split DWARF with .dwp files

Performance

The most computationally intense part of symbol resolution is the parsing of debug info. Debug info can be very large, for example 700MB to 1500MB for Firefox's libxul. wholesym uses the addr2line and pdb-addr2line crates for parsing DWARF and PDB, respectively. It also has its own code for parsing the Breakpad sym format. All of these parsers have been optimized extensively to minimize the time it takes to get the first symbol result, and to cache things so that repeated lookups in the same functions are fast. This means:

  • No expensive preprocessing happens when the symbol file is first loaded.
  • Parsing is as lazy as possible: If possible, we only parse the bytes that are needed for the function which contains the looked-up address.
  • The first parse is as shallow and fast as possible, and just builds an index.
  • Strings (e.g. function names and file paths) are only looked up when needed.
  • Symbol lists, line records, and inlines are cached in sorted structures, and queried via binary search.

Dependencies

~17–32MB
~559K SLoC