1 unstable release

Uses new Rust 2024

new 0.1.0 Apr 6, 2025

#1 in #loc

CC0 license

37KB
839 lines

cymbal - list symbols in a codebase

Overview

cymbal lists all symbols in a code base. For example,

$ cymbal --delimiter ' ' --separator \n
(impl)   ./cymbal/src/ext.rs 10 20  ResultExt
(impl)   ./cymbal/src/ext.rs 31 20  OptionExt
(enum)   ./cymbal/src/config.rs 19 10  Language
(func)   ./cymbal/src/args.rs 82 6  raw_config
(func)   ./cymbal/src/args.rs 140 10  num_threads
(struct) ./cymbal/src/symbol.rs 6 12  Symbol
(func)   ./cymbal/src/text.rs 10 10  new
(struct) ./cymbal/src/cache.rs 16 12  Cache
(struct) ./cymbal/src/cache.rs 22 12  FileInfo
(struct) ./cymbal/src/text.rs 16 12  Span
(impl)   ./cymbal/src/text.rs 9 6  Loc
...

Select a symbol with fzf

cymbal | fzf \
      --delimiter \u200B \
      --read0 \
      --ansi \
      --preview='bat {1} --color always --style=numbers,snip,header --highlight-line {2} --line-range {2}:+100' \
      --reverse \
      --with-nth='{7} {4,5,6}' \
      --nth=2 \
      --with-shell='bash -c' \
      --bind=tab:down \
      --bind=shift-tab:up \
      --bind='ctrl-r:transform-prompt(
          if [ "$FZF_PROMPT" = "full> " ]; then
            echo "> "
          else
            echo "full> "
          fi
        )+transform-nth(
          if [ "$FZF_PROMPT" = "full> " ]; then
            echo "2"
          else
            echo "1.."
          fi
        )' | sed 's/\xE2\x80\x8B/ /g'

asciicast

This can be used to jump to a symbol in a text editor.

Usage

Usage: cymbal [OPTIONS] [PATH]

Arguments:
  [PATH]
          [default: .]

Options:
  -c, --config <CONFIG>
          Language configurations.

          This can either be a path to a .toml file or a TOML string.

          The default configuration will be applied if this argument is not provided or if it is the empty string.

      --cache-dir <CACHE_DIR>
          Directory to cache parsed symbols.

          Files are reparsed if their cached mtime differs from than their current mtime, or the path of the file doesn't exist in the cache. This option is typically used when `symbols` is called from the same directory multiple times, such as searching over a code base in an editor.

          The directory is created if it does not exist.

  -d, --delimiter <DELIMITER>
          The characters between properties of a single symbol.

          This is the character between the path, location, kind, text, and leading/trailing text written to stdout.

          This defaults to U+200B (zero-width space).

          [default:]

  -s, --separator <SEPARATOR>
          The character between symbols.

          This defaults to the U+0 (null byte).

          [default: ]

      --detached
          Whether to spawn a detached process to index symbols.

          Only useful with the `cache-dir` option.

          If this option is false, the cache may not be created if the process is exited prematurely. This can happen if using `symbols` in a pipeline (such as with `fzf`) and selecting a symbol before indexing is complete.

          If this option is true, indexing is performed by a separate detached process whose output is redirected to stdout. Then, if `symbols` is exited prematurely, the indexing will still be able to complete.

  -t, --threads <THREADS>
          The number of worker threads to use when parsing files.

          This defaults to `std::thread::available_parallelism` if it is available, and otherwise defaults to 8.

  -l, --language <LANGUAGE>
          Only show symbols from files with extensions matching this language.

          This option takes precedence over `--extension`.

          [possible values: c, cpp, go, haskell, odin, python, rust, type-script]

  -e, --extension <EXTENSION>
          Only show symbols from files with extensions matching this extension's language. Note that this will not filter for symbols in files matching this extension, but for files with the same language as this extension's.

          The `--language` option takes precedence over `--extension`.

  -h, --help
          Print help (see a summary with '-h')

  -V, --version
          Print version

Configuration

cymbal is configured with TOML on a per-language basis. See default-config.toml for the default configuration.

Each language has a set of queries for different kinds of symbols that can be found in that language. For example for C++,

[cpp]
type = [
  '(type_definition declarator: (type_identifier) @symbol)',
  '(enum_specifier name: (type_identifier) @symbol)',
]
class = [
  '(struct_specifier name: (type_identifier) @symbol body:(_))',
  '(declaration type: (union_specifier name: (type_identifier) @symbol))',
  '(class_specifier name: (type_identifier) @symbol)',
]
function = '(function_declarator declarator: (identifier) @symbol)'
method = [
  { leading = '{scope}::', query = '(function_declarator declarator: (qualified_identifier scope: (_) @scope name: (identifier) @symbol))' },
  { leading = '{scope.1}::{scope.2}::', query = '(function_declarator declarator: (qualified_identifier scope: (_) @scope.1 name: (qualified_identifier scope: (_) @scope.2 name: (identifier) @symbol)))' },
]

There is a fixed set of symbols that are valid, see symbol.rs. For each language, each symbol kind can have multiple queries, such as method above. For symbol kinds where only a single query is needed, a string can be used, like in function above.

Additionally, each query can be garnished with a leading and/or trailing text. These are templates that are hydrated using captures from the tree-sitter query, such as in the queriese for method above.

Dependencies

~98MB
~3M SLoC