#comments #line #extractor #source #multi-language #marker #block

bin+lib rusty-todo-md

A multi-language TODO comment extractor for source code files

11 releases (1 stable)

new 1.0.0 Mar 9, 2025
0.2.0 Mar 9, 2025
0.1.8-alpha.11 Mar 7, 2025
0.1.8-alpha.7 Mar 6, 2025
0.1.8-alpha.5 Mar 4, 2025

#4 in #extractor

Download history 881/week @ 2025-03-02

881 downloads per month

MIT license

62KB
1K SLoC

TODO Extractor ๐Ÿš€

A multi-language TODO comment extractor for source code files. This tool parses Python, Rust, JavaScript, TypeScript, and Go files to extract TODO comments, including single-line and properly formatted multi-line TODOs.


โœจ Features

  • ๐Ÿ“Œ Supports multiple programming languages: Python, Rust, JavaScript, TypeScript, and Go.
  • ๐Ÿ“ Extracts TODO comments from both single-line (// TODO:) and block (/* TODO: */) comments.
  • ๐Ÿ”„ Handles multi-line TODOs by merging only indented or structured comment blocks.
  • ๐Ÿš€ Fast and efficient using the Pest parser.
  • ๐Ÿ“ Provides accurate line numbers for each extracted TODO.

๐Ÿ“ฆ Installation

Clone the repository and build the project using Cargo:

# Clone the repository
git clone https://github.com/your-repo/todo-extractor.git
cd todo-extractor

# Build the project
cargo build --release

# Run the executable
./target/release/todo-extractor path/to/your/file.rs

๐Ÿš€ Usage

Command Line

Run the extractor by providing a path to a source file:

./todo-extractor path/to/source_file.rs

Example Output

Found 2 TODOs:
3 - Refactor this function
10 - Handle edge cases properly

Example: Python File (test.py)

# TODO: Implement feature X
def function():
    """
    TODO: Improve performance
        by reducing loops
    """
    pass

Extracted TODOs:

Found 2 TODOs:
1 - Implement feature X
3 - Improve performance by reducing loops

๐Ÿ” How It Works

1. Detects File Type

The parser checks the file extension (.py, .rs, .js, .ts, .go) and selects the appropriate parser.

2. Extracts Comment Lines

Using Pest grammars, the extractor retrieves only comment lines, while ignoring code and string literals.

3. Identifies TODO Markers

The extractor scans comment lines for the TODO: marker and captures the message.

4. Handles Multi-line TODOs Properly

If a TODO is followed by indented lines, they are merged into a single entry.
However, unrelated lines are not merged.


๐Ÿ› ๏ธ Supported Languages

Language Single-line Syntax Block Syntax
Python # TODO: ... """ TODO: ... """
Rust // TODO: ... /* TODO: ... */
JavaScript // TODO: ... /* TODO: ... */
TypeScript // TODO: ... /* TODO: ... */
Go // TODO: ... /* TODO: ... */

๐Ÿ“ Handling Multi-line TODOs Correctly

A TODO: comment may span multiple lines, but only if:

  1. The next line is indented (with spaces or tabs).
  2. The next line is part of the same comment block.

โœ… Example: Correct Multi-line TODO

The second line is indented, so it is merged into the TODO.

/// TODO: Fix the parser
///     The tokenizer needs improvement
fn foo() {}

โœ”๏ธ Extracted TODO:

Found 1 TODO:
1 - Fix the parser The tokenizer needs improvement

For block comments:

/*
   TODO: Refactor this function
       Ensure performance is optimized
*/

โœ”๏ธ Extracted TODO:

Found 1 TODO:
2 - Refactor this function Ensure performance is optimized

โŒ Example: Incorrect Multi-line TODO

The second line is not indented, so it should NOT be merged.

/// TODO: Fix the parser
/// The tokenizer needs improvement
fn foo() {}

โŒ Extracted TODOs:

Found 1 TODO:
1 - Fix the parser

For block comments:

/*
   TODO: Refactor this function
   This line should NOT be merged
*/

โŒ Extracted TODOs:

Found 1 TODO:
2 - Refactor this function

๐Ÿงช Running Tests

Run tests to validate TODO extraction:

cargo test

Example Test Case for Rust Files

#[test]
fn test_rust_single_line() {
    let src = "// TODO: Refactor this code";
    let todos = extract_todos(Path::new("example.rs"), src);
    assert_eq!(todos.len(), 1);
    assert_eq!(todos[0].message, "Refactor this code");
}

Example Test Case for Multi-line TODO (Indented)

#[test]
fn test_rust_multiline_todo() {
    let src = r#"
/// TODO: Improve logging
///     Add detailed debug info
fn log() {}"#;

    let todos = extract_todos(Path::new("example.rs"), src);
    assert_eq!(todos.len(), 1);
    assert_eq!(todos[0].message, "Improve logging Add detailed debug info");
}

๐Ÿค Contributing

Want to add support for another language? Contributions are welcome! Feel free to open an issue or submit a pull request.

Dependencies

~22โ€“34MB
~643K SLoC