#wikidot #slug #normalization #normal

wikidot-normalize

Simple library to provide Wikidot-compatible string normalization

19 releases (11 breaking)

0.12.0 Sep 29, 2023
0.11.0 Apr 16, 2023
0.10.0 May 11, 2022
0.9.2 Sep 9, 2021
0.3.0 Oct 20, 2019

#484 in Text processing

Download history 42/week @ 2024-12-07 38/week @ 2024-12-14 2/week @ 2024-12-21 29/week @ 2024-12-28 10/week @ 2025-01-04 12/week @ 2025-01-11 19/week @ 2025-01-18 6/week @ 2025-01-25 29/week @ 2025-02-01 26/week @ 2025-02-08 29/week @ 2025-02-15 27/week @ 2025-02-22 67/week @ 2025-03-01 1569/week @ 2025-03-08 1950/week @ 2025-03-15 2364/week @ 2025-03-22

5,950 downloads per month
Used in 4 crates (3 directly)

MIT license

19KB
299 lines

wikidot-normalize

Rust CI badge docs.rs link

Simple library to provide Wikidot-compatible string normalization. It is a Rust port of the functionality in WDStringUtils::toUnixName.

Wikidot normal form is used in the site's page names. Essentially it ensures the following:

  • All ASCII is lowercase.
  • All characters outside of :, a-z, 0-9, or - are replaced with dashes.
  • Underscores are only permitted as the first character.
  • Any leading or trailing dashes are removed.
  • Any set of multiple dashes are replaced with a single dash.
  • Any set of multiple colons are replaced with a single colon.

Examples:

  • "Big Cheese Horace" -> "big-cheese-horace"
  • "bottom--Text" -> "bottom-text"
  • "Tufto's Proposal" -> "tufto-s-proposal"
  • "-test-" -> "test"

This library is getting close to finalization with a v1.0.0 release.

Available under the terms of the MIT License. See LICENSE.md.

Compilation

This library targets the latest stable Rust. At time of writing, that is 1.68.2

$ cargo build --release

Testing

$ cargo test

Add -- --nocapture to the end if you want to see test output.

Dependencies

~3–4MB
~88K SLoC