#parse-url #pest-parser #domain #subdomain #port #query #path

bin+lib url_pest_parser

A URL parser using pest for Rust

1 unstable release

new 0.1.0 Nov 6, 2024

#834 in Web programming

MIT license

8KB
134 lines

URL Parser

A Rust-based URL parser using the pest parsing library with ability to extract various components like protocol, domain, subdomain, port, path segments, query parameters, and fragments.

Parsing Logic

The grammar is defined using pest and covers the following components:

  • Protocol: Either "http" or "https".
  • Subdomain: Optional subdomain that appears before the main domain.
  • Domain: The main domain name.
  • Port: Optional port number, appearing after a colon.
  • Path: Segmented by /, representing directories or resources.
  • Query Parameters: Key-value pairs in the query string, separated by &.
  • Fragment: The portion of the URL following a #.

How It Works

The parser processes URLs by breaking them into components using predefined grammar. Then the results are encapsulated in a structured ParsedURL object.

Example

protocol://subdomain.domain:port/path/to/resource?param1=value1&param2=value2#section
   └── protocol = "https"
   └── subdomain = "sub"
   └── domain = "example.com"
   └── port = 8080
   └── path = ["path", "to", "resource"]
   └── query = [("param1", "value1"), ("param2", "value2")]
   └── fragment = "section"

Dependencies

~2.2–3MB
~59K SLoC