1 unstable release
0.1.0 | Nov 13, 2024 |
---|
#1590 in Parser implementations
11KB
115 lines
date_format_parser
Date Format Parser project is designed to parse date and date-time strings into a standard ISO 8601 format. It supports multiple date formats, optional time components, and flexible separators.
Features
- Parse dates and date-times in multiple formats:
- Year-Month-Day (e.g., 2000-12-20).
- Day-Month-Year (e.g., 20.12.2000).
- Month-Day-Year (e.g., 12/20/2000).
- Support for optional time components:
- Time in 24-hour format (e.g., 14:30:45.123).
- Time with optional seconds and milliseconds.
- Handles common separators:
- Dashes (-), slashes (/), and dots (.).
- Outputs parsed results in ISO 8601 format:
- Dates: YYYY-MM-DD
- Date-times: YYYY-MM-DDTHH:MM:SS[.sss]
- User-friendly CLI interface with commands for:
- Parsing input.
- Displaying help information.
- Showing credits.
Usage
Running the CLI
- Help:
cargo run -- <input>
Displays usage information.
- Credits:
cargo run -- --credits
Shows project credits.
- Parsing a Date or Date-Time:
cargo run -- "2000/12/20 14:30:45.123"
Parses the given input and outputs the result in ISO 8601 format:
"2000-12-20T14:30:45.123"
Supported Input Formats
Date Formats
- Year-Month-Day:
- Example: 2000-12-20
- Separator: "-"
- Day-Month-Year:
- Example: 20.12.2000
- Separator: "."
- Month-Day-Year:
- Example: 12/20/2000
- Separator: "/"
Time Formats
- 24-hour time:
- HH:MM:SS[.sss]
- Example: 14:30:45.123
- Optional components:
- Without seconds: HH:MM
- Without milliseconds: HH:MM:SS
Combined Date-Time
Dates can optionally include times, separated by space:
- Example: 2000-12-20 14:30:45.123
Parsing Process (Technical Description)
The parsing process is built on a robust Pest grammar defined in grammar.pest. Here’s how the tool parses the input:
Input Validation
The input string is validated against a custom grammar defined in the Pest library. The grammar supports:
- Date components (year, month, day).
- Time components (hour, minute, second, millisecond).
- Flexible separators (-, /, .).
Grammar
The grammar uses rules to define valid date and time formats:
date_time = { date ~ (space ~ time)? }
date = { year_month_day | day_month_year | month_day_year }
year_month_day = { year ~ separator ~ month ~ separator ~ day }
day_month_year = { day ~ separator ~ month ~ separator ~ year }
month_day_year = { month ~ separator ~ day ~ separator ~ year }
year = { '1'..'9' ~ ASCII_DIGIT ~ ASCII_DIGIT ~ ASCII_DIGIT }
month = { ("0" ~ '1'..'9') | ("1" ~ '0'..'2') }
day = { ("0" ~ '1'..'9') | ('1'..'2' ~ ASCII_DIGIT) | ("3" ~ '0'..'1') }
time = { hour_24 ~ ":" ~ minute ~ (":" ~ second ~ ("." ~ millisecond)?)? }
hour_24 = { "0" ~ ASCII_DIGIT | "1" ~ ASCII_DIGIT | "2" ~ ('0'..'3') }
minute = { ('0'..'5' ~ ASCII_DIGIT) }
second = { ('0'..'5' ~ ASCII_DIGIT) }
millisecond = { ASCII_DIGIT ~ ASCII_DIGIT ~ ASCII_DIGIT }
separator = { "-" | "/" | "." }
space = { " " }
- Top-Level Rule (date_time):
- Parses a date.
- Optionally parses time if preceded by a single space.
- Date Formats:
- year_month_day: Matches YYYY-MM-DD.
- day_month_year: Matches DD.MM.YYYY.
- month_day_year: Matches MM/DD/YYYY.
- Time Formats:
- Matches time in 24-hour format.
- Handles optional seconds and milliseconds.
- Separators and Spaces:
- Dashes (-), slashes (/), dots (.), and spaces are supported.
Parsing Logic
- The input string is parsed into components (date and time if present) using the Pest library.
- The components are validated against the grammar and extracted:
- Year, month, and day for dates.
- Hour, minute, second, and millisecond for times.
- The components are assembled into an ISO 8601 formatted string.
Use Cases
- Standardizing Dates and Times:
- Convert diverse date-time formats into a unified ISO 8601 format.
- Data Cleaning:
- Pre-process date-time data from files for further analysis.
- Log Parsing:
- Normalize timestamps in application logs for consistent formatting.
License
This project is licensed under the MIT License.
Dependencies
~2.2–3MB
~59K SLoC