618 releases (5 major breaking)

5.0.0 Nov 6, 2024
4.0.0 Nov 1, 2024
3.0.1 Oct 31, 2024
2.0.0 Oct 29, 2024
0.5.4 Nov 26, 2018

#285 in Parser implementations

Download history 43267/week @ 2024-08-04 48481/week @ 2024-08-11 55652/week @ 2024-08-18 61985/week @ 2024-08-25 51886/week @ 2024-09-01 59484/week @ 2024-09-08 45839/week @ 2024-09-15 62073/week @ 2024-09-22 65077/week @ 2024-09-29 49682/week @ 2024-10-06 61210/week @ 2024-10-13 55740/week @ 2024-10-20 61345/week @ 2024-10-27 54789/week @ 2024-11-03 82898/week @ 2024-11-10 81217/week @ 2024-11-17

282,313 downloads per month
Used in 313 crates (109 directly)

Apache-2.0

2.5MB
70K SLoC

EcmaScript/TypeScript parser for the rust programming language.

Features

Heavily tested

Passes almost all tests from tc39/test262.

Error reporting

error: 'implements', 'interface', 'let', 'package', 'private', 'protected',  'public', 'static', or 'yield' cannot be used as an identifier in strict mode
 --> invalid.js:3:10
  |
3 | function yield() {
  |          ^^^^^

Error recovery

The parser can recover from some parsing errors. For example, parser returns Ok(Module) for the code below, while emitting error to handler.

const CONST = 9000 % 2;
const enum D {
    // Comma is required, but parser can recover because of the newline.
    d = 10
    g = CONST
}

Example (lexer)

See lexer.rs in examples directory.

Example (parser)

#[macro_use]
extern crate swc_common;
extern crate swc_ecma_parser;
use swc_common::sync::Lrc;
use swc_common::{
    errors::{ColorConfig, Handler},
    FileName, FilePathMapping, SourceMap,
};
use swc_ecma_parser::{lexer::Lexer, Parser, StringInput, Syntax};

fn main() {
    let cm: Lrc<SourceMap> = Default::default();
    let handler =
        Handler::with_tty_emitter(ColorConfig::Auto, true, false,
        Some(cm.clone()));

    // Real usage
    // let fm = cm
    //     .load_file(Path::new("test.js"))
    //     .expect("failed to load test.js");
    let fm = cm.new_source_file(
        FileName::Custom("test.js".into()).into(),
        "function foo() {}".into(),
    );
    let lexer = Lexer::new(
        // We want to parse ecmascript
        Syntax::Es(Default::default()),
        // EsVersion defaults to es5
        Default::default(),
        StringInput::from(&*fm),
        None,
    );

    let mut parser = Parser::new_from(lexer);

    for e in parser.take_errors() {
        e.into_diagnostic(&handler).emit();
    }

    let _module = parser
        .parse_module()
        .map_err(|mut e| {
            // Unrecoverable fatal error occurred
            e.into_diagnostic(&handler).emit()
        })
        .expect("failed to parser module");
}

Cargo features

typescript

Enables typescript parser.

verify

Verify more errors, using swc_ecma_visit.

Known issues

Null character after \

Because [String] of rust should only contain valid utf-8 characters while javascript allows non-utf8 characters, the parser stores invalid utf8 characters in escaped form.

As a result, swc needs a way to distinguish invalid-utf8 code points and input specified by the user. The parser stores a null character right after \\ for non-utf8 code points. Note that other parts of swc is aware of this fact.

Note that this can be changed at anytime with a breaking change.

Dependencies

~7–15MB
~174K SLoC