#language-detection #detect #wrapper #low-level #detector #hint #cld2

deprecated sys cld2-sys

Unsafe, low-level wrapper for cld2 language detection library

10 releases (3 stable)

Uses old Rust 2015

1.0.2 Nov 28, 2017
1.0.1 Mar 18, 2017
1.0.0 Nov 6, 2016
0.1.0 Nov 19, 2015
0.0.2 Nov 22, 2014

#23 in #hint

Download history 15/week @ 2024-07-21 20/week @ 2024-07-28 50/week @ 2024-08-04 41/week @ 2024-08-11 8/week @ 2024-08-18 30/week @ 2024-08-25 16/week @ 2024-09-01 9/week @ 2024-09-08 7/week @ 2024-09-15 26/week @ 2024-09-22 38/week @ 2024-09-29 17/week @ 2024-10-06 23/week @ 2024-10-13 6/week @ 2024-10-20 11/week @ 2024-10-27 35/week @ 2024-11-03

76 downloads per month
Used in 3 crates (2 directly)

Unlicense/Apache-2.0

21MB
401K SLoC

C++ 400K SLoC // 0.0% comments Rust 1K SLoC // 0.0% comments Shell 207 SLoC // 0.3% comments

Build Status Latest version License

DEPRECATED in favor of whatlang, which is native Rust and smaller. If you have a compelling use-case for this code, please open an issue. Simple PRs, especially for bug fixes, will still be read and possibly merged.

This Rust library detects the language of a string using the cld2 library from the Chromium project.

To use it, add the following lines to your Cargo.toml file and run cargo update:

[dependencies.cld2]
git = "git://github.com/emk/rust-cld2"

Then you can invoke it as follows:

// Put these two lines the top of the file.
extern crate cld2;
use cld2::{detect_language, Format, Reliable, Lang};

let text = "It is an ancient Mariner,
And he stoppeth one of three.
'By thy long grey beard and glittering eye,
Now wherefore stopp'st thou me?";

assert_eq!((Some(Lang("en")), Reliable),
           detect_language(text, Format::Text));

You can also pass in language detection hints and request more detailed output. For details, please see the API documentation.

Contributing

As always, pull requests are welcome! Please keep any patches as simple as possible and include unit tests; that makes it much easier for me to merge them.

If you want to get the C/C++ code building on another platform, please see cld2-sys/build.rb and this build script guide. You'll probably need to adjust some compiler options. Please don't hesitate to ask questions; I'd love for this library to be cross platform.

In your first commit message, please include the following statement:

I dedicate any and all copyright interest in my contributions to this project to the public domain. I make this dedication for the benefit of the public at large and to the detriment of my heirs and successors. I intend this dedication to be an overt act of relinquishment in perpetuity of all present and future rights to this software under copyright law.

This allows us to keep the library legally unencumbered, and free for everyone to use.

License

The original cld2 library is distributed under the Apache License Version 2.0. This also covers much of the code in cld2-sys/src/wrapper.h. All of the new code is released into the public domain as described by the Unlicense.


lib.rs:

Unsafe, low-level wrapper around cld2, the "compact language detector" based on Chromium's code, plus a very thin C wrapper layer. Normally you won't want to use this library directly unless you're writing your own cld2 wrapper library.

If you need access to APIs which are not currently wrapped, please feel free to send pull requests!

Dependencies