9 releases (4 breaking)
0.4.0 | Jan 21, 2025 |
---|---|
0.3.0 | Sep 23, 2024 |
0.2.0 | Sep 20, 2024 |
0.1.0 | Aug 26, 2024 |
0.0.4 | Aug 23, 2024 |
#242 in Text processing
140 downloads per month
Used in office-convert-server
60KB
1K
SLoC
Contains (Cab file, 9KB) tests/samples/sample-xlsx-encrypted.xlsx, (Cab file, 9KB) tests/samples/sample-docx-encrypted.docx
LibreOfficeKit
Rust library providing safe access to the LibreOfficeSDK (LOK)
This library provides functionality for:
- Converting documents between various office and non-office formats (docx, xlsx, odt, ...etc into PDF and other supported formats see File Conversion Filter Names)
- Cryptographically signing documents
- Obtaining available file filters from LibreOffice
- Obtaining LibreOffice version information
- Executing document macros
- Determine document type
This library does not link to the LibreOfficeKit C++ headers like other implementations, I have move that code into the Rust implementation so a C++ build toolchain is not required to build this library
For examples of how to use the library in a real life setting check out the Office convert server which is a real instance of this library being used in production.
LibreOffice Support
Tested against Libreoffice versions 6.4.7.2 and 24.8.4.2 should be compatible with versions supported by the standard LOK C++ library.
Certain functions are only available in certain LibreOffice versions, calling these on a unsupported install will return OfficeError::MissingFunction
as the error.
You can also use Office::get_version_info
which will provide a OfficeVersionInfo
structure which contains a product_version
field with helper functions such as is_free_error_available
which tells you whether a specific function should be available for that version
[!IMPORTANT] LibreOffice has some broken behavior in some versions where some process end cleanup logic causes a segmentation fault when the program exists. This issue has been fixed in the latest release.
However LibreOffice will cause a segmentation fault if you try and create an instance after already destroying one so its recommenced you maintain an instance for the life of the program
Versions that are not affected by this LibreOffice bug are versions 6.x with the latest being 6.4.7.2, any versions newer than this seem to be affected by this bug. This bug is present in all versions newer than 6.x including latest (25.2.0.0.alpha0+ as at 1 Sep 2024)
You can find downloads to 6.4.7.2 on the download archives https://downloadarchive.documentfoundation.org/libreoffice/old/6.4.7.2/ but its recommended you use the latest version instead in most cases.
Windows Support
This library can be run and compiled on Windows. However, the Office::find_install_path()
will only find valid paths on Linux, for Windows you will need to manually specify the path to your LibreOffice installation
As of version
0.4.0
you can specify theLOK_PROGRAM_PATH
environment variable pointing it towards the "program" folder in your LibreOffice installation path and this will be used byOffice::find_install_path()
if the path is valid
Converting a file
To convert an office file format (docx, xlsx, odt, ...etc) into a PDF you can use the following code:
let office = Office::new(Office::find_install_path().unwrap()).unwrap();
let input_url = DocUrl::from_relative_path("./tests/samples/sample-docx.docx").unwrap();
let output_url = DocUrl::from_absolute_path("/tmp/test.pdf").unwrap();
let mut document = office.document_load(&input_url).unwrap();
let success = document.save_as(&output_url, "pdf", None).unwrap();
if !success {
// ...Document conversion failed
}
// ...Do something with the file at output_url
[!NOTE]
You can find the full supported list of conversion formats on the LibreOffice website Here
Loading a password protected file
You can load password protected office documents using the code below:
let office = Office::new(Office::find_install_path().unwrap()).unwrap();
let input_url =
DocUrl::from_relative_path("./tests/samples/sample-docx-encrypted.docx").unwrap();
let needs_password = Rc::new(AtomicBool::new(false));
// Allow password requests
office
.set_optional_features(OfficeOptionalFeatures::DOCUMENT_PASSWORD)
.unwrap();
office
.register_callback({
// Copies of local variables to include in the callback
let needs_password = needs_password.clone();
let input_url = input_url.clone();
// Callback itself
move |office, ty, _| {
if let CallbackType::DocumentPassword = ty {
// Password was requested
if needs_password.swap(true, Ordering::SeqCst) {
// Password we provided was incorrect, you must clear the password to prevent infinite callback loop
// the callback will be called until the correct password (Or None) is provided
office.set_document_password(&input_url, None).unwrap();
return;
}
// Provide the password
office
.set_document_password(&input_url, Some("password"))
.unwrap();
}
}
})
.unwrap();
// Document loads
let document = office.document_load(&input_url).unwrap();
// Check Password was requested
assert!(needs_password.load(Ordering::SeqCst));
// ...Do something with document
[!IMPORTANT]
Ensure you always specify a
None
password on failure or if you don't have a password (and have specified theOfficeOptionalFeatures::DOCUMENT_PASSWORD
optional feature) LibreOffice will continue to block thedocument_load
call and repeatedly invoke the callback until either the correct password is given orNone
is provided
Freeing memory
LibreOffice will accumulate buffers over time as you convert/load documents, if you are using LOK in a long running process you will want to use the Office::trim_memory
function to free some of that memory:
let office = Office::new(Office::find_install_path().unwrap()).unwrap();
// ... Do some document loading and conversion
office.trim_memory(2000).unwrap();
[!NOTE] Negative number provided to
trim_memory
tells LibreOffice to re-fill its memory cachesLarge positive number (>=1000) encourages immediate maximum memory saving.
Credits
The original implementation of this library was based upon https://github.com/undeflife/libreoffice-rs aiming to be more complex and cover more use cases, to use as a backend for an Office file format conversion server.
Dependencies
~2.9–9MB
~81K SLoC