7 releases (breaking)
0.7.0 | Apr 21, 2022 |
---|---|
0.6.0 | Mar 25, 2022 |
0.5.0 | Dec 29, 2021 |
0.4.0 | Oct 14, 2021 |
0.1.0 | Apr 5, 2020 |
#2806 in Database interfaces
33KB
645 lines
MongoDB Apache Arrow Connector
A Rust library for reading and writing Apache Arrow batches from and to MongoDB.
Licensed under the Apache 2.0 license.
Motivation
We are curently writing this library due to a need to read MongoDB data into dataframes.
Features
- Read from a collection to batches
- Write from batches to a collection
- Infer collection schema
- Projection predicate push-down
- Filter predicate push-down
- Data types
- Primitive types that MongoDB supports
- List types
- Nested structs (
bson::Document
) - Arbitrary binary data
lib.rs
:
MongoDB to Apache Arrow Connector
This crate allows reading and writing MongoDB data in the Apache Arrow format.
Data is read as RecordBatch
es from a MongoDB database using the aggregation
framework.
Apache Arrow RecordBatch
es are written to MongoDB using an insert_many into a collection.
Dependencies
~28–41MB
~748K SLoC