3 releases
0.1.17 | May 23, 2024 |
---|---|
0.1.15 | May 22, 2024 |
0.1.10 | May 20, 2024 |
#707 in Text processing
38KB
154 lines
🚀 Getting Started
To add chunkr
to your project and start chunking, use the cargo cli
cargo add chunkr
To checkout code and build it yourself
Clone the repository and run one of the examples from the examples
directory.
git clone https://github.com/d1pankarmedhi/chunkr.git
cd chunkr
🏗️ Examples
Check out these examples to quickly get started:
Chunking
These are some chunking strategy examples:
- Chunking by words - Chunk your documents/texts by number of words.
- Chunking by characters - Chunk your documents/text by number of characters.
- Chunk PDF document - Chunk your pdf documents by words/characters.
Run them using the cargo command like:
cargo run --example chunk_by_words 5 2 "hello there howw are you I am fine thank you"
# ["hello there howw are you", "are you I am fine", "am fine thank you"]
💡 Contributing
As an open-source project, we are open to all kinds of contribution, be it through code, documentation, issues, bugs, or even feature suggestions.
Feel free to check out Contribution guide for more details.
📝 License
This project is licensed under the MIT License - see the LICENSE.md file for details
Dependencies
~15MB
~223K SLoC