52 stable releases (4 major)
new 4.2.0 | Nov 23, 2024 |
---|---|
4.1.0 | Sep 29, 2024 |
3.14.1 | Aug 5, 2024 |
3.12.0 | Jul 30, 2024 |
0.6.0 | Oct 2, 2023 |
#6 in Machine learning
2,778 downloads per month
Used in 7 crates
(6 directly)
395KB
2.5K
SLoC
🍕 Features
- Supports synchronous usage. No dependency on Tokio.
- Uses @pykeio/ort for performant ONNX inference.
- Uses @huggingface/tokenizers for fast encodings.
- Supports batch embeddings generation with parallelism using @rayon-rs/rayon.
The default model is Flag Embedding, which is top of the MTEB leaderboard.
🔍 Not looking for Rust?
- Python 🐍: fastembed
- Go 🐳: fastembed-go
- JavaScript 🌐: fastembed-js
🤖 Models
Text Embedding
- BAAI/bge-small-en-v1.5 - Default
- sentence-transformers/all-MiniLM-L6-v2
- mixedbread-ai/mxbai-embed-large-v1
- Qdrant/clip-ViT-B-32-text - pairs with the image model clip-ViT-B-32-vision for image-to-text search
Click to see full List
- BAAI/bge-large-en-v1.5
- BAAI/bge-small-zh-v1.5
- BAAI/bge-base-en-v1.5
- sentence-transformers/all-MiniLM-L12-v2
- sentence-transformers/paraphrase-MiniLM-L12-v2
- sentence-transformers/paraphrase-multilingual-mpnet-base-v2
- nomic-ai/nomic-embed-text-v1
- nomic-ai/nomic-embed-text-v1.5
- intfloat/multilingual-e5-small
- intfloat/multilingual-e5-base
- intfloat/multilingual-e5-large
- Alibaba-NLP/gte-base-en-v1.5
- Alibaba-NLP/gte-large-en-v1.5
Sparse Text Embedding
- prithivida/Splade_PP_en_v1 - Default
Image Embedding
- Qdrant/clip-ViT-B-32-vision - Default
- Qdrant/resnet50-onnx
- Qdrant/Unicom-ViT-B-16
- Qdrant/Unicom-ViT-B-32
Reranking
- BAAI/bge-reranker-base
- BAAI/bge-reranker-v2-m3
- jinaai/jina-reranker-v1-turbo-en
- jinaai/jina-reranker-v2-base-multiligual
🚀 Installation
Run the following command in your project directory:
cargo add fastembed
Or add the following line to your Cargo.toml:
[dependencies]
fastembed = "3"
📖 Usage
Text Embeddings
use fastembed::{TextEmbedding, InitOptions, EmbeddingModel};
// With default InitOptions
let model = TextEmbedding::try_new(Default::default())?;
// With custom InitOptions
let model = TextEmbedding::try_new(
InitOptions::new(EmbeddingModel::AllMiniLML6V2).with_show_download_progress(true),
)?;
let documents = vec![
"passage: Hello, World!",
"query: Hello, World!",
"passage: This is an example passage.",
// You can leave out the prefix but it's recommended
"fastembed-rs is licensed under Apache 2.0"
];
// Generate embeddings with the default batch size, 256
let embeddings = model.embed(documents, None)?;
println!("Embeddings length: {}", embeddings.len()); // -> Embeddings length: 4
println!("Embedding dimension: {}", embeddings[0].len()); // -> Embedding dimension: 384
Image Embeddings
use fastembed::{ImageEmbedding, ImageInitOptions, ImageEmbeddingModel};
// With default InitOptions
let model = ImageEmbedding::try_new(Default::default())?;
// With custom InitOptions
let model = ImageEmbedding::try_new(
ImageInitOptions::new(ImageEmbeddingModel::ClipVitB32).with_show_download_progress(true),
)?;
let images = vec!["assets/image_0.png", "assets/image_1.png"];
// Generate embeddings with the default batch size, 256
let embeddings = model.embed(images, None)?;
println!("Embeddings length: {}", embeddings.len()); // -> Embeddings length: 2
println!("Embedding dimension: {}", embeddings[0].len()); // -> Embedding dimension: 512
Candidates Reranking
use fastembed::{TextRerank, RerankInitOptions, RerankerModel};
let model = TextRerank::try_new(
RerankInitOptions::new(RerankerModel::BGERerankerBase).with_show_download_progress(true),
)?;
let documents = vec![
"hi",
"The giant panda (Ailuropoda melanoleuca), sometimes called a panda bear, is a bear species endemic to China.",
"panda is animal",
"i dont know",
"kind of mammal",
];
// Rerank with the default batch size
let results = model.rerank("what is panda?", documents, true, None)?;
println!("Rerank result: {:?}", results);
Alternatively, local model files can be used for inference via the try_new_from_user_defined(...)
methods of respective structs.
✊ Support
To support the library, please consider donating to our primary upstream dependency, ort
- The Rust wrapper for the ONNX runtime.
⚙️ Under the hood
It's important we justify the "fast" in FastEmbed. FastEmbed is fast because of:
- Quantized model weights.
- ONNX Runtime which allows for inference on CPU, GPU, and other dedicated runtimes.
- No hidden dependencies via Huggingface Transformers.
📄 LICENSE
Apache 2.0 © 2024
Dependencies
~17–30MB
~520K SLoC