12 releases
0.2.1 | Sep 7, 2024 |
---|---|
0.2.0 | Jun 13, 2024 |
0.1.9 | Apr 17, 2024 |
0.1.8 | Mar 30, 2024 |
0.1.2 | Nov 21, 2023 |
#23 in Machine learning
4,268 downloads per month
Used in 14 crates
(13 directly)
115KB
2.5K
SLoC
Ollama-rs
A simple and easy to use library for interacting with the Ollama API.
It was made following the Ollama API documentation.
Installation
Add ollama-rs to your Cargo.toml
[dependencies]
ollama-rs = "0.2.1"
Initialize Ollama
// By default it will connect to localhost:11434
let ollama = Ollama::default();
// For custom values:
let ollama = Ollama::new("http://localhost".to_string(), 11434);
Usage
Feel free to check the Chatbot example that shows how to use the library to create a simple chatbot in less than 50 lines of code. You can also check some other examples.
These examples use poor error handling for simplicity, but you should handle errors properly in your code.
Completion generation
let model = "llama2:latest".to_string();
let prompt = "Why is the sky blue?".to_string();
let res = ollama.generate(GenerationRequest::new(model, prompt)).await;
if let Ok(res) = res {
println!("{}", res.response);
}
OUTPUTS: The sky appears blue because of a phenomenon called Rayleigh scattering...
Completion generation (streaming)
Requires the stream
feature.
let model = "llama2:latest".to_string();
let prompt = "Why is the sky blue?".to_string();
let mut stream = ollama.generate_stream(GenerationRequest::new(model, prompt)).await.unwrap();
let mut stdout = tokio::io::stdout();
while let Some(res) = stream.next().await {
let responses = res.unwrap();
for resp in responses {
stdout.write(resp.response.as_bytes()).await.unwrap();
stdout.flush().await.unwrap();
}
}
Same output as above but streamed.
Completion generation (passing options to the model)
let model = "llama2:latest".to_string();
let prompt = "Why is the sky blue?".to_string();
let options = GenerationOptions::default()
.temperature(0.2)
.repeat_penalty(1.5)
.top_k(25)
.top_p(0.25);
let res = ollama.generate(GenerationRequest::new(model, prompt).options(options)).await;
if let Ok(res) = res {
println!("{}", res.response);
}
OUTPUTS: 1. Sun emits white sunlight: The sun consists primarily ...
Chat mode
Description: Every message sent and received will be stored in library's history. Each time you want to store history, you have to provide an ID for a chat. It can be uniq for each user or the same every time, depending on your need
Example with history:
let model = "llama2:latest".to_string();
let prompt = "Why is the sky blue?".to_string();
let history_id = "USER_ID_OR_WHATEVER";
let res = ollama
.send_chat_messages_with_history(
ChatMessageRequest::new(
model,
vec![ChatMessage::user(prompt)], // <- You should provide only one message
),
history_id // <- This entry save for us all the history
).await;
if let Ok(res) = res {
println!("{}", res.response);
}
Getting history for some ID:
let history_id = "USER_ID_OR_WHATEVER";
let history = ollama.get_message_history(history_id); // <- Option<Vec<ChatMessage>>
// Act
Clear history if we no more need it:
// Clear history for an ID
let history_id = "USER_ID_OR_WHATEVER";
ollama.clear_messages_for_id(history_id);
// Clear history for all chats
ollama.clear_all_messages();
Check chat with history examples for default and stream
List local models
let res = ollama.list_local_models().await.unwrap();
Returns a vector of Model
structs.
Show model information
let res = ollama.show_model_info("llama2:latest".to_string()).await.unwrap();
Returns a ModelInfo
struct.
Create a model
let res = ollama.create_model(CreateModelRequest::path("model".into(), "/tmp/Modelfile.example".into())).await.unwrap();
Returns a CreateModelStatus
struct representing the final status of the model creation.
Create a model (streaming)
Requires the stream
feature.
let mut res = ollama.create_model_stream(CreateModelRequest::path("model".into(), "/tmp/Modelfile.example".into())).await.unwrap();
while let Some(res) = res.next().await {
let res = res.unwrap();
// Handle the status
}
Returns a CreateModelStatusStream
that will stream every status update of the model creation.
Copy a model
let _ = ollama.copy_model("mario".into(), "mario_copy".into()).await.unwrap();
Delete a model
let _ = ollama.delete_model("mario_copy".into()).await.unwrap();
Generate embeddings
let request = GenerateEmbeddingsRequest::new("llama2:latest".to_string(), "Why is the sky blue?".into());
let res = ollama.generate_embeddings(request).await.unwrap();
Generate embeddings (batch)
let request = GenerateEmbeddingsRequest::new("llama2:latest".to_string(), vec!["Why is the sky blue?", "Why is the sky red?"].into());
let res = ollama.generate_embeddings(request).await.unwrap();
Returns a GenerateEmbeddingsResponse
struct containing the embeddings (a vector of floats).
Make a function call
let tools = vec![Arc::new(Scraper::new()), Arc::new(DDGSearcher::new())];
let parser = Arc::new(NousFunctionCall::new());
let message = ChatMessage::user("What is the current oil price?".to_string());
let res = ollama.send_function_call(
FunctionCallRequest::new(
"adrienbrault/nous-hermes2pro:Q8_0".to_string(),
tools,
vec![message],
),
parser,
).await.unwrap();
Uses the given tools (such as searching the web) to find an answer, returns a ChatMessageResponse
with the answer to the question.
Dependencies
~4–19MB
~269K SLoC