π¦ Ollama β Local AI
Run your agents entirely on your own machine β no cloud, no API keys, no data leaving your network.
Ollama lets you download and
serve open-source language models (Llama 3, Phi-3, Qwen, Mistral, and more) locally via a
simple REST API. Agent Framework integrates with Ollama through the
OllamaSharp client library, which exposes an OllamaApiClient
that can be wrapped into a full AIAgent using the same .AsAIAgent()
extension method you already know.
This makes Ollama a perfect choice for local development, offline scenarios, privacy-sensitive applications, and rapid prototyping without worrying about API costs or rate limits.
Key Concepts
- OllamaApiClient β the Ollama REST client from the
OllamaSharppackage - AsAIAgent() β wraps the Ollama client into a standard
AIAgent - OLLAMA_ENDPOINT β the Ollama server URL (default:
http://localhost:11434) - OLLAMA_MODEL_NAME β the model to run, e.g.
llama3.2orphi3 - No API key needed β Ollama runs locally; authentication is not required
Prerequisites
Install and start Ollama, then pull a model before running any of the examples below.
# Install Ollama: https://ollama.com/download
# Pull a model (llama3.2 recommended for function calling):
ollama pull llama3.2
# Or use Docker:
docker run -d -p 11434:11434 --name ollama ollama/ollama
# Then inside the container:
ollama pull llama3.2
NuGet Packages
dotnet add package Microsoft.Agents.AI --prerelease
dotnet add package OllamaSharp
Microsoft.Agents.AI is the core Agent Framework package (RC-1, pre-release) that provides
the AIAgent interface and the .AsAIAgent() extension method.
OllamaSharp is the stable community client for the Ollama REST API β no --prerelease flag needed for it.
Environment Variables
# PowerShell
$env:OLLAMA_ENDPOINT="http://localhost:11434"
$env:OLLAMA_MODEL_NAME="llama3.2"
# Bash / macOS / Linux
export OLLAMA_ENDPOINT="http://localhost:11434"
export OLLAMA_MODEL_NAME="llama3.2"
Code Sample β Hello Ollama Agent
using Microsoft.Agents.AI;
using OllamaSharp;
// 1. Read configuration from environment variables.
var endpoint = Environment.GetEnvironmentVariable("OLLAMA_ENDPOINT")
?? throw new InvalidOperationException("OLLAMA_ENDPOINT is not set.");
var modelName = Environment.GetEnvironmentVariable("OLLAMA_MODEL_NAME")
?? throw new InvalidOperationException("OLLAMA_MODEL_NAME is not set.");
// 2. Create an OllamaApiClient and wrap it as an AIAgent.
// AsAIAgent() works identically to the Azure OpenAI / OpenAI path.
AIAgent agent = new OllamaApiClient(new Uri(endpoint), modelName)
.AsAIAgent(
instructions: "You are a helpful assistant running locally via Ollama.",
name: "LocalAgent");
// 3. Run the agent β same API as any other AIAgent.
Console.WriteLine(await agent.RunAsync("What is the largest city in France?"));
Code Sample β Streaming with Ollama
using Microsoft.Agents.AI;
using OllamaSharp;
var endpoint = Environment.GetEnvironmentVariable("OLLAMA_ENDPOINT")!;
var modelName = Environment.GetEnvironmentVariable("OLLAMA_MODEL_NAME")!;
AIAgent agent = new OllamaApiClient(new Uri(endpoint), modelName)
.AsAIAgent(instructions: "You are a helpful assistant.", name: "StreamAgent");
// RunStreamingAsync works just like with Azure OpenAI β tokens stream in real time.
Console.Write("Agent: ");
await foreach (var update in agent.RunStreamingAsync("Explain what a neural network is in simple terms."))
{
if (update.Text is not null)
{
Console.Write(update.Text);
}
}
Console.WriteLine();
Code Sample β Ollama with Function Tools
Note: Function calling support varies by model. llama3.2 and
qwen2.5:7b are good choices. Verify your chosen model supports tools
at ollama.com/search?c=tools.
using System.ComponentModel;
using Microsoft.Agents.AI;
using Microsoft.Extensions.AI;
using OllamaSharp;
var endpoint = Environment.GetEnvironmentVariable("OLLAMA_ENDPOINT")!;
var modelName = Environment.GetEnvironmentVariable("OLLAMA_MODEL_NAME")!;
// Define a tool function β works exactly the same as with cloud models.
string GetCurrentTime(
[Description("The timezone identifier, e.g. 'UTC' or 'Europe/Paris'")] string timezone)
=> $"The current time in {timezone} is {DateTime.UtcNow:HH:mm} UTC.";
AIAgent agent = new OllamaApiClient(new Uri(endpoint), modelName)
.AsAIAgent(
instructions: "You are a helpful assistant. Use your tools to answer questions accurately.",
name: "ToolAgent",
tools: [AIFunctionFactory.Create(GetCurrentTime)]);
// The agent will call GetCurrentTime automatically when appropriate.
Console.WriteLine(await agent.RunAsync("What time is it in UTC right now?"));
Code Sample β Multi-Turn with Ollama
Use AgentSession to manage conversation history automatically across multiple turns.
using Microsoft.Agents.AI;
using OllamaSharp;
var endpoint = Environment.GetEnvironmentVariable("OLLAMA_ENDPOINT")!;
var modelName = Environment.GetEnvironmentVariable("OLLAMA_MODEL_NAME")!;
AIAgent agent = new OllamaApiClient(new Uri(endpoint), modelName)
.AsAIAgent(instructions: "You are a friendly local assistant.", name: "ChatAgent");
// AgentSession manages conversation history β no manual list management needed.
await using AgentSession session = agent.CreateSession();
Console.WriteLine(await session.RunAsync("Hi! I'm learning about planets. My favourite is Saturn."));
Console.WriteLine(await session.RunAsync("What makes its rings so special?"));
Console.WriteLine(await session.RunAsync("How many moons does it have?"));
Code Sample β Structured Output with Ollama
Note: Not all models strictly enforce JSON schemas. llama3.1,
qwen2.5:7b, and mistral are known to work reliably with structured output.
using System.ComponentModel;
using Microsoft.Agents.AI;
using OllamaSharp;
var endpoint = Environment.GetEnvironmentVariable("OLLAMA_ENDPOINT")!;
var modelName = Environment.GetEnvironmentVariable("OLLAMA_MODEL_NAME")!;
// Define a C# record β Agent Framework generates the JSON schema automatically.
record MovieRecommendation(
[property: Description("The movie title")] string Title,
[property: Description("The year of release")] int Year,
[property: Description("A brief plot summary, max 2 sentences")] string Summary,
[property: Description("Genre tag such as 'sci-fi', 'drama', or 'comedy'")] string Genre);
AIAgent agent = new OllamaApiClient(new Uri(endpoint), modelName)
.AsAIAgent(
instructions: "You are a film critic. Respond only with valid JSON matching the requested schema.",
name: "FilmAgent");
// RunAsync<T> constrains the model to return JSON matching the record's schema.
MovieRecommendation movie = await agent.RunAsync<MovieRecommendation>(
"Recommend a classic science-fiction film from the 1980s.");
Console.WriteLine($"Title: {movie.Title} ({movie.Year})");
Console.WriteLine($"Genre: {movie.Genre}");
Console.WriteLine($"Summary: {movie.Summary}");
Code Sample β Multi-Agent Orchestration with Ollama
Two local Ollama agents act as specialists; an orchestrator decides which one to call using .AsAIFunction().
All agents share the same local model β no cloud calls.
using Microsoft.Agents.AI;
using OllamaSharp;
var endpoint = Environment.GetEnvironmentVariable("OLLAMA_ENDPOINT")!;
var modelName = Environment.GetEnvironmentVariable("OLLAMA_MODEL_NAME")!;
// Helper: create a new Ollama client for the same local model.
OllamaApiClient NewClient() => new OllamaApiClient(new Uri(endpoint), modelName);
// 1. Create two specialist sub-agents, each with a descriptive name the orchestrator can use.
AIAgent historyAgent = NewClient().AsAIAgent(
instructions: "You are a history expert. Answer concisely and accurately.",
name: "HistoryExpert",
description: "Expert in world history, historical events and figures.");
AIAgent scienceAgent = NewClient().AsAIAgent(
instructions: "You are a science expert. Explain concepts clearly in plain language.",
name: "ScienceExpert",
description: "Expert in physics, chemistry, biology, and astronomy.");
// 2. The orchestrator uses sub-agents as tools via .AsAIFunction().
AIAgent orchestrator = NewClient().AsAIAgent(
instructions: """
You are a helpful assistant with access to specialist experts.
Always delegate to the most appropriate expert using your tools.
Combine their answers into a single, cohesive response.
""",
name: "Orchestrator",
tools: [historyAgent.AsAIFunction(), scienceAgent.AsAIFunction()]);
// 3. A single call β the orchestrator routes each part of the question to the right expert.
Console.WriteLine(await orchestrator.RunAsync(
"When did the Berlin Wall fall, and what causes a solar eclipse?"));
Step-by-Step Explanation
-
Install Ollama and pull a model β Download Ollama from
ollama.com and run
ollama pull llama3.2to fetch the model weights locally. The Ollama service starts automatically and listens on port11434. -
Add the
OllamaSharpNuGet package β This is the only additional package needed.Microsoft.Agents.AIis already a dependency of your project and provides theAsAIAgent()extension method. -
Create an
OllamaApiClientβ Pass the endpoint URI and model name. The client talks to your local Ollama server over HTTP; no internet connection or API key is required. -
Call
.AsAIAgent()β Wraps the Ollama client in a fullAIAgent. From this point on, every Agent Framework feature β streaming, function tools, multi-turn history, structured output β works identically to the cloud-based agents.
Expected Output (Hello Ollama)
Paris
Next Steps
All Examples
- π€ Hello Agent
- π§ Function Tools
- π¬ Multi-Turn Conversations
- β‘ Streaming Responses
- π¦ Structured Output
- π Sequential Workflows
- πΈοΈ Multi-Agent Orchestration
- π¦ Ollama β Local AI
- π₯οΈ LM Studio β Local AI
- π§ Agent Memory
- π RAG
- π MCP Tools
- π OpenTelemetry
- π§ Customer Support Triage
- π¬ Research Pipeline
- π€ Tools vs Sub-Agents
Concepts Used
π Ollama Documentation