π§ Agent Memory
Give your agent a long-term memory that persists across sessions β so it can recall facts from previous conversations, not just the current one.
The built-in AgentSession keeps a short-term chat history for one conversation.
ChatHistoryMemoryProvider extends this by writing messages into a vector store at the
end of each session and searching that store at the start of future sessions. This means an agent
can remember that a user said they like pirate jokes β even in a brand-new conversation started
later.
The sample below uses InMemoryVectorStore from Semantic Kernel for simplicity.
In production, swap it for a durable store such as Azure AI Search, Qdrant, or PostgreSQL with
pgvector β the ChatHistoryMemoryProvider API stays exactly the same.
Key Concepts
- AgentSession β scopes a single conversation; create one per user turn
- ChatHistoryMemoryProvider β writes messages to a vector store and recalls relevant ones in future sessions
- storageScope / searchScope β control which messages are stored and which prior messages are searched (per-user, per-session, or both)
- InMemoryVectorStore β in-process vector store from
Microsoft.SemanticKernel.Connectors.InMemory; replace with any durable store for production - EmbeddingGenerator β required to embed messages before inserting them into the vector store
NuGet Packages
dotnet add package Microsoft.Agents.AI.OpenAI --prerelease
dotnet add package Azure.AI.OpenAI --prerelease
dotnet add package Azure.Identity
dotnet add package Microsoft.SemanticKernel.Connectors.InMemory --prerelease
Environment Variables
# PowerShell
$env:AZURE_OPENAI_ENDPOINT = "https://<your-resource>.openai.azure.com/"
$env:AZURE_OPENAI_DEPLOYMENT_NAME = "gpt-4o-mini"
$env:AZURE_OPENAI_EMBEDDING_DEPLOYMENT_NAME = "text-embedding-3-large"
Code Sample
using Azure.AI.OpenAI;
using Azure.Identity;
using Microsoft.Agents.AI;
using Microsoft.Extensions.AI;
using Microsoft.Extensions.VectorData;
using Microsoft.SemanticKernel.Connectors.InMemory;
var endpoint = Environment.GetEnvironmentVariable("AZURE_OPENAI_ENDPOINT")!;
var deployment = Environment.GetEnvironmentVariable("AZURE_OPENAI_DEPLOYMENT_NAME") ?? "gpt-4o-mini";
var embeddingDeployment = Environment.GetEnvironmentVariable("AZURE_OPENAI_EMBEDDING_DEPLOYMENT_NAME")
?? "text-embedding-3-large";
var azureClient = new AzureOpenAIClient(new Uri(endpoint), new DefaultAzureCredential());
// In-memory vector store backed by Azure OpenAI embeddings.
// Replace InMemoryVectorStore with any IVectorStore implementation for production persistence.
VectorStore vectorStore = new InMemoryVectorStore(new InMemoryVectorStoreOptions
{
EmbeddingGenerator = azureClient
.GetEmbeddingClient(embeddingDeployment)
.AsIEmbeddingGenerator()
});
// Build the agent and attach ChatHistoryMemoryProvider.
AIAgent agent = azureClient
.GetChatClient(deployment)
.AsAIAgent(new ChatClientAgentOptions
{
ChatOptions = new() { Instructions = "You are good at telling jokes." },
Name = "Joker",
AIContextProviders =
[
new ChatHistoryMemoryProvider(
vectorStore,
collectionName: "chathistory",
vectorDimensions: 3072,
// Called whenever a new session has no prior state.
session => new ChatHistoryMemoryProvider.State(
// Store messages for this user + unique session.
storageScope: new() { UserId = "user-001", SessionId = Guid.NewGuid().ToString() },
// Search all prior sessions for this user.
searchScope: new() { UserId = "user-001" }))
]
});
// ββ Session 1 ββββββββββββββββββββββββββββββββββββββββββββββββββββ
AgentSession session1 = await agent.CreateSessionAsync();
string reply1 = await agent.RunAsync(
"I love jokes about pirates. Tell me one!", session1);
Console.WriteLine($"Session 1 β {reply1}");
// ββ Session 2 (new conversation) ββββββββββββββββββββββββββββββββ
// ChatHistoryMemoryProvider searches the vector store and recalls
// that this user likes pirate jokes β even though it's a fresh session.
AgentSession session2 = await agent.CreateSessionAsync();
string reply2 = await agent.RunAsync(
"Tell me a joke you think I'd enjoy.", session2);
Console.WriteLine($"Session 2 β {reply2}");
How It Works
-
Session 1 runs normally β the user asks for a pirate joke. At the end of the
session the
ChatHistoryMemoryProviderembeds the conversation and writes it to the vector store underUserId="user-001". -
Session 2 starts fresh β a new
AgentSessionis created with no local history. -
Memory is injected automatically β before the model call, the provider searches
the vector store using
searchScope: { UserId = "user-001" }and injects relevant prior messages as context. The model sees that the user likes pirate jokes and responds accordingly.
Production Tips
- Replace
InMemoryVectorStorewith a persistent store (Azure AI Search, Qdrant, pgvector) β the API is unchanged. - Use
ManagedIdentityCredentialinstead ofDefaultAzureCredentialin production to avoid credential probing overhead. - Tune
vectorDimensionsto match your embedding model's output dimensions (3072 fortext-embedding-3-large, 1536 fortext-embedding-3-small). - Set both
storageScopeandsearchScopedeliberately β storing per-session but searching per-user is a common pattern.
Next Steps
All Examples
- π€ Hello Agent
- π§ Function Tools
- π¬ Multi-Turn Conversations
- β‘ Streaming Responses
- π¦ Structured Output
- π Sequential Workflows
- πΈοΈ Multi-Agent Orchestration
- π¦ Ollama β Local AI
- π₯οΈ LM Studio β Local AI
- π§ Agent Memory
- π RAG
- π MCP Tools
- π OpenTelemetry
- π§ Customer Support Triage
- π¬ Research Pipeline
- π€ Tools vs Sub-Agents
Concepts Used
π Memory Docs