π RAG β Retrieval-Augmented Generation
Ground your agent's answers in your own documents by searching a vector store before every model call.
Retrieval-Augmented Generation (RAG) connects an AI model to a knowledge base so it can answer
questions about content it was never trained on. In Agent Framework RC-1, you implement RAG with
TextSearchProvider β an IAIContextProvider that runs a semantic search
and injects the results into the model's context window automatically before each invocation.
The example below builds a support agent for a fictitious outdoor company. It loads three documents (return policy, shipping guide, and tent care instructions) into an in-memory vector store, then answers customer questions by retrieving the most relevant document first.
Key Concepts
- TextSearchProvider β runs a search before each model call and injects results into the context
- TextSearchProviderOptions.SearchTime β control when the search fires (
BeforeAIInvokeis the standard setting) - TextSearchResult β wraps a retrieved document with source name, link, and text
- InMemoryChatHistoryProvider β strips RAG-injected messages from the stored chat history to avoid bloat
- AgentRequestMessageSourceType β lets you filter which message types are persisted in history
TextSearchStore and TextSearchDocument are convenience helper classes
from the official Agent Framework samples repository. They are not published on NuGet β
you must copy them into your own project.
After copying the files, set their namespace to match your project (or add
using Microsoft.Agents.AI.Samples; if you keep the original namespace).
NuGet Packages
dotnet add package Microsoft.Agents.AI.OpenAI --prerelease
dotnet add package Azure.AI.OpenAI --prerelease
dotnet add package Azure.Identity
dotnet add package Microsoft.Extensions.VectorData.Abstractions --prerelease
dotnet add package Microsoft.SemanticKernel.Connectors.InMemory --prerelease
Environment Variables
# PowerShell
$env:AZURE_OPENAI_ENDPOINT = "https://<your-resource>.openai.azure.com/"
$env:AZURE_OPENAI_DEPLOYMENT_NAME = "gpt-4o-mini"
$env:AZURE_OPENAI_EMBEDDING_DEPLOYMENT_NAME = "text-embedding-3-large"
Code Sample
using Azure.AI.OpenAI;
using Azure.Identity;
using Microsoft.Agents.AI;
using Microsoft.Agents.AI.Samples; // TextSearchStore, TextSearchDocument (copy from GitHub β see note above)
using Microsoft.Extensions.AI;
using Microsoft.Extensions.VectorData;
using Microsoft.SemanticKernel.Connectors.InMemory;
var endpoint = Environment.GetEnvironmentVariable("AZURE_OPENAI_ENDPOINT")!;
var deployment = Environment.GetEnvironmentVariable("AZURE_OPENAI_DEPLOYMENT_NAME") ?? "gpt-4o-mini";
var embeddingDeployment = Environment.GetEnvironmentVariable("AZURE_OPENAI_EMBEDDING_DEPLOYMENT_NAME")
?? "text-embedding-3-large";
AzureOpenAIClient azureClient = new(new Uri(endpoint), new DefaultAzureCredential());
// Vector store backed by Azure OpenAI embeddings.
VectorStore vectorStore = new InMemoryVectorStore(new()
{
EmbeddingGenerator = azureClient.GetEmbeddingClient(embeddingDeployment).AsIEmbeddingGenerator()
});
// TextSearchStore wraps the vector store with a fixed document schema.
TextSearchStore store = new(vectorStore, "product-docs", vectorDimensions: 3072);
// Upload knowledge-base documents.
await store.UpsertDocumentsAsync(
[
new TextSearchDocument
{
SourceId = "returns-001",
SourceName = "Return Policy",
SourceLink = "https://contoso.com/policies/returns",
Text = "Items may be returned within 30 days of delivery in original packaging. "
+ "Refunds are issued within 5 business days after inspection."
},
new TextSearchDocument
{
SourceId = "shipping-001",
SourceName = "Shipping Guide",
SourceLink = "https://contoso.com/help/shipping",
Text = "Standard shipping is free on orders over $50 and arrives in 3β5 business days "
+ "within the continental US. Expedited options are available at checkout."
},
new TextSearchDocument
{
SourceId = "tent-care-001",
SourceName = "TrailRunner Tent Care",
SourceLink = "https://contoso.com/manuals/trailrunner-tent",
Text = "Clean with lukewarm water and non-detergent soap. Air dry completely before "
+ "storage. Avoid prolonged UV exposure to preserve the waterproof coating."
}
]);
// Adapter: translate TextSearchStore results to TextSearchProvider.TextSearchResult.
async Task<IEnumerable<TextSearchProvider.TextSearchResult>> SearchAsync(
string query, CancellationToken ct)
{
var hits = await store.SearchAsync(query, 1, ct);
return hits.Select(r => new TextSearchProvider.TextSearchResult
{
SourceName = r.SourceName,
SourceLink = r.SourceLink,
Text = r.Text ?? string.Empty,
RawRepresentation = r
});
}
// Build the agent with TextSearchProvider.
AIAgent agent = azureClient
.GetChatClient(deployment)
.AsAIAgent(new ChatClientAgentOptions
{
ChatOptions = new()
{
Instructions = "You are a support specialist for Contoso Outdoors. "
+ "Answer questions using the provided context and cite sources."
},
AIContextProviders = [new TextSearchProvider(SearchAsync, new TextSearchProviderOptions
{
SearchTime = TextSearchProviderOptions.TextSearchBehavior.BeforeAIInvoke
})],
// Strip RAG-injected messages from chat history to avoid bloating it.
ChatHistoryProvider = new InMemoryChatHistoryProvider(new InMemoryChatHistoryProviderOptions
{
StorageInputRequestMessageFilter = messages => messages.Where(m =>
m.GetAgentRequestMessageSourceType() != AgentRequestMessageSourceType.AIContextProvider
&& m.GetAgentRequestMessageSourceType() != AgentRequestMessageSourceType.ChatHistory)
})
});
AgentSession session = await agent.CreateSessionAsync();
Console.WriteLine(await agent.RunAsync("What is your return policy?", session));
Console.WriteLine(await agent.RunAsync("How fast does standard shipping arrive?", session));
Console.WriteLine(await agent.RunAsync("How do I clean my TrailRunner tent?", session));
How It Works
-
Documents are embedded and stored β
UpsertDocumentsAsyncembeds each document's text using Azure OpenAI and writes it to the vector store. -
Before every model call β the
TextSearchProviderembeds the user's message, searches the vector store, and injects the top result(s) as additional context. -
The model cites sources β because the instructions ask for source citations, the
model can include the
SourceNameandSourceLinkin its reply. -
History stays clean β the
InMemoryChatHistoryProviderfilter prevents injected RAG messages from being re-stored in the chat history on the next turn.
Production Tips
- Increase
topKfrom 1 to 3β5 for richer context (at the cost of more tokens). - Use Azure AI Search or Qdrant for durable vector storage and hybrid (keyword + vector) search.
- Pre-chunk long documents (e.g. 500 tokens each) before embedding for better retrieval quality.
- Add a reranker step to reorder results before injecting them as context.
Next Steps
All Examples
- π€ Hello Agent
- π§ Function Tools
- π¬ Multi-Turn Conversations
- β‘ Streaming Responses
- π¦ Structured Output
- π Sequential Workflows
- πΈοΈ Multi-Agent Orchestration
- π¦ Ollama β Local AI
- π₯οΈ LM Studio β Local AI
- π§ Agent Memory
- π RAG
- π MCP Tools
- π OpenTelemetry
- π§ Customer Support Triage
- π¬ Research Pipeline
- π€ Tools vs Sub-Agents
Concepts Used
π RAG Docs