Advanced

πŸ” RAG β€” Retrieval-Augmented Generation

Ground your agent's answers in your own documents by searching a vector store before every model call.

Retrieval-Augmented Generation (RAG) connects an AI model to a knowledge base so it can answer questions about content it was never trained on. In Agent Framework RC-1, you implement RAG with TextSearchProvider β€” an IAIContextProvider that runs a semantic search and injects the results into the model's context window automatically before each invocation.

The example below builds a support agent for a fictitious outdoor company. It loads three documents (return policy, shipping guide, and tent care instructions) into an in-memory vector store, then answers customer questions by retrieving the most relevant document first.

Key Concepts

  • TextSearchProvider β€” runs a search before each model call and injects results into the context
  • TextSearchProviderOptions.SearchTime β€” control when the search fires (BeforeAIInvoke is the standard setting)
  • TextSearchResult β€” wraps a retrieved document with source name, link, and text
  • InMemoryChatHistoryProvider β€” strips RAG-injected messages from the stored chat history to avoid bloat
  • AgentRequestMessageSourceType β€” lets you filter which message types are persisted in history

NuGet Packages

dotnet add package Microsoft.Agents.AI.OpenAI --prerelease
dotnet add package Azure.AI.OpenAI --prerelease
dotnet add package Azure.Identity
dotnet add package Microsoft.Extensions.VectorData.Abstractions --prerelease
dotnet add package Microsoft.SemanticKernel.Connectors.InMemory --prerelease

Environment Variables

# PowerShell
$env:AZURE_OPENAI_ENDPOINT = "https://<your-resource>.openai.azure.com/"
$env:AZURE_OPENAI_DEPLOYMENT_NAME = "gpt-4o-mini"
$env:AZURE_OPENAI_EMBEDDING_DEPLOYMENT_NAME = "text-embedding-3-large"

Code Sample

using Azure.AI.OpenAI;
using Azure.Identity;
using Microsoft.Agents.AI;
using Microsoft.Agents.AI.Samples; // TextSearchStore, TextSearchDocument (copy from GitHub β€” see note above)
using Microsoft.Extensions.AI;
using Microsoft.Extensions.VectorData;
using Microsoft.SemanticKernel.Connectors.InMemory;

var endpoint   = Environment.GetEnvironmentVariable("AZURE_OPENAI_ENDPOINT")!;
var deployment = Environment.GetEnvironmentVariable("AZURE_OPENAI_DEPLOYMENT_NAME") ?? "gpt-4o-mini";
var embeddingDeployment = Environment.GetEnvironmentVariable("AZURE_OPENAI_EMBEDDING_DEPLOYMENT_NAME")
                          ?? "text-embedding-3-large";

AzureOpenAIClient azureClient = new(new Uri(endpoint), new DefaultAzureCredential());

// Vector store backed by Azure OpenAI embeddings.
VectorStore vectorStore = new InMemoryVectorStore(new()
{
    EmbeddingGenerator = azureClient.GetEmbeddingClient(embeddingDeployment).AsIEmbeddingGenerator()
});

// TextSearchStore wraps the vector store with a fixed document schema.
TextSearchStore store = new(vectorStore, "product-docs", vectorDimensions: 3072);

// Upload knowledge-base documents.
await store.UpsertDocumentsAsync(
[
    new TextSearchDocument
    {
        SourceId   = "returns-001",
        SourceName = "Return Policy",
        SourceLink = "https://contoso.com/policies/returns",
        Text       = "Items may be returned within 30 days of delivery in original packaging. "
                   + "Refunds are issued within 5 business days after inspection."
    },
    new TextSearchDocument
    {
        SourceId   = "shipping-001",
        SourceName = "Shipping Guide",
        SourceLink = "https://contoso.com/help/shipping",
        Text       = "Standard shipping is free on orders over $50 and arrives in 3–5 business days "
                   + "within the continental US. Expedited options are available at checkout."
    },
    new TextSearchDocument
    {
        SourceId   = "tent-care-001",
        SourceName = "TrailRunner Tent Care",
        SourceLink = "https://contoso.com/manuals/trailrunner-tent",
        Text       = "Clean with lukewarm water and non-detergent soap. Air dry completely before "
                   + "storage. Avoid prolonged UV exposure to preserve the waterproof coating."
    }
]);

// Adapter: translate TextSearchStore results to TextSearchProvider.TextSearchResult.
async Task<IEnumerable<TextSearchProvider.TextSearchResult>> SearchAsync(
    string query, CancellationToken ct)
{
    var hits = await store.SearchAsync(query, 1, ct);
    return hits.Select(r => new TextSearchProvider.TextSearchResult
    {
        SourceName       = r.SourceName,
        SourceLink       = r.SourceLink,
        Text             = r.Text ?? string.Empty,
        RawRepresentation = r
    });
}

// Build the agent with TextSearchProvider.
AIAgent agent = azureClient
    .GetChatClient(deployment)
    .AsAIAgent(new ChatClientAgentOptions
    {
        ChatOptions = new()
        {
            Instructions = "You are a support specialist for Contoso Outdoors. "
                         + "Answer questions using the provided context and cite sources."
        },
        AIContextProviders = [new TextSearchProvider(SearchAsync, new TextSearchProviderOptions
        {
            SearchTime = TextSearchProviderOptions.TextSearchBehavior.BeforeAIInvoke
        })],
        // Strip RAG-injected messages from chat history to avoid bloating it.
        ChatHistoryProvider = new InMemoryChatHistoryProvider(new InMemoryChatHistoryProviderOptions
        {
            StorageInputRequestMessageFilter = messages => messages.Where(m =>
                m.GetAgentRequestMessageSourceType() != AgentRequestMessageSourceType.AIContextProvider
             && m.GetAgentRequestMessageSourceType() != AgentRequestMessageSourceType.ChatHistory)
        })
    });

AgentSession session = await agent.CreateSessionAsync();

Console.WriteLine(await agent.RunAsync("What is your return policy?", session));
Console.WriteLine(await agent.RunAsync("How fast does standard shipping arrive?", session));
Console.WriteLine(await agent.RunAsync("How do I clean my TrailRunner tent?", session));

How It Works

  1. Documents are embedded and stored β€” UpsertDocumentsAsync embeds each document's text using Azure OpenAI and writes it to the vector store.
  2. Before every model call β€” the TextSearchProvider embeds the user's message, searches the vector store, and injects the top result(s) as additional context.
  3. The model cites sources β€” because the instructions ask for source citations, the model can include the SourceName and SourceLink in its reply.
  4. History stays clean β€” the InMemoryChatHistoryProvider filter prevents injected RAG messages from being re-stored in the chat history on the next turn.

Production Tips

  • Increase topK from 1 to 3–5 for richer context (at the cost of more tokens).
  • Use Azure AI Search or Qdrant for durable vector storage and hybrid (keyword + vector) search.
  • Pre-chunk long documents (e.g. 500 tokens each) before embedding for better retrieval quality.
  • Add a reranker step to reorder results before injecting them as context.

Next Steps