Intermediate

⚡ Streaming Responses

Display agent output token-by-token in real time for a fast, responsive user experience.

Streaming is essential for any interactive application. Without it, the user stares at a blank screen until the model finishes generating the entire response — which can take several seconds for longer outputs. With streaming, text appears progressively, just like in ChatGPT or Copilot.

Agent Framework provides RunStreamingAsync() which returns an IAsyncEnumerable<AgentResponseUpdate>. You iterate over it using await foreach and process each update as it arrives. Each update may contain a fragment of the text response, tool call events, or status updates.

Key Concepts

RunStreamingAsync() — returns an async stream of response fragments
AgentResponseUpdate — a single streaming update; may contain text, tool calls, or metadata
AgentResponseUpdate.Text — extracts just the text fragment from the update
await foreach — C# construct for consuming IAsyncEnumerable
Partial output — each update is only a fragment; accumulate them for the full response

NuGet Packages

dotnet add package Microsoft.Agents.AI.OpenAI --prerelease
dotnet add package Azure.AI.OpenAI --prerelease
dotnet add package Azure.Identity

Code Sample — Basic Streaming

using Azure.AI.OpenAI;
using Azure.Identity;
using Microsoft.Agents.AI;

AIAgent agent = new AzureOpenAIClient(
        new Uri(Environment.GetEnvironmentVariable("AZURE_OPENAI_ENDPOINT")!),
        new AzureCliCredential())
    .GetChatClient("gpt-4o-mini")
    .AsAIAgent(instructions: "You are a helpful assistant.");

// RunStreamingAsync returns IAsyncEnumerable<AgentResponseUpdate>.
// Each update arrives as the model generates tokens.
await foreach (var update in agent.RunStreamingAsync("Tell me a fun fact about the Moon."))
{
    // update.Text is null for non-text updates (e.g. tool call events).
    if (update.Text is not null)
    {
        Console.Write(update.Text);
    }
}

// Print a newline after streaming completes.
Console.WriteLine();

Code Sample — Streaming with Accumulated Response

using Azure.AI.OpenAI;
using Azure.Identity;
using Microsoft.Agents.AI;
using System.Text;

AIAgent agent = new AzureOpenAIClient(
        new Uri(Environment.GetEnvironmentVariable("AZURE_OPENAI_ENDPOINT")!),
        new AzureCliCredential())
    .GetChatClient("gpt-4o-mini")
    .AsAIAgent(instructions: "You are a helpful assistant.");

var fullResponse = new StringBuilder();

await foreach (var update in agent.RunStreamingAsync("What are the planets in our solar system?"))
{
    if (update.Text is not null)
    {
        Console.Write(update.Text);   // stream to console in real time
        fullResponse.Append(update.Text);
    }
}

Console.WriteLine();
Console.WriteLine($"\n--- Full response ({fullResponse.Length} chars) ---");
Console.WriteLine(fullResponse.ToString());

Step-by-Step Explanation

Call RunStreamingAsync() — This sends the request to the model and immediately returns an async enumerable without waiting for the full response.
Iterate with await foreach — Each AgentResponseUpdate arrives as the model produces tokens. The loop body executes for every update.
Check update.Text — Not every update contains text. Tool call invocations, function results, and metadata also arrive as updates. Guard with a null check (or is not null) before writing.
Accumulate if needed — Use a StringBuilder or similar to capture the full response for storage, logging, or further processing after streaming completes.

Next Steps

← 💬 Multi-Turn 📦 Structured Output → 🔀 Workflows →

All Examples

Concepts Used

RunStreamingAsync IAsyncEnumerable AgentResponseUpdate await foreach Streaming

📖 Running Agents Docs