β‘ Streaming Responses
Display agent output token-by-token in real time for a fast, responsive user experience.
Streaming is essential for any interactive application. Without it, the user stares at a blank screen until the model finishes generating the entire response β which can take several seconds for longer outputs. With streaming, text appears progressively, just like in ChatGPT or Copilot.
Agent Framework provides RunStreamingAsync() which returns an
IAsyncEnumerable<AgentResponseUpdate>. You iterate over it using
await foreach and process each update as it arrives. Each update may contain a
fragment of the text response, tool call events, or status updates.
Key Concepts
- RunStreamingAsync() β returns an async stream of response fragments
- AgentResponseUpdate β a single streaming update; may contain text, tool calls, or metadata
- AgentResponseUpdate.Text β extracts just the text fragment from the update
- await foreach β C# construct for consuming
IAsyncEnumerable - Partial output β each update is only a fragment; accumulate them for the full response
NuGet Packages
dotnet add package Microsoft.Agents.AI.OpenAI --prerelease
dotnet add package Azure.AI.OpenAI --prerelease
dotnet add package Azure.Identity
Code Sample β Basic Streaming
using Azure.AI.OpenAI;
using Azure.Identity;
using Microsoft.Agents.AI;
AIAgent agent = new AzureOpenAIClient(
new Uri(Environment.GetEnvironmentVariable("AZURE_OPENAI_ENDPOINT")!),
new AzureCliCredential())
.GetChatClient("gpt-4o-mini")
.AsAIAgent(instructions: "You are a helpful assistant.");
// RunStreamingAsync returns IAsyncEnumerable<AgentResponseUpdate>.
// Each update arrives as the model generates tokens.
await foreach (var update in agent.RunStreamingAsync("Tell me a fun fact about the Moon."))
{
// update.Text is null for non-text updates (e.g. tool call events).
if (update.Text is not null)
{
Console.Write(update.Text);
}
}
// Print a newline after streaming completes.
Console.WriteLine();
Code Sample β Streaming with Accumulated Response
using Azure.AI.OpenAI;
using Azure.Identity;
using Microsoft.Agents.AI;
using System.Text;
AIAgent agent = new AzureOpenAIClient(
new Uri(Environment.GetEnvironmentVariable("AZURE_OPENAI_ENDPOINT")!),
new AzureCliCredential())
.GetChatClient("gpt-4o-mini")
.AsAIAgent(instructions: "You are a helpful assistant.");
var fullResponse = new StringBuilder();
await foreach (var update in agent.RunStreamingAsync("What are the planets in our solar system?"))
{
if (update.Text is not null)
{
Console.Write(update.Text); // stream to console in real time
fullResponse.Append(update.Text);
}
}
Console.WriteLine();
Console.WriteLine($"\n--- Full response ({fullResponse.Length} chars) ---");
Console.WriteLine(fullResponse.ToString());
Step-by-Step Explanation
-
Call
RunStreamingAsync()β This sends the request to the model and immediately returns an async enumerable without waiting for the full response. -
Iterate with
await foreachβ EachAgentResponseUpdatearrives as the model produces tokens. The loop body executes for every update. -
Check
update.Textβ Not every update contains text. Tool call invocations, function results, and metadata also arrive as updates. Guard with a null check (oris not null) before writing. -
Accumulate if needed β Use a
StringBuilderor similar to capture the full response for storage, logging, or further processing after streaming completes.
Next Steps
All Examples
- π€ Hello Agent
- π§ Function Tools
- π¬ Multi-Turn Conversations
- β‘ Streaming Responses
- π¦ Structured Output
- π Sequential Workflows
- πΈοΈ Multi-Agent Orchestration
- π¦ Ollama β Local AI
- π₯οΈ LM Studio β Local AI
- π§ Agent Memory
- π RAG
- π MCP Tools
- π OpenTelemetry
- π§ Customer Support Triage
- π¬ Research Pipeline
- π€ Tools vs Sub-Agents
Concepts Used
π Running Agents Docs