Chapter 6: Agent Loop¶
In Chapter 5: Agent Configuration, we set up the session and loaded the config — the system prompt, the allowed tools, the MCP servers. Everything is wired and ready. Time to actually talk to the LLM. Welcome to the Agent Loop.
Motivation¶
Why does the agent need a loop at all? Can't we just send a prompt and get an answer?
Sometimes, yes. But most useful interactions look like this:
- You ask: "Find all the TODOs in my project"
- The LLM decides it needs to run
grep -r TODO src/ - It sends back a tool call — not a final answer
- Something needs to execute that tool and feed the result back
- The LLM sees the grep output, writes a summary, and now it's done
That "something" is the Agent Loop. It's the ping-pong referee that keeps the ball moving between you, the model, and the tools until the model says "I'm done."
Think of the Agent Loop like a chess clock. Each side takes a turn, state is passed between them, and the clock keeps ticking until someone declares checkmate. The LLM makes a move (text or tool call), the loop processes it (runs the tool, collects the result), pushes the board back, and the LLM moves again. Without the clock, nobody knows whose turn it is.
Use Case¶
Let's trace a concrete example. You type:
"Find my TODOs and summarize them"
Here's what happens inside the Agent Loop:
- You → Loop: Your prompt arrives as a
Usermessage - Loop → LLM: The loop packages the conversation history + tool specs and streams a request to the model
- LLM → Loop: The model streams back tokens. Partway through, it emits a
ContentBlockStartfor a tool use — it wants to callgrepwith{pattern: "TODO", path: "src/"} - Loop detects tool call: The stream ends with
StopReason::ToolUse. The loop transitions toPendingToolUseResults - Loop → Tool System: The outer Agent dispatches the tool, waits for the result
- Tool System → Loop: The grep output comes back as a
ToolResultmessage - Loop → LLM: The loop sends a new request with the tool result appended to the conversation
- LLM → Loop: The model streams a summary: "Found 12 TODOs across 4 files…" — this time with
StopReason::EndTurn - Loop stops: No more tool calls. The loop transitions to
UserTurnEndedand emitsUserTurnEndmetadata
Two round-trips to the model, one tool execution, all orchestrated by the same loop.
Key Concepts¶
1. Turn Structure¶
A user turn is everything that happens between your prompt and the final assistant response. It may involve multiple cycles — each cycle is one request/response pair with the model. A turn with two tool calls has three cycles: prompt → tool call → tool result → tool call → tool result → final answer.
The loop tracks this in UserTurnMetadata:
// crates/agent/src/agent/agent_loop/protocol.rs (simplified)
pub struct UserTurnMetadata {
pub total_request_count: u32,
pub number_of_cycles: u32,
pub input_token_count: u32,
pub output_token_count: u32,
pub turn_duration: Option<Duration>,
pub end_reason: LoopEndReason,
}
2. Streaming Tokens¶
The loop never waits for the full response. It consumes a stream of events — MessageStart, ContentBlockDelta (text chunks), ContentBlockStart (tool use begins), ContentBlockStop, MessageStop, and Metadata. Each event is parsed incrementally by a StreamParseState machine inside the loop.
3. Tool Call Detection¶
When the model wants a tool, it doesn't say so in plain text. It emits structured events: a ContentBlockStart with a ToolUse variant carrying the tool name and ID, followed by ContentBlockDelta events with JSON input fragments, and finally a ContentBlockStop. The loop reassembles the JSON and validates it:
// crates/agent/src/agent/agent_loop/mod.rs (line ~330, simplified)
StreamEvent::ContentBlockStop(_) => {
if let Some((tool_use_id, name, json_buf)) = self.parsing_tool_use.take() {
match serde_json::from_str::<Value>(&json_buf) {
Ok(val) => self.tool_uses.push(ToolUseBlock { tool_use_id, name, input: val }),
Err(_) => self.invalid_tool_uses.push(InvalidToolUse { tool_use_id, name, content: json_buf }),
}
}
}
If the JSON is malformed, the loop marks the stream as errored and the outer Agent retries with a message asking the model to simplify.
4. Execution States¶
The loop is a state machine with six states:
| State | Meaning |
|---|---|
Idle |
Waiting for a prompt |
SendingRequest |
Packaging and sending to the model |
ConsumingResponse |
Streaming tokens from the model |
PendingToolUseResults |
Model asked for tools; waiting for results |
UserTurnEnded |
Model finished; no pending work |
Errored |
Something broke; waiting for retry or close |
Every state transition emits a LoopStateChange event so the outer Agent (and ultimately the TUI) can update in real time.
5. Cancellation¶
When you press Ctrl+C, a CancellationToken fires. The loop drains any remaining stream events, finalizes the parse state, and transitions to UserTurnEnded with end_reason: Cancelled. No zombie streams, no leaked futures.
// crates/agent/src/agent/agent_loop/mod.rs (line ~220, simplified)
AgentLoopRequest::Cancel => {
self.cancel_token.cancel();
// drain remaining stream events...
self.set_execution_state(LoopState::UserTurnEnded);
}
6. Error Recovery¶
The loop handles several error classes from StreamErrorKind:
ContextWindowOverflow— the conversation is too long; the outer Agent triggers compaction (summarization) and retriesThrottling— the service is busy; retryable after backoffStreamTimeout— the model took too long; the Agent injects a "try smaller steps" message and retriesInterrupted— user cancelled; clean shutdownInvalidJson— the model produced broken tool-call JSON; retry with a "split up the work" hint
How It Fits Together (Sequence Diagram)¶
sequenceDiagram
participant User
participant Loop as Agent Loop
participant LLM as Model (LLM)
participant Tools as Tool System
User->>Loop: prompt("find my TODOs")
Loop->>LLM: stream(messages, tool_specs)
LLM-->>Loop: ContentBlockDelta (tool_use: grep)
Loop-->>Loop: detect tool call, state → PendingToolUseResults
Loop->>Tools: execute grep {pattern: "TODO"}
Tools-->>Loop: ToolResult (12 matches)
Loop->>LLM: stream(messages + tool_result)
LLM-->>Loop: ContentBlockDelta ("Found 12 TODOs…")
Loop-->>User: UserTurnEnd (summary)
The loop sits at the center — it's the only component that talks to both the model and the tool system. The User and Tools never interact directly.
Internal Implementation¶
The Agent Loop lives in crates/agent/src/agent/agent_loop/ and is split across four files:
| File | Purpose |
|---|---|
mod.rs |
The AgentLoop struct, main_loop, stream parsing, AgentLoopHandle |
protocol.rs |
Request/response/event types (AgentLoopRequest, AgentLoopEventKind, UserTurnMetadata) |
types.rs |
Wire types (Message, StreamEvent, ContentBlock, ToolUseBlock, StreamError) |
model.rs |
The Model trait — the abstraction over any LLM backend |
The Actor Pattern¶
The AgentLoop is a Tokio actor. It spawns onto its own task and communicates through channels:
// crates/agent/src/agent/agent_loop/mod.rs (line ~155)
pub fn spawn(mut self) -> AgentLoopHandle {
let handle = tokio::spawn(async move {
self.main_loop().await;
});
AgentLoopHandle::new(id, loop_req_tx, loop_event_rx, handle)
}
The caller gets an AgentLoopHandle with three capabilities: send_request(), recv() (events), and cancel(). The handle's Drop implementation aborts the task — no orphaned loops.
The Main Select Loop¶
Inside main_loop() (line ~170), a tokio::select! multiplexes two branches:
- Request branch — receives
AgentLoopRequestmessages (new prompt, cancel, get state) - Stream branch — pulls the next
StreamResultfrom the current model response stream
// crates/agent/src/agent/agent_loop/mod.rs (line ~172, simplified)
loop {
tokio::select! {
req = self.loop_req_rx.recv() => {
self.handle_agent_loop_request(req.payload).await;
},
event = self.curr_stream.next() => {
self.curr_stream_state.next(event, &mut loop_events);
// transition state based on what the parser found
}
}
}
When the stream branch detects the response has ended, it checks: did the model ask for tools? If yes → PendingToolUseResults. If no → UserTurnEnded. If error → Errored.
The Model Trait¶
The Model trait (in model.rs) is the loop's only dependency on the outside world:
// crates/agent/src/agent/agent_loop/model.rs (line ~27)
pub trait Model: Debug + Send + Sync + 'static {
fn stream(
&self, messages: Vec<Message>,
tool_specs: Option<Vec<ToolSpec>>,
system_prompt: Option<String>,
cancel_token: CancellationToken,
) -> Pin<Box<dyn Stream<Item = StreamResult> + Send>>;
}
This trait is implemented by the real backend client (which talks to the LLM service) and by MockModel (used in tests). The loop doesn't know or care which model it's talking to — it just consumes the stream.
The Outer Agent Drives the Loop¶
The AgentLoop itself doesn't execute tools or manage conversation history. That's the outer Agent in crates/agent/src/agent/mod.rs. The Agent owns the loop handle and reacts to its events:
// crates/agent/src/agent/mod.rs (line ~835, simplified)
tokio::select! {
evt = self.agent_loop.recv() => {
self.handle_agent_loop_event(evt).await;
},
// ... other branches for tool results, MCP events, etc.
}
When the Agent receives ResponseStreamEnd with tool uses, it dispatches them to the Tool System. When tool results come back, it packages them into a new SendRequestArgs and calls send_request() on the loop handle — starting the next cycle.
This separation means the loop is pure protocol: stream in, events out. All the messy business logic (permissions, hooks, compaction, tool execution) lives in the Agent.
Key Takeaways¶
- The Agent Loop is the innermost execution loop shared by both V1 and V2 of kiro-cli
- It's a Tokio actor that multiplexes incoming requests with an outgoing model stream
- A single user turn may involve multiple cycles (prompt → tool → result → tool → result → answer)
- The loop is a state machine (
Idle→SendingRequest→ConsumingResponse→PendingToolUseResultsorUserTurnEnded) - It handles cancellation, malformed JSON, and stream errors gracefully
- The loop doesn't execute tools itself — it signals
PendingToolUseResultsand the outer Agent takes over - The
Modeltrait abstracts the LLM backend, making the loop testable withMockModel
What's Next¶
The loop dispatches tools — but what is a tool? How does grep become something the LLM can call? How are permissions checked, results formatted, and timeouts enforced?
Next up: Chapter 7: Tool System — the agent's hands.