Letta Streaming Analysis & Implementation Guide#
Real Streaming Behavior (From Testing)#
Based on actual testing with the message "create a memory block called cameron":
Message Flow Pattern#
[0-21] reasoning_message (ID: message-f7b4fa60-0195-4e50-98c9-dfb6a03b013f)
[22-36] tool_call_message (ID: message-f7b4fa60-0195-4e50-98c9-dfb6a03b013f) ← SAME ID
[37] tool_return_message (ID: message-e906b6cc-33a1-440c-8ff6-15b06ec287c8) ← NEW ID
[38-53] reasoning_message (ID: message-cc7aa672-7859-4e22-9ccd-2efbde068e6c) ← NEW ID
[54-88] assistant_message (ID: message-cc7aa672-7859-4e22-9ccd-2efbde068e6c) ← SAME ID
[89] stop_reason
[90] usage_statistics
Key Insights#
- Chunks are DELTAS - Each chunk contains only the new text, not the full accumulated text
- Message IDs group related chunks - All chunks with the same ID belong together
- Reasoning + Content share IDs - Reasoning and its paired content (tool_call OR assistant) have the same ID
- Tool returns have separate IDs - Tool return messages have different IDs from their calls
- New reasoning = New message group - When a new reasoning_message with a different ID arrives, the previous group is complete
Message Grouping Rules#
Rule 1: Message Boundary Detection#
A new message group starts when we receive a reasoning_message chunk with a different ID than the current one.
Rule 2: Message Structure#
Each message group contains:
reasoning(optional) - fromreasoning_messagechunkscontent- from eithertool_call_messageORassistant_messagechunksid- the shared message ID from Letta
Rule 3: Tool Returns#
Tool return messages (tool_return_message) have separate IDs and should be stored separately, then paired with their tool call using step_id.
Correct Implementation Strategy#
Data Structure#
interface AccumulatingMessage {
id: string;
reasoning: string; // Accumulate reasoning chunks here
content: string; // Accumulate tool_call OR assistant chunks here
type: 'tool_call' | 'assistant' | null; // What kind of content
toolCallName?: string; // For tool calls
}
Algorithm#
currentMessage = null
completedMessages = []
ON CHUNK RECEIVED:
chunkId = chunk.id
// NEW MESSAGE GROUP DETECTED
if chunk.message_type === 'reasoning_message' AND currentMessage AND currentMessage.id !== chunkId:
// Previous message is complete
completedMessages.push(currentMessage)
currentMessage = { id: chunkId, reasoning: '', content: '', type: null }
// INITIALIZE IF NEEDED
if NOT currentMessage:
currentMessage = { id: chunkId, reasoning: '', content: '', type: null }
// ACCUMULATE BASED ON TYPE
if chunk.message_type === 'reasoning_message':
currentMessage.reasoning += chunk.reasoning
else if chunk.message_type === 'tool_call_message':
currentMessage.type = 'tool_call'
currentMessage.toolCallName = chunk.tool_call.name
currentMessage.content += chunk.tool_call.arguments // Delta!
else if chunk.message_type === 'assistant_message':
currentMessage.type = 'assistant'
currentMessage.content += extractText(chunk.content) // Delta!
else if chunk.message_type === 'tool_return_message':
// Tool returns are separate - just store them
storeToolReturn(chunk)
ON STREAM COMPLETE:
// Finalize the last message
if currentMessage:
completedMessages.push(currentMessage)
// Convert to display format and add to messages array
convertAndStore(completedMessages)
Problems We've Had#
Problem 1: "One Slot" - Messages Replacing Each Other#
Symptom: Only one message visible during streaming, messages replace each other
Cause: Only showing currentStream, not showing completedStreamPhases alongside it
Fix: Display BOTH completed messages AND current accumulating message
Problem 2: Messages Disappearing After Completion#
Symptom: Messages visible during stream, gone after stream ends Cause: Clearing state before converting to permanent messages Fix: Convert to messages FIRST, THEN clear streaming state
Problem 3: Finalization Timing#
Symptom: Messages not moving to completed at the right time Cause: Finalizing on message TYPE change instead of ID change Fix: Only finalize when we see a new reasoning_message with a different ID
Problem 4: Fetching from Server#
Symptom: Flickering, delay, messages appearing twice Cause: Trying to fetch "official" messages from server after streaming Fix: Don't fetch from server - we have everything from the stream
Current Issues#
- Messages still disappearing during streaming (one slot problem persists)
- Messages not reappearing after completion
- Flickering between phases
Required Fixes#
- Simplify state structure - One accumulating message, one array of completed
- Fix display logic - Show BOTH accumulated + completed simultaneously
- Fix finalization - Trigger only on ID change
- Fix persistence - Convert to messages immediately, don't clear state prematurely