Message History#
KodeAgent manages conversation history by maintaining a single, unified view of the dialogue that is directly compatible with LLM APIs while still preserving rich, structured information for the agent’s internal reasoning.
Unified History (self.chat_history)#
Starting from v0.10.0, KodeAgent uses a single chat_history member (a list of dictionaries) as the master record. This approach offers several advantages:
API Compatibility: The history is already in the format expected by most LLM APIs (e.g., OpenAI, Gemini, Anthropic), containing
role,content, and optionaltool_calls.Consistency: There’s no longer a “dual storage” paradigm to keep in sync, reducing complexity and potential for bugs.
Provider Agnostic: The history handles multimodal content and tool calls in a standard way across different LLM and SLM providers through LiteLLM.
Structured Messaging & Formatters#
While messages are stored as dictionaries, the agent still reasons using structured Pydantic objects (like ChatMessage, ReActChatMessage, or CodeActChatMessage).
To bridge this gap, KodeAgent uses specialized History Formatters:
Bidirectional Mapping: Formatters convert structured agent objects into API-compliant dictionaries and vice versa.
Agent Specificity: Each agent type (ReAct, CodeAct, FCA) has a dedicated formatter that knows how to map its specific thoughts, actions, and observations into the standard
tool/assistantroles.Rich Content: Thoughts and intermediate reasoning are often stored as text in the assistant’s response or as metadata, ensuring they are preserved for context without breaking API schemas.
The Message Lifecycle#
Input: Your request is wrapped in a
ChatMessage, formatted into a dictionary, and added toself.chat_history.Inference: When it’s time to “think,” the agent sends
self.chat_history(excluding the system prompt) to the LLM.Parsing: The LLM responds. KodeAgent’s
parse_text_response(or native parsing) extracts thoughts, actions, or final answers, which are then added back to the history.Observation: Any tool output or code result is recorded as a
toolrole message in the unified history, and the cycle repeats.
Truncation and Cleanup#
To handle long conversations and prevent token overflow:
Automatic Truncation: Large tool results or long message contents are truncated when being added to history or when being formatted for display/observation.
History Management: Methods like
get_history()allow for easy inspection of the dialogue, whileclear_history()resets the agent’s state for a new task.
This unified system ensures that KodeAgent remains lightweight, scalable, and easy to debug while maintaining full compatibility with modern LLM capabilities.