Explanation
Understanding-oriented content for gomcptest architecture and concepts
Explanation documents discuss and clarify concepts to broaden the reader’s understanding of topics. They provide context and illuminate ideas.
This section provides deeper background on how gomcptest works, its architecture, and the concepts behind it. The explanations are organized from foundational concepts to specific implementations:
- Architecture - Overall system design and component relationships
- MCP Protocol - Core protocol concepts and communication patterns
- Event System - Real-time event architecture for transparency and monitoring
- MCP Tools - Tool implementation details and design patterns
- AgentFlow Implementation - Specific web interface implementation of the event system
1 - gomcptest Architecture
Deep dive into the system architecture and design decisions
This document explains the architecture of gomcptest, the design decisions behind it, and how the various components interact to create a custom Model Context Protocol (MCP) host.
The Big Picture
The gomcptest project implements a custom host that provides a Model Context Protocol (MCP) implementation. It’s designed to enable testing and experimentation with agentic systems without requiring direct integration with commercial LLM platforms.
The system is built with these key principles in mind:
- Modularity: Components are designed to be interchangeable
- Compatibility: The API mimics the OpenAI API for easy integration
- Extensibility: New tools can be easily added to the system
- Testing: The architecture facilitates testing of agentic applications
Core Components
Host (OpenAI Server)
The host is the central component, located in /host/openaiserver
. It presents an OpenAI-compatible API interface and connects to Google’s Vertex AI for model inference. This compatibility layer makes it easy to integrate with existing tools and libraries designed for OpenAI.
The host has several key responsibilities:
- API Compatibility: Implementing the OpenAI chat completions API
- Session Management: Maintaining chat history and context
- Model Integration: Connecting to Vertex AI’s Gemini models
- Function Calling: Orchestrating function/tool calls based on model outputs
- Response Streaming: Supporting streaming responses to the client
- Artifact Storage: Managing file uploads and downloads through RESTful endpoints
Unlike commercial implementations, this host is designed for local development and testing, emphasizing flexibility and observability over production-ready features like authentication or rate limiting.
The tools are standalone executables that implement the Model Context Protocol. Each tool is designed to perform a specific function, such as executing shell commands or manipulating files.
Tools follow a consistent pattern:
- They communicate via standard I/O using the MCP JSON-RPC protocol
- They expose a specific set of parameters
- They handle their own error conditions
- They return results in a standardized format
This approach allows tools to be:
- Developed independently
- Tested in isolation
- Used in different host environments
- Chained together in complex workflows
Artifact Storage
The artifact storage system provides a RESTful API for managing generic file uploads and downloads. This component is integrated directly into the OpenAI server host and offers several key features:
Core Capabilities:
- Universal File Support: Accepts any file type (text, binary, images, documents)
- UUID-based Identification: Each uploaded file receives a unique UUID identifier
- Metadata Management: Automatically tracks original filename, content type, size, and upload timestamp
- Configurable Storage: Supports custom storage directories and file size limits
Storage Architecture:
- Files are stored using UUID-based naming to prevent conflicts and ensure uniqueness
- Metadata is stored in companion
.meta.json
files for efficient retrieval - The storage directory structure is flat but organized with clear separation between data and metadata
Integration Points:
- Exposed through RESTful endpoints (
POST /artifact/
and GET /artifact/{id}
) - Supports CORS for web-based integrations
- Uses the same middleware stack as the main API (CORS, logging, error handling)
This system enables AI agents and users to persistently store and share files across conversations and sessions, making it particularly useful for workflows involving document processing, image analysis, or data manipulation.
CLI
The CLI provides a user interface similar to tools like “Claude Code” or “OpenAI ChatGPT”. It connects to the OpenAI-compatible server and provides a way to interact with the LLM and tools through a conversational interface.
Data Flow
Chat Conversation Flow
- The user sends a request to the CLI
- The CLI forwards this request to the OpenAI-compatible server
- The server sends the request to Vertex AI’s Gemini model
- The model may identify function calls in its response
- The server executes these function calls by invoking the appropriate MCP tools
- The results are provided back to the model to continue its response
- The final response is streamed back to the CLI and presented to the user
Artifact Storage Flow
Upload Flow:
- Client sends a POST request to
/artifact/
with file data and headers - Server validates required headers (
Content-Type
, X-Original-Filename
) - Server generates a UUID for the artifact
- File data is streamed to disk using the UUID as filename
- Metadata is saved in a companion
.meta.json
file - Server responds with the artifact ID
Retrieval Flow:
- Client sends a GET request to
/artifact/{id}
- Server validates the UUID format
- Server reads the metadata file to get original file information
- Server streams the file back with appropriate headers (Content-Type, Content-Disposition, etc.)
This dual-flow architecture allows the system to handle both conversational AI interactions and persistent file storage independently, enabling richer workflows that combine real-time AI processing with persistent data management.
Design Decisions Explained
Why OpenAI API Compatibility?
The OpenAI API has become a de facto standard in the LLM space. By implementing this interface, gomcptest can work with a wide variety of existing tools, libraries, and frontends with minimal adaptation.
Why Google Vertex AI?
Vertex AI provides access to Google’s Gemini models, which have strong function calling capabilities. The implementation could be extended to support other model providers as needed.
By implementing tools as standalone executables rather than library functions, we gain several advantages:
- Security through isolation
- Language agnosticism (tools can be written in any language)
- Ability to distribute tools separately from the host
- Easier testing and development
Why MCP?
The Model Context Protocol provides a standardized way for LLMs to interact with external tools. By adopting this protocol, gomcptest ensures compatibility with tools developed for other MCP-compatible hosts.
Why Built-in Artifact Storage?
The artifact storage system is integrated directly into the host rather than implemented as a separate MCP tool for several strategic reasons:
Performance and Simplicity:
- Direct HTTP endpoints avoid the overhead of MCP protocol wrapping for file operations
- Streaming file uploads and downloads are more efficient without JSON-RPC encapsulation
- Reduces complexity for web-based clients that need direct file access
Integration Benefits:
- Shares the same middleware stack (CORS, logging, error handling) as the main API
- Uses consistent configuration patterns with other host components
- Simplifies deployment by reducing the number of separate services
API Design:
- RESTful endpoints align with standard web practices for file operations
- HTTP semantics (Content-Type, Content-Disposition) map naturally to file storage needs
- Range request support for large files comes naturally with
http.ServeFile
This approach provides a clean separation between the conversational AI capabilities (handled via MCP tools) and persistent storage capabilities (handled via integrated HTTP endpoints).
Limitations and Future Directions
The current implementation has several limitations:
- Single chat session per instance
- Limited support for authentication and authorization
- No persistence of chat history between restarts
- No built-in support for rate limiting or quotas
Future enhancements could include:
- Support for multiple chat sessions
- Integration with additional model providers
- Enhanced security features
- Improved error handling and logging
- Performance optimizations for large-scale deployments
Conclusion
The gomcptest architecture represents a flexible and extensible approach to building custom MCP hosts. It prioritizes simplicity, modularity, and developer experience, making it an excellent platform for experimentation with agentic systems.
By understanding this architecture, developers can effectively utilize the system, extend it with new tools, and potentially adapt it for their specific needs.
2 - Understanding the Model Context Protocol (MCP)
Exploration of what MCP is, how it works, and design decisions behind it
This document explores the Model Context Protocol (MCP), how it works, the design decisions behind it, and how it compares to alternative approaches for LLM tool integration.
What is the Model Context Protocol?
The Model Context Protocol (MCP) is a standardized communication protocol that enables Large Language Models (LLMs) to interact with external tools and capabilities. It defines a structured way for models to request information or take actions in the real world, and for tools to provide responses back to the model.
MCP is designed to solve the problem of extending LLMs beyond their training data by giving them access to:
- Current information (e.g., via web search)
- Computational capabilities (e.g., calculators, code execution)
- External systems (e.g., databases, APIs)
- User environment (e.g., file system, terminal)
How MCP Works
At its core, MCP is a protocol based on JSON-RPC that enables bidirectional communication between LLMs and tools. The basic workflow is:
- The LLM generates a call to a tool with specific parameters
- The host intercepts this call and routes it to the appropriate tool
- The tool executes the requested action and returns the result
- The result is injected into the model’s context
- The model continues generating a response incorporating the new information
The protocol specifies:
- How tools declare their capabilities and parameters
- How the model requests tool actions
- How tools return results or errors
- How multiple tools can be combined
MCP in gomcptest
In gomcptest, MCP is implemented using a set of independent executables that communicate over standard I/O. This approach has several advantages:
- Language-agnostic: Tools can be written in any programming language
- Process isolation: Each tool runs in its own process for security and stability
- Compatibility: The protocol works with various LLM providers
- Extensibility: New tools can be easily added to the system
Each tool in gomcptest follows a consistent pattern:
- It receives a JSON request on stdin
- It parses the parameters and performs its action
- It formats the result as JSON and returns it on stdout
The Protocol Specification
The core MCP protocol in gomcptest follows this format:
Tools register themselves with a schema that defines their capabilities:
{
"name": "ToolName",
"description": "Description of what the tool does",
"parameters": {
"type": "object",
"properties": {
"param1": {
"type": "string",
"description": "Description of parameter 1"
},
"param2": {
"type": "number",
"description": "Description of parameter 2"
}
},
"required": ["param1"]
}
}
Function Call Request
When a model wants to use a tool, it generates a function call like:
{
"name": "ToolName",
"params": {
"param1": "value1",
"param2": 42
}
}
Function Call Response
The tool executes the requested action and returns:
{
"result": "Output of the tool's execution"
}
Or, in case of an error:
{
"error": {
"message": "Error message",
"code": "ERROR_CODE"
}
}
Design Decisions in MCP
Several key design decisions shape the MCP implementation in gomcptest:
Standard I/O Communication
By using stdin/stdout for communication, tools can be written in any language that can read from stdin and write to stdout. This makes it easy to integrate existing utilities and libraries.
Using JSON Schema for tool definitions provides a clear contract between the model and the tools. It enables:
- Validation of parameters
- Documentation of capabilities
- Potential for automatic code generation
Stateless Design
Tools are designed to be stateless, with each invocation being independent. This simplifies the protocol and makes tools easier to reason about and test.
Pass-through Authentication
The protocol doesn’t handle authentication directly; instead, it relies on the host to manage permissions and authentication. This separation of concerns keeps the protocol simple.
Comparison with Alternatives
vs. OpenAI Function Calling
MCP is similar to OpenAI’s function calling feature but with these key differences:
- MCP is designed to be provider-agnostic
- MCP tools run as separate processes
- MCP provides more detailed error handling
Compared to LangChain:
- MCP is a lower-level protocol rather than a framework
- MCP focuses on interoperability rather than abstraction
- MCP allows for stronger process isolation
vs. Agent Protocols
Other agent protocols often focus on higher-level concepts like goals and planning, while MCP focuses specifically on the mechanics of tool invocation.
Future Directions
The MCP protocol in gomcptest could evolve in several ways:
- Enhanced security: More granular permissions and sand-boxing
- Streaming responses: Support for tools that produce incremental results
- Bidirectional communication: Supporting tools that can request clarification
- Tool composition: First-class support for chaining tools together
- State management: Optional session state for tools that need to maintain context
Conclusion
The Model Context Protocol as implemented in gomcptest represents a pragmatic approach to extending LLM capabilities through external tools. Its simplicity, extensibility, and focus on interoperability make it a solid foundation for building and experimenting with agentic systems.
By understanding the protocol, developers can create new tools that seamlessly integrate with the system, unlocking new capabilities for LLM applications.
3 - Event System Architecture
Understanding the event-driven architecture that enables real-time tool interaction monitoring and streaming responses in gomcptest.
This document explains the foundational event system architecture in gomcptest that enables real-time monitoring of tool interactions, streaming responses, and transparent agentic workflows. This system is implemented across different components and interfaces, with AgentFlow being one specific implementation.
What is the Event System?
The event system in gomcptest provides real-time visibility into AI-tool interactions through a streaming event architecture. It captures and streams events that occur during tool execution, enabling transparency in how AI agents make decisions and use tools.
Important Implementation Detail: By default, the OpenAI-compatible server only streams standard chat completion responses to maintain API compatibility. Tool events (tool calls and responses) are only included in the stream when the withAllEvents
flag is enabled in the server configuration. This design allows for:
- Standard Mode: OpenAI API compatibility with only chat completion chunks
- Enhanced Mode: Full event visibility including tool interactions when
withAllEvents
is true
Core Event Concepts
Event-Driven Transparency
Traditional AI interactions are often “black boxes” where users see only the final result. The gomcptest event system provides transparency by exposing:
- Tool Call Events: When the AI decides to use a tool, what tool it chooses, and what parameters it passes
- Tool Response Events: The results returned by tools, including success responses and error conditions
- Processing Events: Internal state changes and decision points during request processing
- Stream Events: Real-time updates as responses are generated
Event Types
The system defines several core event types:
Based on the actual implementation in /host/openaiserver/chatengine/vertexai/gemini/tool_events.go
:
Chat Completion Events
- ChatCompletionStreamResponse: Standard OpenAI-compatible streaming chunks
- Always included in streams (default behavior)
- Contains incremental content as it’s generated
Event Availability
- Default Streaming: Only
ChatCompletionStreamResponse
events are sent - Enhanced Streaming: When
withAllEvents = true
, includes all tool events
Event Architecture Patterns
Producer-Consumer Model
The event system follows a producer-consumer pattern:
- Event Producers: Components that generate events (chat engines, tool executors, stream processors)
- Event Channels: Transport mechanisms for event delivery (Go channels, HTTP streams)
- Event Consumers: Components that process and present events (web interfaces, logging systems, monitors)
Channel-Based Streaming
Events are delivered through channel-based streaming:
type StreamEvent interface {
IsStreamEvent() bool
}
// Event channel returned by streaming operations
func SendStreamingRequest() (<-chan StreamEvent, error) {
eventChan := make(chan StreamEvent, 100)
// Events are sent to the channel as they occur
go func() {
defer close(eventChan)
// Generate and send events
eventChan <- &ToolCallEvent{...}
eventChan <- &ToolResponseEvent{...}
eventChan <- &ContentEvent{...}
}()
return eventChan, nil
}
Each event carries standardized metadata:
- Timestamp: When the event occurred
- Event ID: Unique identifier for tracking
- Event Type: Category and specific type
- Context: Related session, request, or operation context
- Payload: Event-specific data
Event Flow Patterns
Request-Response with Events
Traditional request-response patterns are enhanced with event streaming:
- Request Initiated: System generates start events
- Processing Events: Intermediate steps generate progress events
- Tool Interactions: Tool calls and responses generate events
- Content Generation: Streaming content generates incremental events
- Completion: Final response and end events
Event Correlation
Events are correlated through:
- Session IDs: Grouping events within a single chat session
- Request IDs: Linking events to specific API requests
- Tool Call IDs: Connecting tool call and response events
- Parent-Child Relationships: Hierarchical event relationships
Implementation Patterns
Server-Sent Events (SSE) Implementation
The actual implementation in /host/openaiserver/chatengine/chat_completion_stream.go
delivers events via Server-Sent Events with specific formatting:
HTTP Headers Set:
Content-Type: text/event-stream
Cache-Control: no-cache
Connection: keep-alive
Transfer-Encoding: chunked
Event Format:
data: {"event_type":"tool_call","id":"chatcmpl-abc123","object":"tool.call","created":1704067200,"tool_call":{"id":"call_xyz789","name":"sleep","arguments":{"seconds":3}}}
data: {"event_type":"tool_response","id":"chatcmpl-abc123","object":"tool.response","created":1704067201,"tool_response":{"id":"call_xyz789","name":"sleep","response":"Slept for 3 seconds"}}
data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","created":1704067202,"model":"gemini-2.0-flash","choices":[{"index":0,"delta":{"content":"I have completed the 3-second pause."},"finish_reason":"stop"}]}
data: [DONE]
Event Filtering Logic:
switch res := event.(type) {
case ChatCompletionStreamResponse:
// Always sent - OpenAI compatible
jsonBytes, _ := json.Marshal(res)
w.Write([]byte("data: " + string(jsonBytes) + "\n\n"))
default:
// Tool events only sent when withAllEvents = true
if o.withAllEvents {
jsonBytes, _ := json.Marshal(event)
w.Write([]byte("data: " + string(jsonBytes) + "\n\n"))
}
}
JSON-RPC Event Extensions
For programmatic interfaces, events extend the JSON-RPC protocol:
{
"jsonrpc": "2.0",
"method": "event",
"params": {
"event_type": "tool_call",
"event_data": {
"id": "call_123",
"name": "Edit",
"arguments": {...}
}
}
}
Event Processing Strategies
Real-Time Processing
Events are processed as they occur:
- Immediate Display: Critical events are shown immediately
- Progressive Enhancement: UI updates incrementally as events arrive
- Optimistic Updates: UI shows intended state before confirmation
Buffering and Batching
For performance optimization:
- Event Buffering: Collect multiple events before processing
- Batch Updates: Update UI with multiple events simultaneously
- Debouncing: Reduce update frequency for high-frequency events
Error Handling
Robust error handling in event processing:
- Graceful Degradation: Continue operation when non-critical events fail
- Event Recovery: Attempt to recover from event processing errors
- Fallback Modes: Alternative processing when event system fails
Event System Benefits
Observability
The event system provides comprehensive observability:
- Real-Time Monitoring: See what’s happening as it happens
- Historical Analysis: Review past interactions and decisions
- Performance Insights: Understand timing and bottlenecks
- Error Tracking: Identify and diagnose issues
User Experience
Enhanced user experience through transparency:
- Progress Indication: Users see incremental progress
- Decision Transparency: Understand AI reasoning process
- Interactive Feedback: Respond to tool executions in real-time
- Learning Opportunity: Understand how AI approaches problems
Development and Debugging
Valuable for development:
- Debugging Aid: Trace execution flow and identify issues
- Testing Support: Verify expected event sequences
- Performance Analysis: Identify optimization opportunities
- Integration Testing: Validate event handling across components
Integration Points
Chat Engines Integration
The actual integration in /host/openaiserver/chatengine/vertexai/gemini/
shows specific implementation patterns:
Tool Call Event Generation:
// When AI decides to use a tool
toolCallEvent := NewToolCallEvent(completionID, toolCallID, toolName, args)
eventChannel <- toolCallEvent
Tool Response Event Generation:
// After tool execution completes
toolResponseEvent := NewToolResponseEvent(completionID, toolCallID, toolName, response, err)
eventChannel <- toolResponseEvent
Stream Channel Management:
func (s *ChatSession) SendStreamingChatRequest(ctx context.Context, req chatengine.ChatCompletionRequest) (<-chan chatengine.StreamEvent, error) {
eventChannel := make(chan chatengine.StreamEvent, 100)
go func() {
defer close(eventChannel)
// Process model responses and emit events
for chunk := range modelStream {
// Emit tool events when detected
// Emit content events for streaming text
}
}()
return eventChannel, nil
}
Tools integrate by:
- Emitting execution start events
- Providing progress updates for long-running operations
- Returning detailed response events
- Generating error events with diagnostic information
User Interfaces
Interfaces integrate by:
- Subscribing to event streams
- Processing events in real-time
- Updating UI based on event content
- Providing user controls for event display
Event System Implementations
The event system is a general architecture that can be implemented in various ways:
AgentFlow Web Interface
AgentFlow implements the event system through:
- Browser-based SSE consumption
- Real-time popup notifications for tool calls
- Progressive content updates
- Interactive event display controls
CLI Interfaces
Command-line interfaces can implement through:
- Terminal-based event display
- Progress indicators and status updates
- Structured logging of events
- Interactive prompts based on events
API Gateways
API gateways can implement through:
- Event forwarding to multiple consumers
- Event filtering and transformation
- Event persistence and replay
- Event-based routing and load balancing
Future Event System Enhancements
Advanced Event Types
- Reasoning Events: Capture AI’s internal reasoning process
- Planning Events: Show multi-step planning and strategy
- Context Events: Track context usage and management
- Performance Events: Detailed timing and resource usage
Event Intelligence
- Event Pattern Recognition: Identify common patterns and anomalies
- Predictive Events: Anticipate likely next events
- Event Summarization: Aggregate events into higher-level insights
- Event Recommendations: Suggest optimizations based on event patterns
Enhanced Delivery
- Event Persistence: Store and replay event histories
- Event Filtering: Selective event delivery based on preferences
- Event Routing: Direct events to multiple consumers
- Event Transformation: Adapt events for different consumer types
Conclusion
The event system architecture in gomcptest provides a foundational layer for transparency, observability, and real-time interaction in agentic systems. By understanding these concepts, developers can effectively implement event-driven interfaces, create monitoring systems, and build tools that provide deep visibility into AI agent behavior.
This event system is implementation-agnostic and serves as the foundation for specific implementations like AgentFlow, while also enabling other interfaces and monitoring systems to provide similar transparency and real-time feedback capabilities.
4 - Understanding the MCP Tools
Detailed explanation of the MCP tools architecture and implementation
This document explains the architecture and implementation of the MCP tools in gomcptest, how they work, and the design principles behind them.
MCP (Model Context Protocol) tools are standalone executables that provide specific functions that can be invoked by AI models. They allow the AI to interact with its environment - performing tasks like reading and writing files, executing commands, or searching for information.
In gomcptest, tools are implemented as independent Go executables that follow a standard protocol for receiving requests and returning results through standard input/output streams. Tool interactions generate events that are captured by the event system, enabling real-time monitoring and transparency.
Each tool in gomcptest follows a consistent architecture:
- Standard I/O Interface: Tools communicate via stdin/stdout using JSON-formatted requests and responses
- Parameter Validation: Tools validate their input parameters according to a JSON schema
- Stateless Execution: Each tool invocation is independent and does not maintain state
- Controlled Access: Tools implement appropriate security measures and permission checks
- Structured Results: Results are returned in a standardized JSON format
Common Components
Most tools share these common components:
- Main Function: Parses JSON input, validates parameters, executes the core function, formats and returns the result
- Parameter Structure: Defines the expected input parameters for the tool
- Result Structure: Defines the format of the tool’s output
- Error Handling: Standardized error reporting and handling
- Security Checks: Validation to prevent dangerous operations
The tools in gomcptest can be categorized into several functional groups:
Filesystem Navigation
- LS: Lists files and directories, providing metadata and structure
- GlobTool: Finds files matching specific patterns, making it easier to locate relevant files
- GrepTool: Searches file contents using regular expressions, helping find specific information in codebases
Content Management
- View: Reads and displays file contents, allowing the model to analyze existing code or documentation
- Edit: Makes targeted modifications to files, enabling precise changes without overwriting the entire file
- Replace: Completely overwrites file contents, useful for generating new files or making major changes
System Interaction
- Bash: Executes shell commands, allowing the model to run commands, scripts, and programs
- dispatch_agent: A meta-tool that can create specialized sub-agents for specific tasks
AI/ML Services
- imagen: Generates and manipulates images using Google’s Imagen API, enabling visual content creation
Data Processing
- duckdbserver: Provides SQL-based data processing capabilities using DuckDB, enabling complex data analysis and transformations
Design Principles
The tools in gomcptest were designed with several key principles in mind:
1. Modularity
Each tool is a standalone executable that can be developed, tested, and deployed independently. This modular approach allows for:
- Independent development cycles
- Targeted testing
- Simpler debugging
- Ability to add or replace tools without affecting the entire system
2. Security
Security is a major consideration in the tool design:
- Tools validate inputs to prevent injection attacks
- File operations are limited to appropriate directories
- Bash command execution is restricted with banned commands
- Timeouts prevent infinite operations
- Process isolation prevents one tool from affecting others
3. Simplicity
The tools are designed to be simple to understand and use:
- Clear, focused functionality for each tool
- Straightforward parameter structures
- Consistent result formats
- Well-documented behaviors and limitations
4. Extensibility
The system is designed to be easily extended:
- New tools can be added by following the standard protocol
- Existing tools can be enhanced with additional parameters
- Alternative implementations can replace existing tools
The communication protocol for tools follows this pattern:
Tools receive JSON input on stdin in this format:
{
"param1": "value1",
"param2": "value2",
"param3": 123
}
Tools return JSON output on stdout in one of these formats:
Success:
{
"result": "text result"
}
or
{
"results": [
{"field1": "value1", "field2": "value2"},
{"field1": "value3", "field2": "value4"}
]
}
Error:
{
"error": "Error message",
"code": "ERROR_CODE"
}
Implementation Examples
Most tools follow this basic structure:
package main
import (
"encoding/json"
"fmt"
"os"
)
// Parameters defines the expected input structure
type Parameters struct {
Param1 string `json:"param1"`
Param2 int `json:"param2,omitempty"`
}
// Result defines the output structure
type Result struct {
Result string `json:"result,omitempty"`
Error string `json:"error,omitempty"`
Code string `json:"code,omitempty"`
}
func main() {
// Parse input
var params Parameters
decoder := json.NewDecoder(os.Stdin)
if err := decoder.Decode(¶ms); err != nil {
outputError("Failed to parse input", "INVALID_INPUT")
return
}
// Validate parameters
if params.Param1 == "" {
outputError("param1 is required", "MISSING_PARAMETER")
return
}
// Execute core functionality
result, err := executeTool(params)
if err != nil {
outputError(err.Error(), "EXECUTION_ERROR")
return
}
// Return result
output := Result{Result: result}
encoder := json.NewEncoder(os.Stdout)
encoder.Encode(output)
}
func executeTool(params Parameters) (string, error) {
// Tool-specific logic here
return "result", nil
}
func outputError(message, code string) {
result := Result{
Error: message,
Code: code,
}
encoder := json.NewEncoder(os.Stdout)
encoder.Encode(result)
}
Advanced Concepts
The dispatch_agent tool demonstrates how tools can be composed to create more powerful capabilities. It:
- Accepts a high-level task description
- Plans a sequence of tool operations to accomplish the task
- Executes these operations using the available tools
- Synthesizes the results into a coherent response
Error Propagation
The tool error mechanism is designed to provide useful information back to the model:
- Error messages are human-readable and descriptive
- Error codes allow programmatic handling of specific error types
- Stacktraces and debugging information are not exposed to maintain security
Tools are designed with performance in mind:
- File operations use efficient libraries and patterns
- Search operations employ indexing and filtering when appropriate
- Large results can be paginated or truncated to prevent context overflows
- Resource-intensive operations have configurable timeouts
Future Directions
The tool architecture in gomcptest could evolve in several ways:
- Streaming Results: Supporting incremental results for long-running operations
- Tool Discovery: More sophisticated mechanisms for models to discover available tools
- Tool Chaining: First-class support for composing multiple tools in sequences or pipelines
- Interactive Tools: Tools that can engage in multi-step interactions with the model
- Persistent State: Optional state maintenance for tools that benefit from context
Conclusion
The MCP tools in gomcptest provide a flexible, secure, and extensible foundation for enabling AI agents to interact with their environment. By understanding the architecture and design principles of these tools, developers can effectively utilize the existing tools, extend them with new capabilities, or create entirely new tools that integrate seamlessly with the system.
5 - AgentFlow: Event-Driven Interface Implementation
Implementation details of AgentFlow’s event-driven web interface, demonstrating how the general event system concepts are applied in practice through real-time tool interactions and streaming responses.
This document explains how AgentFlow implements the general event system architecture in a web-based interface, providing a concrete example of the event-driven patterns described in the foundational concepts. AgentFlow is the embedded web interface for gomcptest’s OpenAI-compatible server.
What is AgentFlow?
AgentFlow is a specific implementation of the gomcptest event system in the form of a modern web-based chat interface. It demonstrates how the general event-driven architecture can be applied to create transparent, real-time agentic interactions through a browser-based UI.
Core Architecture Overview
ChatEngine Interface Design
The foundation of AgentFlow’s functionality rests on the ChatServer
interface defined in chatengine/chat_server.go
:
type ChatServer interface {
AddMCPTool(client.MCPClient) error
ModelList(context.Context) ListModelsResponse
ModelDetail(ctx context.Context, modelID string) *Model
ListTools(ctx context.Context) []ListToolResponse
HandleCompletionRequest(context.Context, ChatCompletionRequest) (ChatCompletionResponse, error)
SendStreamingChatRequest(context.Context, ChatCompletionRequest) (<-chan StreamEvent, error)
}
This interface abstracts the underlying LLM provider (currently Vertex AI Gemini) and provides a consistent API for tool integration and streaming responses. The key innovation is the SendStreamingChatRequest
method that returns a channel of StreamEvent
interfaces, enabling real-time event streaming.
OpenAI v1 API Compatibility Strategy
A fundamental design decision was to maintain full compatibility with the OpenAI v1 API while extending it with enhanced functionality. This is achieved through:
- Standard Endpoint Preservation: Uses
/v1/chat/completions
, /v1/models
, and /v1/tools
endpoints - Parameter Encoding: Tool selection is encoded within the existing model parameter using a pipe-delimited format
- Event Extension: Additional events are streamed alongside standard chat completion responses
- Backward Compatibility: Existing OpenAI-compatible clients work unchanged
This approach avoids the need to modify standard API endpoints while providing enhanced capabilities through the AgentFlow interface.
Event System Architecture
StreamEvent Interface
The event system is built around the StreamEvent
interface in chatengine/stream_event.go
:
type StreamEvent interface {
IsStreamEvent() bool
}
This simple interface allows for polymorphic event handling, where different event types can be processed through the same streaming pipeline.
Event Types and Structure
Defined in chatengine/vertexai/gemini/tool_events.go
, tool call events capture when the AI decides to use a tool:
type ToolCallEvent struct {
ID string `json:"id"`
Object string `json:"object"`
Created int64 `json:"created"`
EventType string `json:"event_type"`
ToolCall ToolCallDetails `json:"tool_call"`
}
type ToolCallDetails struct {
ID string `json:"id"`
Name string `json:"name"`
Arguments map[string]interface{} `json:"arguments"`
}
Tool response events capture the results of tool execution:
type ToolResponseEvent struct {
ID string `json:"id"`
Object string `json:"object"`
Created int64 `json:"created"`
EventType string `json:"event_type"`
ToolResponse ToolResponseDetails `json:"tool_response"`
}
type ToolResponseDetails struct {
ID string `json:"id"`
Name string `json:"name"`
Response interface{} `json:"response"`
Error string `json:"error,omitempty"`
}
Server-Sent Events Implementation
The streaming implementation in chatengine/chat_completion_stream.go
provides the SSE infrastructure:
func (o *OpenAIV1WithToolHandler) streamResponse(w http.ResponseWriter, r *http.Request, req ChatCompletionRequest) {
w.Header().Set("Content-Type", "text/event-stream")
w.Header().Set("Cache-Control", "no-cache")
w.Header().Set("Connection", "keep-alive")
w.Header().Set("Transfer-Encoding", "chunked")
// Process events from the stream channel
for event := range stream {
switch res := event.(type) {
case ChatCompletionStreamResponse:
// Handle standard chat completion chunks
default:
// Handle tool events if withAllEvents flag is true
if o.withAllEvents {
jsonBytes, _ := json.Marshal(event)
w.Write([]byte("data: " + string(jsonBytes) + "\n\n"))
}
}
}
}
The withAllEvents
flag controls whether tool events are included in the stream, allowing for backward compatibility with standard OpenAI clients.
Pipe-Delimited Encoding
The tool selection mechanism is implemented through a clever encoding scheme in the model parameter. The ParseModelAndTools
method in chatengine/chat_structure.go
parses this format:
func (req *ChatCompletionRequest) ParseModelAndTools() (string, []string) {
parts := strings.Split(req.Model, "|")
if len(parts) <= 1 {
return req.Model, nil
}
modelName := strings.TrimSpace(parts[0])
toolNames := make([]string, 0, len(parts)-1)
for i := 1; i < len(parts); i++ {
toolName := strings.TrimSpace(parts[i])
if toolName != "" {
toolNames = append(toolNames, toolName)
}
}
return modelName, toolNames
}
This allows formats like:
gemini-2.0-flash
(no tool filtering)gemini-2.0-flash|Edit|View|Bash
(specific tools only)gemini-1.5-pro|VertexAI Code Execution
(model with built-in tools)
The Vertex AI Gemini implementation includes sophisticated tool filtering in chatengine/vertexai/gemini/chatsession.go
:
func (chatsession *ChatSession) FilterTools(requestedToolNames []string) []*genai.Tool {
if len(requestedToolNames) == 0 {
return chatsession.tools // Return all tools if none specified
}
var filteredTools []*genai.Tool
var filteredFunctions []*genai.FunctionDeclaration
for _, tool := range chatsession.tools {
// Handle Vertex AI built-in tools separately
switch {
case tool.CodeExecution != nil && requestedMap[VERTEXAI_CODE_EXECUTION]:
filteredTools = append(filteredTools, &genai.Tool{CodeExecution: tool.CodeExecution})
case tool.GoogleSearch != nil && requestedMap[VERTEXAI_GOOGLE_SEARCH]:
filteredTools = append(filteredTools, &genai.Tool{GoogleSearch: tool.GoogleSearch})
// ... handle other built-in tools
default:
// Handle MCP function declarations
for _, function := range tool.FunctionDeclarations {
if requestedMap[function.Name] {
filteredFunctions = append(filteredFunctions, function)
}
}
}
}
// Combine function declarations into a single tool
if len(filteredFunctions) > 0 {
filteredTools = append(filteredTools, &genai.Tool{
FunctionDeclarations: filteredFunctions,
})
}
return filteredTools
}
This implementation handles both Vertex AI built-in tools (CodeExecution, GoogleSearch, etc.) and MCP function declarations, ensuring they are properly separated to avoid proto validation errors.
Frontend Event Processing
Real-Time Event Handling
The JavaScript implementation in chat-ui.html.tmpl
provides comprehensive event processing through the handleStreamingResponse
method:
async handleStreamingResponse(response) {
const reader = response.body.getReader();
const decoder = new TextDecoder();
while (true) {
const { value, done } = await reader.read();
if (done) break;
const chunk = decoder.decode(value, { stream: true });
const lines = chunk.split('\n');
for (const line of lines) {
if (line.startsWith('data: ')) {
const data = line.slice(6);
if (data === '[DONE]') return;
try {
const parsed = JSON.parse(data);
// Handle different event types
if (parsed.event_type === 'tool_call') {
this.addToolNotification(parsed.tool_call.name, parsed);
this.showToolCallPopup(parsed);
} else if (parsed.event_type === 'tool_response') {
this.updateToolResponsePopup(parsed);
this.storeToolResponse(parsed);
} else if (parsed.choices && parsed.choices[0]) {
// Handle standard chat completion chunks
this.updateMessageContent(messageIndex, assistantMessage, true);
}
} catch (e) {
// Handle JSON parse errors gracefully
}
}
}
}
}
AgentFlow implements a sophisticated popup management system to provide real-time feedback on tool execution:
showToolCallPopup(event) {
const popupId = event.tool_call.id;
// Create popup with loading state
const popup = document.createElement('div');
popup.className = 'tool-popup tool-call';
popup.innerHTML = `
<div class="tool-popup-header">
<div class="tool-popup-title">Tool Executing: ${event.tool_call.name}</div>
<button class="tool-popup-close" onclick="chatUI.closeToolPopup('${popupId}')">×</button>
</div>
<div class="tool-popup-content">
<div class="tool-popup-args">${JSON.stringify(event.tool_call.arguments, null, 2)}</div>
<div class="tool-popup-spinner"></div>
</div>
`;
// Store reference and set auto-close timer
this.toolPopups.set(popupId, popup);
this.popupAutoCloseTimers.set(popupId, setTimeout(() => {
this.closeToolPopup(popupId);
}, 30000));
}
updateToolResponsePopup(event) {
const popup = this.toolPopups.get(event.tool_response.id);
if (!popup) return;
// Update popup with response data
popup.className = `tool-popup ${event.tool_response.error ? 'tool-error' : 'tool-response'}`;
// Update content with response...
// Auto-close after showing result
setTimeout(() => {
this.closeToolPopup(event.tool_response.id);
}, 5500);
}
The frontend buildModelWithTools()
function implements the pipe-delimited encoding:
buildModelWithTools() {
let modelString = this.selectedModel;
if (this.selectedTools.size > 0 && this.selectedTools.size < this.tools.length) {
// Only add tools if not all are selected (all selected means use all tools)
const toolNames = Array.from(this.selectedTools);
modelString += '|' + toolNames.join('|');
}
return modelString;
}
This ensures tool selection is properly encoded in the API request while maintaining OpenAI compatibility.
Technical Design Benefits
Event-Driven Transparency
The event system provides unprecedented visibility into AI decision-making:
- Real-Time Feedback: Users see tool calls as they happen
- Detailed Information: Full argument and response data available
- Error Visibility: Tool failures are clearly communicated
- Learning Opportunity: Users understand how AI approaches problems
Scalable Architecture
The channel-based streaming architecture scales well:
- Non-Blocking: Event processing doesn’t block the main request thread
- Backpressure Handling: Go channels provide natural backpressure
- Resource Management: Proper cleanup prevents memory leaks
- Error Isolation: Tool failures don’t crash the entire system
OpenAI Compatibility
The design maintains full OpenAI v1 API compatibility:
- Standard Endpoints: No custom API modifications required
- Parameter Encoding: Tool selection uses existing model parameter
- Event Extensions: Additional events don’t interfere with standard responses
- Client Compatibility: Existing OpenAI clients work unchanged
Integration Points
MCP Protocol Integration
AgentFlow seamlessly integrates with the Model Context Protocol:
- Tool Discovery: Automatic detection of MCP server capabilities
- Dynamic Loading: Tools can be added/removed without restart
- Protocol Abstraction: MCP details are hidden from the UI
- Error Handling: MCP errors are gracefully handled and displayed
Vertex AI Integration
The Vertex AI backend provides:
- Built-in Tools: Code execution, Google Search, etc.
- Model Selection: Multiple Gemini model variants
- Streaming Support: Native streaming for real-time responses
- Tool Mixing: Combines MCP tools with Vertex AI capabilities
This comprehensive architecture enables AgentFlow to provide an intuitive, powerful interface for agentic interactions while maintaining compatibility with existing OpenAI tooling and providing deep visibility into the AI’s decision-making process.