This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Explanation

Understanding-oriented content for gomcptest architecture and concepts

1: gomcptest Architecture
2: Understanding the Model Context Protocol (MCP)
3: Event System Architecture
4: Understanding the MCP Tools
5: AgentFlow: Event-Driven Interface Implementation

Explanation documents discuss and clarify concepts to broaden the reader’s understanding of topics. They provide context and illuminate ideas.

This section provides deeper background on how gomcptest works, its architecture, and the concepts behind it. The explanations are organized from foundational concepts to specific implementations:

Architecture - Overall system design and component relationships
MCP Protocol - Core protocol concepts and communication patterns
Event System - Real-time event architecture for transparency and monitoring
MCP Tools - Tool implementation details and design patterns
AgentFlow Implementation - Specific web interface implementation of the event system

1 - gomcptest Architecture

Deep dive into the system architecture and design decisions

This document explains the architecture of gomcptest, the design decisions behind it, and how the various components interact to create a custom Model Context Protocol (MCP) host.

The Big Picture

The gomcptest project implements a custom host that provides a Model Context Protocol (MCP) implementation. It’s designed to enable testing and experimentation with agentic systems without requiring direct integration with commercial LLM platforms.

The system is built with these key principles in mind:

Modularity: Components are designed to be interchangeable
Compatibility: The API mimics the OpenAI API for easy integration
Extensibility: New tools can be easily added to the system
Testing: The architecture facilitates testing of agentic applications

Core Components

Host (OpenAI Server)

The host is the central component, located in /host/openaiserver. It presents an OpenAI-compatible API interface and connects to Google’s Vertex AI for model inference. This compatibility layer makes it easy to integrate with existing tools and libraries designed for OpenAI.

The host has several key responsibilities:

API Compatibility: Implementing the OpenAI chat completions API
Session Management: Maintaining chat history and context
Model Integration: Connecting to Vertex AI’s Gemini models
Function Calling: Orchestrating function/tool calls based on model outputs
Response Streaming: Supporting streaming responses to the client
Artifact Storage: Managing file uploads and downloads through RESTful endpoints

Unlike commercial implementations, this host is designed for local development and testing, emphasizing flexibility and observability over production-ready features like authentication or rate limiting.

MCP Tools

The tools are standalone executables that implement the Model Context Protocol. Each tool is designed to perform a specific function, such as executing shell commands or manipulating files.

Tools follow a consistent pattern:

They communicate via standard I/O using the MCP JSON-RPC protocol
They expose a specific set of parameters
They handle their own error conditions
They return results in a standardized format

This approach allows tools to be:

Developed independently
Tested in isolation
Used in different host environments
Chained together in complex workflows

Artifact Storage

The artifact storage system provides a RESTful API for managing generic file uploads and downloads. This component is integrated directly into the OpenAI server host and offers several key features:

Core Capabilities:

Universal File Support: Accepts any file type (text, binary, images, documents)
UUID-based Identification: Each uploaded file receives a unique UUID identifier
Metadata Management: Automatically tracks original filename, content type, size, and upload timestamp
Configurable Storage: Supports custom storage directories and file size limits

Storage Architecture:

Files are stored using UUID-based naming to prevent conflicts and ensure uniqueness
Metadata is stored in companion .meta.json files for efficient retrieval
The storage directory structure is flat but organized with clear separation between data and metadata

Integration Points:

Exposed through RESTful endpoints (POST /artifact/ and GET /artifact/{id})
Supports CORS for web-based integrations
Uses the same middleware stack as the main API (CORS, logging, error handling)

This system enables AI agents and users to persistently store and share files across conversations and sessions, making it particularly useful for workflows involving document processing, image analysis, or data manipulation.

CLI

The CLI provides a user interface similar to tools like “Claude Code” or “OpenAI ChatGPT”. It connects to the OpenAI-compatible server and provides a way to interact with the LLM and tools through a conversational interface.

Data Flow

Chat Conversation Flow

The user sends a request to the CLI
The CLI forwards this request to the OpenAI-compatible server
The server sends the request to Vertex AI’s Gemini model
The model may identify function calls in its response
The server executes these function calls by invoking the appropriate MCP tools
The results are provided back to the model to continue its response
The final response is streamed back to the CLI and presented to the user

Artifact Storage Flow

Upload Flow:

Client sends a POST request to /artifact/ with file data and headers
Server validates required headers (Content-Type, X-Original-Filename)
Server generates a UUID for the artifact
File data is streamed to disk using the UUID as filename
Metadata is saved in a companion .meta.json file
Server responds with the artifact ID

Retrieval Flow:

Client sends a GET request to /artifact/{id}
Server validates the UUID format
Server reads the metadata file to get original file information
Server streams the file back with appropriate headers (Content-Type, Content-Disposition, etc.)

This dual-flow architecture allows the system to handle both conversational AI interactions and persistent file storage independently, enabling richer workflows that combine real-time AI processing with persistent data management.

Design Decisions Explained

Why OpenAI API Compatibility?

The OpenAI API has become a de facto standard in the LLM space. By implementing this interface, gomcptest can work with a wide variety of existing tools, libraries, and frontends with minimal adaptation.

Why Google Vertex AI?

Vertex AI provides access to Google’s Gemini models, which have strong function calling capabilities. The implementation could be extended to support other model providers as needed.

Why Standalone Tools?

By implementing tools as standalone executables rather than library functions, we gain several advantages:

Security through isolation
Language agnosticism (tools can be written in any language)
Ability to distribute tools separately from the host
Easier testing and development

Why MCP?

The Model Context Protocol provides a standardized way for LLMs to interact with external tools. By adopting this protocol, gomcptest ensures compatibility with tools developed for other MCP-compatible hosts.

Why Built-in Artifact Storage?

The artifact storage system is integrated directly into the host rather than implemented as a separate MCP tool for several strategic reasons:

Performance and Simplicity:

Direct HTTP endpoints avoid the overhead of MCP protocol wrapping for file operations
Streaming file uploads and downloads are more efficient without JSON-RPC encapsulation
Reduces complexity for web-based clients that need direct file access

Integration Benefits:

Shares the same middleware stack (CORS, logging, error handling) as the main API
Uses consistent configuration patterns with other host components
Simplifies deployment by reducing the number of separate services

API Design:

RESTful endpoints align with standard web practices for file operations
HTTP semantics (Content-Type, Content-Disposition) map naturally to file storage needs
Range request support for large files comes naturally with http.ServeFile

This approach provides a clean separation between the conversational AI capabilities (handled via MCP tools) and persistent storage capabilities (handled via integrated HTTP endpoints).

Limitations and Future Directions

The current implementation has several limitations:

Single chat session per instance
Limited support for authentication and authorization
No persistence of chat history between restarts
No built-in support for rate limiting or quotas

Future enhancements could include:

Support for multiple chat sessions
Integration with additional model providers
Enhanced security features
Improved error handling and logging
Performance optimizations for large-scale deployments

Conclusion

The gomcptest architecture represents a flexible and extensible approach to building custom MCP hosts. It prioritizes simplicity, modularity, and developer experience, making it an excellent platform for experimentation with agentic systems.

By understanding this architecture, developers can effectively utilize the system, extend it with new tools, and potentially adapt it for their specific needs.

2 - Understanding the Model Context Protocol (MCP)

Exploration of what MCP is, how it works, and design decisions behind it

This document explores the Model Context Protocol (MCP), how it works, the design decisions behind it, and how it compares to alternative approaches for LLM tool integration.

What is the Model Context Protocol?

The Model Context Protocol (MCP) is a standardized communication protocol that enables Large Language Models (LLMs) to interact with external tools and capabilities. It defines a structured way for models to request information or take actions in the real world, and for tools to provide responses back to the model.

MCP is designed to solve the problem of extending LLMs beyond their training data by giving them access to:

Current information (e.g., via web search)
Computational capabilities (e.g., calculators, code execution)
External systems (e.g., databases, APIs)
User environment (e.g., file system, terminal)

How MCP Works

At its core, MCP is a protocol based on JSON-RPC that enables bidirectional communication between LLMs and tools. The basic workflow is:

The LLM generates a call to a tool with specific parameters
The host intercepts this call and routes it to the appropriate tool
The tool executes the requested action and returns the result
The result is injected into the model’s context
The model continues generating a response incorporating the new information

The protocol specifies:

How tools declare their capabilities and parameters
How the model requests tool actions
How tools return results or errors
How multiple tools can be combined

MCP in gomcptest

In gomcptest, MCP is implemented using a set of independent executables that communicate over standard I/O. This approach has several advantages:

Language-agnostic: Tools can be written in any programming language
Process isolation: Each tool runs in its own process for security and stability
Compatibility: The protocol works with various LLM providers
Extensibility: New tools can be easily added to the system

Each tool in gomcptest follows a consistent pattern:

It receives a JSON request on stdin
It parses the parameters and performs its action
It formats the result as JSON and returns it on stdout

The Protocol Specification

The core MCP protocol in gomcptest follows this format:

Tool Registration

Tools register themselves with a schema that defines their capabilities:

{
  "name": "ToolName",
  "description": "Description of what the tool does",
  "parameters": {
    "type": "object",
    "properties": {
      "param1": {
        "type": "string",
        "description": "Description of parameter 1"
      },
      "param2": {
        "type": "number",
        "description": "Description of parameter 2"
      }
    },
    "required": ["param1"]
  }
}

Function Call Request

When a model wants to use a tool, it generates a function call like:

{
  "name": "ToolName",
  "params": {
    "param1": "value1",
    "param2": 42
  }
}

Function Call Response

The tool executes the requested action and returns:

{
  "result": "Output of the tool's execution"
}

Or, in case of an error:

{
  "error": {
    "message": "Error message",
    "code": "ERROR_CODE"
  }
}

Design Decisions in MCP

Several key design decisions shape the MCP implementation in gomcptest:

Standard I/O Communication

By using stdin/stdout for communication, tools can be written in any language that can read from stdin and write to stdout. This makes it easy to integrate existing utilities and libraries.

JSON Schema for Tool Definition

Using JSON Schema for tool definitions provides a clear contract between the model and the tools. It enables:

Validation of parameters
Documentation of capabilities
Potential for automatic code generation

Stateless Design

Tools are designed to be stateless, with each invocation being independent. This simplifies the protocol and makes tools easier to reason about and test.

Pass-through Authentication

The protocol doesn’t handle authentication directly; instead, it relies on the host to manage permissions and authentication. This separation of concerns keeps the protocol simple.

Comparison with Alternatives

vs. OpenAI Function Calling

MCP is similar to OpenAI’s function calling feature but with these key differences:

MCP is designed to be provider-agnostic
MCP tools run as separate processes
MCP provides more detailed error handling

vs. LangChain Tools

Compared to LangChain:

MCP is a lower-level protocol rather than a framework
MCP focuses on interoperability rather than abstraction
MCP allows for stronger process isolation

vs. Agent Protocols

Other agent protocols often focus on higher-level concepts like goals and planning, while MCP focuses specifically on the mechanics of tool invocation.

Future Directions

The MCP protocol in gomcptest could evolve in several ways:

Enhanced security: More granular permissions and sand-boxing
Streaming responses: Support for tools that produce incremental results
Bidirectional communication: Supporting tools that can request clarification
Tool composition: First-class support for chaining tools together
State management: Optional session state for tools that need to maintain context

Conclusion

The Model Context Protocol as implemented in gomcptest represents a pragmatic approach to extending LLM capabilities through external tools. Its simplicity, extensibility, and focus on interoperability make it a solid foundation for building and experimenting with agentic systems.

By understanding the protocol, developers can create new tools that seamlessly integrate with the system, unlocking new capabilities for LLM applications.

3 - Event System Architecture

Understanding the event-driven architecture that enables real-time tool interaction monitoring and streaming responses in gomcptest.

This document explains the foundational event system architecture in gomcptest that enables real-time monitoring of tool interactions, streaming responses, and transparent agentic workflows. This system is implemented across different components and interfaces, with AgentFlow being one specific implementation.

What is the Event System?

The event system in gomcptest provides real-time visibility into AI-tool interactions through a streaming event architecture. It captures and streams events that occur during tool execution, enabling transparency in how AI agents make decisions and use tools.

Important Implementation Detail: By default, the OpenAI-compatible server only streams standard chat completion responses to maintain API compatibility. Tool events (tool calls and responses) are only included in the stream when the withAllEvents flag is enabled in the server configuration. This design allows for:

Standard Mode: OpenAI API compatibility with only chat completion chunks
Enhanced Mode: Full event visibility including tool interactions when withAllEvents is true

Core Event Concepts

Event-Driven Transparency

Traditional AI interactions are often “black boxes” where users see only the final result. The gomcptest event system provides transparency by exposing:

Tool Call Events: When the AI decides to use a tool, what tool it chooses, and what parameters it passes
Tool Response Events: The results returned by tools, including success responses and error conditions
Processing Events: Internal state changes and decision points during request processing
Stream Events: Real-time updates as responses are generated

Event Types

The system defines several core event types:

Tool Interaction Events

Based on the actual implementation in /host/openaiserver/chatengine/vertexai/gemini/tool_events.go:

ToolCallEvent: Generated when the AI decides to invoke a tool
- Structure: event_type: "tool_call", contains tool name, ID, and arguments
- Object type: "tool.call"
- Generated by: NewToolCallEvent() function
ToolResponseEvent: Generated when a tool execution completes
- Structure: event_type: "tool_response", contains response data or error
- Object type: "tool.response"
- Generated by: NewToolResponseEvent() function

Chat Completion Events

ChatCompletionStreamResponse: Standard OpenAI-compatible streaming chunks
- Always included in streams (default behavior)
- Contains incremental content as it’s generated

Event Availability

Default Streaming: Only ChatCompletionStreamResponse events are sent
Enhanced Streaming: When withAllEvents = true, includes all tool events

Event Architecture Patterns

Producer-Consumer Model

The event system follows a producer-consumer pattern:

Event Producers: Components that generate events (chat engines, tool executors, stream processors)
Event Channels: Transport mechanisms for event delivery (Go channels, HTTP streams)
Event Consumers: Components that process and present events (web interfaces, logging systems, monitors)

Channel-Based Streaming

Events are delivered through channel-based streaming:

type StreamEvent interface {
    IsStreamEvent() bool
}

// Event channel returned by streaming operations
func SendStreamingRequest() (<-chan StreamEvent, error) {
    eventChan := make(chan StreamEvent, 100)
    
    // Events are sent to the channel as they occur
    go func() {
        defer close(eventChan)
        
        // Generate and send events
        eventChan <- &ToolCallEvent{...}
        eventChan <- &ToolResponseEvent{...}
        eventChan <- &ContentEvent{...}
    }()
    
    return eventChan, nil
}

Event Metadata

Each event carries standardized metadata:

Timestamp: When the event occurred
Event ID: Unique identifier for tracking
Event Type: Category and specific type
Context: Related session, request, or operation context
Payload: Event-specific data

Event Flow Patterns

Request-Response with Events

Traditional request-response patterns are enhanced with event streaming:

Request Initiated: System generates start events
Processing Events: Intermediate steps generate progress events
Tool Interactions: Tool calls and responses generate events
Content Generation: Streaming content generates incremental events
Completion: Final response and end events

Event Correlation

Events are correlated through:

Session IDs: Grouping events within a single chat session
Request IDs: Linking events to specific API requests
Tool Call IDs: Connecting tool call and response events
Parent-Child Relationships: Hierarchical event relationships

Implementation Patterns

Server-Sent Events (SSE) Implementation

The actual implementation in /host/openaiserver/chatengine/chat_completion_stream.go delivers events via Server-Sent Events with specific formatting:

HTTP Headers Set:

Content-Type: text/event-stream
Cache-Control: no-cache
Connection: keep-alive
Transfer-Encoding: chunked

Event Format:

data: {"event_type":"tool_call","id":"chatcmpl-abc123","object":"tool.call","created":1704067200,"tool_call":{"id":"call_xyz789","name":"sleep","arguments":{"seconds":3}}}

data: {"event_type":"tool_response","id":"chatcmpl-abc123","object":"tool.response","created":1704067201,"tool_response":{"id":"call_xyz789","name":"sleep","response":"Slept for 3 seconds"}}

data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","created":1704067202,"model":"gemini-2.0-flash","choices":[{"index":0,"delta":{"content":"I have completed the 3-second pause."},"finish_reason":"stop"}]}

data: [DONE]

Event Filtering Logic:

switch res := event.(type) {
case ChatCompletionStreamResponse:
    // Always sent - OpenAI compatible
    jsonBytes, _ := json.Marshal(res)
    w.Write([]byte("data: " + string(jsonBytes) + "\n\n"))
default:
    // Tool events only sent when withAllEvents = true
    if o.withAllEvents {
        jsonBytes, _ := json.Marshal(event)
        w.Write([]byte("data: " + string(jsonBytes) + "\n\n"))
    }
}

JSON-RPC Event Extensions

For programmatic interfaces, events extend the JSON-RPC protocol:

{
  "jsonrpc": "2.0",
  "method": "event",
  "params": {
    "event_type": "tool_call",
    "event_data": {
      "id": "call_123",
      "name": "Edit", 
      "arguments": {...}
    }
  }
}

Event Processing Strategies

Real-Time Processing

Events are processed as they occur:

Immediate Display: Critical events are shown immediately
Progressive Enhancement: UI updates incrementally as events arrive
Optimistic Updates: UI shows intended state before confirmation

Buffering and Batching

For performance optimization:

Event Buffering: Collect multiple events before processing
Batch Updates: Update UI with multiple events simultaneously
Debouncing: Reduce update frequency for high-frequency events

Error Handling

Robust error handling in event processing:

Graceful Degradation: Continue operation when non-critical events fail
Event Recovery: Attempt to recover from event processing errors
Fallback Modes: Alternative processing when event system fails

Event System Benefits

Observability

The event system provides comprehensive observability:

Real-Time Monitoring: See what’s happening as it happens
Historical Analysis: Review past interactions and decisions
Performance Insights: Understand timing and bottlenecks
Error Tracking: Identify and diagnose issues

User Experience

Enhanced user experience through transparency:

Progress Indication: Users see incremental progress
Decision Transparency: Understand AI reasoning process
Interactive Feedback: Respond to tool executions in real-time
Learning Opportunity: Understand how AI approaches problems

Development and Debugging

Valuable for development:

Debugging Aid: Trace execution flow and identify issues
Testing Support: Verify expected event sequences
Performance Analysis: Identify optimization opportunities
Integration Testing: Validate event handling across components

Integration Points

Chat Engines Integration

The actual integration in /host/openaiserver/chatengine/vertexai/gemini/ shows specific implementation patterns:

Tool Call Event Generation:

// When AI decides to use a tool
toolCallEvent := NewToolCallEvent(completionID, toolCallID, toolName, args)
eventChannel <- toolCallEvent

Tool Response Event Generation:

// After tool execution completes
toolResponseEvent := NewToolResponseEvent(completionID, toolCallID, toolName, response, err)
eventChannel <- toolResponseEvent

Stream Channel Management:

func (s *ChatSession) SendStreamingChatRequest(ctx context.Context, req chatengine.ChatCompletionRequest) (<-chan chatengine.StreamEvent, error) {
    eventChannel := make(chan chatengine.StreamEvent, 100)
    
    go func() {
        defer close(eventChannel)
        // Process model responses and emit events
        for chunk := range modelStream {
            // Emit tool events when detected
            // Emit content events for streaming text
        }
    }()
    
    return eventChannel, nil
}

Tool Executors

Tools integrate by:

Emitting execution start events
Providing progress updates for long-running operations
Returning detailed response events
Generating error events with diagnostic information

User Interfaces

Interfaces integrate by:

Subscribing to event streams
Processing events in real-time
Updating UI based on event content
Providing user controls for event display

Event System Implementations

The event system is a general architecture that can be implemented in various ways:

AgentFlow Web Interface

AgentFlow implements the event system through:

Browser-based SSE consumption
Real-time popup notifications for tool calls
Progressive content updates
Interactive event display controls

CLI Interfaces

Command-line interfaces can implement through:

Terminal-based event display
Progress indicators and status updates
Structured logging of events
Interactive prompts based on events

API Gateways

API gateways can implement through:

Event forwarding to multiple consumers
Event filtering and transformation
Event persistence and replay
Event-based routing and load balancing

Future Event System Enhancements

Advanced Event Types

Reasoning Events: Capture AI’s internal reasoning process
Planning Events: Show multi-step planning and strategy
Context Events: Track context usage and management
Performance Events: Detailed timing and resource usage

Event Intelligence

Event Pattern Recognition: Identify common patterns and anomalies
Predictive Events: Anticipate likely next events
Event Summarization: Aggregate events into higher-level insights
Event Recommendations: Suggest optimizations based on event patterns

Enhanced Delivery

Event Persistence: Store and replay event histories
Event Filtering: Selective event delivery based on preferences
Event Routing: Direct events to multiple consumers
Event Transformation: Adapt events for different consumer types

Conclusion

The event system architecture in gomcptest provides a foundational layer for transparency, observability, and real-time interaction in agentic systems. By understanding these concepts, developers can effectively implement event-driven interfaces, create monitoring systems, and build tools that provide deep visibility into AI agent behavior.

This event system is implementation-agnostic and serves as the foundation for specific implementations like AgentFlow, while also enabling other interfaces and monitoring systems to provide similar transparency and real-time feedback capabilities.

4 - Understanding the MCP Tools

Detailed explanation of the MCP tools architecture and implementation

This document explains the architecture and implementation of the MCP tools in gomcptest, how they work, and the design principles behind them.

What are MCP Tools?

MCP (Model Context Protocol) tools are standalone executables that provide specific functions that can be invoked by AI models. They allow the AI to interact with its environment - performing tasks like reading and writing files, executing commands, or searching for information.

In gomcptest, tools are implemented as independent Go executables that follow a standard protocol for receiving requests and returning results through standard input/output streams. Tool interactions generate events that are captured by the event system, enabling real-time monitoring and transparency.

Tool Architecture

Each tool in gomcptest follows a consistent architecture:

Standard I/O Interface: Tools communicate via stdin/stdout using JSON-formatted requests and responses
Parameter Validation: Tools validate their input parameters according to a JSON schema
Stateless Execution: Each tool invocation is independent and does not maintain state
Controlled Access: Tools implement appropriate security measures and permission checks
Structured Results: Results are returned in a standardized JSON format

Common Components

Most tools share these common components:

Main Function: Parses JSON input, validates parameters, executes the core function, formats and returns the result
Parameter Structure: Defines the expected input parameters for the tool
Result Structure: Defines the format of the tool’s output
Error Handling: Standardized error reporting and handling
Security Checks: Validation to prevent dangerous operations

Tool Categories

The tools in gomcptest can be categorized into several functional groups:

LS: Lists files and directories, providing metadata and structure
GlobTool: Finds files matching specific patterns, making it easier to locate relevant files
GrepTool: Searches file contents using regular expressions, helping find specific information in codebases

Content Management

View: Reads and displays file contents, allowing the model to analyze existing code or documentation
Edit: Makes targeted modifications to files, enabling precise changes without overwriting the entire file
Replace: Completely overwrites file contents, useful for generating new files or making major changes

System Interaction

Bash: Executes shell commands, allowing the model to run commands, scripts, and programs
dispatch_agent: A meta-tool that can create specialized sub-agents for specific tasks

AI/ML Services

imagen: Generates and manipulates images using Google’s Imagen API, enabling visual content creation

Data Processing

duckdbserver: Provides SQL-based data processing capabilities using DuckDB, enabling complex data analysis and transformations

Design Principles

The tools in gomcptest were designed with several key principles in mind:

1. Modularity

Each tool is a standalone executable that can be developed, tested, and deployed independently. This modular approach allows for:

Independent development cycles
Targeted testing
Simpler debugging
Ability to add or replace tools without affecting the entire system

2. Security

Security is a major consideration in the tool design:

Tools validate inputs to prevent injection attacks
File operations are limited to appropriate directories
Bash command execution is restricted with banned commands
Timeouts prevent infinite operations
Process isolation prevents one tool from affecting others

3. Simplicity

The tools are designed to be simple to understand and use:

Clear, focused functionality for each tool
Straightforward parameter structures
Consistent result formats
Well-documented behaviors and limitations

4. Extensibility

The system is designed to be easily extended:

New tools can be added by following the standard protocol
Existing tools can be enhanced with additional parameters
Alternative implementations can replace existing tools

Tool Protocol Details

The communication protocol for tools follows this pattern:

Input Format

Tools receive JSON input on stdin in this format:

{
  "param1": "value1",
  "param2": "value2",
  "param3": 123
}

Output Format

Tools return JSON output on stdout in one of these formats:

Success:

{
  "result": "text result"
}

{
  "results": [
    {"field1": "value1", "field2": "value2"},
    {"field1": "value3", "field2": "value4"}
  ]
}

Error:

{
  "error": "Error message",
  "code": "ERROR_CODE"
}

Implementation Examples

Basic Tool Structure

Most tools follow this basic structure:

package main

import (
	"encoding/json"
	"fmt"
	"os"
)

// Parameters defines the expected input structure
type Parameters struct {
	Param1 string `json:"param1"`
	Param2 int    `json:"param2,omitempty"`
}

// Result defines the output structure
type Result struct {
	Result  string `json:"result,omitempty"`
	Error   string `json:"error,omitempty"`
	Code    string `json:"code,omitempty"`
}

func main() {
	// Parse input
	var params Parameters
	decoder := json.NewDecoder(os.Stdin)
	if err := decoder.Decode(&params); err != nil {
		outputError("Failed to parse input", "INVALID_INPUT")
		return
	}

	// Validate parameters
	if params.Param1 == "" {
		outputError("param1 is required", "MISSING_PARAMETER")
		return
	}

	// Execute core functionality
	result, err := executeTool(params)
	if err != nil {
		outputError(err.Error(), "EXECUTION_ERROR")
		return
	}

	// Return result
	output := Result{Result: result}
	encoder := json.NewEncoder(os.Stdout)
	encoder.Encode(output)
}

func executeTool(params Parameters) (string, error) {
	// Tool-specific logic here
	return "result", nil
}

func outputError(message, code string) {
	result := Result{
		Error: message,
		Code:  code,
	}
	encoder := json.NewEncoder(os.Stdout)
	encoder.Encode(result)
}

Advanced Concepts

Tool Composition

The dispatch_agent tool demonstrates how tools can be composed to create more powerful capabilities. It:

Accepts a high-level task description
Plans a sequence of tool operations to accomplish the task
Executes these operations using the available tools
Synthesizes the results into a coherent response

Error Propagation

The tool error mechanism is designed to provide useful information back to the model:

Error messages are human-readable and descriptive
Error codes allow programmatic handling of specific error types
Stacktraces and debugging information are not exposed to maintain security

Performance Considerations

Tools are designed with performance in mind:

File operations use efficient libraries and patterns
Search operations employ indexing and filtering when appropriate
Large results can be paginated or truncated to prevent context overflows
Resource-intensive operations have configurable timeouts

Future Directions

The tool architecture in gomcptest could evolve in several ways:

Streaming Results: Supporting incremental results for long-running operations
Tool Discovery: More sophisticated mechanisms for models to discover available tools
Tool Chaining: First-class support for composing multiple tools in sequences or pipelines
Interactive Tools: Tools that can engage in multi-step interactions with the model
Persistent State: Optional state maintenance for tools that benefit from context

Conclusion

The MCP tools in gomcptest provide a flexible, secure, and extensible foundation for enabling AI agents to interact with their environment. By understanding the architecture and design principles of these tools, developers can effectively utilize the existing tools, extend them with new capabilities, or create entirely new tools that integrate seamlessly with the system.

5 - AgentFlow: Event-Driven Interface Implementation

Implementation details of AgentFlow’s event-driven web interface, demonstrating how the general event system concepts are applied in practice through real-time tool interactions and streaming responses.

This document explains how AgentFlow implements the general event system architecture in a web-based interface, providing a concrete example of the event-driven patterns described in the foundational concepts. AgentFlow is the embedded web interface for gomcptest’s OpenAI-compatible server.

What is AgentFlow?

AgentFlow is a specific implementation of the gomcptest event system in the form of a modern web-based chat interface. It demonstrates how the general event-driven architecture can be applied to create transparent, real-time agentic interactions through a browser-based UI.

Core Architecture Overview

ChatEngine Interface Design

The foundation of AgentFlow’s functionality rests on the ChatServer interface defined in chatengine/chat_server.go:

type ChatServer interface {
    AddMCPTool(client.MCPClient) error
    ModelList(context.Context) ListModelsResponse
    ModelDetail(ctx context.Context, modelID string) *Model
    ListTools(ctx context.Context) []ListToolResponse
    HandleCompletionRequest(context.Context, ChatCompletionRequest) (ChatCompletionResponse, error)
    SendStreamingChatRequest(context.Context, ChatCompletionRequest) (<-chan StreamEvent, error)
}

This interface abstracts the underlying LLM provider (currently Vertex AI Gemini) and provides a consistent API for tool integration and streaming responses. The key innovation is the SendStreamingChatRequest method that returns a channel of StreamEvent interfaces, enabling real-time event streaming.

OpenAI v1 API Compatibility Strategy

A fundamental design decision was to maintain full compatibility with the OpenAI v1 API while extending it with enhanced functionality. This is achieved through:

Standard Endpoint Preservation: Uses /v1/chat/completions, /v1/models, and /v1/tools endpoints
Parameter Encoding: Tool selection is encoded within the existing model parameter using a pipe-delimited format
Event Extension: Additional events are streamed alongside standard chat completion responses
Backward Compatibility: Existing OpenAI-compatible clients work unchanged

This approach avoids the need to modify standard API endpoints while providing enhanced capabilities through the AgentFlow interface.

Event System Architecture

StreamEvent Interface

The event system is built around the StreamEvent interface in chatengine/stream_event.go:

type StreamEvent interface {
    IsStreamEvent() bool
}

This simple interface allows for polymorphic event handling, where different event types can be processed through the same streaming pipeline.

Event Types and Structure

Tool Call Events

Defined in chatengine/vertexai/gemini/tool_events.go, tool call events capture when the AI decides to use a tool:

type ToolCallEvent struct {
    ID        string          `json:"id"`
    Object    string          `json:"object"`
    Created   int64           `json:"created"`
    EventType string          `json:"event_type"`
    ToolCall  ToolCallDetails `json:"tool_call"`
}

type ToolCallDetails struct {
    ID        string                 `json:"id"`
    Name      string                 `json:"name"`
    Arguments map[string]interface{} `json:"arguments"`
}

Tool Response Events

Tool response events capture the results of tool execution:

type ToolResponseEvent struct {
    ID           string              `json:"id"`
    Object       string              `json:"object"`
    Created      int64               `json:"created"`
    EventType    string              `json:"event_type"`
    ToolResponse ToolResponseDetails `json:"tool_response"`
}

type ToolResponseDetails struct {
    ID       string      `json:"id"`
    Name     string      `json:"name"`
    Response interface{} `json:"response"`
    Error    string      `json:"error,omitempty"`
}

Server-Sent Events Implementation

The streaming implementation in chatengine/chat_completion_stream.go provides the SSE infrastructure:

func (o *OpenAIV1WithToolHandler) streamResponse(w http.ResponseWriter, r *http.Request, req ChatCompletionRequest) {
    w.Header().Set("Content-Type", "text/event-stream")
    w.Header().Set("Cache-Control", "no-cache")
    w.Header().Set("Connection", "keep-alive")
    w.Header().Set("Transfer-Encoding", "chunked")
    
    // Process events from the stream channel
    for event := range stream {
        switch res := event.(type) {
        case ChatCompletionStreamResponse:
            // Handle standard chat completion chunks
        default:
            // Handle tool events if withAllEvents flag is true
            if o.withAllEvents {
                jsonBytes, _ := json.Marshal(event)
                w.Write([]byte("data: " + string(jsonBytes) + "\n\n"))
            }
        }
    }
}

The withAllEvents flag controls whether tool events are included in the stream, allowing for backward compatibility with standard OpenAI clients.

Model and Tool Selection Mechanism

Pipe-Delimited Encoding

The tool selection mechanism is implemented through a clever encoding scheme in the model parameter. The ParseModelAndTools method in chatengine/chat_structure.go parses this format:

func (req *ChatCompletionRequest) ParseModelAndTools() (string, []string) {
    parts := strings.Split(req.Model, "|")
    if len(parts) <= 1 {
        return req.Model, nil
    }

    modelName := strings.TrimSpace(parts[0])
    toolNames := make([]string, 0, len(parts)-1)

    for i := 1; i < len(parts); i++ {
        toolName := strings.TrimSpace(parts[i])
        if toolName != "" {
            toolNames = append(toolNames, toolName)
        }
    }

    return modelName, toolNames
}

This allows formats like:

gemini-2.0-flash (no tool filtering)
gemini-2.0-flash|Edit|View|Bash (specific tools only)
gemini-1.5-pro|VertexAI Code Execution (model with built-in tools)

Tool Filtering Implementation

The Vertex AI Gemini implementation includes sophisticated tool filtering in chatengine/vertexai/gemini/chatsession.go:

func (chatsession *ChatSession) FilterTools(requestedToolNames []string) []*genai.Tool {
    if len(requestedToolNames) == 0 {
        return chatsession.tools // Return all tools if none specified
    }

    var filteredTools []*genai.Tool
    var filteredFunctions []*genai.FunctionDeclaration

    for _, tool := range chatsession.tools {
        // Handle Vertex AI built-in tools separately
        switch {
        case tool.CodeExecution != nil && requestedMap[VERTEXAI_CODE_EXECUTION]:
            filteredTools = append(filteredTools, &genai.Tool{CodeExecution: tool.CodeExecution})
        case tool.GoogleSearch != nil && requestedMap[VERTEXAI_GOOGLE_SEARCH]:
            filteredTools = append(filteredTools, &genai.Tool{GoogleSearch: tool.GoogleSearch})
        // ... handle other built-in tools
        default:
            // Handle MCP function declarations
            for _, function := range tool.FunctionDeclarations {
                if requestedMap[function.Name] {
                    filteredFunctions = append(filteredFunctions, function)
                }
            }
        }
    }

    // Combine function declarations into a single tool
    if len(filteredFunctions) > 0 {
        filteredTools = append(filteredTools, &genai.Tool{
            FunctionDeclarations: filteredFunctions,
        })
    }

    return filteredTools
}

This implementation handles both Vertex AI built-in tools (CodeExecution, GoogleSearch, etc.) and MCP function declarations, ensuring they are properly separated to avoid proto validation errors.

Frontend Event Processing

Real-Time Event Handling

The JavaScript implementation in chat-ui.html.tmpl provides comprehensive event processing through the handleStreamingResponse method:

async handleStreamingResponse(response) {
    const reader = response.body.getReader();
    const decoder = new TextDecoder();
    
    while (true) {
        const { value, done } = await reader.read();
        if (done) break;
        
        const chunk = decoder.decode(value, { stream: true });
        const lines = chunk.split('\n');
        
        for (const line of lines) {
            if (line.startsWith('data: ')) {
                const data = line.slice(6);
                if (data === '[DONE]') return;
                
                try {
                    const parsed = JSON.parse(data);
                    
                    // Handle different event types
                    if (parsed.event_type === 'tool_call') {
                        this.addToolNotification(parsed.tool_call.name, parsed);
                        this.showToolCallPopup(parsed);
                    } else if (parsed.event_type === 'tool_response') {
                        this.updateToolResponsePopup(parsed);
                        this.storeToolResponse(parsed);
                    } else if (parsed.choices && parsed.choices[0]) {
                        // Handle standard chat completion chunks
                        this.updateMessageContent(messageIndex, assistantMessage, true);
                    }
                } catch (e) {
                    // Handle JSON parse errors gracefully
                }
            }
        }
    }
}

AgentFlow implements a sophisticated popup management system to provide real-time feedback on tool execution:

showToolCallPopup(event) {
    const popupId = event.tool_call.id;
    
    // Create popup with loading state
    const popup = document.createElement('div');
    popup.className = 'tool-popup tool-call';
    popup.innerHTML = `
        <div class="tool-popup-header">
            <div class="tool-popup-title">Tool Executing: ${event.tool_call.name}</div>
            <button class="tool-popup-close" onclick="chatUI.closeToolPopup('${popupId}')">×</button>
        </div>
        <div class="tool-popup-content">
            <div class="tool-popup-args">${JSON.stringify(event.tool_call.arguments, null, 2)}</div>
            <div class="tool-popup-spinner"></div>
        </div>
    `;
    
    // Store reference and set auto-close timer
    this.toolPopups.set(popupId, popup);
    this.popupAutoCloseTimers.set(popupId, setTimeout(() => {
        this.closeToolPopup(popupId);
    }, 30000));
}

updateToolResponsePopup(event) {
    const popup = this.toolPopups.get(event.tool_response.id);
    if (!popup) return;
    
    // Update popup with response data
    popup.className = `tool-popup ${event.tool_response.error ? 'tool-error' : 'tool-response'}`;
    // Update content with response...
    
    // Auto-close after showing result
    setTimeout(() => {
        this.closeToolPopup(event.tool_response.id);
    }, 5500);
}

Model and Tool Parameter Encoding

The frontend buildModelWithTools() function implements the pipe-delimited encoding:

buildModelWithTools() {
    let modelString = this.selectedModel;
    
    if (this.selectedTools.size > 0 && this.selectedTools.size < this.tools.length) {
        // Only add tools if not all are selected (all selected means use all tools)
        const toolNames = Array.from(this.selectedTools);
        modelString += '|' + toolNames.join('|');
    }
    
    return modelString;
}

This ensures tool selection is properly encoded in the API request while maintaining OpenAI compatibility.

Technical Design Benefits

Event-Driven Transparency

The event system provides unprecedented visibility into AI decision-making:

Real-Time Feedback: Users see tool calls as they happen
Detailed Information: Full argument and response data available
Error Visibility: Tool failures are clearly communicated
Learning Opportunity: Users understand how AI approaches problems

Scalable Architecture

The channel-based streaming architecture scales well:

Non-Blocking: Event processing doesn’t block the main request thread
Backpressure Handling: Go channels provide natural backpressure
Resource Management: Proper cleanup prevents memory leaks
Error Isolation: Tool failures don’t crash the entire system

OpenAI Compatibility

The design maintains full OpenAI v1 API compatibility:

Standard Endpoints: No custom API modifications required
Parameter Encoding: Tool selection uses existing model parameter
Event Extensions: Additional events don’t interfere with standard responses
Client Compatibility: Existing OpenAI clients work unchanged

Integration Points

MCP Protocol Integration

AgentFlow seamlessly integrates with the Model Context Protocol:

Tool Discovery: Automatic detection of MCP server capabilities
Dynamic Loading: Tools can be added/removed without restart
Protocol Abstraction: MCP details are hidden from the UI
Error Handling: MCP errors are gracefully handled and displayed

Vertex AI Integration

The Vertex AI backend provides:

Built-in Tools: Code execution, Google Search, etc.
Model Selection: Multiple Gemini model variants
Streaming Support: Native streaming for real-time responses
Tool Mixing: Combines MCP tools with Vertex AI capabilities

This comprehensive architecture enables AgentFlow to provide an intuitive, powerful interface for agentic interactions while maintaining compatibility with existing OpenAI tooling and providing deep visibility into the AI’s decision-making process.