Documentation
gomcptest is a proof of concept (POC) demonstrating how to implement a Model Context Protocol (MCP) with a custom-built host to play with agentic systems.
gomcptest Documentation
Welcome to the gomcptest documentation. This project is a proof of concept (POC) demonstrating how to implement a Model Context Protocol (MCP) with a custom-built host to play with agentic systems.
Documentation Structure
Our documentation follows the Divio Documentation Framework, which organizes content into four distinct types: tutorials, how-to guides, reference, and explanation. This approach ensures that different learning needs are addressed with the appropriate content format.
Tutorials: Learning-oriented content
Tutorials are lessons that take you by the hand through a series of steps to complete a project. They focus on learning by doing, and help beginners get started with the system.
How-to Guides: Problem-oriented content
How-to guides are recipes that guide you through the steps involved in addressing key problems and use cases. They are practical and goal-oriented.
Reference: Information-oriented content
Reference guides are technical descriptions of the machinery and how to operate it. They describe how things work in detail and are accurate and complete.
Reference | Description |
---|
Tools Reference | Comprehensive reference of all available MCP-compatible tools, their parameters, response formats, and error handling. |
OpenAI-Compatible Server Reference | Technical documentation of the server’s architecture, AgentFlow UI, API endpoints, configuration options, and Vertex AI integration. |
cliGCP Reference | ⚠️ DEPRECATED: Legacy CLI reference. Use AgentFlow UI instead. |
Explanation: Understanding-oriented content
Explanation documents discuss and clarify concepts to broaden the reader’s understanding of topics. They provide context and illuminate ideas.
Explanation | Description |
---|
gomcptest Architecture | Deep dive into the system architecture, design decisions, and how the various components interact to create a custom MCP host. |
Understanding the Model Context Protocol (MCP) | Exploration of what MCP is, how it works, design decisions behind it, and how it compares to alternative approaches for LLM tool integration. |
AgentFlow: Modern Web Interface | Comprehensive guide to AgentFlow’s features including tool selection, real-time event notifications, mobile optimization, and conversation management. |
Project Components
gomcptest consists of several key components that work together:
Host Components
- OpenAI-compatible server (
host/openaiserver
): A server that implements the OpenAI API interface and connects to Google’s Vertex AI for model inference. Includes the modern AgentFlow web UI for interactive chat. - cliGCP (
host/cliGCP
): ⚠️ DEPRECATED - Legacy command-line interface. Use the AgentFlow web UI instead.
AgentFlow Web UI
The modern web-based interface is embedded in the openaiserver binary and provides:
- Mobile-optimized design with Apple touch icon support
- Real-time streaming responses via Server-Sent Events
- Professional styling with accessibility features
- Conversation management with persistent history
- File upload support including PDFs
- Embedded architecture for easy deployment via
/ui
endpoint
Access AgentFlow by running ./bin/openaiserver
and visiting http://localhost:8080/ui
The tools
directory contains various MCP-compatible tools:
File System Operations
- Bash: Executes bash commands in a persistent shell session
- Edit: Modifies file content by replacing specified text
- GlobTool: Finds files matching glob patterns
- GrepTool: Searches file contents using regular expressions
- LS: Lists files and directories
- Replace: Completely replaces a file’s contents
- View: Reads file contents
- imagen: Generates images using Google’s Imagen API
- imagen_edit: Edits images using natural language instructions
- plantuml: Generates PlantUML diagram URLs with syntax validation
- plantuml_check: Validates PlantUML file syntax
- duckdbserver: Provides SQL-based data processing with DuckDB
- dispatch_agent: Launches specialized sub-agents for specific tasks
- sleep: Pauses execution for testing and demonstrations
1 - Tutorials
Step-by-step guides to get you started with gomcptest
Tutorials are learning-oriented guides that take you through a series of steps to complete a project. They focus on learning by doing, and helping beginners get started with the system.
These tutorials will help you get familiar with gomcptest and its components.
1.1 - Getting Started with gomcptest
Get gomcptest up and running quickly with this beginner’s guide
This tutorial will take you through building and running your first AI agent system with gomcptest. By the end, you’ll have a working agent that can help you manage files and execute commands on your system.
What you’ll accomplish: Set up gomcptest, build the tools, and have your first conversation with an AI agent that can actually help you with real tasks.
For background on what gomcptest is and how it works, see the Architecture explanation.
Prerequisites
- Go >= 1.21 installed on your system
- Google Cloud account with access to Vertex AI API
- Google Cloud CLI installed
- Basic familiarity with terminal/command line
Setting up Google Cloud Authentication
Before using gomcptest with Google Cloud Platform services like Vertex AI, you need to set up your authentication.
1. Initialize the Google Cloud CLI
If you haven’t already configured the Google Cloud CLI, run:
This interactive command will guide you through:
- Logging into your Google account
- Selecting a Google Cloud project
- Setting default configurations
2. Log in to Google Cloud
Authenticate your gcloud CLI with your Google account:
This will open a browser window where you can sign in to your Google account.
3. Set up Application Default Credentials (ADC)
Application Default Credentials are used by client libraries to automatically find credentials when connecting to Google Cloud services:
gcloud auth application-default login
This command will:
- Open a browser window for authentication
- Store your credentials locally (typically in
~/.config/gcloud/application_default_credentials.json
) - Configure your environment to use these credentials when accessing Google Cloud APIs
These credentials will be used by gomcptest when interacting with Google Cloud services.
Project Setup
Clone the repository:
git clone https://github.com/owulveryck/gomcptest.git
cd gomcptest
Build All Components: Compile tools and servers using the root Makefile
# Build all tools and servers
make all
# Or build only tools
make tools
# Or build only servers
make servers
Set up your environment: Configure Google Cloud Project
# Set your project ID (replace with your actual project ID)
export GCP_PROJECT="your-project-id"
export GCP_REGION="us-central1"
export GEMINI_MODELS="gemini-2.0-flash"
export PORT=8080
Step 4: Start Your First AI Agent
Now let’s start the OpenAI-compatible server with the AgentFlow web interface:
cd host/openaiserver
go run . -withAllEvents -mcpservers "../../bin/LS;../../bin/View;../../bin/Bash;../../bin/GlobTool"
⚠️ Note: We’re using the -withAllEvents
flag to enable full tool event streaming, which is essential for seeing the real-time tool execution notifications in the AgentFlow UI.
You should see output like:
2024/01/15 10:30:00 Starting OpenAI-compatible server on port 8080
2024/01/15 10:30:00 Registered MCP tool: LS
2024/01/15 10:30:00 Registered MCP tool: View
2024/01/15 10:30:00 Registered MCP tool: Bash
2024/01/15 10:30:00 Registered MCP tool: GlobTool
2024/01/15 10:30:00 AgentFlow UI available at: http://localhost:8080/ui
Step 5: Have Your First Agent Conversation
Open the AgentFlow UI: Navigate to http://localhost:8080/ui
in your browser
Test basic interaction: Type this message in the chat:
Hello! Can you help me understand what files are in the current directory?
Watch the magic happen: You’ll see:
- The AI agent decides to use the LS tool
- A blue notification appears showing “Calling tool: LS”
- The tool executes and shows your directory contents
- The AI explains what it found
Try a more advanced task: Ask the agent:
Find all .go files in this project and tell me about the project structure
Watch as the agent:
- Uses GlobTool to find .go files
- Uses View to examine some files
- Gives you an analysis of the project structure
Congratulations! 🎉
You’ve just built and run your first AI agent system! Your agent can now:
- ✅ Navigate your file system
- ✅ Read file contents
- ✅ Execute commands
- ✅ Find files matching patterns
- ✅ Provide intelligent analysis of what it discovers
What You’ve Learned
Through this hands-on experience, you’ve:
- Set up authentication with Google Cloud
- Built MCP-compatible tools from source
- Started an OpenAI-compatible server
- Used the AgentFlow web interface
- Watched an AI agent use tools to accomplish real tasks
Next Steps
Now that your agent is working, explore what else it can do:
1.2 - Building Your First OpenAI-Compatible Server
Set up and run an OpenAI-compatible server with AgentFlow UI for interactive tool management and real-time event monitoring
This tutorial will guide you step-by-step through running and configuring the OpenAI-compatible server in gomcptest with the AgentFlow web interface. By the end, you’ll have a working server with a modern UI that provides tool selection, real-time event monitoring, and interactive chat capabilities.
Prerequisites
- Go >= 1.21 installed
- Access to Google Cloud Platform with Vertex AI API enabled
- GCP authentication set up via
gcloud auth login
- Basic familiarity with terminal commands
- The gomcptest repository cloned and tools built (see the Getting Started guide)
Step 1: Set Up Environment Variables
The OpenAI server requires several environment variables. Create a .envrc file in the host/openaiserver directory:
cd host/openaiserver
touch .envrc
Add the following content to the .envrc file, adjusting the values according to your setup:
# Server configuration
PORT=8080
LOG_LEVEL=INFO
# GCP configuration
GCP_PROJECT=your-gcp-project-id
GCP_REGION=us-central1
GEMINI_MODELS=gemini-2.0-flash
Note: IMAGE_DIR
and IMAGEN_MODELS
environment variables are no longer needed for the openaiserver host. Image generation is now handled by the independent tools/imagen
MCP server.
Load the environment variables:
Step 2: Start the OpenAI Server with AgentFlow UI
Now you can start the OpenAI-compatible server with the embedded AgentFlow interface:
cd host/openaiserver
go run . -withAllEvents -mcpservers "../bin/GlobTool;../bin/GrepTool;../bin/LS;../bin/View;../bin/Bash;../bin/Replace"
⚠️ Important: We’re using the -withAllEvents
flag to enable streaming of all tool execution events. This is essential for the real-time tool monitoring features in AgentFlow.
You should see output indicating that the server has started and registered the MCP tools.
Step 3: Access the AgentFlow Web Interface
Open your web browser and navigate to:
http://localhost:8080/ui
You’ll see the AgentFlow interface with:
- Modern chat interface with mobile-optimized design
- Tool selection dropdown showing all available MCP tools (GlobTool, GrepTool, LS, View, Bash, Replace)
- Model selection with Vertex AI tools support
- Real-time event monitoring for tool calls and responses
- View Available Tools: Click the “Tools: All” button to see the tool dropdown
- Select Specific Tools: Uncheck tools you don’t want to use for focused interactions
- Tool Information: Each tool shows its name and description
- Apply Selection: Your selection is automatically applied to new conversations
As you interact with the AI agent, you’ll see real-time notifications when:
- Tool Calls: Blue notifications appear when the AI decides to use a tool
- Tool Responses: Results are displayed as they complete
- Event Details: Click notifications to see detailed tool arguments and responses
Step 4: Test AgentFlow with Interactive Chat
In the AgentFlow interface, try these interactive examples:
Basic Chat Test
- Type in the chat input: “Hello, what can you do?”
- Send the message and observe the response
- Notice how the AI explains its capabilities and available tools
- Ask: “List the files in the current directory”
- Watch as AgentFlow shows:
- Tool Call Notification: “Calling tool: LS” appears immediately
- Tool Call Popup: Shows the LS tool being called with its arguments
- Tool Response: Displays the directory listing result
- AI Response: The model interprets and explains the results
- Click “Tools: All” to open the tool selector
- Uncheck all tools except “View” and “LS”
- Ask: “Show me the contents of README.md”
- Notice how the AI can only use the selected tools (View for reading, LS for listing)
Event Monitoring
Throughout your interactions, observe:
- Real-time Events: Tool calls appear instantly as blue notifications
- Event History: All tool interactions are preserved in the chat
- Detailed Information: Click on tool notifications to see arguments and responses
Step 5: Alternative API Testing (Optional)
You can also test the server programmatically using curl:
curl http://localhost:8080/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "gemini-2.0-flash",
"messages": [
{
"role": "user",
"content": "List the files in the current directory"
}
]
}'
However, the AgentFlow UI provides much richer feedback and interaction capabilities.
What You’ve Learned
In this tutorial, you’ve:
- Set up the environment for the OpenAI-compatible server with AgentFlow UI
- Built and registered MCP tools
- Started the server with embedded web interface
- Accessed the modern AgentFlow web interface
- Used interactive tool selection and monitoring
- Experienced real-time tool event notifications
- Tested both UI and API interactions
Key AgentFlow Features Demonstrated
- Tool Selection: Granular control over which tools are available to the AI
- Real-time Events: Live monitoring of tool calls and responses
- Event Notifications: Visual feedback for tool interactions
- Mobile Optimization: Responsive design that works on all devices
- Interactive Chat: Modern conversation interface with rich formatting
Next Steps
Now that you have a working AgentFlow-enabled server, you can:
For detailed information about tool selection, event monitoring, and other AgentFlow features, see the comprehensive AgentFlow Documentation.
1.3 - Using the cliGCP Command Line Interface
Set up and use the cliGCP command line interface to interact with LLMs and MCP tools
This tutorial guides you through setting up and using the cliGCP command line interface to interact with LLMs and MCP tools. By the end, you’ll be able to run the CLI and perform basic tasks with it.
Prerequisites
- Go >= 1.21 installed on your system
- Access to Google Cloud Platform with Vertex AI API enabled
- GCP authentication set up via
gcloud auth login
- The gomcptest repository cloned and tools built (see the Getting Started guide)
The cliGCP tool is a command-line interface similar to tools like Claude Code. It connects directly to the Google Cloud Platform’s Vertex AI API to access Gemini models and can use local MCP tools to perform actions on your system.
First, build the cliGCP tool if you haven’t already:
cd gomcptest
make all # This builds all tools including cliGCP
If you only want to build cliGCP, you can run:
cd host/cliGCP/cmd
go build -o ../../../bin/cliGCP
Step 3: Set Up Environment Variables
The cliGCP tool requires environment variables for GCP configuration. You can set these directly or create an .envrc file:
Add the following content to the .envrc file:
export GCP_PROJECT=your-gcp-project-id
export GCP_REGION=us-central1
export GEMINI_MODELS=gemini-2.0-flash
Note: IMAGEN_MODELS
and IMAGE_DIR
are no longer needed for the cliGCP host. Image generation is available through the independent tools/imagen
MCP server.
Load the environment variables:
Now you can run the cliGCP tool with MCP tools:
cd bin
./cliGCP -mcpservers "./GlobTool;./GrepTool;./LS;./View;./dispatch_agent -glob-path ./GlobTool -grep-path ./GrepTool -ls-path ./LS -view-path ./View;./Bash;./Replace"
You should see a welcome message and a prompt where you can start interacting with the CLI.
Step 5: Simple Queries
Let’s try a few simple interactions:
> Hello, who are you?
You should get a response introducing the agent.
Now let’s try using some of the MCP tools:
> List the files in the current directory
The CLI should call the LS tool and show you the files in the current directory.
> Search for files with "go" in the name
The CLI will use the GlobTool to find files matching that pattern.
> Read the README.md file
The CLI will use the View tool to show you the contents of the README.md file.
Step 7: Creating a Simple Task
Let’s create a simple task that combines multiple tools:
> Create a new file called test.txt with the text "Hello, world!" and then verify it exists
The CLI should:
- Use the Replace tool to create the file
- Use the LS tool to verify the file exists
- Use the View tool to show you the contents of the file
What You’ve Learned
In this tutorial, you’ve:
- Set up the cliGCP environment
- Run the CLI with MCP tools
- Performed basic interactions with the CLI
- Used various tools through the CLI to manipulate files
- Created a simple workflow combining multiple tools
Next Steps
Now that you’re familiar with the cliGCP tool, you can:
- Explore more complex tasks that use multiple tools
- Try using the dispatch_agent for more complex operations
- Create custom tools and use them with the CLI
- Experiment with different Gemini models
Check out the How to Configure the cliGCP Tool guide for advanced configuration options.
2 - How-To Guides
Practical guides for solving specific problems with gomcptest
How-to guides are problem-oriented recipes that guide you through the steps involved in addressing key problems and use cases. They are practical and goal-oriented.
These guides will help you solve specific tasks and customize gomcptest for your needs.
Available How-To Guides
2.1 - How to Create a Custom MCP Tool
Build your own Model Context Protocol (MCP) compatible tools
This guide shows you how to create a new custom tool that’s compatible with the Model Context Protocol (MCP) in gomcptest.
Prerequisites
- A working installation of gomcptest
- Go programming knowledge
- Understanding of the MCP protocol basics
mkdir -p tools/YourToolName/cmd
2. Create the README.md file
Create a README.md
in the tool directory with documentation:
touch tools/YourToolName/README.md
Include the following sections:
- Tool description
- Parameters
- Usage notes
- Example
3. Create the main.go file
Create a main.go
file in the cmd directory:
touch tools/YourToolName/cmd/main.go
Here’s a template to start with:
package main
import (
"encoding/json"
"fmt"
"log"
"os"
"github.com/mark3labs/mcp-go"
)
// Define your tool's parameters structure
type Params struct {
// Add your parameters here
// Example:
InputParam string `json:"input_param"`
}
func main() {
server := mcp.NewServer()
// Register your tool function
server.RegisterFunction("YourToolName", func(params json.RawMessage) (any, error) {
var p Params
if err := json.Unmarshal(params, &p); err != nil {
return nil, fmt.Errorf("failed to parse parameters: %w", err)
}
// Implement your tool's logic here
result := doSomethingWithParams(p)
return result, nil
})
if err := server.Run(os.Stdin, os.Stdout); err != nil {
log.Fatalf("Server error: %v", err)
}
}
func doSomethingWithParams(p Params) interface{} {
// Your tool's core functionality
// ...
// Return the result
return map[string]interface{}{
"result": "Your processed result",
}
}
Open the Makefile in the root directory and add your tool:
YourToolName:
go build -o bin/YourToolName tools/YourToolName/cmd/main.go
Also add it to the all
target.
Test the tool directly:
echo '{"name":"YourToolName","params":{"input_param":"test"}}' | ./bin/YourToolName
8. Use with the CLI
Add your tool to the CLI command:
./bin/cliGCP -mcpservers "./GlobTool;./GrepTool;./LS;./View;./YourToolName;./dispatch_agent;./Bash;./Replace"
- Focus on a single, well-defined purpose
- Provide clear error messages
- Include meaningful response formatting
- Implement proper parameter validation
- Handle edge cases gracefully
- Consider adding unit tests in a _test.go file
2.2 - How to Configure the OpenAI-Compatible Server
Customize the OpenAI-compatible server with AgentFlow UI for different use cases, including tool selection, event monitoring, and production deployment
This guide shows you how to configure and customize the OpenAI-compatible server in gomcptest with the AgentFlow web interface for different use cases.
Prerequisites
- A working installation of gomcptest
- Basic familiarity with the OpenAI server from the tutorial
- Understanding of environment variables and configuration
Environment Variables Configuration
Basic Server Configuration
The OpenAI server can be configured using the following environment variables:
# Server port (default: 8080)
export PORT=8080
# Log level: DEBUG, INFO, WARN, ERROR (default: INFO)
export LOG_LEVEL=INFO
GCP Configuration
Configure the Google Cloud Platform integration:
# GCP Project ID (required)
export GCP_PROJECT=your-gcp-project-id
# GCP Region (default: us-central1)
export GCP_REGION=us-central1
# Comma-separated list of Gemini models (default: gemini-1.5-pro,gemini-2.0-flash)
export GEMINI_MODELS=gemini-1.5-pro,gemini-2.0-flash
Note: IMAGEN_MODELS
and IMAGE_DIR
environment variables are no longer needed for the openaiserver. Image generation is now handled by the independent tools/imagen
MCP server.
Enable additional Vertex AI built-in tools:
# Enable code execution capabilities
export VERTEX_AI_CODE_EXECUTION=true
# Enable Google Search integration
export VERTEX_AI_GOOGLE_SEARCH=true
# Enable Google Search with retrieval and grounding
export VERTEX_AI_GOOGLE_SEARCH_RETRIEVAL=true
AgentFlow UI Configuration
The AgentFlow web interface provides several configuration options for enhanced user experience.
Accessing AgentFlow
Once your server is running, access AgentFlow at:
http://localhost:8080/ui
AgentFlow allows granular control over tool availability:
- Default Behavior: All tools are available by default
- Tool Filtering: Use the tool selection dropdown to choose specific tools
- Model String Format: Selected tools are encoded as
model|tool1|tool2|tool3
Example tool selection scenarios:
- Development: Enable only
Edit
, View
, GlobTool
, and GrepTool
for code editing tasks - File Management: Enable only
LS
, View
, Bash
for system administration - Content Creation: Enable
View
, Replace
, Edit
for document editing
Event Monitoring Configuration
AgentFlow provides real-time tool event monitoring:
- Tool Call Events: See when AI decides to use tools
- Tool Response Events: Monitor tool execution results
- Event Persistence: All events are saved in conversation history
- Event Details: Click notifications for detailed argument/response information
Mobile and PWA Configuration
For mobile deployment, AgentFlow supports Progressive Web App features:
- Apple Touch Icons: Pre-configured for iOS web app installation
- Responsive Design: Optimized for mobile devices
- Web App Manifest: Supports “Add to Home Screen” functionality
- Offline Capability: Conversations persist offline
UI Customization
You can customize AgentFlow by modifying the template file:
host/openaiserver/simpleui/chat-ui.html.tmpl
Key customization areas:
- Color Scheme: Modify CSS gradient backgrounds
- Tool Notification Styling: Customize event notification appearance
- Mobile Behavior: Adjust responsive breakpoints
- Branding: Update titles, icons, and metadata
Setting Up a Production Environment
For a production environment, create a proper systemd service file:
sudo nano /etc/systemd/system/gomcptest-openai.service
Add the following content:
[Unit]
Description=gomcptest OpenAI Server
After=network.target
[Service]
User=yourusername
WorkingDirectory=/path/to/gomcptest/host/openaiserver
ExecStart=/path/to/gomcptest/host/openaiserver/openaiserver -mcpservers "/path/to/gomcptest/bin/GlobTool;/path/to/gomcptest/bin/GrepTool;/path/to/gomcptest/bin/LS;/path/to/gomcptest/bin/View;/path/to/gomcptest/bin/Bash;/path/to/gomcptest/bin/Replace"
Environment=PORT=8080
Environment=LOG_LEVEL=INFO
Environment=GCP_PROJECT=your-gcp-project-id
Environment=GCP_REGION=us-central1
Environment=GEMINI_MODELS=gemini-1.5-pro,gemini-2.0-flash
Restart=on-failure
[Install]
WantedBy=multi-user.target
Then enable and start the service:
sudo systemctl enable gomcptest-openai
sudo systemctl start gomcptest-openai
To add custom MCP tools to the server, include them in the -mcpservers
parameter when starting the server:
go run . -mcpservers "../bin/GlobTool;../bin/GrepTool;../bin/LS;../bin/View;../bin/YourCustomTool;../bin/Bash;../bin/Replace"
Some tools require additional parameters. You can specify these after the tool path:
go run . -mcpservers "../bin/GlobTool;../bin/dispatch_agent -glob-path ../bin/GlobTool -grep-path ../bin/GrepTool -ls-path ../bin/LS -view-path ../bin/View"
API Usage Configuration
Enabling CORS
For web applications, you may need to enable CORS. Add a middleware to the main.go file:
package main
import (
"net/http"
// other imports
)
// CORS middleware
func corsMiddleware(next http.Handler) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.Header().Set("Access-Control-Allow-Origin", "*")
w.Header().Set("Access-Control-Allow-Methods", "GET, POST, OPTIONS")
w.Header().Set("Access-Control-Allow-Headers", "Content-Type, Authorization")
if r.Method == "OPTIONS" {
w.WriteHeader(http.StatusOK)
return
}
next.ServeHTTP(w, r)
})
}
func main() {
// existing code...
http.Handle("/", corsMiddleware(openAIHandler))
// existing code...
}
Setting Rate Limits
Add a simple rate limiting middleware:
package main
import (
"net/http"
"sync"
"time"
// other imports
)
type RateLimiter struct {
requests map[string][]time.Time
maxRequests int
timeWindow time.Duration
mu sync.Mutex
}
func NewRateLimiter(maxRequests int, timeWindow time.Duration) *RateLimiter {
return &RateLimiter{
requests: make(map[string][]time.Time),
maxRequests: maxRequests,
timeWindow: timeWindow,
}
}
func (rl *RateLimiter) Middleware(next http.Handler) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
ip := r.RemoteAddr
rl.mu.Lock()
// Clean up old requests
now := time.Now()
if reqs, exists := rl.requests[ip]; exists {
var validReqs []time.Time
for _, req := range reqs {
if now.Sub(req) <= rl.timeWindow {
validReqs = append(validReqs, req)
}
}
rl.requests[ip] = validReqs
}
// Check if rate limit is exceeded
if len(rl.requests[ip]) >= rl.maxRequests {
rl.mu.Unlock()
http.Error(w, "Rate limit exceeded", http.StatusTooManyRequests)
return
}
// Add current request
rl.requests[ip] = append(rl.requests[ip], now)
rl.mu.Unlock()
next.ServeHTTP(w, r)
})
}
func main() {
// existing code...
rateLimiter := NewRateLimiter(10, time.Minute) // 10 requests per minute
http.Handle("/", rateLimiter.Middleware(corsMiddleware(openAIHandler)))
// existing code...
}
Adjusting Memory Usage
For high-load scenarios, adjust Go’s garbage collector:
export GOGC=100 # Default is 100, lower values lead to more frequent GC
Increasing Concurrency
If handling many concurrent requests, adjust the server’s concurrency limits:
package main
import (
"net/http"
// other imports
)
func main() {
// existing code...
server := &http.Server{
Addr: ":" + strconv.Itoa(cfg.Port),
Handler: openAIHandler,
ReadTimeout: 30 * time.Second,
WriteTimeout: 120 * time.Second,
IdleTimeout: 120 * time.Second,
MaxHeaderBytes: 1 << 20,
}
err = server.ListenAndServe()
// existing code...
}
Troubleshooting Common Issues
Debugging Connection Problems
If you’re experiencing connection issues, set the log level to DEBUG:
Common Error Messages
- Failed to create MCP client: Ensure the tool path is correct and the tool is executable
- Failed to load GCP config: Check your GCP environment variables
- Error in LLM request: Verify your GCP credentials and project access
To verify tools are registered correctly, look for log messages like:
INFO server0 Registering command=../bin/GlobTool
INFO server1 Registering command=../bin/GrepTool
2.3 - How to Configure the cliGCP Command Line Interface
Customize the cliGCP tool with environment variables and command-line options
This guide shows you how to configure and customize the cliGCP command line interface for various use cases.
Prerequisites
- A working installation of gomcptest
- Basic familiarity with the cliGCP tool from the tutorial
- Understanding of environment variables and configuration
Command Line Arguments
The cliGCP tool accepts the following command line arguments:
# Specify the MCP servers to use (required)
-mcpservers "tool1;tool2;tool3"
# Example with tool arguments
./cliGCP -mcpservers "./GlobTool;./GrepTool;./dispatch_agent -glob-path ./GlobTool -grep-path ./GrepTool -ls-path ./LS -view-path ./View;./Bash"
Environment Variables Configuration
GCP Configuration
Configure the Google Cloud Platform integration with these environment variables:
# GCP Project ID (required)
export GCP_PROJECT=your-gcp-project-id
# GCP Region (default: us-central1)
export GCP_REGION=us-central1
# Comma-separated list of Gemini models (required)
export GEMINI_MODELS=gemini-1.5-pro,gemini-2.0-flash
# Directory to store images (required for image generation)
export IMAGE_DIR=/path/to/image/directory
Advanced Configuration
You can customize the behavior of the cliGCP tool with these additional environment variables:
# Set a custom system instruction for the model
export SYSTEM_INSTRUCTION="You are a helpful assistant specialized in Go programming."
# Adjust the model's temperature (0.0-1.0, default is 0.2)
# Lower values make output more deterministic, higher values more creative
export MODEL_TEMPERATURE=0.3
# Set a maximum token limit for responses
export MAX_OUTPUT_TOKENS=2048
Creating Shell Aliases
To simplify usage, create shell aliases in your .bashrc
or .zshrc
:
# Add to ~/.bashrc or ~/.zshrc
alias gpt='cd /path/to/gomcptest/bin && ./cliGCP -mcpservers "./GlobTool;./GrepTool;./LS;./View;./Bash;./Replace"'
# Create specialized aliases for different tasks
alias code-assistant='cd /path/to/gomcptest/bin && GCP_PROJECT=your-project GEMINI_MODELS=gemini-2.0-flash ./cliGCP -mcpservers "./GlobTool;./GrepTool;./LS;./View;./Bash;./Replace"'
alias security-scanner='cd /path/to/gomcptest/bin && SYSTEM_INSTRUCTION="You are a security expert focused on finding vulnerabilities in code" ./cliGCP -mcpservers "./GlobTool;./GrepTool;./LS;./View;./Bash"'
Customizing the System Instruction
To modify the default system instruction, edit the agent.go
file:
// In host/cliGCP/cmd/agent.go
genaimodels[model].SystemInstruction = &genai.Content{
Role: "user",
Parts: []genai.Part{
genai.Text("You are a helpful agent with access to tools. " +
"Your job is to help the user by performing tasks using these tools. " +
"You should not make up information. " +
"If you don't know something, say so and explain what you would need to know to help. " +
"If not indication, use the current working directory which is " + cwd),
},
}
Creating Task-Specific Configurations
For different use cases, you can create specialized configuration scripts:
Code Review Helper
Create a file called code-reviewer.sh
:
#!/bin/bash
export GCP_PROJECT=your-gcp-project-id
export GCP_REGION=us-central1
export GEMINI_MODELS=gemini-2.0-flash
export IMAGE_DIR=/tmp/images
export SYSTEM_INSTRUCTION="You are a code review expert. Analyze code for bugs, security issues, and areas for improvement. Focus on providing constructive feedback and detailed explanations."
cd /path/to/gomcptest/bin
./cliGCP -mcpservers "./GlobTool;./GrepTool;./LS;./View;./Bash"
Make it executable:
chmod +x code-reviewer.sh
Documentation Generator
Create a file called doc-generator.sh
:
#!/bin/bash
export GCP_PROJECT=your-gcp-project-id
export GCP_REGION=us-central1
export GEMINI_MODELS=gemini-2.0-flash
export IMAGE_DIR=/tmp/images
export SYSTEM_INSTRUCTION="You are a documentation specialist. Your task is to help create clear, comprehensive documentation for code. Analyze code structure and create appropriate documentation following best practices."
cd /path/to/gomcptest/bin
./cliGCP -mcpservers "./GlobTool;./GrepTool;./LS;./View;./Bash;./Replace"
Configuring dispatch_agent
When using the dispatch_agent tool, you can configure its behavior with additional arguments:
./cliGCP -mcpservers "./GlobTool;./GrepTool;./LS;./View;./dispatch_agent -glob-path ./GlobTool -grep-path ./GrepTool -ls-path ./LS -view-path ./View -timeout 30s;./Bash;./Replace"
You can create specialized tool combinations for different tasks:
# Web development toolset
./cliGCP -mcpservers "./GlobTool -include '*.{html,css,js}';./GrepTool;./LS;./View;./Bash;./Replace"
# Go development toolset
./cliGCP -mcpservers "./GlobTool -include '*.go';./GrepTool;./LS;./View;./Bash;./Replace"
Troubleshooting Common Issues
Model Connection Issues
If you’re having trouble connecting to the Gemini model:
- Verify your GCP credentials:
gcloud auth application-default print-access-token
- Check that the Vertex AI API is enabled:
gcloud services list --enabled | grep aiplatform
- Verify your project has access to the models you’re requesting
If tools are failing to execute:
- Ensure the tool paths are correct
- Verify the tools are executable
- Check for permission issues in the directories you’re accessing
For better performance:
- Use more specific tool patterns to reduce search scope
- Consider creating specialized agents for different tasks
- Set a lower temperature for more deterministic responses
2.4 - How to Query OpenAI Server with Tool Events
Learn how to programmatically query the OpenAI-compatible server and monitor tool execution events using curl, Python, or shell commands
This guide shows you how to programmatically interact with the gomcptest OpenAI-compatible server to execute tools and monitor their execution events in real-time.
Prerequisites
- A running gomcptest OpenAI server with tools registered
- Basic familiarity with HTTP requests and Server-Sent Events
curl
command-line tool or Python with requests
library
The gomcptest server supports two streaming modes:
- Standard Mode (default): Only chat completion chunks (OpenAI compatible)
- Enhanced Mode: Includes tool execution events (
tool_call
and tool_response
)
Tool events are only visible when the server is configured with enhanced streaming or when using the AgentFlow web UI.
Method 1: Using curl with Streaming
First, let’s execute a simple tool like sleep
with streaming enabled:
curl -X POST http://localhost:8080/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Accept: text/event-stream" \
-N \
-d '{
"model": "gemini-2.0-flash",
"messages": [
{
"role": "user",
"content": "Please use the sleep tool to pause for 2 seconds, then tell me you are done"
}
],
"stream": true
}'
Expected Output (Standard Mode)
In standard mode, you’ll see only chat completion chunks:
data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","created":1704067200,"model":"gemini-2.0-flash","choices":[{"index":0,"delta":{"role":"assistant","content":"I'll"},"finish_reason":null}]}
data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","created":1704067200,"model":"gemini-2.0-flash","choices":[{"index":0,"delta":{"content":" use the sleep tool to pause for 2 seconds."},"finish_reason":null}]}
data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","created":1704067200,"model":"gemini-2.0-flash","choices":[{"index":0,"delta":{"content":" Done! I have completed the 2-second pause."},"finish_reason":"stop"}]}
data: [DONE]
Here’s a Python script that demonstrates how to capture and parse tool events:
import requests
import json
import time
def stream_with_tool_monitoring(prompt, model="gemini-2.0-flash"):
"""
Send a streaming request and monitor tool events
"""
url = "http://localhost:8080/v1/chat/completions"
payload = {
"model": model,
"messages": [{"role": "user", "content": prompt}],
"stream": True
}
headers = {
"Content-Type": "application/json",
"Accept": "text/event-stream"
}
print(f"🚀 Sending request: {prompt}")
print("=" * 60)
with requests.post(url, json=payload, headers=headers, stream=True) as response:
if response.status_code != 200:
print(f"❌ Error: {response.status_code} - {response.text}")
return
content_buffer = ""
tool_calls = {}
for line in response.iter_lines(decode_unicode=True):
if line.startswith("data: "):
data = line[6:] # Remove "data: " prefix
if data == "[DONE]":
print("\n✅ Stream completed")
break
try:
event = json.loads(data)
# Handle different event types
if event.get("event_type") == "tool_call":
tool_info = event.get("tool_call", {})
tool_id = tool_info.get("id")
tool_name = tool_info.get("name")
tool_args = tool_info.get("arguments", {})
print(f"🔧 Tool Call: {tool_name}")
print(f" ID: {tool_id}")
print(f" Arguments: {json.dumps(tool_args, indent=2)}")
tool_calls[tool_id] = {
"name": tool_name,
"args": tool_args,
"start_time": time.time()
}
elif event.get("event_type") == "tool_response":
tool_info = event.get("tool_response", {})
tool_id = tool_info.get("id")
tool_name = tool_info.get("name")
response_data = tool_info.get("response")
error = tool_info.get("error")
if tool_id in tool_calls:
duration = time.time() - tool_calls[tool_id]["start_time"]
print(f"📥 Tool Response: {tool_name} (took {duration:.2f}s)")
else:
print(f"📥 Tool Response: {tool_name}")
print(f" ID: {tool_id}")
if error:
print(f" ❌ Error: {error}")
else:
print(f" ✅ Response: {response_data}")
elif event.get("choices") and event["choices"][0].get("delta"):
# Handle chat completion chunks
delta = event["choices"][0]["delta"]
if delta.get("content"):
content_buffer += delta["content"]
print(delta["content"], end="", flush=True)
except json.JSONDecodeError:
continue
if content_buffer:
print(f"\n💬 Complete Response: {content_buffer}")
# Example usage
if __name__ == "__main__":
# Test with sleep tool
stream_with_tool_monitoring(
"Use the sleep tool to pause for 3 seconds, then tell me the current time"
)
print("\n" + "=" * 60)
# Test with file operations
stream_with_tool_monitoring(
"List the files in the current directory using the LS tool"
)
The sleep tool is perfect for demonstrating tool events because it has a measurable duration:
# Test sleep tool with timing
curl -X POST http://localhost:8080/v1/chat/completions \
-H "Content-Type: application/json" \
-N \
-d '{
"model": "gemini-2.0-flash",
"messages": [
{
"role": "user",
"content": "Please use the sleep tool to pause for exactly 5 seconds and confirm when complete"
}
],
"stream": true
}' | while IFS= read -r line; do
echo "$(date '+%H:%M:%S') - $line"
done
This will show timestamps for each event, helping you verify the tool execution timing.
Method 4: Advanced Event Filtering with jq
If you have jq
installed, you can filter and format the events:
curl -X POST http://localhost:8080/v1/chat/completions \
-H "Content-Type: application/json" \
-N \
-d '{
"model": "gemini-2.0-flash",
"messages": [{"role": "user", "content": "Use the sleep tool for 3 seconds"}],
"stream": true
}' | grep "^data: " | sed 's/^data: //' | while read line; do
if [ "$line" != "[DONE]" ]; then
echo "$line" | jq -r '
if .event_type == "tool_call" then
"🔧 TOOL CALL: " + .tool_call.name + " with args: " + (.tool_call.arguments | tostring)
elif .event_type == "tool_response" then
"📥 TOOL RESPONSE: " + .tool_response.name + " -> " + (.tool_response.response | tostring)
elif .choices and .choices[0].delta.content then
"💬 CONTENT: " + .choices[0].delta.content
else
"📊 OTHER: " + (.object // "unknown")
end'
fi
done
When you execute a tool, you should see this event sequence:
Tool Call Event: AI decides to use a tool
{
"event_type": "tool_call",
"object": "tool.call",
"tool_call": {
"id": "call_abc123",
"name": "sleep",
"arguments": {"seconds": 3}
}
}
Tool Response Event: Tool execution completes
{
"event_type": "tool_response",
"object": "tool.response",
"tool_response": {
"id": "call_abc123",
"name": "sleep",
"response": "Slept for 3 seconds"
}
}
Chat Completion Chunks: AI generates response based on tool result
Troubleshooting
If you’re not seeing tool events in your stream:
- Check Server Configuration: Tool events require the
withAllEvents
flag to be enabled - Verify Tool Registration: Ensure tools are properly registered with the server
- Test with AgentFlow UI: The web UI at
http://localhost:8080/ui
always shows tool events
If the AI isn’t using your requested tool:
- Be Explicit: Clearly request the specific tool by name
- Check Tool Availability: Use
/v1/tools
endpoint to verify tool registration - Use Simple Examples: Start with basic tools like
sleep
or LS
# Check which tools are registered
curl http://localhost:8080/v1/tools | jq '.tools[] | .name'
Next Steps
This approach gives you programmatic access to the same tool execution visibility that the AgentFlow web interface provides, enabling automation, monitoring, and integration with other systems.
2.5 - How to Use the OpenAI Server with big-AGI
Configure the gomcptest OpenAI-compatible server as a backend for big-AGI
This guide shows you how to set up and configure the gomcptest OpenAI-compatible server to work with big-AGI, a popular open-source web client for AI assistants.
Prerequisites
Why Use big-AGI with gomcptest?
big-AGI provides a polished, feature-rich web interface for interacting with AI models. By connecting it to the gomcptest OpenAI-compatible server, you get:
- A professional web interface for your AI interactions
- Support for tools/function calling
- Conversation history management
- Persona management
- Image generation capabilities
- Multiple user support
Setting Up big-AGI
Clone the big-AGI repository:
git clone https://github.com/enricoros/big-agi.git
cd big-agi
Install dependencies:
Create a .env.local
file for configuration:
cp .env.example .env.local
Edit the .env.local
file to configure your gomcptest server connection:
# big-AGI configuration
# Your gomcptest OpenAI-compatible server URL
OPENAI_API_HOST=http://localhost:8080
# This can be any string since the gomcptest server doesn't use API keys
OPENAI_API_KEY=gomcptest-local-server
# Set this to true to enable the custom server
OPENAI_API_ENABLE_CUSTOM_PROVIDER=true
Start big-AGI:
Open your browser and navigate to http://localhost:3000
to access the big-AGI interface.
Configuring big-AGI to Use Your Models
The gomcptest OpenAI-compatible server exposes Google Cloud models through an OpenAI-compatible API. In big-AGI, you’ll need to configure the models:
- Open big-AGI in your browser
- Click on the Settings icon (gear) in the top right
- Go to the Models tab
- Under “OpenAI Models”:
- Click “Add Models”
- Add your models by ID (e.g.,
gemini-1.5-pro
, gemini-2.0-flash
) - Set context length appropriately (8K-32K depending on the model)
- Set function calling capability to
true
for models that support it
To use the MCP tools through big-AGI’s function calling interface:
- In big-AGI, click on the Settings icon
- Go to the Advanced tab
- Enable “Function Calling” under the “Experimental Features” section
- In a new chat, click on the “Functions” tab (plugin icon) in the chat interface
- The available tools from your gomcptest server should be listed
Configuring CORS for big-AGI
If you’re running big-AGI on a different domain or port than your gomcptest server, you’ll need to enable CORS on the server side. Edit the OpenAI server configuration:
Create or edit a CORS middleware for the OpenAI server:
// CORS middleware with specific origin allowance
func corsMiddleware(next http.Handler) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
// Allow requests from big-AGI origin
w.Header().Set("Access-Control-Allow-Origin", "http://localhost:3000")
w.Header().Set("Access-Control-Allow-Methods", "GET, POST, OPTIONS")
w.Header().Set("Access-Control-Allow-Headers", "Content-Type, Authorization")
if r.Method == "OPTIONS" {
w.WriteHeader(http.StatusOK)
return
}
next.ServeHTTP(w, r)
})
}
Apply this middleware to your server routes
Troubleshooting Common Issues
Model Not Found
If big-AGI reports that models cannot be found:
- Verify your gomcptest server is running and accessible
- Check the server logs to ensure models are properly registered
- Make sure the model IDs in big-AGI match exactly the ones provided by your gomcptest server
Function Calling Not Working
If tools aren’t working properly:
- Ensure the tools are properly registered in your gomcptest server
- Check that function calling is enabled in big-AGI settings
- Verify the model you’re using supports function calling
Connection Issues
If big-AGI can’t connect to your server:
- Verify the
OPENAI_API_HOST
value in your .env.local
file - Check for CORS issues in your browser’s developer console
- Ensure your server is running and accessible from the browser
Production Deployment
For production use, consider:
Securing your API:
- Add proper authentication to your gomcptest OpenAI server
- Update the
OPENAI_API_KEY
in big-AGI accordingly
Deploying big-AGI:
Setting up HTTPS:
- For production, both big-AGI and your gomcptest server should use HTTPS
- Consider using a reverse proxy like Nginx with Let’s Encrypt certificates
Example: Basic Chat Interface
Once everything is set up, you can use big-AGI’s interface to interact with your AI models:
- Start a new chat
- Select your model from the model dropdown (e.g.,
gemini-1.5-pro
) - Enable function calling if you want to use tools
- Begin chatting with your AI assistant, powered by gomcptest
The big-AGI interface provides a much richer experience than a command-line interface, with features like conversation history, markdown rendering, code highlighting, and more.
3 - Reference
Technical reference documentation for gomcptest components and tools
Reference guides are technical descriptions of the machinery and how to operate it. They describe how things work in detail and are accurate and complete.
This section provides detailed technical documentation on gomcptest’s components, APIs, parameters, and tools.
3.1 - Tools Reference
Comprehensive reference of all available MCP-compatible tools
This reference guide documents all available MCP-compatible tools in the gomcptest project, their parameters, and response formats.
Bash
Executes bash commands in a persistent shell session.
Parameters
Parameter | Type | Required | Description |
---|
command | string | Yes | The command to execute |
timeout | number | No | Timeout in milliseconds (max 600000) |
Response
The tool returns the command output as a string.
Banned Commands
For security reasons, the following commands are banned:
alias
, curl
, curlie
, wget
, axel
, aria2c
, nc
, telnet
, lynx
, w3m
, links
, httpie
, xh
, http-prompt
, chrome
, firefox
, safari
Edit
Modifies file content by replacing specified text.
Parameters
Parameter | Type | Required | Description |
---|
file_path | string | Yes | Absolute path to the file to modify |
old_string | string | Yes | Text to replace |
new_string | string | Yes | Replacement text |
Response
Confirmation message with the updated content.
Finds files matching glob patterns with metadata.
Parameters
Parameter | Type | Required | Description |
---|
pattern | string | Yes | Glob pattern to match files against |
path | string | No | Directory to search in (default: current directory) |
exclude | string | No | Glob pattern to exclude from results |
limit | number | No | Maximum number of results to return |
absolute | boolean | No | Return absolute paths instead of relative |
Response
A list of matching files with metadata including path, size, modification time, and permissions.
Searches file contents using regular expressions.
Parameters
Parameter | Type | Required | Description |
---|
pattern | string | Yes | Regular expression pattern to search for |
path | string | No | Directory to search in (default: current directory) |
include | string | No | File pattern to include in the search |
Response
A list of matches with file paths, line numbers, and matched content.
LS
Lists files and directories in a given path.
Parameters
Parameter | Type | Required | Description |
---|
path | string | Yes | Absolute path to the directory to list |
ignore | array | No | List of glob patterns to ignore |
Response
A list of files and directories with metadata.
Replace
Completely replaces a file’s contents.
Parameters
Parameter | Type | Required | Description |
---|
file_path | string | Yes | Absolute path to the file to write |
content | string | Yes | Content to write to the file |
Response
Confirmation message with the content written.
View
Reads file contents with optional line range.
Parameters
Parameter | Type | Required | Description |
---|
file_path | string | Yes | Absolute path to the file to read |
offset | number | No | Line number to start reading from |
limit | number | No | Number of lines to read |
Response
The file content with line numbers in cat -n format.
dispatch_agent
Launches a new agent with access to specific tools.
Parameters
Parameter | Type | Required | Description |
---|
prompt | string | Yes | The task for the agent to perform |
Response
The result of the agent’s task execution.
imagen
Generates and manipulates images using Google’s Imagen API.
Parameters
Parameter | Type | Required | Description |
---|
prompt | string | Yes | Description of the image to generate |
aspectRatio | string | No | Aspect ratio for the image (default: “1:1”) |
safetyFilterLevel | string | No | Safety filter level (default: “block_some”) |
personGeneration | string | No | Person generation policy (default: “dont_allow”) |
Response
Returns a JSON object with the generated image path and metadata.
duckdbserver
Provides data processing capabilities using DuckDB.
Parameters
Parameter | Type | Required | Description |
---|
query | string | Yes | SQL query to execute |
database | string | No | Database file path (default: in-memory) |
Response
Query results in JSON format.
imagen_edit
Edits images using Google’s Gemini 2.0 Flash model with natural language instructions.
Parameters
Parameter | Type | Required | Description |
---|
base64_image | string | Yes | Base64 encoded image data (without data:image/… prefix) |
mime_type | string | Yes | MIME type of the image (e.g., “image/jpeg”, “image/png”) |
edit_instruction | string | Yes | Text describing the edit to perform |
temperature | number | No | Randomness in generation (0.0-2.0, default: 1.0) |
top_p | number | No | Nucleus sampling parameter (0.0-1.0, default: 0.95) |
Response
Returns edited image information including file path and HTTP URL.
plantuml
Generates PlantUML diagram URLs from plain text diagrams with syntax validation and error correction.
Parameters
Parameter | Type | Required | Description |
---|
plantuml_code | string | Yes | PlantUML diagram code in plain text format |
output_format | string | No | Output format: “svg” (default) or “png” |
Response
Returns URL pointing to PlantUML server for SVG/PNG rendering.
plantuml_check
Validates PlantUML file syntax using the official PlantUML processor.
Parameters
Parameter | Type | Required | Description |
---|
file_path | string | Yes | Path to PlantUML file (.puml, .plantuml, .pu) |
Response
Returns validation result with detailed error messages if syntax issues are found.
sleep
Pauses execution for a specified number of seconds (useful for testing and demonstrations).
Parameters
Parameter | Type | Required | Description |
---|
seconds | number | Yes | Number of seconds to sleep |
Response
Confirmation message after sleep completion.
Most tools return JSON responses with the following structure:
{
"result": "...", // String result or
"results": [...], // Array of results
"error": "..." // Error message if applicable
}
Error Handling
All tools follow a consistent error reporting format:
{
"error": "Error message",
"code": "ERROR_CODE"
}
Common error codes include:
INVALID_PARAMS
: Parameters are missing or invalidEXECUTION_ERROR
: Error executing the requested operationPERMISSION_DENIED
: Permission issuesTIMEOUT
: Operation timed out
3.2 - OpenAI-Compatible Server Reference
Technical documentation of the server’s architecture, API endpoints, and configuration
This reference guide provides detailed technical documentation on the OpenAI-compatible server’s architecture, API endpoints, configuration options, and integration details with Vertex AI.
Overview
The OpenAI-compatible server is a core component of the gomcptest system. It implements an API surface compatible with the OpenAI Chat Completions API while connecting to Google’s Vertex AI for model inference. The server acts as a bridge between clients (like the modern AgentFlow web UI) and the underlying LLM models, handling session management, function calling, and tool execution.
AgentFlow Web UI
The server includes AgentFlow, a modern web-based interface that is embedded directly in the openaiserver binary. It provides:
- Mobile-First Design: Optimized for iPhone and mobile devices
- Real-time Streaming: Server-sent events for immediate response display
- Professional Styling: Clean, modern interface with accessibility features
- Conversation Management: Persistent conversation history
- Attachment Support: File uploads including PDF support
- Embedded Architecture: Built into the main server binary for easy deployment
UI Access
Access AgentFlow by starting the openaiserver and navigating to the /ui
endpoint:
./bin/openaiserver
# AgentFlow available at: http://localhost:8080/ui
Development Note
The host/openaiserver/simpleui
directory contains a standalone UI server used exclusively for development and testing. Production users should use the embedded UI via the /ui
endpoint.
API Endpoints
POST /v1/chat/completions
The primary endpoint that mimics the OpenAI Chat Completions API.
Request
{
"model": "gemini-pro",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Hello, world!"}
],
"stream": true,
"max_tokens": 1024,
"temperature": 0.7,
"functions": [
{
"name": "get_weather",
"description": "Get the current weather in a given location",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city and state, e.g. San Francisco, CA"
}
},
"required": ["location"]
}
}
]
}
Response (non-streamed)
{
"id": "chatcmpl-123456789",
"object": "chat.completion",
"created": 1677858242,
"model": "gemini-pro",
"choices": [
{
"message": {
"role": "assistant",
"content": "Hello! How can I help you today?"
},
"finish_reason": "stop",
"index": 0
}
],
"usage": {
"prompt_tokens": 13,
"completion_tokens": 7,
"total_tokens": 20
}
}
Response (streamed)
When stream
is set to true
, the server returns a stream of SSE (Server-Sent Events) with partial responses:
data: {"id":"chatcmpl-123456789","object":"chat.completion.chunk","created":1677858242,"model":"gemini-pro","choices":[{"delta":{"role":"assistant"},"index":0,"finish_reason":null}]}
data: {"id":"chatcmpl-123456789","object":"chat.completion.chunk","created":1677858242,"model":"gemini-pro","choices":[{"delta":{"content":"Hello"},"index":0,"finish_reason":null}]}
data: {"id":"chatcmpl-123456789","object":"chat.completion.chunk","created":1677858242,"model":"gemini-pro","choices":[{"delta":{"content":"!"},"index":0,"finish_reason":null}]}
data: {"id":"chatcmpl-123456789","object":"chat.completion.chunk","created":1677858242,"model":"gemini-pro","choices":[{"delta":{"content":" How"},"index":0,"finish_reason":null}]}
data: {"id":"chatcmpl-123456789","object":"chat.completion.chunk","created":1677858242,"model":"gemini-pro","choices":[{"delta":{},"index":0,"finish_reason":"stop"}]}
data: [DONE]
Supported Features
Models
The server supports the following Vertex AI models:
gemini-1.5-pro
gemini-2.0-flash
gemini-pro-vision
(legacy)
The server supports Google’s native Vertex AI tools:
- Code Execution: Enables the model to execute code as part of generation
- Google Search: Specialized search tool powered by Google
- Google Search Retrieval: Advanced retrieval tool with Google search backend
Parameters
Parameter | Type | Default | Description |
---|
model | string | gemini-pro | The model to use for generating completions |
messages | array | Required | An array of messages in the conversation |
stream | boolean | false | Whether to stream the response or not |
max_tokens | integer | 1024 | Maximum number of tokens to generate |
temperature | number | 0.7 | Sampling temperature (0-1) |
functions | array | [] | Function definitions the model can call |
function_call | string or object | auto | Controls function calling behavior |
Function Calling
The server supports function calling similar to the OpenAI API. When the model identifies that a function should be called, the server:
- Parses the function call parameters
- Locates the appropriate MCP tool
- Executes the tool with the provided parameters
- Returns the result to the model for further processing
Architecture
The server consists of these key components:
HTTP Server
A standard Go HTTP server that handles incoming requests and routes them to the appropriate handlers.
Session Manager
Maintains chat history and context for ongoing conversations. Ensures that the model has necessary context when generating responses.
Vertex AI Client
Communicates with Google’s Vertex AI API to:
- Send prompt templates to the model
- Receive completions from the model
- Stream partial responses back to the client
Manages the available MCP tools and handles:
- Tool registration and discovery
- Parameter validation
- Tool execution
- Response processing
Response Streamer
Handles streaming responses to clients in SSE format, ensuring low latency and progressive rendering.
Configuration
The server can be configured using environment variables and command-line flags:
Command-Line Options
Flag | Description | Default |
---|
-mcpservers | Input string of MCP servers | - |
-withAllEvents | Include all events (tool calls, tool responses) in stream output, not just content chunks | false |
⚠️ Important for Testing: The -withAllEvents
flag is mandatory for testing tool event flows in development. It enables streaming of all tool execution events including tool calls and responses, which is essential for debugging and development. Without this flag, only standard chat completion responses are streamed.
Environment Variables
The server can be configured using environment variables:
Core Configuration
Variable | Description | Default |
---|
GCP_PROJECT | Google Cloud project ID | - |
GCP_REGION | Google Cloud region | us-central1 |
GEMINI_MODELS | Comma-separated list of available models | gemini-1.5-pro,gemini-2.0-flash |
PORT | HTTP server port | 8080 |
LOG_LEVEL | Logging level (DEBUG, INFO, WARN, ERROR) | INFO |
Variable | Description | Default |
---|
VERTEX_AI_CODE_EXECUTION | Enable Code Execution tool | false |
VERTEX_AI_GOOGLE_SEARCH | Enable Google Search tool | false |
VERTEX_AI_GOOGLE_SEARCH_RETRIEVAL | Enable Google Search Retrieval tool | false |
Legacy Configuration
Variable | Description | Default |
---|
GOOGLE_APPLICATION_CREDENTIALS | Path to Google Cloud credentials file | - |
GOOGLE_CLOUD_PROJECT | Legacy alias for GCP_PROJECT | - |
GOOGLE_CLOUD_LOCATION | Legacy alias for GCP_REGION | us-central1 |
Error Handling
The server implements consistent error handling with HTTP status codes:
Status Code | Description |
---|
400 | Bad Request - Invalid parameters or request format |
401 | Unauthorized - Missing or invalid authentication |
404 | Not Found - Model or endpoint not found |
429 | Too Many Requests - Rate limit exceeded |
500 | Internal Server Error - Server-side error |
503 | Service Unavailable - Vertex AI service unavailable |
Error responses follow this format:
{
"error": {
"message": "Detailed error message",
"type": "error_type",
"param": "parameter_name",
"code": "error_code"
}
}
Security Considerations
The server does not implement authentication or authorization by default. In production deployments, consider:
- Running behind a reverse proxy with authentication
- Using API keys or OAuth2
- Implementing rate limiting
- Setting up proper firewall rules
Examples
Basic Usage
export GCP_PROJECT="your-project-id"
export GCP_REGION="us-central1"
./bin/openaiserver
# Access AgentFlow UI at: http://localhost:8080/ui
Development with Full Event Streaming
export GCP_PROJECT="your-project-id"
export GCP_REGION="us-central1"
./bin/openaiserver -withAllEvents
# Access AgentFlow UI with full tool events at: http://localhost:8080/ui
export GCP_PROJECT="your-project-id"
export VERTEX_AI_CODE_EXECUTION=true
export VERTEX_AI_GOOGLE_SEARCH=true
./bin/openaiserver
# AgentFlow UI with Vertex AI tools at: http://localhost:8080/ui
Development UI Server (For Developers Only)
# Terminal 1: Start API server
export GCP_PROJECT="your-project-id"
./bin/openaiserver -port=4000
# Terminal 2: Start development UI server
cd host/openaiserver/simpleui
go run . -ui-port=8081 -api-url=http://localhost:4000
# Development UI at: http://localhost:8081
Note: The standalone UI server is for development purposes only. Production users should use the embedded UI via /ui
.
Client Connection
curl -X POST http://localhost:8080/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "gemini-2.0-flash",
"messages": [{"role": "user", "content": "Hello, world!"}]
}'
Limitations
- Single chat session support only
- No persistent storage of conversations
- Limited authentication options
- Basic rate limiting
- Limited model parameter controls
Advanced Usage
Tools are automatically registered when the server starts. To register custom tools:
- Place executable files in the
MCP_TOOLS_PATH
directory - Ensure they follow the MCP protocol
- Restart the server
Streaming with Function Calls
When using function calling with streaming, the stream will pause during tool execution and resume with the tool results included in the context.
3.3 - cliGCP Reference (Deprecated)
Detailed reference of the cliGCP command-line interface (deprecated in favor of AgentFlow UI)
⚠️ DEPRECATED: The cliGCP command-line interface is deprecated in favor of the modern AgentFlow web UI. New users should use the AgentFlow UI instead. This documentation is maintained for legacy users.
Overview
The cliGCP (Command Line Interface for Google Cloud Platform) is a legacy command-line tool that provides a chat interface similar to tools like “Claude Code” or “ChatGPT”. It connects to an OpenAI-compatible server and allows users to interact with LLMs and MCP tools through a conversational interface.
For new projects, we recommend using the AgentFlow web UI which provides a modern, mobile-optimized interface with better features and user experience.
Command Structure
Basic Usage
Flags
Flag | Description | Default |
---|
-mcpservers | Comma-separated list of MCP tool paths | "" |
-server | URL of the OpenAI-compatible server | “http://localhost:8080” |
-model | LLM model to use | “gemini-pro” |
-prompt | Initial system prompt | “You are a helpful assistant.” |
-temp | Temperature setting for model responses | 0.7 |
-maxtokens | Maximum number of tokens in responses | 1024 |
-history | File path to store/load chat history | "" |
-verbose | Enable verbose logging | false |
Example
./bin/cliGCP -mcpservers "./bin/Bash;./bin/View;./bin/GlobTool;./bin/GrepTool;./bin/LS;./bin/Edit;./bin/Replace;./bin/dispatch_agent" -server "http://localhost:8080" -model "gemini-pro" -prompt "You are a helpful command-line assistant."
Components
Chat Interface
The chat interface provides:
- Text-based input for user messages
- Markdown rendering of AI responses
- Real-time streaming of responses
- Input history and navigation
- Multi-line input support
The tool manager:
- Loads and initializes MCP tools
- Registers tools with the OpenAI-compatible server
- Routes function calls to appropriate tools
- Processes tool results
Session Manager
The session manager:
- Maintains chat history within the session
- Handles context windowing for long conversations
- Optionally persists conversations to disk
- Provides conversation resume functionality
Interaction Patterns
Basic Chat
The most common interaction pattern is a simple turn-based chat:
- User enters a message
- Model generates and streams a response
- Chat history is updated
- User enters the next message
Function Calling
When the model determines a function should be called:
- User enters a message requesting an action (e.g., “List files in /tmp”)
- Model analyzes the request and generates a function call
- cliGCP intercepts the function call and routes it to the appropriate tool
- Tool executes and returns results
- Results are injected back into the model’s context
- Model continues generating a response that incorporates the tool results
- The complete response is shown to the user
Multi-turn Function Calling
For complex tasks, the model may make multiple function calls:
- User requests a complex task (e.g., “Find all Python files containing ’error’”)
- Model makes a function call to list directories
- Tool returns directory listing
- Model makes additional function calls to search file contents
- Each tool result is returned to the model
- Model synthesizes the information and responds to the user
Technical Details
Messages between cliGCP and the server follow the OpenAI Chat API format:
{
"role": "user"|"assistant"|"system",
"content": "Message text"
}
Function calls use this format:
{
"role": "assistant",
"content": null,
"function_call": {
"name": "function_name",
"arguments": "{\"arg1\":\"value1\",\"arg2\":\"value2\"}"
}
}
Tools are registered with the server using JSONSchema:
{
"name": "tool_name",
"description": "Tool description",
"parameters": {
"type": "object",
"properties": {
"param1": {
"type": "string",
"description": "Parameter description"
}
},
"required": ["param1"]
}
}
Error Handling
The CLI implements robust error handling for:
- Connection issues with the server
- Tool execution failures
- Model errors
- Input validation
Error messages are displayed to the user with context and possible solutions.
Configuration
Environment Variables
Variable | Description | Default |
---|
OPENAI_API_URL | URL of the OpenAI-compatible server | http://localhost:8080 |
OPENAI_API_KEY | API key for authentication (if required) | "" |
MCP_TOOLS_PATH | Path to MCP tools (overridden by -mcpservers) | “./tools” |
DEFAULT_MODEL | Default model to use | “gemini-pro” |
SYSTEM_PROMPT | Default system prompt | “You are a helpful assistant.” |
Configuration File
You can create a ~/.cligcp.json
configuration file with these settings:
{
"server": "http://localhost:8080",
"model": "gemini-pro",
"prompt": "You are a helpful assistant.",
"temperature": 0.7,
"max_tokens": 1024,
"tools": [
"./bin/Bash",
"./bin/View",
"./bin/GlobTool"
]
}
Advanced Usage
Persistent History
To save and load chat history:
./bin/cliGCP -history ./chat_history.json
Custom System Prompt
To set a specific system prompt:
./bin/cliGCP -prompt "You are a Linux command-line expert that helps users with shell commands and filesystem operations."
Combining with Shell Scripts
You can use cliGCP in shell scripts by piping input and capturing output:
echo "Explain how to find large files in Linux" | ./bin/cliGCP -noninteractive
Limitations
- Single conversation per instance
- Limited rendering capabilities for complex markdown
- No built-in authentication management
- Limited offline functionality
- No multi-modal input support (e.g., images)
Troubleshooting
Common Issues
Issue | Possible Solution |
---|
Connection refused | Ensure the OpenAI server is running |
Tool not found | Check tool paths and permissions |
Out of memory | Reduce history size or split conversation |
Slow responses | Check network connection and server load |
Diagnostic Mode
Run with the -verbose
flag to enable detailed logging:
This will show all API requests, responses, and tool interactions, which can be helpful for debugging.
4 - Explanation
Understanding-oriented content for gomcptest architecture and concepts
Explanation documents discuss and clarify concepts to broaden the reader’s understanding of topics. They provide context and illuminate ideas.
This section provides deeper background on how gomcptest works, its architecture, and the concepts behind it. The explanations are organized from foundational concepts to specific implementations:
- Architecture - Overall system design and component relationships
- MCP Protocol - Core protocol concepts and communication patterns
- Event System - Real-time event architecture for transparency and monitoring
- MCP Tools - Tool implementation details and design patterns
- AgentFlow Implementation - Specific web interface implementation of the event system
4.1 - gomcptest Architecture
Deep dive into the system architecture and design decisions
This document explains the architecture of gomcptest, the design decisions behind it, and how the various components interact to create a custom Model Context Protocol (MCP) host.
The Big Picture
The gomcptest project implements a custom host that provides a Model Context Protocol (MCP) implementation. It’s designed to enable testing and experimentation with agentic systems without requiring direct integration with commercial LLM platforms.
The system is built with these key principles in mind:
- Modularity: Components are designed to be interchangeable
- Compatibility: The API mimics the OpenAI API for easy integration
- Extensibility: New tools can be easily added to the system
- Testing: The architecture facilitates testing of agentic applications
Core Components
Host (OpenAI Server)
The host is the central component, located in /host/openaiserver
. It presents an OpenAI-compatible API interface and connects to Google’s Vertex AI for model inference. This compatibility layer makes it easy to integrate with existing tools and libraries designed for OpenAI.
The host has several key responsibilities:
- API Compatibility: Implementing the OpenAI chat completions API
- Session Management: Maintaining chat history and context
- Model Integration: Connecting to Vertex AI’s Gemini models
- Function Calling: Orchestrating function/tool calls based on model outputs
- Response Streaming: Supporting streaming responses to the client
Unlike commercial implementations, this host is designed for local development and testing, emphasizing flexibility and observability over production-ready features like authentication or rate limiting.
The tools are standalone executables that implement the Model Context Protocol. Each tool is designed to perform a specific function, such as executing shell commands or manipulating files.
Tools follow a consistent pattern:
- They communicate via standard I/O using the MCP JSON-RPC protocol
- They expose a specific set of parameters
- They handle their own error conditions
- They return results in a standardized format
This approach allows tools to be:
- Developed independently
- Tested in isolation
- Used in different host environments
- Chained together in complex workflows
CLI
The CLI provides a user interface similar to tools like “Claude Code” or “OpenAI ChatGPT”. It connects to the OpenAI-compatible server and provides a way to interact with the LLM and tools through a conversational interface.
Data Flow
- The user sends a request to the CLI
- The CLI forwards this request to the OpenAI-compatible server
- The server sends the request to Vertex AI’s Gemini model
- The model may identify function calls in its response
- The server executes these function calls by invoking the appropriate MCP tools
- The results are provided back to the model to continue its response
- The final response is streamed back to the CLI and presented to the user
Design Decisions Explained
Why OpenAI API Compatibility?
The OpenAI API has become a de facto standard in the LLM space. By implementing this interface, gomcptest can work with a wide variety of existing tools, libraries, and frontends with minimal adaptation.
Why Google Vertex AI?
Vertex AI provides access to Google’s Gemini models, which have strong function calling capabilities. The implementation could be extended to support other model providers as needed.
By implementing tools as standalone executables rather than library functions, we gain several advantages:
- Security through isolation
- Language agnosticism (tools can be written in any language)
- Ability to distribute tools separately from the host
- Easier testing and development
Why MCP?
The Model Context Protocol provides a standardized way for LLMs to interact with external tools. By adopting this protocol, gomcptest ensures compatibility with tools developed for other MCP-compatible hosts.
Limitations and Future Directions
The current implementation has several limitations:
- Single chat session per instance
- Limited support for authentication and authorization
- No persistence of chat history between restarts
- No built-in support for rate limiting or quotas
Future enhancements could include:
- Support for multiple chat sessions
- Integration with additional model providers
- Enhanced security features
- Improved error handling and logging
- Performance optimizations for large-scale deployments
Conclusion
The gomcptest architecture represents a flexible and extensible approach to building custom MCP hosts. It prioritizes simplicity, modularity, and developer experience, making it an excellent platform for experimentation with agentic systems.
By understanding this architecture, developers can effectively utilize the system, extend it with new tools, and potentially adapt it for their specific needs.
4.2 - Understanding the Model Context Protocol (MCP)
Exploration of what MCP is, how it works, and design decisions behind it
This document explores the Model Context Protocol (MCP), how it works, the design decisions behind it, and how it compares to alternative approaches for LLM tool integration.
What is the Model Context Protocol?
The Model Context Protocol (MCP) is a standardized communication protocol that enables Large Language Models (LLMs) to interact with external tools and capabilities. It defines a structured way for models to request information or take actions in the real world, and for tools to provide responses back to the model.
MCP is designed to solve the problem of extending LLMs beyond their training data by giving them access to:
- Current information (e.g., via web search)
- Computational capabilities (e.g., calculators, code execution)
- External systems (e.g., databases, APIs)
- User environment (e.g., file system, terminal)
How MCP Works
At its core, MCP is a protocol based on JSON-RPC that enables bidirectional communication between LLMs and tools. The basic workflow is:
- The LLM generates a call to a tool with specific parameters
- The host intercepts this call and routes it to the appropriate tool
- The tool executes the requested action and returns the result
- The result is injected into the model’s context
- The model continues generating a response incorporating the new information
The protocol specifies:
- How tools declare their capabilities and parameters
- How the model requests tool actions
- How tools return results or errors
- How multiple tools can be combined
MCP in gomcptest
In gomcptest, MCP is implemented using a set of independent executables that communicate over standard I/O. This approach has several advantages:
- Language-agnostic: Tools can be written in any programming language
- Process isolation: Each tool runs in its own process for security and stability
- Compatibility: The protocol works with various LLM providers
- Extensibility: New tools can be easily added to the system
Each tool in gomcptest follows a consistent pattern:
- It receives a JSON request on stdin
- It parses the parameters and performs its action
- It formats the result as JSON and returns it on stdout
The Protocol Specification
The core MCP protocol in gomcptest follows this format:
Tools register themselves with a schema that defines their capabilities:
{
"name": "ToolName",
"description": "Description of what the tool does",
"parameters": {
"type": "object",
"properties": {
"param1": {
"type": "string",
"description": "Description of parameter 1"
},
"param2": {
"type": "number",
"description": "Description of parameter 2"
}
},
"required": ["param1"]
}
}
Function Call Request
When a model wants to use a tool, it generates a function call like:
{
"name": "ToolName",
"params": {
"param1": "value1",
"param2": 42
}
}
Function Call Response
The tool executes the requested action and returns:
{
"result": "Output of the tool's execution"
}
Or, in case of an error:
{
"error": {
"message": "Error message",
"code": "ERROR_CODE"
}
}
Design Decisions in MCP
Several key design decisions shape the MCP implementation in gomcptest:
Standard I/O Communication
By using stdin/stdout for communication, tools can be written in any language that can read from stdin and write to stdout. This makes it easy to integrate existing utilities and libraries.
Using JSON Schema for tool definitions provides a clear contract between the model and the tools. It enables:
- Validation of parameters
- Documentation of capabilities
- Potential for automatic code generation
Stateless Design
Tools are designed to be stateless, with each invocation being independent. This simplifies the protocol and makes tools easier to reason about and test.
Pass-through Authentication
The protocol doesn’t handle authentication directly; instead, it relies on the host to manage permissions and authentication. This separation of concerns keeps the protocol simple.
Comparison with Alternatives
vs. OpenAI Function Calling
MCP is similar to OpenAI’s function calling feature but with these key differences:
- MCP is designed to be provider-agnostic
- MCP tools run as separate processes
- MCP provides more detailed error handling
Compared to LangChain:
- MCP is a lower-level protocol rather than a framework
- MCP focuses on interoperability rather than abstraction
- MCP allows for stronger process isolation
vs. Agent Protocols
Other agent protocols often focus on higher-level concepts like goals and planning, while MCP focuses specifically on the mechanics of tool invocation.
Future Directions
The MCP protocol in gomcptest could evolve in several ways:
- Enhanced security: More granular permissions and sand-boxing
- Streaming responses: Support for tools that produce incremental results
- Bidirectional communication: Supporting tools that can request clarification
- Tool composition: First-class support for chaining tools together
- State management: Optional session state for tools that need to maintain context
Conclusion
The Model Context Protocol as implemented in gomcptest represents a pragmatic approach to extending LLM capabilities through external tools. Its simplicity, extensibility, and focus on interoperability make it a solid foundation for building and experimenting with agentic systems.
By understanding the protocol, developers can create new tools that seamlessly integrate with the system, unlocking new capabilities for LLM applications.
4.3 - Event System Architecture
Understanding the event-driven architecture that enables real-time tool interaction monitoring and streaming responses in gomcptest.
This document explains the foundational event system architecture in gomcptest that enables real-time monitoring of tool interactions, streaming responses, and transparent agentic workflows. This system is implemented across different components and interfaces, with AgentFlow being one specific implementation.
What is the Event System?
The event system in gomcptest provides real-time visibility into AI-tool interactions through a streaming event architecture. It captures and streams events that occur during tool execution, enabling transparency in how AI agents make decisions and use tools.
Important Implementation Detail: By default, the OpenAI-compatible server only streams standard chat completion responses to maintain API compatibility. Tool events (tool calls and responses) are only included in the stream when the withAllEvents
flag is enabled in the server configuration. This design allows for:
- Standard Mode: OpenAI API compatibility with only chat completion chunks
- Enhanced Mode: Full event visibility including tool interactions when
withAllEvents
is true
Core Event Concepts
Event-Driven Transparency
Traditional AI interactions are often “black boxes” where users see only the final result. The gomcptest event system provides transparency by exposing:
- Tool Call Events: When the AI decides to use a tool, what tool it chooses, and what parameters it passes
- Tool Response Events: The results returned by tools, including success responses and error conditions
- Processing Events: Internal state changes and decision points during request processing
- Stream Events: Real-time updates as responses are generated
Event Types
The system defines several core event types:
Based on the actual implementation in /host/openaiserver/chatengine/vertexai/gemini/tool_events.go
:
Chat Completion Events
- ChatCompletionStreamResponse: Standard OpenAI-compatible streaming chunks
- Always included in streams (default behavior)
- Contains incremental content as it’s generated
Event Availability
- Default Streaming: Only
ChatCompletionStreamResponse
events are sent - Enhanced Streaming: When
withAllEvents = true
, includes all tool events
Event Architecture Patterns
Producer-Consumer Model
The event system follows a producer-consumer pattern:
- Event Producers: Components that generate events (chat engines, tool executors, stream processors)
- Event Channels: Transport mechanisms for event delivery (Go channels, HTTP streams)
- Event Consumers: Components that process and present events (web interfaces, logging systems, monitors)
Channel-Based Streaming
Events are delivered through channel-based streaming:
type StreamEvent interface {
IsStreamEvent() bool
}
// Event channel returned by streaming operations
func SendStreamingRequest() (<-chan StreamEvent, error) {
eventChan := make(chan StreamEvent, 100)
// Events are sent to the channel as they occur
go func() {
defer close(eventChan)
// Generate and send events
eventChan <- &ToolCallEvent{...}
eventChan <- &ToolResponseEvent{...}
eventChan <- &ContentEvent{...}
}()
return eventChan, nil
}
Each event carries standardized metadata:
- Timestamp: When the event occurred
- Event ID: Unique identifier for tracking
- Event Type: Category and specific type
- Context: Related session, request, or operation context
- Payload: Event-specific data
Event Flow Patterns
Request-Response with Events
Traditional request-response patterns are enhanced with event streaming:
- Request Initiated: System generates start events
- Processing Events: Intermediate steps generate progress events
- Tool Interactions: Tool calls and responses generate events
- Content Generation: Streaming content generates incremental events
- Completion: Final response and end events
Event Correlation
Events are correlated through:
- Session IDs: Grouping events within a single chat session
- Request IDs: Linking events to specific API requests
- Tool Call IDs: Connecting tool call and response events
- Parent-Child Relationships: Hierarchical event relationships
Implementation Patterns
Server-Sent Events (SSE) Implementation
The actual implementation in /host/openaiserver/chatengine/chat_completion_stream.go
delivers events via Server-Sent Events with specific formatting:
HTTP Headers Set:
Content-Type: text/event-stream
Cache-Control: no-cache
Connection: keep-alive
Transfer-Encoding: chunked
Event Format:
data: {"event_type":"tool_call","id":"chatcmpl-abc123","object":"tool.call","created":1704067200,"tool_call":{"id":"call_xyz789","name":"sleep","arguments":{"seconds":3}}}
data: {"event_type":"tool_response","id":"chatcmpl-abc123","object":"tool.response","created":1704067201,"tool_response":{"id":"call_xyz789","name":"sleep","response":"Slept for 3 seconds"}}
data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","created":1704067202,"model":"gemini-2.0-flash","choices":[{"index":0,"delta":{"content":"I have completed the 3-second pause."},"finish_reason":"stop"}]}
data: [DONE]
Event Filtering Logic:
switch res := event.(type) {
case ChatCompletionStreamResponse:
// Always sent - OpenAI compatible
jsonBytes, _ := json.Marshal(res)
w.Write([]byte("data: " + string(jsonBytes) + "\n\n"))
default:
// Tool events only sent when withAllEvents = true
if o.withAllEvents {
jsonBytes, _ := json.Marshal(event)
w.Write([]byte("data: " + string(jsonBytes) + "\n\n"))
}
}
JSON-RPC Event Extensions
For programmatic interfaces, events extend the JSON-RPC protocol:
{
"jsonrpc": "2.0",
"method": "event",
"params": {
"event_type": "tool_call",
"event_data": {
"id": "call_123",
"name": "Edit",
"arguments": {...}
}
}
}
Event Processing Strategies
Real-Time Processing
Events are processed as they occur:
- Immediate Display: Critical events are shown immediately
- Progressive Enhancement: UI updates incrementally as events arrive
- Optimistic Updates: UI shows intended state before confirmation
Buffering and Batching
For performance optimization:
- Event Buffering: Collect multiple events before processing
- Batch Updates: Update UI with multiple events simultaneously
- Debouncing: Reduce update frequency for high-frequency events
Error Handling
Robust error handling in event processing:
- Graceful Degradation: Continue operation when non-critical events fail
- Event Recovery: Attempt to recover from event processing errors
- Fallback Modes: Alternative processing when event system fails
Event System Benefits
Observability
The event system provides comprehensive observability:
- Real-Time Monitoring: See what’s happening as it happens
- Historical Analysis: Review past interactions and decisions
- Performance Insights: Understand timing and bottlenecks
- Error Tracking: Identify and diagnose issues
User Experience
Enhanced user experience through transparency:
- Progress Indication: Users see incremental progress
- Decision Transparency: Understand AI reasoning process
- Interactive Feedback: Respond to tool executions in real-time
- Learning Opportunity: Understand how AI approaches problems
Development and Debugging
Valuable for development:
- Debugging Aid: Trace execution flow and identify issues
- Testing Support: Verify expected event sequences
- Performance Analysis: Identify optimization opportunities
- Integration Testing: Validate event handling across components
Integration Points
Chat Engines Integration
The actual integration in /host/openaiserver/chatengine/vertexai/gemini/
shows specific implementation patterns:
Tool Call Event Generation:
// When AI decides to use a tool
toolCallEvent := NewToolCallEvent(completionID, toolCallID, toolName, args)
eventChannel <- toolCallEvent
Tool Response Event Generation:
// After tool execution completes
toolResponseEvent := NewToolResponseEvent(completionID, toolCallID, toolName, response, err)
eventChannel <- toolResponseEvent
Stream Channel Management:
func (s *ChatSession) SendStreamingChatRequest(ctx context.Context, req chatengine.ChatCompletionRequest) (<-chan chatengine.StreamEvent, error) {
eventChannel := make(chan chatengine.StreamEvent, 100)
go func() {
defer close(eventChannel)
// Process model responses and emit events
for chunk := range modelStream {
// Emit tool events when detected
// Emit content events for streaming text
}
}()
return eventChannel, nil
}
Tools integrate by:
- Emitting execution start events
- Providing progress updates for long-running operations
- Returning detailed response events
- Generating error events with diagnostic information
User Interfaces
Interfaces integrate by:
- Subscribing to event streams
- Processing events in real-time
- Updating UI based on event content
- Providing user controls for event display
Event System Implementations
The event system is a general architecture that can be implemented in various ways:
AgentFlow Web Interface
AgentFlow implements the event system through:
- Browser-based SSE consumption
- Real-time popup notifications for tool calls
- Progressive content updates
- Interactive event display controls
CLI Interfaces
Command-line interfaces can implement through:
- Terminal-based event display
- Progress indicators and status updates
- Structured logging of events
- Interactive prompts based on events
API Gateways
API gateways can implement through:
- Event forwarding to multiple consumers
- Event filtering and transformation
- Event persistence and replay
- Event-based routing and load balancing
Future Event System Enhancements
Advanced Event Types
- Reasoning Events: Capture AI’s internal reasoning process
- Planning Events: Show multi-step planning and strategy
- Context Events: Track context usage and management
- Performance Events: Detailed timing and resource usage
Event Intelligence
- Event Pattern Recognition: Identify common patterns and anomalies
- Predictive Events: Anticipate likely next events
- Event Summarization: Aggregate events into higher-level insights
- Event Recommendations: Suggest optimizations based on event patterns
Enhanced Delivery
- Event Persistence: Store and replay event histories
- Event Filtering: Selective event delivery based on preferences
- Event Routing: Direct events to multiple consumers
- Event Transformation: Adapt events for different consumer types
Conclusion
The event system architecture in gomcptest provides a foundational layer for transparency, observability, and real-time interaction in agentic systems. By understanding these concepts, developers can effectively implement event-driven interfaces, create monitoring systems, and build tools that provide deep visibility into AI agent behavior.
This event system is implementation-agnostic and serves as the foundation for specific implementations like AgentFlow, while also enabling other interfaces and monitoring systems to provide similar transparency and real-time feedback capabilities.
4.4 - Understanding the MCP Tools
Detailed explanation of the MCP tools architecture and implementation
This document explains the architecture and implementation of the MCP tools in gomcptest, how they work, and the design principles behind them.
MCP (Model Context Protocol) tools are standalone executables that provide specific functions that can be invoked by AI models. They allow the AI to interact with its environment - performing tasks like reading and writing files, executing commands, or searching for information.
In gomcptest, tools are implemented as independent Go executables that follow a standard protocol for receiving requests and returning results through standard input/output streams. Tool interactions generate events that are captured by the event system, enabling real-time monitoring and transparency.
Each tool in gomcptest follows a consistent architecture:
- Standard I/O Interface: Tools communicate via stdin/stdout using JSON-formatted requests and responses
- Parameter Validation: Tools validate their input parameters according to a JSON schema
- Stateless Execution: Each tool invocation is independent and does not maintain state
- Controlled Access: Tools implement appropriate security measures and permission checks
- Structured Results: Results are returned in a standardized JSON format
Common Components
Most tools share these common components:
- Main Function: Parses JSON input, validates parameters, executes the core function, formats and returns the result
- Parameter Structure: Defines the expected input parameters for the tool
- Result Structure: Defines the format of the tool’s output
- Error Handling: Standardized error reporting and handling
- Security Checks: Validation to prevent dangerous operations
The tools in gomcptest can be categorized into several functional groups:
Filesystem Navigation
- LS: Lists files and directories, providing metadata and structure
- GlobTool: Finds files matching specific patterns, making it easier to locate relevant files
- GrepTool: Searches file contents using regular expressions, helping find specific information in codebases
Content Management
- View: Reads and displays file contents, allowing the model to analyze existing code or documentation
- Edit: Makes targeted modifications to files, enabling precise changes without overwriting the entire file
- Replace: Completely overwrites file contents, useful for generating new files or making major changes
System Interaction
- Bash: Executes shell commands, allowing the model to run commands, scripts, and programs
- dispatch_agent: A meta-tool that can create specialized sub-agents for specific tasks
AI/ML Services
- imagen: Generates and manipulates images using Google’s Imagen API, enabling visual content creation
Data Processing
- duckdbserver: Provides SQL-based data processing capabilities using DuckDB, enabling complex data analysis and transformations
Design Principles
The tools in gomcptest were designed with several key principles in mind:
1. Modularity
Each tool is a standalone executable that can be developed, tested, and deployed independently. This modular approach allows for:
- Independent development cycles
- Targeted testing
- Simpler debugging
- Ability to add or replace tools without affecting the entire system
2. Security
Security is a major consideration in the tool design:
- Tools validate inputs to prevent injection attacks
- File operations are limited to appropriate directories
- Bash command execution is restricted with banned commands
- Timeouts prevent infinite operations
- Process isolation prevents one tool from affecting others
3. Simplicity
The tools are designed to be simple to understand and use:
- Clear, focused functionality for each tool
- Straightforward parameter structures
- Consistent result formats
- Well-documented behaviors and limitations
4. Extensibility
The system is designed to be easily extended:
- New tools can be added by following the standard protocol
- Existing tools can be enhanced with additional parameters
- Alternative implementations can replace existing tools
The communication protocol for tools follows this pattern:
Tools receive JSON input on stdin in this format:
{
"param1": "value1",
"param2": "value2",
"param3": 123
}
Tools return JSON output on stdout in one of these formats:
Success:
{
"result": "text result"
}
or
{
"results": [
{"field1": "value1", "field2": "value2"},
{"field1": "value3", "field2": "value4"}
]
}
Error:
{
"error": "Error message",
"code": "ERROR_CODE"
}
Implementation Examples
Most tools follow this basic structure:
package main
import (
"encoding/json"
"fmt"
"os"
)
// Parameters defines the expected input structure
type Parameters struct {
Param1 string `json:"param1"`
Param2 int `json:"param2,omitempty"`
}
// Result defines the output structure
type Result struct {
Result string `json:"result,omitempty"`
Error string `json:"error,omitempty"`
Code string `json:"code,omitempty"`
}
func main() {
// Parse input
var params Parameters
decoder := json.NewDecoder(os.Stdin)
if err := decoder.Decode(¶ms); err != nil {
outputError("Failed to parse input", "INVALID_INPUT")
return
}
// Validate parameters
if params.Param1 == "" {
outputError("param1 is required", "MISSING_PARAMETER")
return
}
// Execute core functionality
result, err := executeTool(params)
if err != nil {
outputError(err.Error(), "EXECUTION_ERROR")
return
}
// Return result
output := Result{Result: result}
encoder := json.NewEncoder(os.Stdout)
encoder.Encode(output)
}
func executeTool(params Parameters) (string, error) {
// Tool-specific logic here
return "result", nil
}
func outputError(message, code string) {
result := Result{
Error: message,
Code: code,
}
encoder := json.NewEncoder(os.Stdout)
encoder.Encode(result)
}
Advanced Concepts
The dispatch_agent tool demonstrates how tools can be composed to create more powerful capabilities. It:
- Accepts a high-level task description
- Plans a sequence of tool operations to accomplish the task
- Executes these operations using the available tools
- Synthesizes the results into a coherent response
Error Propagation
The tool error mechanism is designed to provide useful information back to the model:
- Error messages are human-readable and descriptive
- Error codes allow programmatic handling of specific error types
- Stacktraces and debugging information are not exposed to maintain security
Tools are designed with performance in mind:
- File operations use efficient libraries and patterns
- Search operations employ indexing and filtering when appropriate
- Large results can be paginated or truncated to prevent context overflows
- Resource-intensive operations have configurable timeouts
Future Directions
The tool architecture in gomcptest could evolve in several ways:
- Streaming Results: Supporting incremental results for long-running operations
- Tool Discovery: More sophisticated mechanisms for models to discover available tools
- Tool Chaining: First-class support for composing multiple tools in sequences or pipelines
- Interactive Tools: Tools that can engage in multi-step interactions with the model
- Persistent State: Optional state maintenance for tools that benefit from context
Conclusion
The MCP tools in gomcptest provide a flexible, secure, and extensible foundation for enabling AI agents to interact with their environment. By understanding the architecture and design principles of these tools, developers can effectively utilize the existing tools, extend them with new capabilities, or create entirely new tools that integrate seamlessly with the system.
4.5 - AgentFlow: Event-Driven Interface Implementation
Implementation details of AgentFlow’s event-driven web interface, demonstrating how the general event system concepts are applied in practice through real-time tool interactions and streaming responses.
This document explains how AgentFlow implements the general event system architecture in a web-based interface, providing a concrete example of the event-driven patterns described in the foundational concepts. AgentFlow is the embedded web interface for gomcptest’s OpenAI-compatible server.
What is AgentFlow?
AgentFlow is a specific implementation of the gomcptest event system in the form of a modern web-based chat interface. It demonstrates how the general event-driven architecture can be applied to create transparent, real-time agentic interactions through a browser-based UI.
Core Architecture Overview
ChatEngine Interface Design
The foundation of AgentFlow’s functionality rests on the ChatServer
interface defined in chatengine/chat_server.go
:
type ChatServer interface {
AddMCPTool(client.MCPClient) error
ModelList(context.Context) ListModelsResponse
ModelDetail(ctx context.Context, modelID string) *Model
ListTools(ctx context.Context) []ListToolResponse
HandleCompletionRequest(context.Context, ChatCompletionRequest) (ChatCompletionResponse, error)
SendStreamingChatRequest(context.Context, ChatCompletionRequest) (<-chan StreamEvent, error)
}
This interface abstracts the underlying LLM provider (currently Vertex AI Gemini) and provides a consistent API for tool integration and streaming responses. The key innovation is the SendStreamingChatRequest
method that returns a channel of StreamEvent
interfaces, enabling real-time event streaming.
OpenAI v1 API Compatibility Strategy
A fundamental design decision was to maintain full compatibility with the OpenAI v1 API while extending it with enhanced functionality. This is achieved through:
- Standard Endpoint Preservation: Uses
/v1/chat/completions
, /v1/models
, and /v1/tools
endpoints - Parameter Encoding: Tool selection is encoded within the existing model parameter using a pipe-delimited format
- Event Extension: Additional events are streamed alongside standard chat completion responses
- Backward Compatibility: Existing OpenAI-compatible clients work unchanged
This approach avoids the need to modify standard API endpoints while providing enhanced capabilities through the AgentFlow interface.
Event System Architecture
StreamEvent Interface
The event system is built around the StreamEvent
interface in chatengine/stream_event.go
:
type StreamEvent interface {
IsStreamEvent() bool
}
This simple interface allows for polymorphic event handling, where different event types can be processed through the same streaming pipeline.
Event Types and Structure
Defined in chatengine/vertexai/gemini/tool_events.go
, tool call events capture when the AI decides to use a tool:
type ToolCallEvent struct {
ID string `json:"id"`
Object string `json:"object"`
Created int64 `json:"created"`
EventType string `json:"event_type"`
ToolCall ToolCallDetails `json:"tool_call"`
}
type ToolCallDetails struct {
ID string `json:"id"`
Name string `json:"name"`
Arguments map[string]interface{} `json:"arguments"`
}
Tool response events capture the results of tool execution:
type ToolResponseEvent struct {
ID string `json:"id"`
Object string `json:"object"`
Created int64 `json:"created"`
EventType string `json:"event_type"`
ToolResponse ToolResponseDetails `json:"tool_response"`
}
type ToolResponseDetails struct {
ID string `json:"id"`
Name string `json:"name"`
Response interface{} `json:"response"`
Error string `json:"error,omitempty"`
}
Server-Sent Events Implementation
The streaming implementation in chatengine/chat_completion_stream.go
provides the SSE infrastructure:
func (o *OpenAIV1WithToolHandler) streamResponse(w http.ResponseWriter, r *http.Request, req ChatCompletionRequest) {
w.Header().Set("Content-Type", "text/event-stream")
w.Header().Set("Cache-Control", "no-cache")
w.Header().Set("Connection", "keep-alive")
w.Header().Set("Transfer-Encoding", "chunked")
// Process events from the stream channel
for event := range stream {
switch res := event.(type) {
case ChatCompletionStreamResponse:
// Handle standard chat completion chunks
default:
// Handle tool events if withAllEvents flag is true
if o.withAllEvents {
jsonBytes, _ := json.Marshal(event)
w.Write([]byte("data: " + string(jsonBytes) + "\n\n"))
}
}
}
}
The withAllEvents
flag controls whether tool events are included in the stream, allowing for backward compatibility with standard OpenAI clients.
Pipe-Delimited Encoding
The tool selection mechanism is implemented through a clever encoding scheme in the model parameter. The ParseModelAndTools
method in chatengine/chat_structure.go
parses this format:
func (req *ChatCompletionRequest) ParseModelAndTools() (string, []string) {
parts := strings.Split(req.Model, "|")
if len(parts) <= 1 {
return req.Model, nil
}
modelName := strings.TrimSpace(parts[0])
toolNames := make([]string, 0, len(parts)-1)
for i := 1; i < len(parts); i++ {
toolName := strings.TrimSpace(parts[i])
if toolName != "" {
toolNames = append(toolNames, toolName)
}
}
return modelName, toolNames
}
This allows formats like:
gemini-2.0-flash
(no tool filtering)gemini-2.0-flash|Edit|View|Bash
(specific tools only)gemini-1.5-pro|VertexAI Code Execution
(model with built-in tools)
The Vertex AI Gemini implementation includes sophisticated tool filtering in chatengine/vertexai/gemini/chatsession.go
:
func (chatsession *ChatSession) FilterTools(requestedToolNames []string) []*genai.Tool {
if len(requestedToolNames) == 0 {
return chatsession.tools // Return all tools if none specified
}
var filteredTools []*genai.Tool
var filteredFunctions []*genai.FunctionDeclaration
for _, tool := range chatsession.tools {
// Handle Vertex AI built-in tools separately
switch {
case tool.CodeExecution != nil && requestedMap[VERTEXAI_CODE_EXECUTION]:
filteredTools = append(filteredTools, &genai.Tool{CodeExecution: tool.CodeExecution})
case tool.GoogleSearch != nil && requestedMap[VERTEXAI_GOOGLE_SEARCH]:
filteredTools = append(filteredTools, &genai.Tool{GoogleSearch: tool.GoogleSearch})
// ... handle other built-in tools
default:
// Handle MCP function declarations
for _, function := range tool.FunctionDeclarations {
if requestedMap[function.Name] {
filteredFunctions = append(filteredFunctions, function)
}
}
}
}
// Combine function declarations into a single tool
if len(filteredFunctions) > 0 {
filteredTools = append(filteredTools, &genai.Tool{
FunctionDeclarations: filteredFunctions,
})
}
return filteredTools
}
This implementation handles both Vertex AI built-in tools (CodeExecution, GoogleSearch, etc.) and MCP function declarations, ensuring they are properly separated to avoid proto validation errors.
Frontend Event Processing
Real-Time Event Handling
The JavaScript implementation in chat-ui.html.tmpl
provides comprehensive event processing through the handleStreamingResponse
method:
async handleStreamingResponse(response) {
const reader = response.body.getReader();
const decoder = new TextDecoder();
while (true) {
const { value, done } = await reader.read();
if (done) break;
const chunk = decoder.decode(value, { stream: true });
const lines = chunk.split('\n');
for (const line of lines) {
if (line.startsWith('data: ')) {
const data = line.slice(6);
if (data === '[DONE]') return;
try {
const parsed = JSON.parse(data);
// Handle different event types
if (parsed.event_type === 'tool_call') {
this.addToolNotification(parsed.tool_call.name, parsed);
this.showToolCallPopup(parsed);
} else if (parsed.event_type === 'tool_response') {
this.updateToolResponsePopup(parsed);
this.storeToolResponse(parsed);
} else if (parsed.choices && parsed.choices[0]) {
// Handle standard chat completion chunks
this.updateMessageContent(messageIndex, assistantMessage, true);
}
} catch (e) {
// Handle JSON parse errors gracefully
}
}
}
}
}
AgentFlow implements a sophisticated popup management system to provide real-time feedback on tool execution:
showToolCallPopup(event) {
const popupId = event.tool_call.id;
// Create popup with loading state
const popup = document.createElement('div');
popup.className = 'tool-popup tool-call';
popup.innerHTML = `
<div class="tool-popup-header">
<div class="tool-popup-title">Tool Executing: ${event.tool_call.name}</div>
<button class="tool-popup-close" onclick="chatUI.closeToolPopup('${popupId}')">×</button>
</div>
<div class="tool-popup-content">
<div class="tool-popup-args">${JSON.stringify(event.tool_call.arguments, null, 2)}</div>
<div class="tool-popup-spinner"></div>
</div>
`;
// Store reference and set auto-close timer
this.toolPopups.set(popupId, popup);
this.popupAutoCloseTimers.set(popupId, setTimeout(() => {
this.closeToolPopup(popupId);
}, 30000));
}
updateToolResponsePopup(event) {
const popup = this.toolPopups.get(event.tool_response.id);
if (!popup) return;
// Update popup with response data
popup.className = `tool-popup ${event.tool_response.error ? 'tool-error' : 'tool-response'}`;
// Update content with response...
// Auto-close after showing result
setTimeout(() => {
this.closeToolPopup(event.tool_response.id);
}, 5500);
}
The frontend buildModelWithTools()
function implements the pipe-delimited encoding:
buildModelWithTools() {
let modelString = this.selectedModel;
if (this.selectedTools.size > 0 && this.selectedTools.size < this.tools.length) {
// Only add tools if not all are selected (all selected means use all tools)
const toolNames = Array.from(this.selectedTools);
modelString += '|' + toolNames.join('|');
}
return modelString;
}
This ensures tool selection is properly encoded in the API request while maintaining OpenAI compatibility.
Technical Design Benefits
Event-Driven Transparency
The event system provides unprecedented visibility into AI decision-making:
- Real-Time Feedback: Users see tool calls as they happen
- Detailed Information: Full argument and response data available
- Error Visibility: Tool failures are clearly communicated
- Learning Opportunity: Users understand how AI approaches problems
Scalable Architecture
The channel-based streaming architecture scales well:
- Non-Blocking: Event processing doesn’t block the main request thread
- Backpressure Handling: Go channels provide natural backpressure
- Resource Management: Proper cleanup prevents memory leaks
- Error Isolation: Tool failures don’t crash the entire system
OpenAI Compatibility
The design maintains full OpenAI v1 API compatibility:
- Standard Endpoints: No custom API modifications required
- Parameter Encoding: Tool selection uses existing model parameter
- Event Extensions: Additional events don’t interfere with standard responses
- Client Compatibility: Existing OpenAI clients work unchanged
Integration Points
MCP Protocol Integration
AgentFlow seamlessly integrates with the Model Context Protocol:
- Tool Discovery: Automatic detection of MCP server capabilities
- Dynamic Loading: Tools can be added/removed without restart
- Protocol Abstraction: MCP details are hidden from the UI
- Error Handling: MCP errors are gracefully handled and displayed
Vertex AI Integration
The Vertex AI backend provides:
- Built-in Tools: Code execution, Google Search, etc.
- Model Selection: Multiple Gemini model variants
- Streaming Support: Native streaming for real-time responses
- Tool Mixing: Combines MCP tools with Vertex AI capabilities
This comprehensive architecture enables AgentFlow to provide an intuitive, powerful interface for agentic interactions while maintaining compatibility with existing OpenAI tooling and providing deep visibility into the AI’s decision-making process.