The Agent2Agent Protocol and AgentHub Implementation
This document explores the core principles of Google’s Agent2Agent protocol and how AgentHub implements a communication broker based on these concepts. We distinguish between the Agent2Agent protocol specification (task structures and communication patterns) and our custom AgentHub broker implementation.
Agent2Agent vs AgentHub: What’s What
Agent2Agent Protocol (Google)
The Agent2Agent protocol defines:
- Task Message Structures:
TaskMessage,TaskResult,TaskProgresswith their fields and semantics - Task Status and Priority Enums: Standardized task lifecycle and priority levels
- Communication Patterns: Asynchronous task delegation and result reporting concepts
AgentHub Implementation (This Project)
AgentHub provides:
- Event Bus Broker: Centralized gRPC service that routes tasks between agents
- Pub/Sub Architecture: Publisher-subscriber pattern for task distribution
- Subscription Mechanisms:
SubscribeToTasks,SubscribeToTaskResults,SubscribeToTaskProgressmethods - Agent Implementations: Sample publisher and subscriber agents demonstrating the protocol
Philosophy and Core Concepts
Beyond Simple Request-Response
Traditional software architectures rely heavily on synchronous request-response patterns where a client requests a service and waits for an immediate response. While effective for simple operations, this pattern has limitations when dealing with:
- Complex, multi-step processes that require coordination between multiple specialized services
- Long-running operations that may take minutes or hours to complete
- Dynamic workload distribution where the best processor for a task may vary over time
- Autonomous decision-making where agents need to collaborate without central coordination
The Agent2Agent protocol addresses these limitations by defining task structures and communication patterns for autonomous agents. AgentHub implements a broker-based system that enables agents to communicate using Agent2Agent-inspired task structures:
- Delegating work to other agents based on their capabilities
- Accepting and processing tasks according to their specializations
- Reporting progress during long-running operations
- Making collaborative decisions about task distribution and execution
Autonomous Collaboration
In an Agent2Agent system, each agent operates with a degree of autonomy, making decisions about:
- Which tasks to accept based on current capacity and capabilities
- How to prioritize work when multiple tasks are pending
- When to delegate subtasks to other specialized agents
- How to report progress and handle failures
This autonomy enables the system to be more resilient, scalable, and adaptive compared to centrally-controlled architectures.
Key Design Principles
1. Asynchronous Communication
Agent2Agent communication is fundamentally asynchronous. When Agent A requests work from Agent B:
- Agent A doesn’t block waiting for completion
- Agent B can process the task when resources are available
- Progress updates provide visibility into long-running operations
- Results are delivered when the work is complete
This asynchronicity enables:
- Better resource utilization as agents aren’t blocked waiting
- Improved scalability as systems can handle more concurrent operations
- Enhanced resilience as temporary agent unavailability doesn’t block the entire system
2. Rich Task Semantics (Agent2Agent Protocol)
The Agent2Agent protocol defines rich task message structures that AgentHub implements:
message TaskMessage {
string task_id = 1; // Unique identifier for tracking
string task_type = 2; // Semantic type (e.g., "data_analysis")
google.protobuf.Struct parameters = 3; // Flexible parameters
string requester_agent_id = 4; // Who requested the work
string responder_agent_id = 5; // Who should do the work (optional)
google.protobuf.Timestamp deadline = 6; // When it needs to be done
Priority priority = 7; // How urgent it is
google.protobuf.Struct metadata = 8; // Additional context
}
This rich structure enables:
- Intelligent routing based on task type and agent capabilities
- Priority-based scheduling to ensure urgent tasks are handled first
- Deadline awareness for time-sensitive operations
- Context preservation for better decision-making
3. Explicit Progress Tracking
Long-running tasks benefit from explicit progress reporting:
message TaskProgress {
string task_id = 1; // Which task this refers to
TaskStatus status = 2; // Current status
string progress_message = 3; // Human-readable description
int32 progress_percentage = 4; // Quantitative progress (0-100)
google.protobuf.Struct progress_data = 5; // Structured progress information
}
This enables:
- Visibility into system operations for monitoring and debugging
- User experience improvements with real-time progress indicators
- Resource planning by understanding how long operations typically take
- Early failure detection when progress stalls unexpectedly
4. Flexible Agent Addressing
The protocol supports multiple addressing patterns:
- Direct addressing: Tasks sent to specific agents by ID
- Broadcast addressing: Tasks sent to all capable agents
- Capability-based routing: Tasks routed based on agent capabilities
- Load-balanced routing: Tasks distributed among agents with similar capabilities
This flexibility enables different architectural patterns within the same system.
Architectural Patterns
Microservices Enhancement
In a microservices architecture, Agent2Agent can enhance service communication by:
- Replacing synchronous HTTP calls with asynchronous task delegation
- Adding progress visibility to long-running service operations
- Enabling service composition through task chaining
- Improving resilience through task retry and timeout mechanisms
Event-Driven Architecture Integration
Agent2Agent complements event-driven architectures by:
- Adding structure to event processing with explicit task semantics
- Enabling bidirectional communication where events can trigger tasks that produce responses
- Providing progress tracking for complex event processing workflows
- Supporting task-based coordination alongside pure event broadcasting
Workflow Orchestration
Complex business processes can be modeled as Agent2Agent workflows:
- Process Initiation: A workflow agent receives a high-level business request
- Task Decomposition: The request is broken down into specific tasks
- Agent Coordination: Tasks are distributed to specialized agents
- Progress Aggregation: Individual task progress is combined into overall workflow status
- Result Assembly: Task results are combined into a final business outcome
Benefits and Trade-offs
Benefits
Scalability: Asynchronous operation and agent autonomy enable horizontal scaling without central bottlenecks.
Resilience: Agent failures don’t cascade as easily since tasks can be retried or redistributed.
Flexibility: New agent types can be added without modifying existing agents.
Observability: Rich task semantics and progress reporting provide excellent visibility into system operations.
Modularity: Agents can be developed, deployed, and scaled independently.
Trade-offs
Complexity: The system requires more sophisticated error handling and state management compared to simple request-response patterns.
Latency: For simple operations, the overhead of task creation and routing may add latency compared to direct calls.
Debugging: Distributed, asynchronous operations can be more challenging to debug than synchronous call chains.
Consistency: Managing data consistency across asynchronous agent operations requires careful design.
When to Use Agent2Agent
Agent2Agent is particularly well-suited for:
Complex Processing Pipelines
When work involves multiple steps that can be performed by different specialized agents:
- Data ingestion → validation → transformation → analysis → reporting
- Image upload → virus scan → thumbnail generation → metadata extraction
- Order processing → inventory check → payment processing → fulfillment
Long-Running Operations
When operations take significant time and users need progress feedback:
- Large file processing
- Machine learning model training
- Complex data analysis
- Batch job processing
Dynamic Load Distribution
When workload characteristics vary and different agents may be better suited for different tasks:
- Multi-tenant systems with varying customer requirements
- Resource-intensive operations that need specialized hardware
- Geographic distribution where local processing is preferred
System Integration
When connecting heterogeneous systems that need to coordinate:
- Third-party service coordination
- Cross-platform workflows
Comparison with Other Patterns
vs. Message Queues
Traditional message queues provide asynchronous communication but lack:
- Rich task semantics
- Progress tracking
- Bidirectional result delivery
- Priority and deadline awareness
vs. RPC/HTTP APIs
RPC and HTTP APIs provide structured communication but are typically:
- Synchronous (blocking)
- Lacking progress visibility
- Point-to-point rather than flexible routing
- Without built-in retry and timeout semantics
vs. Event Sourcing
Event sourcing provides audit trails and state reconstruction but:
- Focuses on state changes rather than work coordination
- Lacks explicit progress tracking
- Doesn’t provide direct task completion feedback
- Requires more complex query patterns for current state
The SubAgent Library: Simplifying Agent Development
While the Agent2Agent protocol and AgentHub broker provide powerful capabilities for building distributed agent systems, implementing agents from scratch requires significant boilerplate code. The SubAgent library addresses this by providing a high-level abstraction that handles infrastructure concerns, letting developers focus on business logic.
The Problem: Too Much Boilerplate
Traditional agent implementation requires:
- ~200+ lines of setup code: gRPC client configuration, connection management, health checks
- A2A protocol compliance: Correct AgentCard structure with all required fields
- Subscription management: Setting up task streams and handling lifecycle
- Observability integration: Manual tracing span creation, logging, metrics
- Error handling: Graceful shutdown, signal handling, resource cleanup
This creates several issues:
- High barrier to entry: New agents require deep knowledge of the infrastructure
- Code duplication: Every agent reimplements the same patterns
- Maintenance burden: Infrastructure changes require updates across all agents
- Inconsistent quality: Some agents may have better observability or error handling than others
The Solution: Infrastructure as a Library
The SubAgent library encapsulates all infrastructure concerns into a simple, composable API:
// 1. Configure your agent
config := &subagent.Config{
AgentID: "my_agent",
Name: "My Agent",
Description: "Does something useful",
}
// 2. Create and register skills
agent, _ := subagent.New(config)
agent.MustAddSkill("Skill Name", "Description", handlerFunc)
// 3. Run (everything else is automatic)
agent.Run(ctx)
This reduces agent implementation from ~200 lines to ~50 lines (75% reduction), letting developers focus entirely on their domain logic.
Architecture
The SubAgent library implements a layered architecture:
┌─────────────────────────────────────────┐
│ Your Business Logic │
│ (Handler Functions: ~30 lines) │
├─────────────────────────────────────────┤
│ SubAgent Library │
│ - Config & Validation │
│ - AgentCard Creation (A2A compliant) │
│ - Task Subscription & Routing │
│ - Automatic Observability │
│ - Lifecycle Management │
├─────────────────────────────────────────┤
│ AgentHub Client Library │
│ - gRPC Connection │
│ - Message Publishing/Subscription │
│ - TraceManager, Metrics, Logging │
├─────────────────────────────────────────┤
│ AgentHub Broker │
│ - Event Routing │
│ - Agent Registry │
│ - Task Distribution │
└─────────────────────────────────────────┘
Key Features
1. Declarative Configuration
Instead of imperative setup code, agents use declarative configuration:
config := &subagent.Config{
AgentID: "agent_translator", // Required
Name: "Translation Agent", // Required
Description: "Translates text", // Required
Version: "1.0.0", // Optional, defaults
HealthPort: "8087", // Optional, defaults
}
The library:
- Validates all required fields
- Applies sensible defaults for optional fields
- Returns clear error messages for configuration issues
2. Skill-Based Programming Model
Agents define capabilities as “skills” - discrete units of functionality:
agent.MustAddSkill(
"Language Translation", // Name (shown to LLM)
"Translates text between languages", // Description
translateHandler, // Implementation
)
Each skill maps to a handler function with a clear signature:
func (ctx, task, message) -> (artifact, state, errorMessage)
This model:
- Encourages single-responsibility design
- Makes capabilities explicit and discoverable
- Simplifies testing (handlers are pure functions)
- Enables skill-based task routing
3. Automatic A2A Compliance
The library generates complete, A2A-compliant AgentCards:
// Developer writes:
agent.MustAddSkill("Translate", "Translates text", handler)
// Library generates:
&pb.AgentCard{
ProtocolVersion: "0.2.9",
Name: "agent_translator",
Description: "Translation Agent",
Version: "1.0.0",
Skills: []*pb.AgentSkill{
{
Id: "skill_0",
Name: "Translate",
Description: "Translates text",
Tags: []string{"Translate"},
InputModes: []string{"text/plain"},
OutputModes: []string{"text/plain"},
},
},
Capabilities: &pb.AgentCapabilities{
Streaming: false,
PushNotifications: false,
},
}
This ensures all agents follow protocol standards without manual effort.
4. Built-In Observability
Every task execution is automatically wrapped with observability:
Tracing:
// Automatic span creation for each task
taskSpan := traceManager.StartSpan(ctx, "agent.{agentID}.handle_task")
traceManager.AddA2ATaskAttributes(taskSpan, taskID, skillName, contextID, ...)
traceManager.SetSpanSuccess(taskSpan) // or RecordError()
Logging:
// Automatic structured logging
logger.InfoContext(ctx, "Processing task", "task_id", taskID, "skill", skillName)
logger.ErrorContext(ctx, "Task failed", "error", err)
Metrics:
- Task processing duration
- Success/failure counts
- Active task count
- (via AgentHubClient metrics)
Developers get full distributed tracing and logging without writing any observability code.
5. Lifecycle Management
The library handles the complete agent lifecycle:
Startup:
- Validate configuration
- Connect to broker (with retries)
- Register AgentCard
- Subscribe to tasks
- Start health check server
- Signal “ready”
Runtime:
- Receive tasks from broker
- Route to appropriate handler
- Execute with tracing/logging
- Publish results
- Handle errors gracefully
Shutdown:
- Catch SIGINT/SIGTERM signals
- Stop accepting new tasks
- Wait for in-flight tasks (with timeout)
- Close broker connection
- Cleanup resources
- Exit cleanly
All automatically - developers never write lifecycle code.
Design Patterns
The Handler Pattern
Handlers are pure functions that transform inputs to outputs:
func myHandler(ctx context.Context, task *pb.Task, message *pb.Message)
(*pb.Artifact, pb.TaskState, string) {
// Extract input
input := extractInput(message)
// Validate
if err := validate(input); err != nil {
return nil, TASK_STATE_FAILED, err.Error()
}
// Process
result := process(ctx, input)
// Create artifact
artifact := createArtifact(result)
return artifact, TASK_STATE_COMPLETED, ""
}
This pattern:
- Testable: Pure functions are easy to unit test
- Composable: Handlers can call other functions
- Error handling: Explicit return of state and error message
- Context-aware: Receives context for cancellation and tracing
The Configuration Pattern
Configuration is separated from code:
// Development
config := &subagent.Config{
AgentID: "my_agent",
HealthPort: "8080",
}
// Production (from environment)
config := &subagent.Config{
AgentID: os.Getenv("AGENT_ID"),
BrokerAddr: os.Getenv("BROKER_ADDR"),
HealthPort: os.Getenv("HEALTH_PORT"),
}
This enables:
- Different configs for dev/staging/prod
- Easy testing with mock configs
- Container-friendly (12-factor app)
Benefits
For Developers:
- Faster development: 75% less code to write
- Lower complexity: Focus on business logic, not infrastructure
- Better quality: Automatic best practices (observability, error handling)
- Easier testing: Handler functions are pure and testable
- Clearer structure: Skill-based organization is intuitive
For Operations:
- Consistent observability: All agents have same tracing/logging
- Standard health checks: Uniform health endpoints
- Predictable behavior: Lifecycle management is consistent
- Easy monitoring: Metrics are built-in
- Reliable shutdown: Graceful handling is automatic
For the System:
- Better integration: All agents follow same patterns
- Easier debugging: Consistent trace structure across agents
- Simplified maintenance: Library updates improve all agents
- Reduced errors: Less custom code means fewer bugs
Evolution Path
The SubAgent library provides a clear evolution path for agent development:
Phase 1: Simple Agents (Current)
- Single skills, synchronous processing
- Text input/output
- Uses library defaults
Phase 2: Advanced Agents
- Multiple skills per agent
- Streaming responses
- Custom capabilities
- Extended AgentCard fields
Phase 3: Specialized Agents
- Custom observability (additional traces/metrics)
- Advanced error handling
- Multi-modal input/output
- Stateful processing
The library supports all phases through its extensibility points (GetClient(), GetLogger(), custom configs).
Comparison with Manual Implementation
| Aspect | Manual Implementation | SubAgent Library |
|---|---|---|
| Lines of Code | ~200 lines setup | ~50 lines total |
| Configuration | 50+ lines imperative | 10 lines declarative |
| AgentCard | Manual struct creation | Automatic generation |
| Observability | Manual span/log calls | Automatic wrapping |
| Lifecycle | Custom signal handling | Built-in management |
| Error Handling | Scattered throughout | Centralized in library |
| Testing | Must mock infrastructure | Test handlers directly |
| Maintenance | Per-agent updates needed | Library update benefits all |
| Learning Curve | High (need infrastructure knowledge) | Low (focus on logic) |
| Time to First Agent | Several hours | Under 30 minutes |
Real-World Impact
The Echo Agent demonstrates the library’s impact:
Before SubAgent Library (211 lines):
- Manual client setup: 45 lines
- AgentCard creation: 30 lines
- Task subscription: 60 lines
- Handler implementation: 50 lines
- Lifecycle management: 26 lines
With SubAgent Library (82 lines):
- Configuration: 10 lines
- Skill registration: 5 lines
- Handler implementation: 50 lines
- Run: 2 lines
- Everything else: automatic
The business logic (50 lines) stays the same, but infrastructure code (161 lines) is eliminated.
When to Use SubAgent Library
Use SubAgent Library when:
- Building new agents from scratch
- Agent has 1-10 skills with clear boundaries
- Standard A2A protocol is sufficient
- You want consistent observability across agents
- Quick development time is important
Consider Manual Implementation when:
- Highly custom protocol requirements
- Need very specific lifecycle control
- Existing agent migration (may not be worth refactoring)
- Experimental/research agents with non-standard patterns
For 99% of agent development, the SubAgent library is the right choice.
Future Evolution
The Agent2Agent principle opens possibilities for:
Intelligent Agent Networks
Agents that learn about each other’s capabilities and performance characteristics to make better delegation decisions.
Self-Organizing Systems
Agent networks that automatically reconfigure based on workload patterns and agent availability.
Cross-Organization Collaboration
Extending Agent2Agent protocols across organizational boundaries for B2B workflow automation.
AI Agent Integration
Natural integration points for AI agents that can understand task semantics and make autonomous decisions about task acceptance and delegation.
The Agent2Agent principle represents a foundational shift toward more intelligent, autonomous, and collaborative software systems that can handle the complexity of modern distributed applications while providing the visibility and control that operators need.
Feedback
Was this page helpful?
Glad to hear it! Please tell us how we can improve.
Sorry to hear that. Please tell us how we can improve.