Brief Introduction About Google Agent2Agent Protocol (A2A)
agent to agent scenarios with a new open protocol
Abstract
This blog introduces Agent2Agent (A2A), a new open protocol led by Google Cloud Platform, which aims to achieve interoperability between artificial intelligence agents.
The A2A protocol aims to bridge the gap between opaque agent systems built by different vendors and frameworks, enabling them to collaborate across isolated data systems and applications, thereby improving autonomy, productivity and reducing long-term costs.
The protocol has been supported by more than 50 technology and consulting partners and defines a set of key principles, working methods and technical specifications to promote secure communication, information exchange and operational coordination between agents.
Main themes and key points:
1. Necessity of agent interoperability
As autonomous AI agents are widely deployed in enterprises to automate a variety of tasks, it is critical to enable these agents to collaborate across different systems and applications in a dynamic multi-agent ecosystem.
“To maximize the benefits of agent-based AI, it is critical that these agents can collaborate across siloed data systems and applications in dynamic, multi-agent ecosystems.”
Even if agents are built by different vendors or frameworks, enabling interoperability “can increase autonomy and multiply productivity while reducing long-term costs.”
2. A2A Protocol’s Vision:
A2A is a new open protocol designed to solve agent interoperability issues and has been supported by more than 50 technology partners including Atlassian, Box, Cohere, Salesforce, SAP and leading service providers such as Accenture, BCG and Deloitte.
A2A will allow AI agents to "communicate with each other, securely exchange information, and coordinate actions across various enterprise platforms or applications."
The core vision is to achieve “AI agents that work seamlessly together, regardless of the underlying technology, to automate complex enterprise workflows and drive unprecedented levels of efficiency and innovation.”
3. Positioning and foundation of A2A protocol
A2A is an open protocol that complements the Anthropic Model Context Protocol (MCP).
The protocol draws on Google’s internal expertise in scaling agent systems and is designed to address the challenges of deploying large-scale multi-agent systems.
A2A enables developers to build proxies that can connect with any other proxy built using the protocol, and gives users “the flexibility to combine proxies from different providers.”
Emphasizes the importance of a standardized approach to managing agents across different platforms and cloud environments, arguing that “this universal interoperability is critical to realizing the full potential of collaborative AI agents.”
4. A2A design principles:
Embrace agent capabilities: Allow agents to collaborate in natural, unstructured patterns, even if they do not share memory, tools, and context. multi-agent scenarios, without limiting agents to a single “tool.”
Based on existing standards: Built on popular standards such as HTTP, SSE, JSON-RPC, etc., it is easy to integrate with the existing IT stack.
Secure by default: Supports enterprise-grade authentication and authorization, aligned with OpenAPI’s authentication scheme.
Support for long-running tasks: Flexibly support a variety of tasks, from quick tasks to in-depth studies that take hours or even days, with real-time feedback, notifications, and status updates.
Modality-agnostic: Supports a variety of modalities, including text, audio, and video streaming.
5. How A2A works
Client-server architecture: A2A facilitates communication between a “client” agent and a “remote” agent. The client agent initiates tasks and the remote agent executes them.
Capability discovery: Agents advertise their capabilities through “agent cards” in JSON format for client agents to identify and select the appropriate agent for communication.
Task management: Communication is task-completion oriented. Task objects have a lifecycle and can be completed immediately or long-running. Agents can synchronize task status. The output of a task is called an artifact.
Collaboration: Agents can send messages to each other to pass context, replies, artifacts, or user instructions.
User Experience Negotiation: Messages contain “parts” with specified content types, allowing the client and remote agent to negotiate desired formats and user interface features (e.g., iframes, video, web forms, etc.).
Agent2Agent (A2A) Protocol Learning Guide
Test questions (short answer questions, 2-3 sentences each):
What is the main goal of the Agent2Agent (A2A) protocol?
What mode of collaboration between agents does the A2A protocol emphasize? How does it differ from the way agents share memory or tools?
List at least three key principles followed in the design of the A2A protocol.
In the A2A protocol, what are the roles of the “client” agent and the “remote” agent? How do they interact?
What is a "Proxy Card"? What key information does it contain, and how does the client use it?
Briefly describe the concept of "task" in the A2A protocol. What states may a task go through during its life cycle?
In the A2A protocol, what is the difference between a "message" and an "artifact"? What type of information is each used to convey?
How does the A2A protocol handle long-running tasks? What mechanisms does it provide to keep the client up to date on the status of the task?
How does the A2A protocol consider security? Does it exchange identity information directly within the protocol?
Describe the A2A protocol’s support for non-text media. List at least two of the modalities it supports.
Quiz answers:
The main goal of the A2A protocol is to support interoperability between AI agents built by different vendors or frameworks, bridging the gap between opaque agent systems, thereby enhancing autonomy, improving productivity and reducing long-term costs.
The A2A protocol emphasizes collaboration between agents in a natural, unstructured pattern, even though they do not share memory, tools, and context. Instead, agents exchange context, state, instructions, and data.
At least three key principles include: embrace proxy capabilities, build on existing standards, secure by default, support long-running tasks, and be modality-agnostic.
The "client" agent initiates and communicates task requests on behalf of the user, and the "remote" agent is responsible for executing these tasks and providing results. The "client" agent uses A2A to communicate with the "remote" agent to complete the user request.
An agent card is a JSON-formatted file that describes the capabilities, skills, and authentication mechanisms of a remote agent. Clients use agent cards to identify the best agent that can perform a specific task and understand how to communicate with it.
"Task" is a stateful entity in the A2A protocol that is oriented to completing end-user requests. A task may go through the states of "submitted", "working", "input-required", "completed", "canceled", "failed" and "unknown" during its life cycle.
"Messages" contain any non-artifact content, such as the agent's thoughts, user context, instructions, or status updates. "Artifacts" are the final results generated by the agent after performing a task, such as text, files, or data.
The A2A protocol supports clients polling the broker for updates, streaming updates via Server Sent Events (SSE), and receiving status updates via push notifications when disconnected.
The A2A protocol is designed to support enterprise-level authentication and authorization, following the OpenAPI authentication specification. However, it does not exchange identity information directly within the protocol, but expects to obtain it out-of-band and transmit it in the HTTP header.
The A2A protocol is designed to be modality-independent and supports a variety of media types, including text, audio, and video streams, to accommodate a wider range of agent interaction scenarios.
key terms
Agent2Agent (A2A) Protocol: An open protocol designed to enable interoperability between different AI agents, allowing them to communicate and collaborate across different systems and platforms.
Agent: An intelligent system that can autonomously process tasks and interact with the environment, usually based on artificial intelligence technology.
Interoperability: ·Interoperability: The ability of different systems or agents to seamlessly exchange information and work together.
Opaque Agent: A "black box" proxy whose inner workings, thinking processes, and tools are not visible to other agents or clients.
Agent Card: A JSON-formatted metadata file that a remote agent uses to advertise its capabilities, skills, supported modalities, and authentication requirements.
Client: In the A2A protocol, the entity that initiates a task request to the remote agent on behalf of the user (it can be a service, another agent, or an application).
Remote Agent: In the A2A protocol, an opaque agent that receives and executes task requests from the client.
Task: A stateful entity in the A2A protocol that represents an end-user request. The client and the remote agent interact to complete the task and produce results.
Artifact: The final result or output generated by the remote agent after executing the task, which can be text, file, data or other forms of content.
Message: In the A2A protocol, non-artifact content exchanged between the client and the remote agent to convey information such as context, instructions, status updates, or metadata.
Part: A complete content fragment that constitutes a message or artifact. Each part has a specified content type (such as text, file, data).
JSON-RPC 2.0: The JSON-based remote procedure call protocol used by the A2A protocol for communication between the client and the remote agent.
Server-Sent Events (SSE): A web technology that allows servers to push real-time updates to clients, which the A2A protocol leverages to support streaming status updates of long-running tasks.
Push Notification: A mechanism that allows a remote agent to send task status updates to a client through an external service after the remote agent is disconnected from the client.
Capability Discovery: The process by which a client identifies and finds suitable remote agents that can perform a specific task, usually by querying and analyzing agent card information.
Modality: refers to the form of information or interaction, such as text, audio, video, etc. The A2A protocol is designed to support multiple modalities.
Authentication and Authorization: The security process of verifying the identity of the client and granting it access to specific functions of the remote agent. A2A follows the OpenAPI authentication specification.
Reference
https://github.com/google/A2A/tree/main
https://google.github.io/A2A/#/
https://developers.googleblog.com/zh-hans/a2a-a-new-era-of-agent-interoperability/