Some of this article's listed sources may not be reliable. Please help improve this article by looking for better, more reliable sources. Unreliable citations may be challenged and removed. (December 2025) (Learn how and when to remove this message) |
In the context of generative artificial intelligence, AI agents (also referred to as compound AI systems or agentic AI) are a class of intelligent agents that can pursue goals, use tools, and take actions with varying degrees of autonomy. In practice, they usually operate within human-defined objectives, constraints, and available tools.[1][2]
Overview
[edit]AI agents possess several key attributes, including goal-directed behavior, natural language interfaces, the capacity to use external tools, and the ability to perform multi-step tasks. Their control flow is frequently driven by large language models (LLMs). Agent systems may also include memory components, planning logic, tool interfaces, and orchestration software for coordinating agent components.[2][3]
AI agents do not have a standard definition.[4][5][6] NIST has described agentic AI as an emerging area requiring standards for secure operation, interoperability, and reliable interaction with external systems.[1]
A common application of AI agents is the automation of tasks, for example booking travel plans based on a user's prompted request.[7][8][9]
Companies such as Google, Microsoft and Amazon Web Services have offered platforms for deploying pre-built AI agents.[10] Several protocols have been proposed for standardizing inter-agent communication, with examples including the Model Context Protocol, Gibberlink,[11] and many others. Some of these protocols are also used for connecting agents with external applications.[12]
In December 2025, Linux Foundation announced the formation of the Agentic AI Foundation (AAIF), with the goal of ensuring agentic AI evolves transparently and collaboratively.[13][14]
History
[edit]AI agents have been traced back to research from the 1990s, with Harvard professor Milind Tambe noting that the definition of an AI agent was not clear at the time either. Researcher Andrew Ng has been credited with spreading the term "agentic" to a wider audience in 2024.[15]
Training and testing
[edit]Researchers have attempted to build world models[16][17] and reinforcement learning environments[18] to train or evaluate AI agents. For example, video games such as Minecraft[19] and No Man's Sky[20] as well as replicas of company websites,[21] have also been used for training AI agents.
Autonomous capabilities
[edit]The Financial Times compared the autonomy of AI agents to the SAE classification of self-driving cars, comparing most applications to level 2 or level 3, with some achieving level 4 in highly specialized circumstances, and level 5 being theoretical.[22]
Cognitive architecture
[edit]The following are some possible internal design options for reasoning within an agent:[23]
- Retrieval-augmented generation
- ReAct (Reason + Act) pattern is an iterative process in which an AI agent alternates between reasoning and taking actions, receives observations from the environment or external tools, and integrates these observations into subsequent reasoning steps.[24]
- Reflexion, which uses an LLM to create feedback on the agent's plan of action and stores that feedback in a memory cache.
- A tool/agent registry, for organizing software functions or other agents that the agent can use.
- One-shot model querying, which queries the model once to create the plan of action.
Reference architecture
[edit]Ken Huang proposed an AI Agent reference architecture, which consists of seven interconnected layers, with each layer building on the functionality of the layers beneath it[25]:
- Layer 1: Foundation models - provide the core AI engines to power agent capabilities.
- Layer 2: Data operations - manage the complex data infrastructure required for AI agent operations, including Vector database, data loaders, RAG.
- Layer 3: Agent frameworks - sophisticated software and tools that simplify the development and management of the AI agents.
- Layer 4: Deployment and infrastructure - provide the robust technical foundation for running AI agents.
- Layer 5: Evaluation and observability - focus on assessing the safety and performance of AI agents.
- Layer 6: Security and compliance - a crucial protective framework ensuring AI agents operate safely, securely, and conform to regulatory boundaries. At this layer security and compliance features embedded into all the AI agent stack layers are integrated together.
- Layer 7: Agent ecosystem - represents the AI agents' interface with real-world applications and users.
Orchestration patterns
[edit]To execute complex tasks, autonomous agents are often integrated with other agents or specialized tools. These configurations, known as orchestration patterns or workflows, include the following:[26][27]
- Prompt chaining: A sequence where the output of one step serves as the input for the next.
- Routing: The classification of an input to direct it to a specialized downstream task or tool.
- Parallelization: The simultaneous execution of multiple tasks.
- Sequential processing: A fixed, linear progression of tasks through a predefined pipeline.
- Planner-critic: An iterative pattern where one agent generates a proposal and another evaluates it to provide feedback for refinement.
Multimodal AI agents
[edit]In addition to large language models (LLMs), vision-language models (VLMs) and multimodal foundation models can be used as the basis for agents. In September 2024, Allen Institute for AI released an open-source vision-language model.[28] Nvidia released a fram