Kubert Blog

AI Agent Reading List: Curated by Kubert

AI Agent Reading List: Curated by Kubert

In our opinion, these are some of the most impactful papers in the rapidly evolving field of AI agents. This reading list reflects our belief in the significance of these works, as they cover foundational theories, cutting-edge advancements, and practical applications. Whether you’re a seasoned researcher or a curious learner, these papers offer valuable insights into the world of AI agents.

Reading List

Foundational

Attention Is All You Need

This seminal paper introduces the Transformer model, which has become the foundation for many modern AI agents.

EUREKA: Human-Level Reward Design via Coding Large Language Models

The EUREKA algorithm leverages LLMs for generating and iteratively refining reward functions that outperform human-engineered rewards in complex reinforcement learning environments, including tasks requiring high dexterity, like pen spinning. This paper is critical for understanding how AI can autonomously generate and improve reward systems, which is foundational for advancing the capabilities of AI agents in learning complex skills without extensive human intervention.

An Interactive Agent Foundation Model

This paper defines the Agent Paradigm and framework for developing generalist AI agents that can operate across multiple domains, such as robotics, gaming, and healthcare.

Genie: Generative Interactive Environments

This paper is essential for studying AI Agents because it introduces a groundbreaking approach to generating interactive environments from unlabelled Internet videos, pushing the boundaries of what AI Agents can achieve. The paper details how Genie, a scalable and flexible foundation world model, enables the creation of diverse, controllable virtual experiences, making it a powerful tool for training generalist AI agents.

VOYAGER: An Open-Ended Embodied Agent with Large Language Models

This research pushes the boundaries of what AI Agents can achieve regarding adaptability, skill accumulation, and open-ended exploration, making it a foundational study for anyone interested in the future of AI Agents and their applications in dynamic environments.

More Agents Is All You Need

This paper is a critical read for those interested in scaling AI performance. It introduces the concept that increasing the number of instantiated agents can significantly improve the performance of large language models (LLMs) across various tasks. The paper provides a systematic study of the scaling properties of LLM agents, making it particularly valuable for researchers focused on enhancing AI capabilities through multi-agent collaboration and ensemble methods.

Additional

AppAgent: Multimodal Agents as Smartphone Users

This paper introduces the concept of multimodal agents designed to interact with smartphones as users. It is valuable for understanding how AI agents can be designed to mimic human-like interactions with everyday technology, which is especially relevant for those exploring AI in mobile and consumer applications.

Chain-of-Thought Reasoning Without Prompting

This paper investigates the inherent reasoning capabilities of large language models (LLMs) by altering the decoding process rather than relying on explicit prompts. It is crucial for those studying AI agents because it demonstrates how LLMs can exhibit reasoning abilities without external guidance, challenging the conventional reliance on prompting techniques. Understanding this unsupervised approach to reasoning could lead to more autonomous and capable AI agents.

Tree of Thoughts: Deliberate Problem Solving with Large Language Models

This paper introduces the Tree of Thoughts framework, which provides a more sophisticated and human-like approach to problem-solving, making it a powerful tool for developing advanced AI Agents capable of handling complex and diverse tasks.

Octopus v2: On-device language model for super agent

This paper presents a cutting-edge method for deploying language models on edge devices, addressing crucial challenges like latency, accuracy, and privacy. The paper introduces an optimized 2-billion parameter model that outperforms even GPT-4 in function calling tasks, making it highly relevant for developing AI Agents that need to operate efficiently on mobile and other edge devices. It also discusses innovative techniques like fine-tuning with functional tokens, which could inspire new approaches in your work on AI Agents.

Orca-Math: Unlocking the Potential of SLMs in Grade School Math

This paper presents a breakthrough in enhancing small language models’ (SLMs)

Scaling Instructable Agents Across Many Simulated Worlds

The generality and scalability of the approach in the paper make it an essential study for understanding how to develop AI Agents capable of operating in a wide range of dynamic, real-time environments.

Improving Factuality and Reasoning in Language Models through Multiagent Debate

This paper covers the “society of minds” approach to improve LLM reasoning, which leads to better AI agents.

Multi-Agent

AGENTVERSE: Facilitating Multi-Agent Collaboration and Exploring Emergent Behaviors

The paper provides insights into how multi-agent collaboration can lead to more effective and efficient task completion, especially in complex real-world scenarios.

AutoAgents: A Framework for Automatic Agent Generation

This paper introduces a groundbreaking framework for dynamically generating and coordinating specialized agents to tackle complex tasks. Unlike traditional multi-agent systems that rely on predefined roles, AutoAgents adaptively generates task-specific agents and refines their collaboration through self-refinement and collaborative refinement mechanisms.

AgentCoder: Multiagent-Code Generation with Iterative Testing and Optimization

This introduces an innovative multi-agent framework that significantly enhances the code generation process by combining the strengths of different specialized agents.