TradingAgents: Multi-Agents LLM Financial Trading Framework

Yijia Xiao1, Edward Sun1, Di Luo2, Wei Wang1
1University of California, Los Angeles, 2Massachusetts Institute of Technology

Abstract

Societies of LLM-powered agents have advanced automated problem-solving, particularly in finance. Yet, most frameworks don’t replicate the collaborative workflows of real trading firms. TradingAgents addresses this gap by assigning specialized LLM-powered agents—analysts, researchers, traders, and risk managers—to simulate a dynamic, team-based environment. These agents collaborate through debates, structured outputs, and risk checks. Experiments show that TradingAgents significantly improves key performance metrics over baseline models, highlighting the promise of multi-agent LLM frameworks in financial trading.

Introduction

Autonomous agents equipped with Large Language Models (LLMs) can mimic human problem-solving in finance—an intricate domain shaped by fundamentals, market sentiment, and macro factors. While deep learning models have long struggled with explainability, LLM-based systems show promise by pairing structured reasoning with interpretability. However, current solutions often lack organizational realism and rely on purely conversational interfaces susceptible to context loss.

TradingAgents fills these gaps by emulating the multi-agent decision-making processes of trading firms. The framework includes fundamental, sentiment, news, and technical analysts, along with bullish and bearish researchers, traders, and a risk management team. They coordinate using structured documents and concise dialogues. Our architecture leverages specialized LLM roles, combining clarity with deeper debates. Through extensive evaluations, TradingAgents delivers robust performance across multiple assets, validating the importance of multi-agent collaboration for real-world trading systems.

Related Work

LLMs as Financial Assistants

Specialized LLMs in finance have improved domain understanding via fine-tuning or from-scratch training on financial corpora (e.g., FinGPT, BloombergGPT). These models often excel at classification tasks but face challenges in generative quality compared to powerful general-purpose models like GPT-4.

Fine-Tuned LLMs for Finance

Fine-tuning boosts performance on tasks such as financial sentiment analysis. Examples include PIXIU (FinMA) and Instruct-FinGPT. They outperform generic open-source LLMs but still lag behind top-tier proprietary models in some generative tasks.

Finance LLMs Trained from Scratch

Models like BloombergGPT and XuanYuan 2.0 blend general corpora with specialized financial data, delivering strong domain-specific results. While they may not match larger closed-source models, they remain competitive among open-source counterparts.

TradingAgents Overall Framework Organization
Figure 1: TradingAgents Overall Framework Organization. I. Analysts Team: Four analysts concurrently gather relevant market information. II. Research Team: The team discusses data. III. Trader: Makes final decisions using debates and history. IV. Risk Management Team: Monitors risk. V. Fund Manager: Approves and executes trades.

LLMs as Traders

LLMs directly executing trades often rely on news-driven or reasoning-driven prompts, sometimes enhanced by reinforcement learning. Debate and reflection modules help overcome hallucinations and bolster factual accuracy.

News-Driven Agents

These agents use market news to gauge sentiment. Both closed-source (GPT-4) and open-source (Qwen) models show promising gains via simple sentiment-driven strategies.

Reasoning-Driven Agents

Frameworks like FinMem and TradingGPT integrate multi-round reasoning, reflection, and debates between agents with different stances, enabling more robust trading signals.

Reinforcement Learning-Driven Agents

RL aligns LLM outputs with backtest rewards, often leveraging memorized states and technical signals to refine decision-making.

LLMs as Alpha Miners

Some frameworks focus on generating alpha factors rather than final trades. Systems like QuantAgent and AlphaGPT iteratively refine alpha scripts through feedback from an LLM-based judge and real-market performance, accelerating systematic strategy development.

TradingAgents: Role Specialization

TradingAgents assigns each LLM agent a clear role. This mirrors how real trading firms split responsibilities—e.g., fundamental, sentiment, news, and technical analysts gather data, while researchers balance bullish and bearish arguments. A trader synthesizes these inputs, and risk managers ensure exposures stay within safe limits. This structured approach fosters comprehensive coverage of market signals.

Analyst Team

The analyst team (Figure 2) covers fundamental, sentiment, news, and technical aspects. Each member focuses on different market signals, providing the basis for research and trading decisions.

TradingAgents Analyst Team
Figure 2: TradingAgents Analyst Team
  • Fundamental Analysts: Evaluate intrinsic value via earnings, balance sheets, etc.
  • Sentiment Analysts: Analyze social media and public sentiment data.
  • News Analysts: Track macro events, economic indicators, and other critical news.
  • Technical Analysts: Calculate metrics like MACD/RSI to identify trends and patterns.

Researcher Team

(Figure 3) Bullish and bearish researchers debate the analysts’ findings, challenging each other’s viewpoints to produce a balanced outcome.

TradingAgents Researcher Team
Figure 3: TradingAgents Researcher Team
TradingAgents Trader Decision-Making Process
Figure 4: TradingAgents Trader Decision-Making Process
TradingAgents Risk Management Team and Fund Manager Approval Workflow
Figure 5: TradingAgents Risk Management and Fund Manager Workflow
  • Bullish Researchers: Highlight favorable signals and positive growth opportunities.
  • Bearish Researchers: Emphasize caution, identifying risks or negative signals.

Trader Agents

(Figure 4) Trader agents synthesize all insights to form buy/sell decisions, weighing returns against potential downside.

  • Review data from analysts and researchers.
  • Determine optimal trade timing and size.
  • Execute orders and manage portfolios.

Risk Management Team

(Figure 5) Risk managers ensure safety by evaluating volatility, liquidity, and other exposures. They enforce stop-loss measures and signal portfolio rebalancing when necessary.

  • Monitor market risk factors.
  • Adjust trading strategies to stay within risk limits.
  • Collaborate with traders to manage drawdowns.

All agents follow a ReAct-style prompting framework. Their actions—like research, debate, or trade execution—are tracked in a shared environment, creating a cohesive multi-agent ecosystem reminiscent of real trading firms.

TradingAgents: Agent Workflow

Communication Protocol

Relying solely on natural language can lead to “telephone effect” issues for complex, long-horizon tasks. TradingAgents introduces structured reports to preserve key details and reduce message distortion, drawing inspiration from frameworks like MetaGPT. Each agent produces or queries structured entries—concise and focused—to streamline interactions.

Types of Agent Interactions

Instead of lengthy dialogues, TradingAgents agents exchange structured documents containing critical data. Short natural language debates occur when merging contrasting opinions (e.g., bullish vs. bearish). Key communication types include:

  • Analyst Team: Each analyst produces specialized reports (fundamentals, sentiment, etc.).
  • Traders: Combine analyst reports into a decision signal with accompanying rationale.

Debates among researchers or risk managers occur in natural language but are recorded as structured entries. This approach maintains clarity while enabling multi-round reasoning.

Backbone LLMs

We employ both “quick-thinking” and “deep-thinking” LLMs, choosing models based on complexity and speed requirements. Analysts and traders use robust reasoning models for decision-making, while simpler tasks (e.g., data retrieval) rely on faster LLMs. This modular design, requiring no GPUs, allows easy swapping of different local or API-based models and ensures future scalability.

Experiments

We evaluate our framework on multi-asset data spanning a realistic time period, combining historical prices, news, social sentiment, insider transactions, and more. Baselines include traditional strategies like Buy-and-Hold, MACD, and SMA, ensuring a fair comparison.

Back Trading

Our dataset includes stocks like Apple and Google, daily news, social media sentiment, and technical indicators. Agents process only the data available up to each trading day, avoiding look-ahead bias.

Simulation Setup

The simulation runs from June 19, 2024, to November 19, 2024. TradingAgents autonomously generates buy, sell, or hold signals, then records performance metrics. This daily cycle repeats for each asset under study.

Baseline Models

We benchmark against several baselines:

  • Buy and Hold
  • MACD
  • KDJ and RSI
  • ZMR
  • SMA

Evaluation Metrics

Cumulative Returns on AAPL
(a) Cumulative Returns on AAPL
TradingAgents Transactions for AAPL
(b) TradingAgents Transactions for AAPL.
Green / Red Arrows for Long / Short Positions.
Categories Models AAPL GOOGL AMZN
CR%↑ ARR%↑ SR↑ MDD%↓ CR%↑ ARR%↑ SR↑ MDD%↓ CR%↑ ARR%↑ SR↑ MDD%↓
Market B&H -5.23-5.09-1.2911.90 7.788.091.3513.04 17.117.63.533.80
Rule-based MACD -1.49-1.48-0.814.53 6.206.262.311.22 ----
KDJ&RSI 2.052.071.641.09 0.40.40.021.58 -0.77-0.76-2.251.08
ZMR 0.570.570.170.86 -0.580.582.122.34 -0.77-0.77-2.450.82
SMA -3.2-2.97-1.723.67 6.236.432.122.34 11.0111.62.223.97
Ours TradingAgents 26.6230.58.210.91 24.3627.586.391.69 23.2124.905.602.11
Improvement(%) 24.5728.436.57- 16.5819.494.26- 6.107.302.07-

Table 1: TradingAgents: Comparison of Performance Metrics across AAPL, GOOGL, and AMZN.

Sharpe Ratio

TradingAgents consistently beats all baselines in risk-adjusted returns, showing Sharpe Ratios above 5.60 and surpassing the nearest competitors by at least 2.07 points. Its adaptability and robust debate mechanism enable high returns with controlled risk.

Maximum Drawdown

Rule-based baselines limit downside but sacrifice overall returns. TradingAgents balances both, keeping maximum drawdown below 2% while generating superior returns, aided by dedicated risk-control agents.

Explainability

Unlike dense deep-learning models, TradingAgents provides transparent logs of its ReAct-style reasoning for every trade decision. This approach greatly enhances human interpretability, facilitating debugging and fine-tuning in real markets.

Conclusion

We introduced TradingAgents, a multi-agent LLM trading framework inspired by professional trading firms. Its specialized analysts, researcher debates, and risk management teams create a rich decision-making ecosystem. By effectively combining structured reports and targeted dialogues, TradingAgents exceeds baseline performance across returns, Sharpe ratio, and drawdown metrics. Future work will explore live trading, expanded agent roles, and real-time data integration for even more refined trading outcomes.