diff --git a/README.md b/README.md index 2d94454..d449d62 100644 --- a/README.md +++ b/README.md @@ -2,4 +2,4 @@ > MARW Workshop, AAAI 2025 > -> Homepage: https://TradingAgents.github.io/ +> Homepage: https://TradingAgents-AI.github.io/ diff --git a/index.html b/index.html index e52ee51..08f5daf 100644 --- a/index.html +++ b/index.html @@ -114,7 +114,7 @@ Lack of Realistic Organizational Modeling: Many frameworks fail to capture the complex interactions between agents that mimic the structure of real-world trading firms Li et al., 2023, Wang et al., 2024, Yu et al., 2024. Instead, they focus narrowly on specific task performance, often disconnected from the organizational workflows and established human operating procedures proven effective in trading. This limits their ability to fully replicate and benefit from real-world trading practices. - Inefficient Communication Interfaces: Most existing systems use natural language as the primary communication medium, typically relying on message histories or an unstructured pool of information for decision-making Park et al., 2023, Qian et al., 2024. This approach often results in a "telephone effect", where details are lost, and states become corrupted as conversations lengthen. Agents struggle to maintain context and track extended histories while filtering out irrelevant information from previous decision steps, diminishing their effectiveness in handling complex, dynamic tasks. Additionally, the unstructured pool-of-information approach lacks clear instructions, forcing logical communication and information exchange between agents to depend solely on retrieval, which disrupts the relational integrity of the data. + Inefficient Communication Interfaces: Most existing systems use natural language as the primary communication medium, typically relying on message histories or an unstructured pool of information for decision-making Park et al., 2023, Qian et al., 2024. This approach often results in a "telephone effect", where details are lost, and states become corrupted as conversations lengthen. Agents struggle to maintain context and track extended histories while filtering out irrelevant information from previous decision steps, diminishing their effectiveness in handling complex, dynamic tasks. Additionally, the unstructured pool-of-information approach lacks clear instructions, forcing logical communication and information exchange between agents to depend solely on retrieval, which disrupts the relational integrity of the data.
In this work, we address these key limitations of existing models by introducing a system that overcomes these challenges. First, our framework bridges the gap by simulating the multi-agent decision-making processes typical of professional trading teams. It incorporates specialized agents tailored to distinct aspects of trading, inspired by the organizational structure of real-world trading firms. These agents include fundamental analysts, sentiment/news analysts, technical analysts, and traders with diverse risk profiles. Bullish and bearish debaters evaluate market conditions to provide balanced recommendations, while a risk management team ensures that exposures remain within acceptable limits. Second, to enhance communication, our framework combines structured outputs for control, clarity, and reasoning with natural language dialogue to facilitate effective debate and collaboration among agents. This hybrid approach ensures both precision and flexibility in decision-making.
@@ -316,11 +316,6 @@In this section, we describe the experimental setup used to evaluate our proposed framework. We also provide detailed descriptions of the evaluation metrics employed to assess performance comprehensively.
-
- To simulate a realistic trading environment, we utilize a multi-asset and multi-modal financial dataset comprising of various stocks such as Apple, Nvidia, Microsoft, Meta, Google, and more. The dataset includes:
@@ -351,44 +346,74 @@
-
+ To thoroughly evaluate the performance of our TradingAgents framework, we use widely recognized metrics to assess the risk management, profitability, and safety of the TradingAgents strategy in comparison to baseline approaches. Here we describe these metrics:
- -The cumulative return measures the total return generated over the simulation period. It is calculated as:
-- CR = ((Vend - Vstart) / Vstart) × 100% -
-where Vend is the portfolio value at the end of the simulation, and Vstart is the initial portfolio value.
- -The annualized return normalizes the cumulative return over the number of years:
-- AR = (((Vend / Vstart)^(1/N)) - 1) × 100% -
-where N is the number of years in the simulation.
- -The Sharpe ratio measures risk-adjusted return by comparing a portfolio's excess return over the risk-free rate to its volatility:
-- SR = (R̄ - Rf) / σ -
-where R̄ is the average portfolio return, Rf is the risk-free rate (e.g., yield of 3-month Treasury bills), and σ is the standard deviation of the portfolio returns.
- -Maximum drawdown measures the largest peak-to-trough decline in the portfolio value:
-- MDD = maxt ∈ [0, T] ((Peakt - Trought) / Peakt) × 100% -
-
-
+
+ | Metric | +RNA Sequence | +Modality Fusion | +RNA-GPT | +||||||
|---|---|---|---|---|---|---|---|---|---|
| + | SBERT | +SPub | +SGPT | +SBERT | +SPub | +SGPT | +SBERT | +SPub | +SGPT | +
| Precision | +0.7372 | 0.5528 | 0.5219 | +0.6929 | 0.6507 | 0.6655 | +0.8602 | 0.7384 | 0.7848 | +
| Recall | +0.7496 | 0.5270 | 0.5474 | +0.8028 | 0.6082 | 0.6603 | +0.8404 | 0.7208 | 0.7561 | +
| F1 Score | +0.7424 | 0.5387 | 0.5339 | +0.7403 | 0.6283 | 0.6627 | +0.8494 | 0.7293 | 0.7700 | +
Table 1: TradingAgents (AIS): Comparison of RNA Sequence (left), Modality Fusion (middle), and TradingAgents (right). Embedding base models are BERT, PubMedBERT, and OpenAI's GPT text-embedding-3-large.
+ +The Sharpe Ratio performance highlights TradingAgents' exceptional ability to deliver superior risk-adjusted returns, consistently outperforming all baseline models across AAPL, GOOGL, and AMZN with Sharpe Ratios of at least 5.60—surpassing the next best models by a significant margin of at least 2.07 points. This result underscores TradingAgents' effectiveness in balancing returns against risk, a critical metric for sustainable and predictable investment growth. By excelling over market benchmarks like Buy-and-Hold and advanced strategies such as KDJRSI, SMA, MACD, and ZMR, TradingAgents demonstrates its adaptability and robustness in diverse market conditions. Its ability to maximize returns while maintaining controlled risk exposure establishes a solid foundation for multi-agent and debate-based automated trading algorithms.
+ +While rule-based baselines demonstrated superior performance in controlling risk, as reflected by their maximum drawdown scores, they fell short in capturing high returns. This trade-off between risk and reward underscores TradingAgents' strength as a balanced approach. Despite higher returns being typically associated with higher risks, TradingAgents maintained a relatively low maximum drawdown compared to many baselines. Its effective risk-control mechanisms, facilitated by the debates among risk-control agents, ensured that the maximum drawdown remained within a manageable limit, not exceeding 2%. This demonstrates TradingAgents' capability to strike a robust balance between maximizing returns and managing risk effectively.
+ +A significant drawback of current deep learning methods for trading is their dense and complex architectures, which often render the decisions made by trading agents indecipherable to humans. This challenge, rooted in the broader issue of AI explainability, is particularly critical for trading agents, as they operate in real-world financial markets, often involving substantial sums of money where incorrect decisions can lead to severe consequences and losses.
+ +In contrast, an LLM-based agentic framework for trading offers a transformative advantage: its operations and decisions are communicated in natural language, making them highly interpretable to humans. To illustrate this, we provide the full trading log of TradingAgents for a single day in the Appendix, showcasing its use of the ReAct-style prompting framework Yao et al., 2023. Each decision made by the agents is accompanied by detailed reasoning, tool usage, and thought processes, enabling traders to easily understand and debug the system. This transparency empowers traders to fine-tune and adjust the framework to account for factors influencing decisions, offering a significant edge in explainability over traditional deep-learning trading algorithms.
Table 1 and Figures 6, 7, and 8 highlight that our method significantly outperforms existing rule-based trading baselines, particularly in profitability, as measured by returns. TradingAgents achieves at least a 23.21% cumulative return and 24.90% annual return on the three sampled stocks, outperforming the best-performing baselines by a margin of at least 6.1%. Notably, on the AAPL stock—a particularly challenging case due to market volatility during the testing period—traditional methods struggled, as their patterns failed to generalize to this situation. In contrast, TradingAgents excelled even under these adverse conditions, achieving returns exceeding 26% within less than three months.
+Table 1 and Figures (a) and (b) highlight that our method significantly outperforms existing rule-based trading baselines, particularly in profitability, as measured by returns. TradingAgents achieves at least a 23.21% cumulative return and 24.90% annual return on the three sampled stocks, outperforming the best-performing baselines by a margin of at least 6.1%. Notably, on the AAPL stock—a particularly challenging case due to market volatility during the testing period—traditional methods struggled, as their patterns failed to generalize to this situation. In contrast, TradingAgents excelled even under these adverse conditions, achieving returns exceeding 26% within less than three months.
The Sharpe Ratio performance highlights TradingAgents' exceptional ability to deliver superior risk-adjusted returns, consistently outperforming all baseline models across AAPL, GOOGL, and AMZN with Sharpe Ratios of at least 5.60—surpassing the next best models by a significant margin of at least 2.07 points. This result underscores TradingAgents' effectiveness in balancing returns against risk, a critical metric for sustainable and predictable investment growth. By excelling over market benchmarks like Buy-and-Hold and advanced strategies such as KDJRSI, SMA, MACD, and ZMR, TradingAgents demonstrates its adaptability and robustness in diverse market conditions. Its ability to maximize returns while maintaining controlled risk exposure establishes a solid foundation for multi-agent and debate-based automated trading algorithms.
@@ -426,9 +451,23 @@In this paper, we introduced TradingAgents, an LLM-agent-powered stock trading framework that simulates a realistic trading firm environment with multiple specialized agents engaging in agentic debates and conversations. Leveraging the capabilities of LLMs to process and analyze diverse data sources, the framework enables informed trading decisions while utilizing multi-agent interactions to enhance performance through comprehensive reasoning and debate before acting. By integrating agents with distinct roles and risk profiles, along with a reflective agent and a dedicated risk management team, TradingAgents significantly improves trading outcomes and risk management compared to baseline models. Additionally, the collaborative nature of these agents ensures adaptability to varying market conditions. Extensive experiments demonstrate that TradingAgents outperforms traditional trading strategies and baselines in cumulative return, Sharpe ratio, and other critical metrics. Future work will focus on deploying the framework in a live trading environment, expanding agent roles, and incorporating real-time data processing to enhance performance further.
+Table 1 and Figures (a) and (b) highlight that our method significantly outperforms existing rule-based trading baselines, particularly in profitability, as measured by returns. TradingAgents achieves at least a 23.21% cumulative return and 24.90% annual return on the three sampled stocks, outperforming the best-performing baselines by a margin of at least 6.1%. Notably, on the AAPL stock—a particularly challenging case due to market volatility during the testing period—traditional methods struggled, as their patterns failed to generalize to this situation. In contrast, TradingAgents excelled even under these adverse conditions, achieving returns exceeding 26% within less than three months.
+ +The Sharpe Ratio performance highlights TradingAgents' exceptional ability to deliver superior risk-adjusted returns, consistently outperforming all baseline models across AAPL, GOOGL, and AMZN with Sharpe Ratios of at least 5.60—surpassing the next best models by a significant margin of at least 2.07 points. This result underscores TradingAgents' effectiveness in balancing returns against risk, a critical metric for sustainable and predictable investment growth. By excelling over market benchmarks like Buy-and-Hold and advanced strategies such as KDJRSI, SMA, MACD, and ZMR, TradingAgents demonstrates its adaptability and robustness in diverse market conditions. Its ability to maximize returns while maintaining controlled risk exposure establishes a solid foundation for multi-agent and debate-based automated trading algorithms.
+ +While rule-based baselines demonstrated superior performance in controlling risk, as reflected by their maximum drawdown scores, they fell short in capturing high returns. This trade-off between risk and reward underscores TradingAgents' strength as a balanced approach. Despite higher returns being typically associated with higher risks, TradingAgents maintained a relatively low maximum drawdown compared to many baselines. Its effective risk-control mechanisms, facilitated by the debates among risk-control agents, ensured that the maximum drawdown remained within a manageable limit, not exceeding 2%. This demonstrates TradingAgents' capability to strike a robust balance between maximizing returns and managing risk effectively.
+ +A significant drawback of current deep learning methods for trading is their dense and complex architectures, which often render the decisions made by trading agents indecipherable to humans. This challenge, rooted in the broader issue of AI explainability, is particularly critical for trading agents, as they operate in real-world financial markets, often involving substantial sums of money where incorrect decisions can lead to severe consequences and losses.
+ +In contrast, an LLM-based agentic framework for trading offers a transformative advantage: its operations and decisions are communicated in natural language, making them highly interpretable to humans. To illustrate this, we provide the full trading log of TradingAgents for a single day in the Appendix, showcasing its use of the ReAct-style prompting framework Yao et al., 2023. Each decision made by the agents is accompanied by detailed reasoning, tool usage, and thought processes, enabling traders to easily understand and debug the system. This transparency empowers traders to fine-tune and adjust the framework to account for factors influencing decisions, offering a significant edge in explainability over traditional deep-learning trading algorithms.