A company composed of 7 agents completed the development of a game in 7 minutes

The market's expectations for AI Agents have always been high. In addition to agents with various one-way tasks, a previous experiment by Stanford University and Google has demonstrated a virtual town (Virtual Town) composed of 25 AI Agents that operate cooperatively on their own. Make daily schedules, appointments and plan events and parties in this virtual town.

However, a recent research experiment showed that a virtual company formed by **7 AI Agents completed a complete development process in 7 minutes, at a cost of about US$1. **This experiment comes from the latest paper released by researchers from Tsinghua University, Beijing University of Posts and Telecommunications, Dalian University of Technology, Brown University, and domestic AI startup company Wall-Facing Intelligence.

They created a virtual company called ChatDev, composed of 7 AI Agents, whose roles are CEO, CTO, CPO, programmer, designer, tester and code reviewer. These Agents are supported by the ChatGPT 3.5 model.

The Waterfall model based on the sequence of developing software divides the entire process into 4 stages: design, programming, testing and documentation. The researchers defined roles, assigned tasks, and communication protocols that needed to be followed for each Agent through prompt words. and termination criteria and constraints.

After that, every Agent in the ChatDev company will participate in the development work at different stages. For example, the CEO, CPO, and CTO will work together in the design stage, and the programming stage will mainly involve the CTO, programmers, and designers.

The specific role allocation at each stage is roughly as shown in the figure below. The above is the task allocation role based on the software development process, and the following Chat Chain is a decision-making communication and feedback process:

After the various roles were assigned, the company ChatDev received an order to develop the "Gomoku" game. Immediately everyone entered the first design stage. The CEO, CPO and CTO had a simple conversation about the product type and programming language. The decision was made, and the decision-making process is very similar to that in reality:

  • CEO: In order to meet the needs of customers, let’s discuss the specific programming languages for products and implementation;
  • CPO: I recommend developing this game as a desktop version, because then it can be played on a computer, and it will have a very friendly user interface and interactive experience. In view of our experience, I recommend using the Python language.
  • CEO: I agree with your proposal. @CTO, would you like to give me some specific advice on the choice of programming language?
  • CTO: Python is a good fit and a good choice.
  • CEO: Well, the simplicity and readability of the Python language have made it the first choice for both beginners and experienced developers. *CTO: Let’s get started!

Afterwards, the CEO and CPO started brainstorming together to further determine the specific details of the product:

After this is finalized, the programming stage begins. The core is the development process and dialogue between the CTO, programmers, designers, and code reviewers, and the game development work is completed through collaboration.

After the various processes of product development and testing are completed, the CTO, programmers, CEO and CPO will work together to write a detailed document on the interaction of the software for interaction with customers.

The whole process splits the tasks into more than 70 subtasks. After completing the task splitting, the product development is completed in almost 7 minutes (the normal development cycle may take 2-4 weeks), and through *memory * and self-reflection functions, which identify some potential vulnerabilities and fix them.

These researchers have made statistics on the cost of each stage of the entire development process. In addition to the fact that the entire process takes less than 7 minutes on average, the entire cost is less than 1 US dollar**, demonstrating very high software development efficiency. . The following is a simple implementation process of this game product:

According to the paper, 86.66% of the AI Agent executed perfectly during the entire process, and in the part where the execution failed, 50% of the cases were caused by the API's token length limit. This kind of failure The restriction prevents obtaining the complete source code within a specified length range for code generation. These challenges are particularly evident when dealing with complex software systems or situations that require extensive code generation.

The other 50% of failures are mainly affected by external dependency issues. Problems will occur when some dependencies cannot be found on the cloud or have the wrong version. But overall, the results of this experiment are relatively successful. Perhaps in the near future, many of our jobs may be able to rely on AI Agents.

The full paper can be viewed here:

The public code can be viewed on GitHub:

View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Share
Comment
0/400
No comments
Trade Crypto Anywhere Anytime
qrCode
Scan to download Gate app
Community
English
  • 简体中文
  • English
  • Tiếng Việt
  • 繁體中文
  • Español
  • Русский
  • Français (Afrique)
  • Português (Portugal)
  • Bahasa Indonesia
  • 日本語
  • بالعربية
  • Українська
  • Português (Brasil)