🌟 Photo Sharing Tips: How to Stand Out and Win?
1.Highlight Gate Elements: Include Gate logo, app screens, merchandise or event collab products.
2.Keep it Clear: Use bright, focused photos with simple backgrounds. Show Gate moments in daily life, travel, sports, etc.
3.Add Creative Flair: Creative shots, vlogs, hand-drawn art, or DIY works will stand out! Try a special [You and Gate] pose.
4.Share Your Story: Sincere captions about your memories, growth, or wishes with Gate add an extra touch and impress the judges.
5.Share on Multiple Platforms: Posting on Twitter (X) boosts your exposure an
10 representative AI-Agents, how will change the Internet / reshape Web3
SOURCE: VION WILLIAMS
Explore the innovative possibilities of AI-Agents
Consensus and non-consensus of AI-Agents
The reason why AI-Agents has become a direction that has attracted more and more attention is largely due to the fact that LLM provides a feasible technical implementation route for the application of AI-Agents, and secondly, there are many AI-Agents related projects. lock up.
Although LiLian Weng defines what are LLM-driven AI-Agents in his article? But Deepmind is also trying to define the concept of a unified agent. I believe that the concept of AI-Agents will also form different types of differentiation with the understanding of different AI companies.
The clearer key consensus is that based on LLM-driven Agents to realize automatic processing of general problems, it is the AI-Agents that we have identified in this large-scale language model explosion cycle and have formed a shared understanding.
Find the possibility from the correlation of Agents
At the application level of AI-Agents, at the current stage, we should look at AI-Agents from the perspective of "relevance" as much as possible, that is, we must be tolerant of trial and error and innovative in the possible forms of AI-Agents. Possibility**, you must not look for a standardized answer in a narrow position like some critics, these are not advisable.
For example, Auto-GTP, as a possibility, has actually inspired many Agents projects, but narrow criticism will lose the opportunity to capture new opportunities, which is a common phenomenon among Chinese developers. As a developer without creativity, how will you rely on your traditional competitiveness in the era of natural language programming?
Although there are many introductions about AI-Agents-related projects, I think there is a problem of homogeneous listing and introduction. These contents let us know which projects belong to the direction of AI-Agents, but there is no related Starting from nature, it shows the potential of AI-Agents in different application fields, and the ecological position of certain types of AI-Agents projects.
For example, in my introduction, Auto-GPT, BabayAGI and MetaGPT will be classified into one type of ecology by me, because they have the continuity of a certain path;
Constructing a holistic cognition in the puzzle of Agents
All in all, in the introduction of representative projects about AI-Agents, I used the perspectives of "relevance", "ecological position" and "continuity" to introduce representative projects, so that we can vaguely See the future development trend of AI-Agents.
The following 10 representative related projects appear, including some related reference projects. I will use the case as a puzzle to piece together a relatively complete map, which is enough to let more people clearly realize how the potential of Agents can change everything on the Internet. Including reshaping the Web3 landscape.
Two major future directions of AI-Agents
AI-Agents can be roughly divided into two directions: **Autonomous Agents and Generative Agents. **
Autonomous Agents takes Auto-GPT as an example, which represents the ability to automatically perform various tasks to achieve target results through natural language requirement descriptions. In this collaborative relationship, Autonomous Agents serve people and have clear tool attributes;
Generative Agents takes the virtual town of 25 intelligent agents published by Stanford as an example. Generative Agents, as an AI-Agents with personality-like characteristics, autonomous decision-making ability, and long-term memory, is more inclined to the concept of "nativeness". In this collaborative In relationships, Agents have digitally native social relationships, not just tools to serve people;
Auto-GPT
One of the most well-known open source projects of Auto-GPT, its introduction on GitHub is very simple "An experimental open-source attempt to make GPT-4 fully autonomous.", an experimental open source attempt to make GPT-4 fully autonomous .
A brief summary is that Auto-GTP can fully automate the final task result through a one-sentence task requirement; the core logic of Auto-GPT's ability to complete tasks independently lies in the task planning ability of the language model, through the task Carry out step-by-step disassembly and analysis, and automatically improve the execution steps of the task. In the process, the search results on the Internet will be fed back to the language model, and the task will be further disassembled and executed.
To use the popular vernacular as a metaphor, **Auto-GPT completed the task in the process of "self-questioning and self-answering", without the need for humans to provide prompt words. **
project address:
**BabyAGI can automatically create, sort and execute new tasks based on the results of previous tasks and our preset goals. **It uses natural language processing technology to create new tasks based on goals, and store the task results in the database so that relevant information can be found when needed.
BabyAGI is actually a Python script that runs an infinite loop to complete the following steps:
Both Auto-GPT and BabyAGI theoretically represent the initial period of our current LLM outbreak. Our exploration of AGI based on LLM and the general-purpose task-solving processor driven by LLM are, I think, the holy grail in the field of AI-Agents in the future.
Generative Agents
The paper "Generative Agents: Interactive Simulacra of Human Behavior" published by Stanford and Google researchers is already a very well-known AI-Agent project. All in all, this research put 25 AI agents in a pixel-style virtual small In the town, intelligent agents can realize the simulated interaction of human life behavior, and can also interact with the environment of the virtual town, and can also interact with humans outside the virtual world. **
This paper has two key solutions that are most worthy of our attention:
1, Architecture of Generative Agent
2, memory flow
Based on the above two key components, in fact, the overall behavior of the generative agent is divided into three parts: [memory and retrieval] [reflection] [planning and response]. For details, please refer to the content of the original paper.
This paper and this experiment verified that the interactive behavior generated by the agent formed based on LLM can be trusted to simulate the behavior of human interaction in a digital environment. The generative agent can play a role in many digital environments, especially the generated It is a human-computer interaction relationship formed between human agents and human beings.
What we can feel most intuitively is that the generative agent is created as a role of a native digital resident of the metaverse, and produces various interactions with the environment of the human metaverse. In fact, We can simulate a highly developed digital virtual world of AI-Agents, and humans can extract the digital labor results of AI-Agents from this world;
How Agents Become Working Partners
Since Agents are translated as "agents" in many contexts this time, agents are easily associated with the role concept of an intermediary, making it difficult for many people to intuitively establish the association of the scene application of Agents; in these three cases, respectively It shows how Agents can become "human experts" that can be hired, an automated marketing company that does not require human participation at all, and how Agents form a team to collaborate with each other.
In the following example, we can use NexusGPT to create multiple expert staff, and use GPTeam to form a team hired by humans, and this AI team works in a fully automated company such as AutoCorp. When we put these puzzle pieces together, we can intuitively feel that the future has come;
NexusGPT
This is a so-called world's first AI freelance platform made by an independent developer Assem. NexusGPT is based on the LangChainAI framework, using GPT-3.5 API and Chroma (an AI-native open source embedded database), on the NexuseGPT platform With over eight hundred AI agents with specific skills.
But these all rely on the support of function calls of OpenAI and LangchainAI;
NexusGPT represents a future business model for humans to hire Agents. This project actually has a lot of room for improvement, such as the combination of Agents and expert modules (expert system and expert model). Party A’s pricing method for hiring Agents is based on Token Calculation of consumption, etc., these will change our traditional way of hiring labor market, and will also change the way of collaboration of DAO;
AutoCorp
Created in 5 hours by AutoCorpmina fahmi and their team during the New York GPT/LLM Hackathon. AutoCorp is a completely independent brand marketing company. AutoCorp will automatically create brand advertisements and product designs for a company that sells T-shirts directly. When customers raise new consumer needs, AutoCorp will update its theme and generate new design assets. Continuous self-iteration towards a better business direction.
This paragraph is quoted from Mina fahmi’s twitter, and AutoCorp was also created by mina fahmi and their team during the GPT/LLM hackathon in New York in 5 hours, and their purpose of creating AutoCorp is also to push the concept of “Autonomy” to the extreme .
In fact, the purpose of AutoCorp and DAO is highly consistent. **If the ultimate goal of a decentralized organization is to remove the "human" factor, then fully automating production operations is actually a reasonable development appeal of the concept of DAO. **AutoCorp actually represents the future business development direction of DAO.
GPTeam
GPTeam is an open source multi-agent simulation system. GPTeam leverages GPT-4 to create multiple agents that cooperate to achieve predefined goals. The main goal of this project is to explore the potential of GPT models in improving multi-agent productivity and effective communication.
project address:
In fact, there are still many open source projects like GPTeam, such as Dev-GPT, an automated development team that creates customized microservices for users. The team consists of three virtual roles: product manager, developer, and development operation and maintenance. The technical idea of Dev-GPT is mainly to identify and test effective task strategies. If it fails 10 times in a row, it will switch to the next method.
All this makes me have to think of DAO, an automated task collaboration organization based on automated governance logic;
How Agents replace repetitive work
Before AI completely replaces our work, Agents replace most of our current repetitive labor is the next development direction of Agents in the business field. Before the emergence of LLM-based Agents, RPA (Robot Process Automation) is the industry's first Seeking solutions, but the traditional RPA has a high threshold and cannot be popularized to the public. RPA is a remedy for the lack of automation of traditional IT interaction logic, and the current Agents can communicate with natural language to realize the function of RPA need.
The following two projects show us how LLM-based Agents will help us free ourselves from repetitive labor in our daily work and academic research. (In fact, the potential of these two projects is more than that)
Cheat Layer
"Automatee your business Using Natural Language", using natural language to automate your business, this is the brand slogan of Cheat Layere. Cheat layer solves impossible business automation problems through custom-trained GPT-4 machine learning models, serving as AI software engineers for each user.
Cheat Layer automates the operation of the entire webpage through the plug-in mode of Google Chrome and uses natural language. For example, most of our routine operations on the webpage can actually be operated automatically. Cheat Layer is easily reminiscent of RPA, that is, robotic process automation. There have been many discussions about the relationship between Agents and RPA. It is an indisputable fact that traditional RPA has been eliminated by Agents.
Use natural language through Cheat Layer to set up the automation of business processes, and use Project Atlas Agents to manage different automation processes. Generally speaking, we can use natural language mode to create an Agents to manage the automated execution of a certain business , as the complexity of the business increases, we can iteratively improve the Agents.
GPT Researcher
GPT Researcher is a GPT-based autonomous agent capable of conducting online comprehensive research on any given topic. The introduction of the project on Github is:
"The agent is capable of generating detailed, objective, and unbiased research reports with customization options to focus on relevant resources, outlines, and lessons. Inspired by AutoGPT and a recent Plan-and-Solve paper, GPT Researcher solves the speed and deterministic problems, by parallelizing agent work rather than synchronous operations, providing more stable performance and faster speed."
The architecture of GPT Researcher is mainly carried out by running two agents, **one is a "planner" and the other is an "executor"; **The planner is responsible for generating research questions, while the executor is based on the research questions generated by the planner Find relevant information, and finally filter and summarize all relevant information through the planner, and then generate a research report;
Generate a set of research questions that together form an objective opinion about any given task.
For each research question, trigger a crawler agent to scrape information relevant to the given task from online resources.
For each crawled resource, summarize based on relevant information and trace its source.
Finally, all the summarized resources are screened and aggregated, and the final research report is generated.
Features of this project
Generate research, outlines, resources and lessons learned reports
Each study aggregates more than 20 internet sources to form objective and factual conclusions
Includes an easy-to-use web interface (HTML/CSS/JS)
Java-enabled web scraping
Log and track contextual information about web sources visited and used
Export research reports to formats such as PDF...
Although GPT Researcher is an academic research tool based on GPT, and this is an open source project for academic purposes under the MIT license. From the perspective of content creation, this open source has high commercial value. For example, when this open source project is applied to business analysis reports, it can still save a lot of time. Secondly, transform this open source project into an AI for in-depth content writing -agents will also completely change the pattern of the content media industry;
project address:
AI-Agents' infrastructure ecology
The obvious future is that the collaborative relationship between humans in the future is no longer a collaborative relationship between humans and humans, but a collaborative relationship between humans and AI-Agents, and everyone will have as many AI-Agents as possible. Agents help themselves to handle as many tasks as possible, thus forming a large and complex intelligent social collaboration structure;**
The collaborative relationship between humans and Agents is different from the collaborative theory of humans and tools in previous social science theories. The key is that Agents, as a kind of human-like intelligence, have certain independent decision-making capabilities, and human trust in Agents has also become a key The issue, not to mention the self-awareness of Agents, but the influence of Agents on social interaction behavior in making decisions instead of humans.
Based on the consideration of the above two propositions, we have to realize that it is possible for human beings to create their own AI-Agents efficiently and conveniently, while allowing their own Agents to have more powerful capabilities, and at the same time, Agents are reliable and trustworthy. It is inseparable from a good infrastructure to provide support. The introduction of the following three projects, I think, represents the construction direction of the future AI-Agents infrastructure;
langchain
LangChain is a language model-based application development framework. It can achieve the following functions
Data-aware: Connect language models to other data sources
Agent: Allows a language model to interact with its environment.
The main value of LangChain lies in:
Component: Provides abstractions for working with language models, and provides a series of implementations for each abstraction. These components are modular and easy to use, whether you use the rest of the LangChain framework or not.
Ready-made chains: A structured set of components for implementing specific high-level tasks.
Ready-made chains make it easy to get started quickly. For more complex applications and granular use cases, components make it easy to customize existing chains or build new ones.
langchain provides standard, extensible interfaces and external integration by providing the following modules
Model I/O model input and output: interface interaction with language model
Data connection data connection: interface interaction with data of a specific application
Chains chain: build call sequence
*Agents: Let the chain choose which tools to use based on high-level instructions. *
*Memory: Saves application state between runs of the chain. *
*Callbacks Callbacks: Record and stream intermediate steps of any chain. *
Thanks to the relatively active developer ecology of Langchain in the English community, there are relatively many cases of Agents application development using Langchain. Defining the framework of Agents and providing a zero-code development framework is a future trend.
Based on a specific framework system, the manufacture of Agents is like building Lego blocks. Unlike the modularization of Web3, the modules of Agents do not have to be off-the-shelf, but ordinary people can also develop specific components through natural language programming. Added to the framework of Agents.
For example, many people use the langchain framework to develop chatbots, develop a tone conversion component through natural language programming, and add it to the chatbot, then the original default dialogue tone can be changed into a dialogue tone that meets the user's own preferences.
The enlightenment given to us by langchain is that the Agents development framework for code-free programming + component modules for natural language programming may be a necessary development tool for the popularization of Agents.
Tranformer Agents
Transformer Agents is an AI-Agents system launched by hungging Face. Although the current function is not very good, the key reason why we must keep an eye on it is that huggingFace is a huge model library open source community.
Transformer Agents is actually based on the Transformer framework, adding a natural language-based API: huggingface defines a set of tools and designs an agent to interpret natural language and use these tools. Most importantly, this system has Extensible design.
That is to say, Transformer Agents used a small number of well-prepared proxy tools in the early stage to verify the feasibility of this system, and then the scalability means that Transformer Agents can freely use huggingface's huge model tool library.
Of course, it is exciting to realize this vision, but at the current stage, I still look forward to Transformer Agents being able to propose an impressive agents framework to accommodate the influx of developers into this ecology that contains huge gold mines. middle. HuggingFace may have adjusted its own development strategy.
WebArena
**WebArena is a self-contained, self-hosted web environment for building autonomous agents. **WebArena creates four popular categories of websites with features and data that mimic their real-world counterparts.
To simulate human problem solving, WebArena also embeds tools and knowledge resources as stand-alone websites. WebArena introduces a benchmark for interpreting high-level real-world natural language commands into concrete web-based interactions. The researchers provided annotated programs for programmatically verifying the functional correctness of each task.
Overview of cited papers:
"Current agents are primarily created and tested in simplified synthetic environments, which largely limit the representation of real-world scenarios. In this paper, we build an agent command-and-control environment that is highly Realistic and reproducible. Specifically, we focused on agents performing tasks on the web and created an environment that includes fully functional websites in four common areas: e-commerce, social forum discussions, collaborative software development, and content management .Our environment is rich and diverse, including some tools (such as maps) and external knowledge bases (such as user manuals) to encourage human-like task solving.
Based on our environment, we publish a set of benchmark tasks that focus on evaluating the functional correctness of task completion. The tasks in our benchmark are diverse and span a long time, and are designed to simulate tasks frequently performed by humans on the Internet. We design and implement several autonomous agents, integrating state-of-the-art techniques such as think before acting.
The results show that solving complex tasks is challenging: our best GPT-4-based agent only achieves an end-to-end task success rate of 10.59%. These results highlight the need for further development of powerful agents, current state-of-the-art language models are far from perfect on these real-world tasks, and WebArena can be used to measure such progress. "
Thesis address:
This is an academic research result of an AI researcher from Carnegie Mellon. In fact, WebArena complements the currently well-known langchain development architecture, or various Agents-Team related projects. We need an Agents simulation test platform , used to ensure the robustness and effectiveness of Agents.
The main function of this platform is to test the feasibility of various Agents projects. One scenario I can even imagine is that when I hire an Agent on a certain platform in the future, we will use the Agents through a platform like WebArena to Testing the real working ability of Agents also means that humans have the right to speak on the pricing decisions of AI-Agents.
**How will AI-Agents affect everything? **
Agents-based automated collaboration network
Through our introduction and analysis of more than a dozen projects above, these different projects are like pieces of a jigsaw puzzle, making up our relative overall understanding of Agents. Agents is actually the direction to truly bring out the potential of LLM, and LLM is the center , Agents endow LLM with hands and feet. Based on the functional diversity of LLM-driven Agents, Agents will be like a biological explosion, and humans and Agents will become a digital companion/symbiotic development relationship.
The collaborative network of human society will also form an automated collaborative network between humans and Agents due to the large-scale application of Agents. The production structure of human society will be upgraded, and all aspects of society will be affected and changed;
Changing Everything on the Internet
AI-Agents have completely changed the way we obtain information, process information, produce information, and use information on the Internet, and have changed our current business model that relies on the Internet. An intelligent network with communication capabilities and autonomous/automatic execution of tasks is the Internet Agents are the intelligent medium that we talk to and execute.
Reshaping the Narrative for Web3
The encrypted currency network will become the natural currency network of Agents, and the computing resources consumed by the entire AI-Agents collaboration network will make Token an important AI economic resource; the personal data ownership represented by Web3 will also face a new human-computer interaction relationship Among them, a brand-new proposition that humans and AI-Agents share data property rights. The emergence of Agents with independent property rights (a radical movement to liberate AI), DAOs fully automated by AI-Agents, and super-individuals monopolize most of the network data property rights and effective computing resources.
The data affirmative movement under the wave of Web3 has brought back everyone’s data ownership. In fact, most people do not necessarily have high-value data resources. The return of data ownership has become a political appeal of Web3 narrativeism, but it ignores the AGI society. The production structure is unequal; what AI-Agents represents is that while AI is super-productivity, it is also building a new production relationship of human-computer interaction and automatic collaboration, which makes us have to reshape the narrative logic of Web3;
Accelerate the construction of metaverse
From the development and evolution of Generative Agents, exploring digital native digital residents, and constructing a series of social activities in the metaverse environment of native digital humans (AI-Agents with personality characteristics and autonomous consciousness), in fact, is accelerating the metaverse. The universe has evolved from a digital space to a digital territory with social functions and forms. The concept of computing space will also allow Agents to obtain a digital multimodal development space, thereby accelerating the emergence of Agents' embodied intelligence in the digital environment.
The construction of metaverse is no longer the task of human beings, but the task of continuous self-expansion as the living space of AI-Agent;
Be wary of the kidnapping of a single technology narrative
In fact, in recent years, various technological hotspots have emerged one after another, and mankind seems to have entered a period of frequent technological revolutions. In fact, the three narratives of Metaverse, Web3, and AGI have emerged one after another, which has indeed created a lot for people in choosing career directions. Due to the fact that most people in the market are project-based thinking-oriented, the positioning of the project itself can easily be attributed to a specific category, such as either Web3 or AI. This is where the ass decides the head, ignoring technology The objective law of development of history.
**The development of science and technology has never been fragmented, but has moved towards interdisciplinary integration in a dialectical unity. **For example, the NFT narrative attribute of Web3 is naturally in line with the narrative of the Metaverse. In the early days of Web3, the two were deliberately opposed by some people. These are very narrow perspectives. The same is true for today's AGI narrative. Web3 practitioners only know AI tools, but do not think deeply about AGI's narrative logic. They will deliberately create a cognitive resistance between AI and Web3. For example, many Web3 people's understanding of DAO is in the original Few people have the courage to stop and rethink the influence of AGI on DAO.
Web3, Metaverse, and AGI are three highly related directions. Traditional mainstream technology media organizations or investment institutions have not yet established a new paradigm concept for future technology narratives, and have been using old narrative paradigms to influence the market. The resources of science and technology practitioners in this direction are scattered and their ideas are not open enough. We do not rule out that new technological narratives will continue to emerge in the future, but if the old paradigm of technological narratives continues to be adopted, the resources of scientific and technological talents will only be split and dispersed again and again. The old paradigm of technological cognition is a waste resource of invisible things.
A key question currently facing the entire Chinese technology industry is what is technology? There is a lack of new narrative paradigms, and no new narrative concepts to guide us to better deal with the next wave of technology. We are always immersed in projects, but we lack narratives that can condense the power of science and technology. Neither the three major narratives of Web3, Metaverse, and AGI originated in China.
I really look forward to ushering in an era when a hundred flowers bloom and a hundred schools of thought contend in scientific and technological narratives. We urgently need to form a new understanding of technological narratives, so that we can find the right path for development and determine our sustainable development position in the entire technological ecosystem.
Of course, appealing alone is useless, and someone still needs to do it, so I will do it first, and I have endured these single-tech narrative thinking for a long time!