AI Application Roadmap: Controllability is the Strongest Roadmap

Source: Semi-Light

Author: Wang Yonggang

  • Wang Yonggang: Founder and CEO of SeedV Lab, Executive Dean of Innovation Works AI Engineering Institute
  • Tong Chao: Co-founder and Chief Product of SeedV Lab

Where are the application scenarios of generative AI

Generative AI technologies such as Stable Diffusion and ChatGPT have attracted the most attention from the IT venture capital circle in the first half of 2023. When we recover from the magnificent wave of new technologies and start to seriously think about what kind of application scenarios are the best implementation directions for generative AI, many people will still find that the relationship between technology and the market is intricate , it is difficult to sort out the best path for the implementation of generative AI:

  • An investor: In the past few months, all front-line venture capital institutions have been mobilized and are very busy. It turns out that except for four or five leading large-scale model entrepreneurial projects that are sought after by everyone, other AIGC projects are unclear and dare not invest. , I don’t know where AI applications will develop in the future.
  • An analyst: Several leading large-scale model projects have invested heavily and have high risks; all B-side and government-side applications are limited by private deployment and private data, and have a long cycle and are difficult to implement; most C-side applications are too It is too shallow, and the homogeneity of text and image generation projects is serious; it is common to start a business after one or two good papers, but the team itself can't figure out the specific application direction...

The biggest crux of the thinking here is:

  • Most people still subconsciously think of generative AI as a set of tools for generating dialogues, articles, and pictures. According to this stereotype: this thing can only help copywriters and designers improve efficiency, how can it be called a subversive change?
  • Although there are many signs that generative AI is showing the dawn of artificial general intelligence (AGI), people who are limited by short-term value judgments will always say: so what? seeing is believing. Isn't today's AI still chatting, writing articles, and drawing pictures?

Obviously, it is not advisable to analyze the application prospects of generative AI only from a single perspective or a single time node. Is there a simple and easy-to-use thinking model that connects the development of generative AI?

Build a thinking model around controllability

We believe that generative AI is an information industry revolution that can be compared with desktop computing and mobile computing, and even has a more subversive effect. Disruptive changes are never achieved overnight, but are gradually realized with the continuous development and progress of generative AI. If you want to see clearly what new products, new platforms, new markets, and new opportunities generative AI will bring, we think there is a simple and easy-to-understand thinking path that is easy to guide product selection and project selection:

**The more controllable the generative AI is, the more disruptive it will be to the market and industry! **

This path can be simply represented by a graph:

As the controllability of generative AI over generated content continues to improve, the applicable application scenarios for generative AI will continue to expand and deepen. Quantitative change leads to qualitative change. Once the domain threshold is broken through, generative AI can completely transform the existing product ecology and endow products with truly intelligent elements.

During the evolution process, the controllability of generative AI will roughly go through six stages. Take the most basic text generation as an example:

Phase 1: Uncontrollable

More than 20 years ago, the statistical language model based on the N-grams algorithm could also generate continuous text content. However, the resulting results are largely uncontrollable. Such an early form of "generative AI" has almost no possibility of being transformed into products, let alone subverting the existing market.

Phase 2: The general direction is controllable

From LSTM or RNN-based text generation to early GPT (such as GPT-2) text generation, generative AI has gradually acquired the ability to describe a piece of human-like language. The ability to describe at this stage can basically achieve fluent sentences, and the content roughly conforms to the prompts given by humans. However, because the details, structure or logic are uncontrollable, it is still difficult to transform into a truly useful product.

Phase 3: Controllable structure or local logic

From GPT-3 to ChatGPT (GPT-3.5), for the first time, generative AI has control over the structure and local logic of generated content. Text creation and multi-round conversations are two typical application ecology in this period. The former can support practical scenarios such as automatic article summarization, legal document generation, and marketing copy generation, while the latter can meet some needs of conversational search, language learning, intelligent customer service, virtual people, and intelligent game characters.

Phase 4: Preliminary chain of thought is controllable

From GPT-3.5 to GPT-4, the logical reasoning ability of generative AI has improved significantly. For the first time, generative AI has powerful analytical capabilities (such as extracting data from news reports and summarizing trends), control capabilities (such as converting human language into complex system control instructions) and preliminary logical reasoning capabilities (such as answering simple questions mathematics, logic problems). The text content that can be generated also extends to structured and semi-structured text such as data, tables, codes, instruction sequences, workflows or tool chains. This directly led to a large number of new tools and systems today characterized by Copilot (literally translated as "copilot").

Phase 5: Complex logical reasoning is controllable

Of course, when today's GPT-4 generates text, the logical thinking chain that can be controlled is still in its infancy. If all goes well, humans are expected to develop a next-generation generative AI that can precisely control complex logical reasoning in the not too distant future. Such AI has advanced logical reasoning capabilities such as memory, learning, planning, and decision-making. These capabilities are enough to completely subvert the human-computer interaction in the past decades and redefine the relationship between humans and computers in scenarios such as efficiency tools, content platforms, business process automation, robots, operating systems, and smart devices.

Phase 6: Controllable rules or principles

From a more forward-looking perspective, the highest-level manifestations of human thinking are: 1. Discover principles and formulate rules based on inductive thinking; 2. Apply principles or rules to specific scenarios based on deductive thinking. The ideal evolutionary form of generative AI is to approach the way of human thinking, generate rules or principles comparable to human thinking, and apply them. Once it reaches the "Kingdom of Freedom" where the rules or principles are controllable, generative AI will have a strong ability to iterate and improve itself, and can design system rules and world rules like humans, and even carry out scientific research with human scientists.

Controllability and Typical Application Direction

The improvement of the controllability of generative AI has brought about a substantial expansion of the applicable field. We use the following figure to summarize the relationship between the controllability and the best application direction of generative AI at different development stages:

Based on controllability, at each stage of development, the application directions supported by generative AI continue to expand and deepen, from satisfying simple and local needs, to gradually developing to meet domain and platform needs, and finally accumulating to the product and business model. disruptive change. Whether the chain of thinking and logical reasoning are controllable, and to what extent they can be precisely controlled, are the most critical factors in the process of quantitative change to qualitative change.

Controllability and Specific Application Cases

Based on the controllability of generative AI, we divide the most suitable application directions of generative AI today and in the near future into four categories, and use the following figure to compare the typical application cases in each category with the application of generative AI. The different stages of development are linked:

Content Creation Tool/Content Platform

Content creation tools are the most direct and fastest scenario for generative AI to be implemented. With the improvement of the controllability of generative AI, content creation tasks will transition from simple text and image creation to complex automatic creation of videos, 3D, animations, games, movies, and virtual worlds. With the help of AI, every ordinary person will have abilities that originally belonged only to professional teams and professional tools. Once the creative desire of ordinary people is greatly released, the higher-level needs for sharing, viewing, purchasing, and socializing in new content forms will surely drive the birth, development, and growth of a new generation of content platforms.

Business Automation/Enterprise Services

Due to reasons such as data security, private deployment, content accuracy, and compliance, business processes have very high requirements for the controllability of generative AI. The business areas where generative AI is most suitable today may include content creation in marketing and user interfaces in e-commerce. In addition, generative AI can also greatly improve business efficiency by automatically generating intermediate codes such as SQL, automatically collecting and analyzing data, automatically generating reports, and automatically connecting business processes. In the future, with the improvement of the controllability of generative AI, more cutting-edge AI technologies will be absorbed in key processes such as planning, decision-making, and optimization in business processes.

Personal Assistant/Professional Assistant

In personal life and office scenarios, generative AI will gradually serve as various forms of "assistants" and establish a new ecosystem of human-AI collaboration within a few years. How controllable the generative AI is, fundamentally determines how smart the AI assistants in our lives or work are and what problems they can help us solve. Once generative AI has a level equivalent to human secretaries, drivers, translators, lawyers, etc. in some jobs, AI assistants will become a new generation of popular electronic products that replace computers and mobile phones.

Infrastructure/Development Tools/OS/Search Engines

The programming capabilities, data processing capabilities, system design capabilities, and knowledge processing capabilities of generative AI will provide new design concepts and cross-age new functions for development tools, databases, search engines, and operating systems. Whether an operating system with AI as the core and an intelligent computing platform with AI as the core can be born in the future depends entirely on how high the complex logical reasoning ability of generative AI can reach.

Application Capability Evolution of Multimodal AI

Compared with simple text generation and image generation, multimodal systems including sound, video, 3D scenes, animation, and complex story lines are more in line with human common sense and original needs, and obviously have broader and far-reaching application prospects. For the technical status and prospects of multimodal AI, please refer to another article by the author of this article:

In the post-GPT era, multimodality is the biggest opportunity In the field of multimodality, we believe that today's and future generative AI will evolve and accumulate roughly according to the context shown in the figure below, and will continue to give birth to revolutionary new applications, new platforms, and even disruptive new business models:

Permission to use

The pictures and text content of all the above application roadmaps are released by SeedV Lab under CC BY 4.0 license. On the basis of indicating the original source (SeedV laboratory), everyone is free to use, modify, and republish.

The pictures of the above application roadmap are also open source in the following locations, welcome to visit (you can directly click [read the original text] at the end of the article to visit):

github.com/SeedV/generative-ai-roadmap

View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Share
Comment
0/400
No comments
Trade Crypto Anywhere Anytime
qrCode
Scan to download Gate app
Community
English
  • 简体中文
  • English
  • Tiếng Việt
  • 繁體中文
  • Español
  • Русский
  • Français (Afrique)
  • Português (Portugal)
  • Bahasa Indonesia
  • 日本語
  • بالعربية
  • Українська
  • Português (Brasil)