📢 Gate Square #Creator Campaign Phase 1# is now live – support the launch of the PUMP token sale!
The viral Solana-based project Pump.Fun ($PUMP) is now live on Gate for public sale!
Join the Gate Square Creator Campaign, unleash your content power, and earn rewards!
📅 Campaign Period: July 11, 18:00 – July 15, 22:00 (UTC+8)
🎁 Total Prize Pool: $500 token rewards
✅ Event 1: Create & Post – Win Content Rewards
📅 Timeframe: July 12, 22:00 – July 15, 22:00 (UTC+8)
📌 How to Join:
Post original content about the PUMP project on Gate Square:
Minimum 100 words
Include hashtags: #Creator Campaign
The craze for large models: the sweetest cake and the hardest
Author丨Bai Jiajia
American Editor丨Fisherman
Source: Silicon-Based Research Lab
Editor's note:
One side is fanatical and the other side is cold. This is the current status of China's large-scale model industry. "Chaos Age" may be the most appropriate label for this industry. In the past six months, technology and people, technology and industry, human civilization and technological civilization have entered a new stage of reconstruction. Behind this change is not only the advancement of technology, but also the promotion of key people and key companies.
As a frontier observer of the intelligent era, the "Silicon-based Research Office" pays attention to all technology-related stories. Today, we will launch the plan of "The Chaos Age of Large-scale Models", starting from the deconstruction of large-scale models, and turning the lens to the forefront of these waves companies and people, share and interpret their unique insights for readers.
This article is the second article planned in this series: "The Application Craze of Large Models: The Sweetest Cake and the Hardest Pass". Part 1: The Chaotic Age of Large Models: Contradictions, Differentiation and the Future
"I've seen things you humans would never believe. I watched warships burn on the rim of Orion, and I watched C-rays flicker in the darkness near Tannhauser's Gate. All these moments, will pass In time, like tears disappearing in the rain."
This is the final monologue in the movie "Blade Runner", uttered by replicant Roy Batty.
This line was later selected by the British "Observer" as the sixth of the top ten classic moments in film history, and is often cited as a representative of science fiction works.
In a way, this passage is becoming a reality. Today, the large-scale model that has caused the world to fall into a frenzy is absorbing global knowledge at an unimaginable speed. The torrent of people, things, and things behind those characters may not be as plain as Starheap Wars.
And the scholars, engineers, and businessmen who made all of this are still waiting, maybe they can't tell what they want to wait for, more sophisticated technology probes? More efficient production tools? A super app with more money?
Or a replicant like Roy Batty, alleviating the almost desperate loneliness of human beings looking up at the galaxy.
is it coming
what does it bring
How did it come about?
Where did it first sprout?
What is the difference between China and foreign countries?
……
In the face of today's endless large-scale models, it is too late to worry or look forward to. For this partner who will be with us for a long time in the future, the best welcome ceremony is to stare at it.
The C side rolls to the B side, open source impacts closed source
The emergence of ChatGPT is like artificial intelligence knocking on your door.
It was released to the public on November 30, 2022. In just two months, ChatGPT has successfully surpassed 100 million monthly active users, making it the fastest growing consumer application in history.
At the beginning of March, Codeway Dijital developed Chat with Ask AI based on the ChatGPT API. With its powerful question-and-answer function, its revenue in the first half of the year exceeded 16 million US dollars (about 112 million Renminbi), becoming the AI+Chatbot track with the highest downloads and revenue. application.
**ChatGPT's success seems to indicate that the business logic of AI applications for C-end users is established-hand over the product to users, and they will explore the infinite possibilities brought by AI. In the process, a wonderful income curve will also Then emerge. **
Unfortunately, things are not that simple.
On July 4, the web analysis company Similarweb released data saying that ChatGPT’s global visits in June fell by 9.7% month-on-month, and the number of unique visitors fell by 5.7% month-on-month.
Character.AI can imitate the personalities of entertainment celebrities, historical figures, and fictional characters for conversations, ranking second among all similar AI tools.
In this regard, Similarweb analyst David Carr said, "From now on, chatbots must prove their worth and not take everything for granted."
To some extent, the decline in the number of visits to ChatGPT and Character.AI symbolizes that the AI applications for C-end users have gradually reached the upper limit, and this shows that—
**Users are not so interested in exploring the possibility of AI independently. Deeply integrating AI with application scenarios and "putting the hammer next to the nail" are necessary conditions for the landing of large models. **
Microsoft took the lead in becoming a "hammer porter".
Using the technology of OpenAI, the parent company of ChatGPT, Microsoft launched Microsoft 365 Copilot, and at the annual Inspire conference on July 18, it offered a price tag of $30 per month for commercial customers.
**By combining with its own scenarios, Microsoft has embarked on a more solid monetization path, and this is also the model of many large companies testing water AI today. The whole process is closed-loop, and data and models only flow among giants. **
It's like the most advanced robot manufacturing technology in "Blade Runner" is firmly controlled by Taylor Corporation.
However, real developments are often more dramatic than movies. Recently, Meta, also an Internet giant, announced the open source Llama2 basic model, which is distributed by Microsoft Cloud and is available for free commercial use by enterprises, firing the first shot of "anti-technology monopoly".
Judging from the evaluation results, Llama2 still has a certain gap from GPT-4, and has its own advantages and disadvantages with GPT-3.5. It is currently the best open source model on the market.
But what exactly does this mean?
Taking Huawei's division of large model levels as an example, it can be divided into basic large models (simulating human functions, such as language and vision), industry large models (according to industry divisions, functions of multiple basic large models may be coordinated) and scenarios Models (corresponding to specific scenarios in the industry, such as outlet assistants, supply chain logistics, and small molecule optimization).
Meta's open source Llama2 is a large language model in the basic large model. It does not require a large amount of data to train the model, but only a large corpus. Through fine-tuning, large-scale model entrepreneurs can develop AI applications suitable for corresponding industries or scenarios.
The "Miaoya Camera", which has suddenly exploded in China recently, is a beneficiary of open source.
Stable Diffusion (SD) is one of the hottest AI painting tools at present. It is a free and open source project that can be deployed and used by anyone for free. According to market forecasts, the principle of "Miaoya Camera" is to solve the problem of randomness in SD image output through the LoRA model plug-in.
LoRA is actually a model fine-tuning technology that is free and open to the public. On July 25, Alibaba Cloud launched a training and deployment plan for the full series of versions of Llama2 in China, including LoRA fine-tuning.
From ChatGPT facing C-end consumers directly, to Microsoft's combination of large models with its own scenarios and then pushing them to consumers or enterprises, and then to Meta open source Llama2, providing basic large models and fine-tuning services for enterprises, **multiple paths to promote large-scale Behind the landing of the model, there are large-scale model players trying to open up the business closed loop as soon as possible, and the consideration of withdrawing funds to support technological iteration is also a long-term investment in building an AI industrial ecology and competing for the right to speak. **
It is worth mentioning that although the article uses Microsoft, OpenAI, and Meta as examples, it does not mean that they are limited to the corresponding paths. In fact, with the support of technology and funds, leading companies have the strength to deploy multiple paths, and This also made the battle situation more anxious.
For example, according to an article published by The Information on July 24, the open source large-scale language model that OpenAI is developing is currently code-named G3PO, and the release schedule has not yet been determined internally.
Interestingly, although the front foot Zuckerberg stated on Facebook that "open source promotes innovation because it allows more developers to use new technologies... I believe that if the ecosystem is more open, more progress will be released" , But in a conference call a few days later, he proposed that he would charge a portion of the revenue from reselling services to large cloud computing companies such as Microsoft, Amazon, and Google.
Data may become the strongest moat for Chinese companies
Dai Yusen, managing partner of ZhenFund, has an ingenious analogy for this round of large-scale model entrepreneurship: the emergence of GPT-3 is equivalent to discovering a new continent, and ChatGPT-4 is like discovering gold in a new continent.
The Chinese company's catching up journey is like knowing the New World and where the gold is, and knowing that OpenAI is going by boat, and knowing the general appearance of the boat, but without a detailed map.
Therefore, for Chinese large-scale model players, finding shore supplies during this long voyage is a matter of life and death.
Docking is to find a landing scene for the big model and form a sustainable business model.
It is undeniable that there is still a certain gap between domestic artificial intelligence chips and world-class standards. Limited by chip sanctions, it is difficult for Chinese companies to expand the scale of computing power.
In addition, on the algorithm side, including various large models such as natural language processing, computer vision, audio, and multimodality, China has certain advantages, but the algorithm itself and the computing power complement each other, so it is also temporarily in the same position as OpenAI and other industry leaders. a certain distance.
Under this current situation, domestic enterprises must work hard on data if they want not to fall behind in this wave of AI.
**In other words, one of the core barriers for Chinese large-scale model players in this round of competition is based on the data formed in the Chinese market. **
However, high-quality data often contain a large number of corporate secrets, which are not even allowed to be uploaded to external networks, let alone submitted to other companies for large-scale model development.
In March of this year, the Korean media "economist" reported that there were three cases involving the misuse and abuse of ChatGPT within Samsung, resulting in semiconductor equipment measurement data, product yield and other content being stored in the ChatGPT learning database, causing major losses to the company .
It is precisely because of the risk of information leakage and the high cost of trust among enterprises that foreign large-scale model companies often start to increase the scale of large-scale models, establish an industrial ecology, and then follow up with applications.
To a certain extent, the "Voluntary Commitment Letter" recently signed by companies such as Microsoft, OpenAI, and Amazon at the White House is not only a response to social concerns caused by the rapid development of AI, but also a signal to the market, hoping to get more social institutions and business trust.
** Back in China, under the guidance of national power, state-owned enterprises and local governments are relatively open to large-scale models, and applications, ecology, and model construction are developing simultaneously. **
For example, Huawei's Pangu large-scale model landed in the modern large-scale cross-shaft "Lilou Coal Mine" with the largest mining reserves and the longest mine service life in Shandong Province.
In October 2022, Huawei signed a cooperation framework with Yunding Technology, a subsidiary of Shanneng Group, to fully launch mines, artificial intelligence, ICT infrastructure, smart parks, talent training, smart wearables, mining terminals, and ICT solutions for industry scenarios. field cooperation.
Since then, experts from both sides have gone deep into the front line of the mine and deeply participated in the application of large-scale models. In actual production, 21 majors in 9 majors including coal mining, tunneling, main transportation, auxiliary transportation, lifting, safety supervision, anti-scouring, washing and coking have been excavated. Application scenarios, constantly upgrading the model, and officially released the first AI large model in the mining field on July 18 this year.
Similar stories also happened to companies such as Baidu and HKUST Xunfei.
On June 27 this year, Beijing released the first batch of 10 typical application cases of large-scale industry models, most of which are "hard core" fields such as urban governance, smart finance, health care, and industrial modernization.
These include the "Equipment Operation Inspection Knowledge Assistant Equipped with Power Industry NLP Large Model" jointly developed by Baidu and State Grid Smart Grid Research Unit, which can improve the F1 indicators of electric power professional word segmentation and electric power marketing sensitive entity recognition by 9.27% and 13.28%, reaching 92.376% and 94.947%;
The "Urban Brain Large Model" jointly developed by iFLYTEK and Zhongguancun Science City City Brain solves problems such as limited access and application of urban governance data resources, weak generalization ability of urban governance service models, and information security in the era of artificial intelligence.
**Chinese companies have thus embarked on a unique path-from industry large models to general large models, and then look at what kind of large-scale model technology is needed for large-scale implementation of applications. **
**And this process is also in line with the industry's general consensus on high-quality data production—lower the threshold by popularizing AI, and at the same time implement AI in the industry, and then accumulate and collect more high-quality data, and finally push the model forward Iterate quickly. **
The reason why domestic large-scale model application scenarios are different from those abroad is essentially that under the background that computing power and algorithms do not dominate, the country and enterprises form a joint force to accelerate the development of the "data-model-data" flywheel.
**And what really determines the future direction during this round of docking is actually whether a data market with high quality, liquidity and security can be built in China. **
A few days ago, the China Communications Standards Association and the China Academy of Information and Communications Technology released the "Database Development Research Report (2023)". The report pointed out that the global database market size in 2022 will be 83.3 billion U.S. dollars, and the Chinese database market size will be 5.97 billion U.S. dollars (about 40.36 billion U.S. dollars). billion yuan), accounting for 7.2% of the world.
** Where is the sweetest cake? **
To sum up, there are two trends in the large model track as a whole.
**One is that leading companies are moving from C-side applications to B-side. Some players choose to integrate their own resources and establish a full-chain service system from data base to industrial applications. The other part chooses to build a large-scale model platform to integrate with small and medium-sized enterprises Form a joint force to attack the leading players. **
**Secondly, overseas companies are the first to implement large-scale models in their own scenarios, and domestic companies are deeply integrated with real industries to form a data flywheel. **
Between the ebb and flow of the tide, the "sweetest piece of cake" in the application layer of the AI industry chain gradually surfaced.
** Judging from the current situation, language large-scale models and visual large-scale models are the clearest commercialization paths and the most concentrated large-scale models on the market. **In addition to direct-to-consumer applications such as ChatGPT and Miaoya Camera, it is also making steady progress in areas such as collaborative office, image editing, and intelligent customer service.
However, the degree of homogeneity of such applications is relatively high. Unless the technology is as leading as OpenAI, the effect will not be much different. What's more, even OpenAI needs to continuously introduce new functions to retain customers.
On July 20 and July 21, ChatGPT increased the number of messages that can be sent through GPT-4 and launched the custom command function.
Against the background of the stable distribution of game version numbers, the game industry is expected to become the sweetest piece of cake in the short term for large-scale model applications.
**In the long run, the large model is essentially a tool for improving the quality and efficiency of the industry. The willingness of customers to purchase services or products is directly related to the benefits that the large model can leverage. Therefore, in order to find the most imaginative application scenarios in the future, the key indicators that need to be investigated are the scale of the industry itself and the height of the moat. **
"Silicon-based Research Office" believes that new energy vehicles are the most imaginative field for future large-scale models.
From the perspective of development prospects, new energy vehicles conform to the global "low-carbon and environmentally friendly" consumption trend, which is conducive to reducing petrochemical energy consumption.
For example, in June 2022, the environment ministers of the 27 EU countries reached an agreement on a new climate protection legislation. From 2035, the EU will only allow cars with zero carbon dioxide emissions to be on the road.
From the point of view of reducing CO2 emissions alone, big models can find their way.
In addition to driving, the entire automobile industry chain itself is also a major carbon emitter, and metal raw material smelting, transnational transportation, manufacturing and other links are the focus of carbon emission reduction. However, due to the complex industrial chain, trivial data, and wide application scenarios, it is difficult for car companies to collect and evaluate the carbon footprint of the entire life cycle of cars.
With the intelligentization of the automobile industry chain, various data are transmitted to the cloud, and it is gradually possible to sort out a clear carbon reduction path. In this process, the "data" flywheel of the large model is expected to become the "fifth" of the automobile. "a wheel" to break down the data barriers between links and form an intelligent pathway in the industrial chain.
** On the other hand, the combination of large models and new energy vehicles is actually a win-win road. **
The high inference cost of large models is the reason why many enterprises are discouraged from it. With the development of technology, large models are released from the cloud to products, and the car itself can also perform a certain degree of reasoning tasks based on the on-board chip, and feed back the results to the cloud. For car owners, this means that new energy vehicles will still maintain a certain degree of "smartness" without being connected to the Internet, which is a bonus item for user experience.
**However, there are still several difficulties to be overcome before the big model can truly empower the new energy vehicle industry. **
For example, data storage issues.
As early as 2017, there was a wave of industrial big data boom in China, in which a typical scenario was the early warning and maintenance of key equipment. In layman's terms, it is to predict when the equipment may shut down through the data fed back by the sensor, and prompt what kind of equipment should be replaced.
However, after the actual implementation, it was discovered that at least 2 to 3 cycles of data are required to form a complete data model, and the storage cost alone is as high as tens of millions, which is too risky for enterprises.
And this is also true today, because the research and development of large models and subsequent iterations also require massive data as support, so today's car companies are more inclined to build platforms first, connect data and business, and then use large models to make Some fits.
**Secondly, compared with generative large models, the industrial field pays more attention to stability. **
To give a simple example, we use ChatGPT to write poems, expecting it to be creative, each piece is different, but in the industrial field, if each instruction is different, it will cause big problems.
Therefore, the in-depth integration of large models and production lines must be similar to writing codes to generate industrial instructions or propose optimization solutions for specific links, and it is impossible to really intervene in production.
As the saying goes, blessings come from misfortunes, and misfortunes come from blessings. The two difficulties for large-scale models to enter the new energy vehicle industry are actually the moats for companies that will make achievements in this field in the future. With the continuous development of storage technology , and the emergence of new digital factories such as "black light factories", the resistance to connecting large models with the new energy industry is also decreasing.
In some more cutting-edge fields, the two have begun to produce chemical reactions.
At present, the implementation of large-scale models in the field of new energy vehicles is basically focused on autonomous driving. Baidu, Tesla, Huawei, and Google have all deployed. Demonstration area on the road.