OpenAI applied for the GPT-5 trademark, when will it be released? What new abilities will it bring?

Question

Original source: AGI Innovation Lab![](https://img-cdn.gateio.im/resized-social/moments-bab2147faf-61db342df5-dd1a6f-1c6801) Image source: Generated by Unbounded AI‌On August 1st, OpenAI has officially submitted a trademark application for "GPT-5", which covers the following:* Software that artificially generates human speech and text* Convert audio data files to text* Voice and speech recognition* Machine learning-based language and speech processingAccording to GPT-5's trademark application documents, the GPT-5 trademark covers the function of AI to generate speech and text, and can also convert audio files into text, realize sound and speech recognition, and use machine learning technology for language and speech processing.This may mean that GPT-5 will support voice capabilities, which will bring users a more advanced and efficient voice and text processing experience, and further enhance multimodal capabilities.## When is GPT-5 coming?When GPT-4 is released in March 2023, OpenAI is expected to release the next-generation model in December 2023. Runway co-founder Siqi Chen previously stated that I was told that GPT5 is scheduled to complete training in December this year, and Openai expects it to be able to achieve general artificial intelligence (AGI). That means we'll all be arguing fiercely about whether it's truly AGI.However, when asked at an MIT event in April whether OpenAI was training GPT-5, OpenAI CEO Sam Altman said "we won't, and won't for a while." In an interview in June this year, OpenAI founder and CEO Sam Altman said when he was asked what launched GPT-5, I am also curious, we have no answer, we will not have GPT-5 soon, we must Make security a big part of it.Still, some believe that OpenAI may launch GPT-4.5, an intermediate version between GPT-4 and GPT-5, like GPT-3.5, by October 2023. GPT-4.5 is said to eventually bring multimodal capabilities, the ability to analyze images and text. OpenAI has already announced and demonstrated the multimodal capabilities of GPT-4 as early as March 2023 during the GPT-4 developer livestream. Now Microsoft has released the multimodal capabilities of GPT-4 in Bing Chat. It looks like the next major update to GPT-4 is just around the corner.In addition, OpenAI currently has a lot of work to do on the GPT-4 model before starting to work on GPT-5. Currently, the inference time of GPT-4 is very long and quite expensive to run. GPT-4 API access is still hard to come by. Additionally, OpenAI just recently opened up access to ChatGPT plugins and code interpreters, which are still in beta. Internet browsing has been removed from GPT-4 because it displays content from paid sites.While GPT-4 is very powerful, I think OpenAI realizes that computational efficiency is one of the key elements to sustainably run the model. Add new features and capabilities, and you can handle larger infrastructures while ensuring that all checkpoints are up and running reliably. So, on a wild guess, GPT-5 will likely be released in 2024 if we assume no regulatory hurdles from government agencies.## Predictions: GPT-5 features and functions**Reduce hallucinations**The hot topic in the industry is that GPT-5 will realize AGI (artificial intelligence). Among other things, GPT-5 should reduce inference time, improve efficiency, reduce hallucinations, and more. Let's start with hallucinations, one of the key reasons why most users don't trust AI models very much.According to OpenAI, GPT-4 scores 40% higher than GPT-3.5 on the factual evaluation of internal adversarial design across all nine categories. GPT-4 is now 82% less likely to respond to inaccurate and disallowed content. It comes very close to the 80% score in the accuracy tests across categories. This is a giant leap against illusion.Now, OpenAI is expected to reduce hallucinations to less than 10% in GPT-5, which will be huge for making LLM models trustworthy.**Calculation Efficiency Model**We already know that GPT-4 is expensive to run ($0.03 per 1K token) and takes longer to infer. And the older GPT-3.5-turbo model is 15 times cheaper than GPT-4 ($0.002 per 1K token). According to a recent report by SemiAnalysis, GPT-4 is not a dense model, but is based on a "mixture of experts" architecture. This means that GPT-4 uses 16 different models for different tasks, with 1.8 trillion parameters.With such a large infrastructure, the cost of running and maintaining the GPT-4 model becomes very expensive.In fact, many new large models have begun to pursue "small and refined", so that large models have as few parameters as possible, not more.In the recent interpretation of the Google PaLM 2 model, the PaLM 2 parameters are quite small, but the performance is fast.**Multisensory AI Model**Although GPT-4 has been declared a multimodal AI model, it only deals with two types of data, namely images and text. With GPT-5, OpenAI may take a giant step towards true multimodality. It can also handle text, audio, images, video, depth data and temperature. It will be able to interconnect data streams from different modalities to create embedded spaces.**long-term memory**With the release of GPT-4, OpenAI brings a maximum context length of 32K tokens at a cost of $0.06 per 1K token. We quickly saw a shift from the standard 4K token to 32K in a matter of months. Recently, Anthropic increased the context window of its Claude AI chatbot from 9K tokens to 100K tokens. It is expected that GPT-5 may bring long-term memory support through greater context length.This helps make AI characters and friends remember your characters and memories for years to come. Besides that, you can also load books and text document libraries in a single contextual window. A variety of new AI applications may emerge thanks to the support of long-term memory, and GPT-5 could make this possible.When do you think GPT-5 will be released and what disruptive innovations will it bring?References: