Given past months of rumours and speculation about the wildly trended and still trending ChatGPT, OpenAI has finally announced GPT-4, the latest in a series of AI language models powering apps like ChatGPT and the new Bing.
According to the company, the model is “more creative and collaborative than ever” and “can solve difficult problems with greater precision.”
It can analyze both text and images, although it can only respond via text.
OpenAI also warns that the systems retain many of the problems seen with previous language models, including a tendency to fabricate information and the ability to generate violent and harmful text.
OpenAI, therefore, declares how it has already been working with several companies to integrate GPT-4 into their products, including Duolingo, Stripe, and Khan Academy.
The new model is available to the general public through ChatGPT Plus, OpenAI’s $20 monthly ChatGPT subscription, and today we learned that it’s what’s been powering Microsoft’s Bing chatbot since day one.
It will also be accessible as an API for developers to build on, with a waiting list available here.
According to the detailed blog post, OpenAI stated that the distinction between GPT-4 and its predecessor GPT-3.5 is “subtle” in casual conversation (GPT-3.5 is the model that powers ChatGPT).
OpenAI CEO Sam Altman disclosed that GPT-4 is “still flawed, still limited,” but that it also “still looks more impressive on first use than when you spend more time with him”.
The company reports that GPT-4’s improvements are evident in the system’s performance on a number of tests and benchmarks, including the Uniform Bar Exam, LSAT, SAT Math, and SAT Evidence-Based Reading & Writing.
In the tests reported, the GPT-4 scored in the 88th percentile and above, and a full list of the system’s tests and scores can be seen here.
Speculation about GPT-4 and its capabilities has been rife over the past year, with many suggesting that it would represent a huge leap over previous systems.
However, judging by OpenAI’s announcement, this is not the case as the company previously warned.
“People want to be disappointed and they will be disappointed,” Altman said in an interview with GPT-4 in January.
The rumours were reignited last week when a Microsoft executive leaked that the system would be released this week in an interview with the German press.
The executive also suggested that the system will be multimodal, meaning it will be able to produce not only text but also other content.
Many AI researchers believe that multimodal systems integrating text, audio, and video offer the best path toward creating more capable AI systems.
GPT-4 is one of them, but supporting fewer media than some predicted.
OpenAI reports that the system can accept text and images while producing only text.
The company also says that the model’s ability to analyze text and images simultaneously allows it to interpret more complex queries.
In the following samples, you can see the system explain memes and unusual images.