Bookmarked Article

How powerful is OpenAI’s GPT-4?


The world of artificial intelligence has been buzzing with the announcement of GPT-4, the latest and most powerful language model developed by OpenAI. GPT-4 is a giant leap forward from its predecessor, GPT-3.5, which was already capable of generating coherent and diverse texts on almost any topic. What makes GPT-4 so remarkable is that it can not only produce natural language, but also understand it at a deeper level. 


GPT-4 is not just a bigger and faster version GPT-3.5. It is also a multimodal model, which means that it can handle multiple types of input and output — such as text, images, video and audio — opening up new possibilities for creating and interacting with content across different domains and platforms.


In case you are still wondering how powerful the language model is, here are some key features and improvements that distinguish GPT-4 from its predecessors:


Enhanced performance
  • The difference between GPT-3.5 and GPT-4 becomes apparent when handling complex tasks. GPT-4 is more reliable, creative, and capable of processing nuanced instructions than GPT-3.5.
  • GPT-4 was tested on a variety of benchmarks, including simulating human-designed exams, with no specific training for the exams. GPT-4 outperformed existing large language models, including state-of-the-art models, on traditional machine learning benchmarks.
  • GPT-4's language capabilities extend beyond English. It outperformed GPT-3.5 and other large language models (like Chinchilla and PaLM) in 24 of 26 tested languages, including low-resource languages such as Latvian, Welsh, and Swahili.
  • GPT-4 has been used internally with great impact on functions such as support, sales, content moderation, and programming. It is also assisting humans in evaluating AI outputs, the second phase of the alignment strategy.
Seeing beyond text
  • GPT-4 can process both text and images as inputs, generating natural language, code, and other text outputs. It exhibits similar capabilities on a range of domains, including documents with text and photographs, diagrams, or screenshots, as it does on text-only inputs.
  • GPT-4 can also be enhanced with test-time techniques like few-shot and chain-of-thought prompting, originally developed for text-only language models. However, image inputs are still in the research preview stage and not yet publicly available.
  • Despite its potential, there are limitations to GPT-4's image input capabilities, such as its inability to process images with high visual complexity or ambiguity.
Safer, but not fail-safe
  • Mitigations and improvements have been made based on feedback and data from experts, such as incorporating an additional safety reward signal  to reduce harmful outputs.
  • Compared to GPT-3.5, GPT-4 has significantly improved safety properties, responding to sensitive requests 29% more often in accordance with policies.
  • Despite model-level interventions, bad behaviors are still possible, where deployment-time safety techniques like monitoring for abuse are crucial.
  • In addition, similar risks to previous models still exist in GPT-4, such as generating harmful advice, buggy code, or inaccurate information.

Before you go, you might also be interested in how ChatGPT crafted ten limericks for our Valentine's specials.