This year, the field of artificial intelligence (A.I.) has continued to advance at a staggeringly fast rate. With the advent of AI systems single-handedly solving complex mathematical operations such as the matrix multiplication; regular folks got a taste of how it feels like to become Leonardo Da Vinci with text-to-image A.I. generators; whereas newer labs such as Stability.ai and Midjourney are open-sourcing their models to the public, and introducing a paradigm shift and breaking from the dynamic where groundbreaking innovations were only readily available for tech giants or established labs with bountiful resources.
As 2022 draws to a close, we take a look at this year’s notable breakthroughs and events.
In October, an A.I.-powered humanoid robot named Ai-da became the first of her kind to speak in the UK Parliament, addressing the impact of technology on the creative industries in front of the Communications and Digital Committee. Her appearance has raised crucial questions regarding the future of art and technology, most notably reflected in a question from the UK committee: “How do you produce art and how is this different to what human artists produce?”
Originally developed by DeepMind — the creator of the renowned AlphaGo that defeated professional Go player Lee Sedol in 2016 — machine-learning model AlphaFold serves as more than a solution to the impossible 50-year-old protein folding problem since its launch in November 2020. With the massive update for its open-source protein structure database this July — which expanded the amount of 3D structures available from one million to 200 million — more than 500,000 researchers from 190 countries have made use of the database to aid their research, whereas AlphaFold has been cited in 295 pieces of AI-related research literature this year, compared to 69 in 2020.
Although deep learning architecture such as generative adversarial networks (GANs), which are capable of creating realistic images, dates back as early as 2014, diffusion models has certainly marked 2022 as the year of AI art, with the likes of DALLE-2, Imagen, Stability Diffusion, and Midjourney, receiving mainstream attention.
Earlier this year, technology company Nvidia unveiled GET3D (the acronym comes from “a Generative model that directly generates Explicit Textured 3D meshes [...]”), which is capable of using only 2D images to generate 3D shapes with high-fidelity textures and complex geometric details. As VR and AR technology are becoming increasingly popular, GET3D could drastically enhance the efficiency of creating an immersive interface. Since the generated 3D models are created in the same format used by mainstream graphics software, users can directly export the models into 3D renderers and game engines for further editing.
A group of researchers from the University of California, Berkeley, have developed an algorithm called “DayDreamer” that enables a quadruped to teach itself to walk within an hour. Compared to conventional robot training methods, the DayDreamer algorithm utilizes artificial neural networks to obtain a world model through interacting with the physical environment, instead of a simulated one, which allows the AI to better predict the results of a series of actions.
Following the Blake Lemoine controversy, where the computer scientist was put on leave by Google after claiming the company’s chatbot has feelings, Meta unveiled in July a project codenamed “No Language Left Behind” (NLLB) that aims to offer high quality translations for more than 200 spoken languages — including low-resource ones like Asturian, Luganda, Urdu, and more — through open-source models. Recently, in late October, the company also introduced the first AI-powered speech-to-speech translator that works for Hokkien – a Chinese dialect that is predominantly spoken and not written.
In late November, OpenAI successfully trained a neural network to play the video game Minecraft by making it study 70,000 hours of gameplay video. Through a semi-supervised imitation training method, dubbed “Video PreTraining,” the trained bot is capable of performing tasks of varied difficulty, from rudimentary crafting and hunting actions, to the more advanced “pillar jumping” — a common move to elevate by repeatedly jumping while placing a block underneath oneself. It is also the first bot that can master the crafting of diamond tools, which normally takes a competent human player around 20 minutes of high-speed mouse clicks, or 24,000 actions.