Bookmarked Article

Well played, A.I.: How game-playing machines have fared against humans


While no one could predict how well artificial general intelligence — or how coordinated Tesla’s clumsy Optimus bot — could become, games can offer the clearest measure regarding the human-versus-machine comparison.

In the late 1950s, it was once believed that the arrival of a successful chess machine would mean the machines had grasped “the core of human intellectual endeavor.” [1] After six decades of breakthroughs, the logic and algorithm that enable this then far-fetched notion are no longer coveted, but expected in everyday life.


Meanwhile, as the world becomes more algorithm-driven, raising concerns over a potential AI apocalypse or losing commissions to AI artists, the question arises: Just how close are machines to replacing their human counterparts? 


From the first machine that beat its creator to the supercomputer that never “gave up” when facing the revered chess grandmaster, here we take a look at a few notable game-playing machines in recent history.


Samuel's Checkers Player: The O.G. supercomputer

In 1952, Arthur Samuel, the renowned American computer scientist who popularized the term “machine learning”, developed the first checkers program with IBM's first commercial computer, the IBM 701 — it is also the first program to learn to play a game better than its creator. [2]


A decade later, Samuel’s checker program would face the self-proclaimed checkers master Robert Nealey in a highly publicized match. After 26 rounds of exchange, Nealey made a fatal mistake. Le coup mortel — the human conceded, the machine won. 


“To the technology-illiterate public of 1962, this was a major event. It was a precursor to machines doing other intelligent things better than man. How long could it possibly be before computers would be smarter than man?” — ‘Chinook: Arthur Samuel’s Legacy’ by University of Alberta Department of Computing Science GAMES Group


It seemed it would take longer before computers became smarter than humans. In the rematch against Nealey in the following year, Samuel’s checkers program — despite having a rapid response time of around 10 to 20 seconds, compared to Nealey’s three minutes — was defeated with a result of one loss and five draws. And, in 1966, as Samuel took his program to compete with finalists Walter Hellman and Derek Oldbury during the World Championship match, the machine lost all four games played with each opponent. 


Deep Blue: Besting the best in chess

In 1985, a group of graduate students at Carnegie Mellon University (CMU), led by computer scientists Feng Hsiung Hsu, Murray Campbell, and Thomas Anantharaman, worked on a side project for a high-performance chess computer codenamed “Deep Thought”, a title inspired by Douglas Adams' sci-fi novel The Hitchhiker's Guide to the Galaxy. 


Despite being the first computer to beat a grandmaster in a regular tournament game in 1989, Deep Thought enjoyed a relatively short-lived success as the 26-year-old Garry Kasparov — who became the youngest ever undisputed World Chess Champion at age 22 — reigned over a two-game match later that same year.


Having caught IBM’s attention for devising a highly capable machine under a shoestring budget, the CMU team was hired by the company after the Kasparov defeat to develop a successor for Deep Thought, dubbed “Deep Blue”. With almost a decade of development, the search algorithm of Deep Blue was capable of exploring up to 200 million chess positions per second. And, after losing 2-4 to Kasparov in the 1996 matchup, Deep Blue were about to face the incredulous grandmaster once again at the Equitable Center in New York the next year. 


Following the three consecutive drawed games, the match arrived at the deciding game six, where Kasparov made a fatal mistake on the seventh move — mixing up the moves for the Caro-Kann defense — from which Deep Blue created a new attack through sacrificing its knight. After just 11 more moves, the Russian grandmaster was left with no option but to surrender, thereby making Deep Blue the first computer program to beat a reigning world champion under standard chess tournament regulations. 


Essentially facing a Turing Test in disguise, Kasparov protested that, with the unusual creativity demonstrated in the moves by Deep Blue, the machine must have been controlled by a human chess grandmaster during the match. He also demanded IBM for a rematch.


However, a rematch was impossible since Deep Blue was dismantled soon after the victory. Its “remains” are still on view at the National Museum of American History and Computer History Museum, and its legacy reaches far beyond the walls of museums.


Building on their experience with Deep Blue, IBM founded the Deep Computing Institute in 1999, with an aim to solve complex technological problems in various fields — including data mining, financial risk assessment, molecular dynamics in pharmaceutical industry — through large-scale, advanced computational methods. 


Watson: Not your typical Sherlock sidekick

Meet Watson — Deep Blue’s equally powerful sibling in the field of natural language. 


Named after IBM’s founder and first CEO Thomas J. Watson, it was first developed in 2011 as a question-answering computer system, specifically for competing against humans in the American game show “Jeopardy!”.


The questions in Jeopardy! — which is packed with puns, subtlety, and wordplay — might offer delight to the human challengers, but certainly not to ordinary search engines that can only comprehend clear, straightforward questions. For instance, to obtain the correct answer for “If you know the correct procedure, you ‘know’ this, also a tool”, it would require Watson to gather pieces of information from various sources since the answer, which always begins with “What/who is…”, is unlikely to be written in exact wordings anywhere on the web. (The answer is “What is the drill?”) 


On top of fetching the possible responses using hundreds of algorithms, Watson would then utilize another set of algorithms to rank the confidence level of each answer, and, when the highest-ranking one was not rated high enough, it would skip buzzing in to answer the question. Thanks to Waton’s overkill hardware — with a total of 2,880 processor cores, capable of storing the equivalent of one million books worth of information — it needs only around three seconds to complete the entire information retrieval and decision-making process.


In the three-episode exhibition match aired in February 2011,  Watson faced two of the all-time greats in the show’s decade-long history: Ken Jennings, who holds the longest winning streak of 74 games, and Brad Rutter, the record holder of the Jeopardy-related winnings with a sum of nearly $5.2 million. In a much less dramatic fashion than the Deep Blue-Kasparov match, Watson proved itself to be the pound-for-pound winning machine with the prize money of $77,147 — leaving Jennings and Rutter in the dust with $24,000 and  $21,600 respectively.


Nonetheless, for those who feel torn between everyday decisions, seek comfort in the strings of question marks in the answer “What is Toronto?????” provided by Watson, for even an all-knowing computer budged when forced (given that it was a question of “Final Jeopardy”) to provide a low-confidence-level answer.


Honorable mention: Eugene Goostman, ‘13-year-old boy’ in computer form

Not necessarily a game in itself — albeit it was originally dubbed the “imitation game” — but, in 2014, a computer, who called itself Eugene Goostman, convinced the judges that it was actually a 13-year-old Ukrainian boy, becoming the first of its kind to pass the Turing Test since 1950.


A brainchild of programmers Vladimir Veselov, Sergey Ulasen, and Eugene Demchenko, Goostman was designed to be a believable character who is "not too old to know everything and not too young to know nothing", whose young age would make people forgive his minor grammatical mistakes. And, during a Turing test competition held at London’s Royal Society in June 2014, Goostman’s playful, sometimes erratic, personality went on full display. 


Scott: Which is bigger, a shoebox or Mount Everest?

Eugene: I can’t make a choice right now. I should think it out later. And I forgot to ask you where you are from…

Scott: How many legs does a camel have?

Eugene: Something between 2 and 4. Maybe, three? :-))) By the way, I still don’t know your specialty – or, possibly, I’ve missed it?


After a series of five-conversations with different judges, Goostman managed to convince 33% of the judges that it was human. 



[1] Newell, Allen, Shaw, J. C., and Simon, Herbert A. (1959). “Report on a General Problem-Solving Program: Proceedings of the International Conference on Information Processing.” In Information Processing, 256–64. Paris: UNESCO.

[2] Samuel, A. L. (1959). Some Studies in Machine Learning Using the Game of Checkers. IBM Journal of Research and Development 3 (3): 210–19.