Could This New Competition Replace The Turing Test?
Levesque's Winograd Schema Challenge (WSC) takes a different approach. Instead of basing the test on short free-form conversations, the WSC poses a set of multiple-choice questions that look like this:
I. The trophy would not fit in the brown suitcase because it was too big. What was too big?
Answer 0: the trophy
Answer 1: the suitcase
II. The town councilors refused to give the demonstrators a permit because they feared violence. Who feared violence?
Answer 0: the town councilors
Answer 1: the angry demonstrators
The answers to these multiple choice questions should be fairly obvious to the average person, but ambiguous for machines devoid of human-like reasoning or intelligence. Indeed, humans use a surprising number of cognitive skills when answering these questions, including abilities in spatial and interpersonal reasoning, knowledge about the typical sizes of objects, how political demonstrations unfold, and many other types of commonsense reasoning.
What's more, based on the ambiguity of the questions, a machine can't simply access the Internet or a resource to find the answer; it has to literally reason it out. So an expert system like IBM's Watson would have a real hard time with this test as it's designed for different tasks altogether, namely natural language processing and data acquisition/analysis (and it does so by taking a probabilistic approach).
The WSC is being proposed as a way to measure and track progress in the development of automated human-like reasoning skills. To that end, a test will be administered on a yearly basis by Commonsense Reasoning Symposium, the first to be held at the 2015 AAAI Spring Symposia at Stanford University from March 23 to 25, 2015. Each year, the test will feature an entirely new set of questions. Researchers and students will be invited to design computer programs that simulate human intelligence. The winner that meets the baseline for human performance will receive a grand prize of $25,000.
George Dvorsky