Sunday, November 30, 2014

How Linguistic Tricks Can Solve Different Problems

This quarter, I have been able to study existing technologies under several new lenses. In a communications class, we explored the Turing Test and what this signified for artificial intelligence. With the knowledge I’ve accumulated in our linguistics class, I am able to criticize the Turing Test as being misguided and suggest solutions. To delve further into the work done in my communications class, let me first start by describing the Turing Test and the problems I have diagnosed. Then, I will look at linguistic solutions to approaching the problems highlighted.

In the 19th and 20th centuries in the US, behaviorism - led by scientists Watson and Skinner - was the main approach to psychology. A behaviorist believes that if you respond in a way appropriate to someone with that belief, then you have it. For example, if being happy (the belief) was defined as smiling, then a smiling robot would be ‘happy'. In this way, behaviorism bypasses any reference to mental events.

In this post, I will be discussing intelligence - and so it is important to think about a possible definition of the term. Turing’s definition is strongly behaviorist: if a machine behaves as intelligently as a human, then it is as intelligent. This view of artificial intelligence is behaviorist, by suggesting that that it does not matter if a machine can 'think' – it must simply act like it is thinking. Stemming from this line of thought, Alan Turing designed “The Turing Test,” which I will describe very shortly. This is a behavioral test for the presence of mind, thought or intelligence through the framework of an imitation game. In broad terms, a human must determine whether the entity it is communicating with through a computer is a human or a computer by asking the wannabe-human questions.[1] With this understanding, let us criticize the test.

It makes me uncomfortable to think that a machine with an extremely basic mind (inexistent, one might argue) could pass the Turing Test and be deemed intelligent. Because the Turing Test recognizes intelligence in programs that sustain a human-like conversation, programs try to trick humans. For example, a tester could ask a difficult arithmetic question such as “Calculate 2231*347” and the computer program would have a delayed answer to mimic human computational behavior. This, to me, shows us that the Turing Test has lost its way. Here, we are testing for intelligence from a human standard and not exploring what it really means to be intelligent. Imagine a robot that had an output for every possible input, and that this robot looked like a human - it even behaved like a human too. Is it intelligent or isn't it just a simulation? This discussion suggests that we must try and determine how the program’s internal structure – its mind – behaves and see if that matches with our existing notions of intelligence. The problem though: how can we do that without looking at its internal states? Like Turing, we have to settle with judging behavior because we cannot judge its internal structure. 

Instead of Turing’s imitation game, I think there should be a series of questions (constantly refreshed so robots cannot cheat by learning new output) that test a program’s ability to reason. It turns out that certain linguistic tools can test reason very well. Programs developed for the Turing test perform incredibly well across many areas, especially syntax. However, certain semantic problems that require extra linguistic context cause these programs difficulty. Consider the example question:
The large ball crashed right through the table because it was made of Styrofoam. What was made of Styrofoam?[2]
a.     The large ball
b.     The table
The following question is an example of anaphora, where the interpretation depends upon another expression in context. In order to answer the above question, you need to have some domain-specific knowledge about materials and how things break. Humans have no trouble solving these questions because they have knowledge stored away. However, computers find these questions very difficult because they do not know where to retrieve the information – or even determine what information needs to be retrieved. Let us consider another example:

a. Only a few of the children ate their ice-cream. They ate the strawberry flavor first.
b. Only a few of the children ate their ice-cream. They threw it around the room instead.

Here, anaphora refers to what is called a complement set. In (a), the pronoun ‘they’ refers to the children eating ice cream, whereas in (b) it refers to the non-ice cream eaters. Using common sense, we are able to quickly decipher who the pronoun refers to. However, this is once again a difficult process for a computer because there are multiple possible referents. Crucially, to solve these questions, the computer must have greater understanding about the state of the world. 

By designing questions like these that test a computer’s internal structure, we are able to better determine whether a machine is intelligent. Such linguistic tricks, then, should be used in the Turing Test. However, are such questions sufficient to diagnose intelligence? Is this kind of semantic knowledge synonymous with understanding? Is this version of the test still too close to a human standard of intelligence? Overall, I hope that I have shown that Turing test questions should try and peer into the internal structure of a program by asking semantically challenging questions and observing its output. 

[1] Oppy, Graham. "The Turing Test." Stanford University. Stanford University, 09 Apr. 2003. Web. 25 Nov. 2014. <http://plato.stanford.edu/entries/turing-test/>.
[2] Lohr, Steve. "Looking to the Future of Data Science." Bits Looking to the Future of Data Science Comments. New York Times, n.d. Web. 10 Nov. 2014. <http://bits.blogs.nytimes.com/2014/08/27/looking-to-the-future-of-data-science/?_r=0>.

1 comment:

  1. This is in an incredible post, addressing important questions in artificial intelligence, computer science, philosophy, and linguistics. Even though we discussed your thoughts in section, I thought it would be beneficial to mention a few things we discussed. One major topic we discussed was John Searle’s Chinese Room thought experiment. His thought experiment asks us to imagine a native English speaker who knows no Chinese in a “black box” with a Chinese translation dictionary and all the necessary tools to properly translate to Mandarin. (Imagine that people outside the room send in other Chinese symbols, which, unknown to the person in the room, are questions in Chinese (the input). And imagine that by following the instructions in the translation tools the man in the room is able to pass out Chinese symbols, which are correct answers to the questions (the output). The program enables the person in the room to pass the Turing Test for understanding Chinese even though he does not understand a word of Chinese. He argues that a computer can do the same thing, so a computer has anything the man possesses when it comes to language.

    ReplyDelete