Wednesday, October 15, 2014

Artificial Intelligence and Language: Distinguishing Children from Machines

Nathan Tindall

In what follows, I will discuss common arguments for linguistic nativism, and use them in order to advocate that a machine could achieve an accurate grammar. These thoughts reflect a synthesis of ideas emphasized in other classes I am taking (PSYCH 141: Cognitive Development, CS 221: Artificial Intelligence) with linguistics.

There are two dominant viewpoints in developmental psychology as to the theory of language acquisition: nativism and empiricism. While the former argues that humans have some unique language-learning module with which we are equipped at birth, the latter argues that language is attained through a more generalized learning paradigm based upon statistical learning. There is a continuum between these viewpoints; for example, some assert that there may be some primitive elements (syllables) at the base of the system, for which we have some biological framework for sensitivity. Understanding how humans acquire language has critical implications for improving natural language processing. In particular, there is an argument to be made that, if properly equipped, machines may be able to acquire parts of the language system with which humans are equipped.

Nativist interpretations of language acquisition argue that the ability to learn a language grammar is hard wired into some specialized part of the human brain separate from other learning capacities (i.e. a Universal Grammar). Noam Chomsky’s “Poverty of the Stimulus” argument, first presented in his essay “Rules and Representations” (1980) argues for linguistic nativism by asserting that a there are patterns in natural language that cannot be learned by positive evidence alone. Linguistic recursion, as discussed by Olek in class today, is a feature that is frequently cited, for example. Positive evidence, however, is the primarily stimulus to which children are exposed; they very infrequently are exposed to examples of non-language. This argument for the Universal Grammar (i.e. linguistic nativism) hinges upon the fact that some elements of language are not acquirable through the human experience.

The basic paradigm for artificial intelligence is to present the machine with training data, on which it “learns” how to distinguish elements, make optimal decisions, and give reliable output through statistical pattern recognition. Machine learning is used all over technology and on the web, in order to emulate “intelligent behavior.” Despite the fact that Chomsky likely considers most statistical models to be meaningless, regardless of their accuracy, the distinction between positive and negative stimuli has implications for the way that natural language processing algorithms should be developed. Unlike a child, machines can be fed both positive and negative language stimuli, supporting the idea that a machine with adequate training could acquire properties of language that are unattainable by humans through real world stimulus. For the prospects of natural language processing, this is good news, as it means that a sufficient algorithm may be able to match the level of computation that humans are biologically equipped to perform.

Opening the question of positive and negative stimuli for broader linguistic discussion, to what extent do you think we rely on each in order to make inferences about language? How do you feel about this theory that machines can acquire linguistic abilities that humans may be hard-wired to execute? What broader implications might this have for the relationship between humans and machines? Personally, the ability for machines to outperform humans is an overwhelming concept; however, the process of teaching computers to be better at language may, in fact, shed light on the way that language is learned and used.

Word Count: 579

References 

Chomsky, Noam (1980). "Rules and Representations." Behavioral and Brain Sciences 3.01.

Newport, Elissa (2011). "The Modularity Issue in Language Acquisition: A Rapprochement? Comments on Gallistel and Chomsky." Language Learning and Development, 7:4, 279-286

Valian, V. (2009). Innateness and learnability. In E. L. Bavin (Ed.), Cambridge Handbook of Child Language (pp. 15–34). Cambridge: Cambridge University Press

2 comments:

  1. I think what Chomsky means by saying most statistical models are meaningless is that they are meaningless in terms of helping machines to acquire true "intelligence," which can only be achieved through people's understanding of languages through their language faculty. Both within a language and cross linguistically, there are syntactic rules that appear unrelated but can be elegantly unified through linguistic theories of generative grammar, which is why many linguists regard the current theories as effective in explaining the mechanism of first language acquisition.

    Indeed, as the author points out, generative linguists believe that "some elements of (one's native) language" are not acquired through exposure to outside experience, but they are nevertheless acquirable, as a typical second language acquisition process often includes learning of such elements formally. Therefore, if we compare machine learning to the machine's learning a foreign language, it won't be a problem, at least methodologically, to train machines with both positive and negative stimuli, and actually it may be a more effective approach. Whether machines can be "intelligent" is mostly a philosophical issue, but for linguists and computer scientists alike, we need to better understand the elements of language in order to come up with good "textbooks" for computers to learn from.

    ReplyDelete
  2. Hi Nathan,

    I think natural language processing algorithms will be capable of producing language, since they are already able to “understand” language although not at a deep or emotional level. I have seen a Stanford research project where the student developed an algorithm, which can break down sentences into components and then uses an online language system to further understand the sentence. It functions at a level equivalent to IBM’s Watson. If you want more details about the online database she used, I can look it up for you. However, I think when any machine-learning algorithm produces language it cannot really know how to say something colloquially even if it might be efficient at correct syntax. For example, Chomsky has many examples of sentences with good syntax but are nonsensical, for example the cliché sentence “colorless green ideas sleep furiously” which I am in the process of writing a blog post about. Because of this, I think children will always have better "colloquial" syntax while machines will be better at pure syntax until the child grows older.

    ReplyDelete