Nathan Tindall
In what follows, I will discuss common arguments for
linguistic nativism, and use them in order to advocate that a machine could achieve
an accurate grammar. These thoughts reflect a synthesis of ideas emphasized in
other classes I am taking (PSYCH 141: Cognitive Development, CS 221: Artificial
Intelligence) with linguistics.
There are two dominant viewpoints in developmental psychology
as to the theory of language acquisition: nativism and empiricism. While the
former argues that humans have some unique language-learning module with which
we are equipped at birth, the latter argues that language is attained through a
more generalized learning paradigm based upon statistical learning. There is a
continuum between these viewpoints; for example, some assert that there may be
some primitive elements (syllables) at the base of the system, for which we
have some biological framework for sensitivity. Understanding how humans
acquire language has critical implications for improving natural language
processing. In particular, there is an argument to be made that, if properly
equipped, machines may be able to acquire parts of the language system with
which humans are equipped.
Nativist interpretations of language acquisition argue that
the ability to learn a language grammar is hard wired into some specialized
part of the human brain separate from other learning capacities (i.e. a
Universal Grammar). Noam Chomsky’s “Poverty of the Stimulus” argument, first
presented in his essay “Rules and Representations” (1980) argues for linguistic
nativism by asserting that a there are patterns in natural language that cannot
be learned by positive evidence alone. Linguistic recursion, as discussed by
Olek in class today, is a feature that is frequently cited, for example. Positive
evidence, however, is the primarily stimulus to which children are exposed; they
very infrequently are exposed to examples of non-language. This argument for
the Universal Grammar (i.e. linguistic nativism) hinges upon the fact that some
elements of language are not acquirable through the human experience.
The basic paradigm for artificial intelligence is to present
the machine with training data, on which it “learns” how to distinguish
elements, make optimal decisions, and give reliable output through statistical
pattern recognition. Machine learning is used all over technology and on the
web, in order to emulate “intelligent behavior.” Despite the fact that Chomsky
likely considers most statistical models to be meaningless, regardless of their
accuracy, the distinction between positive and negative stimuli has
implications for the way that natural language processing algorithms should be
developed. Unlike a child, machines can be fed both positive and negative
language stimuli, supporting the idea that a machine with adequate training
could acquire properties of language that are unattainable by humans through
real world stimulus. For the prospects of natural language processing, this is
good news, as it means that a sufficient algorithm may be able to match the
level of computation that humans are biologically equipped to perform.
Opening the question of positive and negative stimuli for
broader linguistic discussion, to what extent do you think we rely on each in order
to make inferences about language? How do you feel about this theory that machines
can acquire linguistic abilities that humans may be hard-wired to execute? What
broader implications might this have for the relationship between humans and
machines? Personally, the ability for machines to outperform humans is an
overwhelming concept; however, the process of teaching computers to be better
at language may, in fact, shed light on the way that language is learned and
used.
Chomsky, Noam (1980). "Rules and Representations." Behavioral and Brain Sciences 3.01.
Newport, Elissa (2011). "The Modularity Issue in Language
Acquisition: A Rapprochement? Comments on Gallistel and Chomsky." Language
Learning and Development, 7:4, 279-286
I think what Chomsky means by saying most statistical models are meaningless is that they are meaningless in terms of helping machines to acquire true "intelligence," which can only be achieved through people's understanding of languages through their language faculty. Both within a language and cross linguistically, there are syntactic rules that appear unrelated but can be elegantly unified through linguistic theories of generative grammar, which is why many linguists regard the current theories as effective in explaining the mechanism of first language acquisition.
ReplyDeleteIndeed, as the author points out, generative linguists believe that "some elements of (one's native) language" are not acquired through exposure to outside experience, but they are nevertheless acquirable, as a typical second language acquisition process often includes learning of such elements formally. Therefore, if we compare machine learning to the machine's learning a foreign language, it won't be a problem, at least methodologically, to train machines with both positive and negative stimuli, and actually it may be a more effective approach. Whether machines can be "intelligent" is mostly a philosophical issue, but for linguists and computer scientists alike, we need to better understand the elements of language in order to come up with good "textbooks" for computers to learn from.
Hi Nathan,
ReplyDeleteI think natural language processing algorithms will be capable of producing language, since they are already able to “understand” language although not at a deep or emotional level. I have seen a Stanford research project where the student developed an algorithm, which can break down sentences into components and then uses an online language system to further understand the sentence. It functions at a level equivalent to IBM’s Watson. If you want more details about the online database she used, I can look it up for you. However, I think when any machine-learning algorithm produces language it cannot really know how to say something colloquially even if it might be efficient at correct syntax. For example, Chomsky has many examples of sentences with good syntax but are nonsensical, for example the cliché sentence “colorless green ideas sleep furiously” which I am in the process of writing a blog post about. Because of this, I think children will always have better "colloquial" syntax while machines will be better at pure syntax until the child grows older.