Schedule Spring 2019

Place: MIT 46-5165, Time: Thursdays, 5pm


2/14 Reading Group
Structures, not strings: linguistics as part of the cognitive sciences by Everaert et al. (2015)

2/21 Sherry Yong Chen (MIT Linguistics)

The linguistics of one, two, three

The interpretation of number words in context is of interest not only to linguists, but also logicians, psychologists, and computer scientists. In this talk, we will discuss some theoretical and developmental questions related to the linguistics of English numerals.
Bare numerals (e.g. two, three) present an interesting puzzle to semantic and pragmatic theories, as they seem to vary between several different interpretations: ‘at least n’, ‘exactly n’, and sometimes even ‘at most n’. We will examine how the availability of a particular interpretation seems to depend on the interaction between linguistic structure and contextual factors, and discuss three approaches that try to capture the relationship between these interpretations. 
Turning to the acquisition of bare numerals, developmental research suggests that preschoolers by the age of 5 are able to access ’non-exactly' interpretations of a bare numeral in contexts where these interpretations are licensed, just like adult speakers. A natural hypothesis for this is that the knowledge of the full range of interpretations may come through a prior understanding of the meaning of explicit expressions such as ‘at least/at most’ in English. This turns out to be questionable, however, since it is also shown that 5-year-olds haven’t yet acquired the meaning of the expressions at least and at most yet. Time permitting, we will end with a discussion about what all this means for the development of numerical concept and/or language development in general.

3/14 Reading Group: Danfeng Wu (MIT Linguistics), Syntactic Theory: A Formal Introduction Chapters 9.3-9.9

What is the role of psycholinguistic evidence (specifically evidence from language processing) in the study of language? What is the relation between knowledge of language and use of language? We hope to explore these questions through a discussion of an HPSG textbook chapter. HPSG (Head-driven phrase structure grammar) is a different syntactic framework from generative transformational grammar, and is surface-oriented, constraint-based and strongly lexicalist. This textbook chapter argues that HPSG is more compatible than transformational grammar with observed facts about language processing. For instance, language processing is incremental and rapid (e.g. Tanenhaus et al. 1995 & 1996, Arnold et al. 2002). The order of presentation of the words largely determines the order of the listener’s mental operations in comprehending them. And lexical choices have a substantial influence on processing (MacDonald et al. 1994). For these reasons, such psycholinguistic evidence supports an HPSG type of grammar, and poses difficulty to transformational grammar.

3/21 Paola Merlo (University of Geneva)

In the computational study of intelligent behaviour, the domain of language is distinguished by the complexity of the representations and the sophistication of the domain theory that is available. It also has a large amount of observational data available for many languages. The main scientific challenge for computational approaches to language is the creation of theories and methods that fruitfully combine large-scale, corpus-based approaches with the linguistic depth of more theoretical methods. I report here on some recent and current work on word order universals and argument structure that exemplifies the quantitative computational syntax approach. First, we demonstrate that typological frequencies of noun phrase orderings, universal 20, are systematically correlated to abstract syntactic principles at work in structure building and movement. Then, we investigate higher level structural principles of efficiency and complexity. In a large-scale, computational study, we confirm a trend towards minimization of the distance between words, in time and across languages. In the third case study, much like the comparative method in linguistics, cross-lingual corpus investigations take advantage of any corresponding annotation or linguistic knowledge across languages. We show that corpus data and typological data involving the causative alternation exhibit interesting correlations explained by the notion of spontaneity of an event. Finally, time permitting, I will discuss current work investigating on whether the notion of similarity in the intervention theory of locality is related to current notions of similarity in word embedding space.

4/4 Candace Ross (MIT CSAIL)


4/11 Reading Group
Integration of visual and linguistic information in spoken language comprehension by Tanenhaus et al. (1995)

4/18 Tal Linzen (JHU Cognitive Science)


5/2 Rachel Ryskin (MIT BCS)


5/9 Joshua Hartshorne (Boston College)

In popular culture, robots or other theory-of-mind-impaired fictional characters have difficulties with metaphors or indirect speech because they lack "common sense". In fact, common sense plays an even more central role in language processing than these depictions suggest. Compare:

(1) The city council denied the protesters a permit because they feared violence.

(2) The city council denied the protesters a permit because they advocated violence.

Most people interpret they as referring to the city council in (1) and the protesters in (2). It is hard to explain this without invoking common sense. Similarly, compare:

(3) The hat fit in the box because it was big.

(4) The hat didn't fit in the box because it was big.

Such examples are ubiquitous in natural language. In this talk, I will describe recent computational and experimental work that tries to make sense of this commonplace phenomenon.

5/16 Ethan Wilcox (Harvard)

Recurrent Neural Networks (RNNs) are one type of neural model that has been able to achieve state-of-the-art scores on a variety of natural language tasks, including translation and language modeling (which is used in, for example, text prediction). However, the nature of the representations that these 'black boxes' learn is poorly understood, raising issues of accountability and controllability of the NLP system. In this talk, I will argue that one way to assess what these networks are learning is to treat like subjects in a psycholinguistic experiment. By feeding them hand-crafted sentences that belie the model's underlying knowledge of language I will demonstrate that they are able to learn the filler--gap dependency, and are even sensitive to the hierarchical constraints implicated in the dependency. Next, I turn to "island effects", or structural configurations that block the filler---gap dependency, which have been theorized to be unlearnable. I demonstrate that RNNs are able to learn some of the "island" constraints and even recover some of their pre-island gap expectation. These experiments demonstrate that linear statistical models are able to learn some fine-grained syntactic rules, however their behavior remains un-humanlike in many cases.