Jacques Durand (Centre National de la Recherche Scientifique & Université de Toulouse II) Doing linguistics with large corpora: two case studies in phonology

Abstract

In this paper, I try to establish why corpora play a significant role within modern linguistics and yet occupy a controversial place within the field. After some preliminary remarks, I examine the place of data within Chomskyan linguistics. I then proceed to a partial criticism of the classical Chomskyan position and argue that corpora can throw important light on various phenomena. I explore two areas from the phonology of English and French. First of all, I will illustrate the study of segmental variation in English phonology through data collected within the PAC Project (Phonologie de l'anglais contemporain). Secondly, I will explore a well-known area of French phonology studied within the PFC project (Phonologie du français contemporain): i.e. liaison (a linking strategy which is at the interface between phonology, morphology, syntax and meaning). On the basis of these discussions, I will show how major changes have affected the field in relation to data and mental representations. The emergence of usage-based grammars, connectionism and other models force us to rethink our attitude, even if we do not have to share all the theoretical presuppositions of such approaches. Recent work by Chomsky and others within evolutionary theory paradoxically reinforce the place of data within linguistics. In the conclusion, I try to suggest why intuition is likely to remain an indispensable tool for theory-construction in linguistics. But this should be no excuse for not strengthening our data-bases as bad or insufficient data rarely lead to good theories.

Relevant recent contributions

Durand, Jacques (2004). English in early 21st century Scotland: a phonological perspective. La tribune internationale des langues vivantes 36 : 87-105.

Durand, Jacques, Carr, Philip & Pukli, Monika (2004). The PAC project: principles and methods. La tribune internationale des langues vivantes 36 : 24-35.

Durand, Jacques and Lyche, Chantal (2003). Le projet 'Phonologie du Français Contemporain' (PFC) et sa méthodologie. In: E. Delais and J. Durand (eds.), Corpus et variation en phonologie du français : méthodes et analyses. Toulouse: Presses Universitaires du Mirail. 212-276.

Durand, Jacques & Lyche, Chantal (2008) “French liaison in the light of corpus data”. Journal of French Language Studies. 18/1: 33-66.

Durand, Jacques (2006) “Mapping French Pronunciation. The PFC project.” In J.-P. Montreuil & C. Nishida (eds).(2006). New Perspectives on Romance Linguistics. Vol. 2 : Phonetics, Phonology and Dialectology. Selected Papers from the 35th Linguistic Symposium on Romance Languages (LSRL), Austin, Texas, February 2005. Amsterdam : John Benjamins. 65-82