Plenary Speakers


The following speakers have agreed to give plenaries at the conference:

  • Edgar W. Schneider (University of Regensburg), Thu 27 May, 9-10 am:
    "Tracking the evolution of vernaculars: Corpus linguistics and Earlier Southern US Englishes"

    Given that, as Kretzschmar (2009) recently posited, language is a complex self-organizing system, it is widely accepted that mechanisms of language change become most directly evident by observing change in vernacular rather than standard varieties (and documents, for that matter). Corpus linguists have worked towards identifying and computerizing historical sources which come close to representing natural speech (e.g. Nevalainen & Raumolin-Brunberg 1996; Nurmi et al. 2009; Huber 2007), but clearly a number of problems remain, and little energy has been devoted to building historical corpora of dialects in the narrow sense. In this paper I address these issues from a general perspective and apply the methodology to corpus-based investigations of earlier black and white dialects from the Southern United States.
         In a first part I address some theoretical and methodological issues involved in the attempt at tracking vernacular speech forms in electronic corpora, paying special emphasis to semi-literate writings as evidence. These involve data identification and assessment in terms of validity and reliability, steps and decisions in corpus compilation, and factors which appear to restrict the value of the corpus-driven approach, such as missing data, orthographic variability, the problem of zero forms, and so on. I then introduce the object domain investigated here, debates on the evolution of both African-American and white dialects in the southern United States. Widely discussed issues in this context include the British or African / creole origins of African American Vernacular English (AAVE), the question of whether black and white dialects have been diverging from each other over the last century or so, and Guy Bailey's controversial claim that white southern speech, the best known and most highly stigmatized non-ethnic dialect of American English, originated only after the Civil War, in the Reconstruction period (i.e., much later than originally suspected).
        My evidence builds upon three pertinent corpora the compilation of which I have directed at the University of Regensburg over the last decade, and each of these will be introduced and briefly characterized in turn: SPOC, the "Southern Plantation Overseers' Corpus", which consists of 537 letters written by 55 white plantation overseers between 1794 and 1876 (Schneider & Montgomery 2001); the BLUR ("Blues Lyrics compiled at the University of Regensburg") corpus, which consists of 7431 transcribed lyrics of blues songs recorded early in the 20th century, a total of roughly 1.5 mio. words; and COAAL, a "Corpus of Older African American Letters", which is being compiled and completed now, and consists of ca. 1300 letters written by semi-literate African American writers in the 18th and 19th centuries. The three types of sources will be compared to each other and assessed with respect to their methodological potential and limitations.
        Finally, I will employ the three corpora in a comparative analysis of select interesting phenomena of the southern vernaculars, especially from the domain of verbal morphology.

    Huber, Magnus. 2007. "The Old Bailey Proceedings, 1674-1834. Evaluating and annotating a corpus of 18th- and 19th-century spoken English." In Anneli Meurman-Solin and Arja Nurmi, eds., Annotating Variation and Change. e-series "varieng – studies in variation, contact and change in english",
    Kretzschmar, William R., Jr. 2009. The Linguistics of Speech. Cambridge: Cambridge University Press.
    Nevalainen, Terttu, and Helena Raumolin-Brunberg, eds. 1996. Sociolinguistics and Language History. Studies Based on the Corpus of Early English Correspondence. Amsterdam: Rodopi.
    Nurmi, Arja, Minna Nevala, and Minna Palander-Collin, eds. 2009. The Language of Daily Life in England (1400-1800). Amsterdam, Philadelphia: Benjamins.
    Schneider, Edgar W., and Michael Montgomery. 2001. "On the trail of early nonstandard grammar: An electronic corpus of Southern U.S. antebellum overseers letters." American Speech 76,. 388-410.

  • Miriam Meyerhoff (University of Edinburgh), Thu 27 May, 5.30-6.30 pm:
    "Finding your mark: Uncovering hidden constraints in corpora"

    The study of post-colonial Englishes presents some empirical challenges for corpus linguists. It is well-known that English takes on local colour wherever it comes into contact with other languages. But to study this properly, two kinds of corpora are needed: solid corpora of English in its post-colonial context, and usable corpora of the languages it has come into context with. Both kinds of corpora are often in short supply, hampering our understanding of how linguistic features are mapped between languages. The consequences of this are unfortunate: researchers may generalise from limited samples or from the intuitions of a few informants. With creole varieties of English, this problem is perhaps most acute: English is generally in contact with lesser known (possibly now endangered) varieties that are not thoroughly documented.

    But all is not lost. In this talk, I will suggest that it is possible to use relatively modest corpora of lesser known languages to explore the complexity and dynamics of contact-induced variation and change. I will draw on data from a corpus of spoken, conversational Bislama (the English-lexified creole spoken in Vanuatu, SW Pacific) and a corpus of narratives recorded in Tamambo (the Oceanic language spoken on Malo island, NW Vanuatu). Even small corpora of spontaneous speech provide considerable data on variable patterns that lie below the surface and that cannot be directly perceived.

    I use some standard variationist tools to explore the patterns underlying variable presence of pronominal subjects and objects in Bislama. While most work on substrate transfer in creoles focuses on features that are obligatory in the substrate language, an interesting property of subject and object presence in Bislama is that both the input (in this case, the Tamambo norms) and the output (the Bislama norms) are variable. How do speakers deal with variable input? Do they replicate the patterns of one language in the other? Do they regularise the variable input? Or do they innovate and create wholly new patterns?

    The answers from the Tamambo and Bislama corpora suggest a surprising mix of persistence and innovation. To the extent that the answers converge with other research on variation in dialect contact – and with our on-going work on language contact in the UK – I will suggest that they point to some fundamental cross-linguistic constraints on the replication of variation.

  • Stefan Th. Gries (University of California, Santa Barbara), Fri 28 May, 9-10 am:
    "Corpus Linguistics, linguistic theory, and (psycho)linguistic models: Some comments plus implications for the study of variation in corpora"

    Recent discussions on the CORPORA list suggest that the field of corpus linguistics is more divided than a superficial glance at the assumptions shared by corpus linguists might suggest. In the first part of this talk, I will critically discuss some views on corpus linguistics and its relation to linguistic theory in general and one linguistic theory in particular. In particular, I will express some concerns about (aspects of) corpus-driven linguistics, the rule-governedness and psychological generalizability of corpus linguistic findings, as well as some corpus linguists' views of adjacent fields.
    In the second part of the talk, I will discuss a particular psycholinguistic model of language acquisition, representation, and processing, which has not only proven extremely versatile and powerful in a wide variety of linguistic subdisciplines, but is also highly compatible with corpus-linguistic approaches, in fact even relies on corpus approaches. I will introduce and exemplify the main assumptions of this model as well as highlight its characteristics and benefits especially with regard to how it handles variation data such as morphological and syntactic alternations as well as changes over time (in developmental and diachronic studies).

  • Michaela Mahlberg (University of Nottingham), Sat 29 May, 9-10 am:
    "The corpus stylistic analysis of fiction – or the fiction of corpus stylistics?"

    The interest in corpus approaches to the analysis of literature seems to be growing and the term ‘corpus stylistics’ is becoming more and more popular. In this paper I want to look at challenges and opportunities for the field of corpus stylistics. Corpus stylistics draws on the potential that comparative data provides for the analysis of individual texts. With a focus on discourse-level analysis, corpus stylistics needs to link quantitative findings with interpretations and literary criticism. With examples mainly drawn from works by Charles Dickens and other 19th century fiction, I will be looking at linguistic elements of textual worlds. The examples for textual building blocks of fictional worlds refer to the description of characters as well as the narrator’s involvement in the story. A corpus approach to textual worlds can be seen as complementing cognitive approaches such as text world theory (cf. Gavins 2007). The underlying corpus methodology deals with the retrieval of ‘clusters’, i.e. repeated sequences of words such as I am glad to hear, with a shriek and a or all that sort of thing. The main literary critical approach that will be discussed in view of corpus findings focuses on the ‘externalisation’ of characters in Dickens (cf. John 2001). The paper also argues that the questions we ask in corpus stylistics should not be dictated by the functionality of the computer tools that are available to us, but tools need to be developed that help us investigate the questions suggested by features of a text. With the help of an XML corpus of texts annotated with information relating to quoted speech, examples of fictional speech will be discussed. The paper aims to show the value of the combination of corpus-driven methods and literary stylistics. 

    Gavins, J. 2007. Text World Theory. An Introduction. Edinburgh: EUP.
    John, J. 2001. Dickens’s Villains. Melodrama, Character, Popular Culture. Oxford: OUP.

  • Elizabeth C. Traugott (Stanford University), Sun 30 May, 9-10 am:
    "The persistence of linguistic contexts over time: Implications for corpus research"

    Most instances of grammaticalization have been shown to arise in restrictive contexts (cf. Bybee, Perkins, and Pagliuca 1994). The persistence over time of linguistic contexts ("co-texts" broadly defined to include prior discourse) raises theoretical and methodological issues for historical corpus research. What is the appropriate unit of linguistic context? How long do contexts remain relevant in the history of specific constructions? In quantitative work should "bridging contexts" (Heine 2002) and "critical contexts" (Diewald 2002) for grammaticalization be counted after grammaticalization has set in? I argue that bridging contexts should be counted (contra Eckardt 2006) because they persist over time but critical contexts cannot be as they do not persist. The persistence of linguistic contexts for morphosyntactic developments suggests that prior context should be considered an integral component of corpus research. Examples will be drawn from a variety of corpora, most especially the Proceedings of the Old Bailey 1674-1834 (see Huber 2007).

    Bybee, Joan, Revere Perkins & William Pagliuca (1994): The Evolution of Grammar: Tense, Aspect, and Modality in the Languages of the World. Chicago: University of Chicago Press.
    Diewald, Gabriele (2002): "A model for relevant types of contexts in grammaticalization", New Reflections on Grammaticalization, eds. Ilse Wischer & Gabriele Diewald. Amsterdam: Benjamins. 103-120.
    Eckardt, Regine (2006): Meaning Change in Grammaticalization: An Enquiry into Semantic Reanalysis. Oxford: Oxford University Press.
    Heine, Bernd (2002): "On the role of context in grammaticalization", New Reflections on Grammaticalization, eds. Ilse Wischer & Gabriele Diewald. Amsterdam: Benjamins. 83-101.
    Huber, Magnus (2007): "The Old Bailey Proceedings, 1674-1834: Evaluating and annotating a corpus of eighteenth- and nineteenth-century spoken English", Annotating Variation and Change. eVARIENG: Studies in Variation, Contact and Change in English, Volume 1, eds. Anelli Meurman-Solin & Arja Nurmi, <>.
    The Proceedings of the Old Bailey 1674-1913. <>.