Move back one step  Move forward one step 
Larger font Smaller font Default font

   Sources for varieties of English

Text corpora
Dedicated journals

Historical records

Information about varieties in previous centuries can be gleaned from a number of sources. These can be classified by type. Each type has its own advantages and disadvantages. The more types of historical record available, the better. Frequently, one has to make do with a limited set of sources and reconstruct features of varieties on the basis of fragmentary material.

Emigrant letters People who emigrated in previous centuries wrote back home, usually to maintain contact with friends and relatives. Because of this, letters from emigrants are available in archives today. Such material is usually non-prescriptive, i.e. written in a colloquial style without undue consideration of normative grammar. Hence it is a good source of information on varieties and, when used judiciously, can be useful for linguistic analyses.
Personal accounts Apart from letters, there are also documents of various kinds in which speakers offer personal accounts of their lives and experiences. Some of these have been recorded deliberately, e.g. the accounts of life under slavery or in other adverse conditions. Such texts are not normally written using the variety in question, unless verbatim transcripts of what was said by informants are used. In this context one could also mention court records in which the statements of accused persons and/or defendants were written down by court clarks.
Dialect glossaries From the 17th century onwards, a certain antiquarian interest in dialect vocabulary can be observed. Collections of words from diverse regions of the British Isles are available and are often a good source of material on the varieties spoken there. Such material is almost entirely lexical, i.e. information about pronunciation and grammar is not normally included.
Literary satires Already with Chaucer (in the 14th century) one finds dialect material used to characterise figures in literary works. Shakespeare and Ben Jonson are prominent Elizabethan writers who kept up this tradition. Many satires contain figures from the Celtic regions, i.e. Irish, Scottish or Welsh characters, especially in drama from the 17th century onwards. The accuracy of such portrayals is often doubtful because many of the authors were English and did not have a first-hand knowledge of the speech they were satirising. In addition, there are limits on the linguistic features which can be represented using so-called ‘eye dialect’, i.e. changes in spelling to indicate dialect traits in writing.
Rhyming material End rhyme, in poetry and sometimes in drama, can be a source of information on the pronunciation of vowels. For instance, one could check whether eat and great or past and waste rhyme for a particular author. This could indicate whether the first words in each pair still had the vowel /e:/ or /a:/ respectively.
Prescriptive comments From the 18th century onwards, there are many works in which authors complain about regional pronunciation and grammar. This is connected with the rise of prescriptivism, i.e. strict notions of what is ‘correct’ in language and what variety was taken to be socially acceptable, and by implication what other forms were not. Authors often cite supposedly ‘incorrect’ usage and thus inadvertently supply present-day linguists with information about regional varieties of English in previous centuries.
Early audio records Audio recordings have now existed for over a century. The earliest of these are usually of well-known cultural or political figures. The speech found in these recordings can nonetheless be examined in order to reach a greater time depth when documenting many varieties of English. The earliest audio records frequently offer attestations of features - or the lack thereof - which appear - or disppeared - in certain varieties and allow one in many instances to construct a relative chronology of language change.

Selected references for different types of historical records

Emigrant letters

Hickey, Raymond (ed.) 2019. Keeping in Touch. Familiar Letters across the English-speaking World. Amsterdam: John Benjamins.

Montgomery, Michael 1995. ‘The linguistic value of Ulster emigrant letters’, Ulster Folklife 41: 1-15.

Personal accounts

Fitzpatrick, David 1994. Oceans of Consolation. Personal Accounts of Irish Migration to Australia. Cork: University Press.

Rickford, John R. and Jerome S. Handler 1994. ‘Textual evidence on the nature of early Barbadian speech, 1676-1835’, Journal of Pidgin and Creole Languages 9.2: 221-55.

Stanihurst, Richard 1965 [1577]. ‘The description of Ireland’ Chronicles of England, Scotlande and Irelande edited by R. Holinshed. London. Reprinted by Ams Press.

Dialect glossaries

Barnes, William (ed.) 1867. A Glossary, with Some Pieces of Verse, of the Old Dialect of the English Colony in the Baronies of Forth and Bargy, County of Wexford, Ireland Formerly Collected by Jacob Poole. London: J. R. Smith.

Ray, John 1674. A collection of English words not generally used. London.

Vallancey, Charles 1788. ‘Memoir of the language, manners, and customs of an Anglo-Saxon colony settled in the baronies of Forth and Bargie, in the County of Wexford, Ireland, in 1167, 1168, 1169’, Transactions of the Royal Irish Academy 2, 19-41.

Literary satires

Bartley, J. O. 1954. Teague, Shenkin and Sawney: Being an Historical Study of the Earliest Irish, Welsh and Scottish Characters in English Plays. Cork: University Press.

Bliss, Alan J. 1979. Spoken English in Ireland 1600-1740. Twenty-seven Representative Texts Assembled and Analysed. Dublin: Cadenus Press.

Duggan, G. C. 1969 [1937]. The Stage Irishman: A History of the Irish Play and Stage Characters from Earliest Times. Dublin and Cork/London: Talbot Press.

Jonson, Ben 1969. The Complete Masques. Edited by Stephen Orgel. New Haven, London: Yale University Press.

Sullivan, James 1980. ‘The validity of literary dialect: evidence from the theatrical portrayal of Hiberno-English’, Language and Society 9: 195-219.

Rhyming material

Kniezsa, Veronika 1985. ‘Jonathan Swift’s English’, in Siegmund-Schulze (ed.), pp. 116-24.

Siegmund-Schultze, Dorothea (ed.) 1985. Irland. Gesellschaft und Kultur. [Ireland. Society and culture] Vol. 4. Halle: University Press.

Prescriptive comments

Patterson, David 1860. The Provincialisms of Belfast and the Surrounding Districts Pointed Out and Corrected; to which is Added an Essay on Mutual Improvement Societies. Belfast: Alexander Mayne.

Sheridan, Thomas 1781. A Rhetorical Grammar of the English Language Calculated Solely for the Purpose of Teaching Propriety of Pronunciation and Justness of Delivery, in that Tongue. Dublin: Price.

Sheridan, Thomas 1967 [1780]. A general dictionary of the English language. 2 vols. Menston: The Scolar Press.

Sheridan, Thomas 1970 [1762]. A Course of Lectures on Elocution. Hildesheim: Georg Olms.

Early audio records

Hickey, Raymond (ed.) 2017. Listening to the Past. Audio Records of Accents of English. Cambridge: Cambridge University Press.

Available corpora

Since the early 1990s a large number of corpora have become available. Some of these corpora are specific to certain varieties of English. Below, a selection of such sources is given. These corpora are in the main concerned with documenting the standard variety of the country where they are compiled. This is particularly true of the ICE corpora (compiled as part of the large and ongoing project, International Corpus of English, coordinated by the Departmen of English, University College London). The sub-copora of this project are labelled by using the acronym and then the region/country in question, e.g. ICE-East Africa or ICE-Ireland. A full list of the currently available corpora can be found on the main website for the entire project (see relevant entry in the following table).

Most of the universities involved in the compilation of such corpora have websites with additional information. The field of variety corpora is an expanding field and with each passing year new corpora appear, some of which are put in the public domain by their compilers. As can be seen from the following list, many corpora are in fact dedicated to forms of English in the early modern period (from the 17th century to the present day). This time span is important as it is covers the period during which English was transported overseas.

Name Compiling institution / individuals
ARCHER, a corpus of British and American English from 1650-1990 Douglas Biber and associates in Northwestern Arizona University in collboration with colleagues at the University of Freiburg, Germany
Australian Corpus of English Department of Linguistics, Macquarie University, NSW, Australia
Bank of English University of Bermingham, sponsored by the publisher HarperCollins
British National Corpus Consortium under the aegis of Oxford University Press
Brown Corpus of Standard American English. W. Nelson Francis and Henry Kucera, Brown University, Providence, Rhode Island
Corpus of 19th Century English Merja Kytö and associates, Uppsala University, Sweden
Corpus of English Dialogues Merja Kytö, Uppsala University, Sweden and Jonathan Culpeper, Lancaster University, England
Corpus of Early English Correspondence Terttu Nevalainen and Helena Raumolin-Brunberg, University of Helsinki, Finland
A Corpus of Irish English Raymond Hickey, Essen University, Germany (packaged with Corpus Presenter, Software for Language Analysis, Amsterdam: John Benjamins, 2003)
Corpus of London Teenage Language (COLT) Anna-Britta Stenström and associates, Department of English, University of Bergen
Freiburg-Brown Corpus of American English (FROWN) Christian Mair and associates, University of Freiburg, Germany
Freiburg-LOB Corpus of British English (FLOB) Christian Mair and associates, University of Freiburg, Germany
Freiburg Corpus of English Dialects (FRED) Bernd Kortmann and associates, University of Freiburg, Germany
The Helsinki Corpus of Older Scots Anneli Meurman-Solin, Department of English, University of Helsinki, Finland
International Corpus of English (ICE), collection of corpora from various anglophone countries, now (2005) partially completed Co-ordinated by the Department of English, University College London, England
Kolhapur Corpus of Indian English Shivaji University, Kolhapur
The Newcastle Electronic Corpus of Tyneside English (NECTE) Karen Corrigan, School of English Literature, Language, and Linguistics, University of Newcastle upon Tyne
Northern Ireland Transcribed Corpus of Speech (NITCS) John Kirk, Department of English, Queen’s University, Belfast, Northern Ireland
Old Bailey Court Depositions Department of History, University of Sheffield
Santa Barbara Corpus of Spoken American English University of Santa Barbara, California

Dedicated journals

American Speech
English Language and Linguistics
English Today
English World-Wide
Language and Society
Journal of English Linguistics
Journal of Pidgin and Creole Languages
Journal of Sociolinguistics
Language Variation and Change


For a comprehensive list of relevant books, see the branch References towards the bottom of the tree on left.