Move back one step  Move forward one step  List of themes  Map of themes
Larger font Smaller font Default font

Corpora and data sources



A number of corpora are available for the study of Irish English. These vary in range, size and stage of completeness. The following list is intended to convey an impression of the current situation. Only those corpora which are in the public domain or to which the academic community has some kind of access are mentioned here. Many authors who work on Irish English have collections of data which they have used and still use for their linguistic analyses. The nature of the data is normally discussed in their studies but access is not provided through printed material or online sources.


A Corpus of Irish English Correspondence (CORIECOR)


This is a new corpus currently being compiled by Carolina P. Amador Moreno (University of Extremadura, Cáceres, Spain) and Kevin McCafferty (University of Bergen, Norway) and consisting of emigrant letters from Ireland over the past two to three centuries. The data is unique in providing a window on vernacular Irish English usage (mostly of a Northern Irish origin) and has already been used as a basis for two PhD theses at the University of Bergen, Norway. For more details, see the following:

McCafferty, Kevin and Carolina P. Amador Moreno 2012. ‘A Corpus of Irish English Correspondence (CORIECOR): A tool for studying the history and evolution of Irish English’, in: Bettina Migge and Máire Ní Chiosáin (eds), New Perspectives in Irish English. Amsterdam: John Benjamins, pp. 265-288.


The Tape-Recorded Survey of Hiberno-English Speech


This survey was initiated in the 1970s and carried out at the Department of English, Queen’s University, Belfast under the direction of Dr. Michael Barry. It was discontinued in the early 1980s. What material is publicly available can be found on the DVD accompanying the book A Sound Atlas of Irish English by the present author (see relevant node in the tree on the left).


The Northern Ireland Transcribed Corpus of Speech


Under the supervision of Dr. John Kirk, Department of English, Queen’s University, Belfast, this corpus has been compiled during the 1990s. It consists of transcriptions of a section of the tape recordings for the previous corpus. For information, please contact Dr. Kirk at info@johnmkirk.co.uk. The corpus has been used in a number of investigations such as that by Simone Zwickl (see Zwickl 2002 in the references section)

For further information on this corpus, see:

Kirk, John M. 1992. ‘The Northern Ireland Transcribed Corpus of Speech’, in Leitner, Gerhard (ed.) New Directions in English Language Corpora. Berlin: Mouton de Gruyter), pp. 65-73.


ICE - Ireland


As part of the International Corpus of English project, this corpus has been compiled over a period of more than 10 years. It is a collection of texts which represent fairly standard forms of written Irish English. It has been used for studies by Jeffrey Kallen (Trinity College Dublin) and John Kirk in recent years. The material is not yet available to the general public, though this is the intention, as set out in the outlines for the entire ICE project. For further information, please contact either of the authors just mentioned.

There is a general website for the entire project on the server of University College London: International Corpus of English

Website for ICE-Ireland (Queen’s University Belfast)

Jeffrey Kallen can be contact at jkallen@tcd.ie (Trinity College, Dublin, Ireland). Studies arising from ICE-Ireland:

Kirk, John M., Jeffrey L. Kallen, Orla Lowry and A. Rooney 2004. ‘Issues arising from the compilation of ICE-Ireland’, in Belfast Papers in Language and Linguistics 16: 23-41.

Kirk, John M., Jeffrey L. Kallen, Orla Lowry and A. Rooney in press. ‘The compilation of ICE-Ireland: unity versus diversity’, in Antoinette Renouf and A. Kehoe (eds) The changing face of corpus linguistics (Amsterdam: Rodopi).


Limerick Corpus of Irish English


This is a project which is currently in progress at the Department of European Languages, University of Limerick, Ireland. Some results of data analyses have been published in the following volume:

Barron, Anne and Klaus Schneider (eds) 2005. The pragmatics of Irish English. Berlin: Mouton de Gruyter.

The corpus is synchronically oriented and one of its primary aims is to document pragmatic features of present-day Irish English.

Website: http://www.ul.ie/~lcie/homepage.htm
(this website seems to be defunct now – February 2012 – Raymond Hickey)


CELT Corpus of Electronic Texts


Based at Cork (National University of Ireland, Cork), this corpus consists in the main of historical texts in the Irish language with some in English as well. The amount of English material is slight and does not appear to have been gathered with a view to linguistic analysis.

Website: www.ucc.ie/celt/index.html


The vocabulary of Irish English


For anyone interested in the vocabulary of Irish English there is an informative website concerning all aspects of the lexicon at the following address:

Website: Terence Dolan’s Hiberno-English page
(this website seems to be defunct now – February 2012 – Raymond Hickey)

The author is the compiler of the main dictionary of Irish English, see Dolan (2012 [1998]) in the references section.


Online sources for other varieties of English



Website: The Newcastle Electronic Corpus of Tyneside English

Website: American Dialect Society

Website: Linguistic Atlas Projects at the University of Georgia

Website: Information on language in Newfoundland

Website: Dictionary of Newfoundland English

Website: The Origins of New Zealand English project


Other corpus projects of relevance to English studies



Website: The Bank of English

Website: The British National Corpus

Website: ICAME (International Computer Archive of Modern English)

Website: Research Unit for Variation, Contacts and Change in English (Helsinki University)