Goto desktop  Move back one step  Move forward one step  Sitemap
Larger font Smaller font
   Download software

Corpus Presenter, Version 2022   (c. 50 MB Exe file). Build: January 2022.

Version 2022 Corpus Presenter is a major upgrade of the program compared to previous versions and contains many significant enhancements. It is entirely free of charge and does not require registration or a password.

Installation process greatly simplified and improved

All you need do is download the EXE file from the link in the above box (the file is called Corpus_Presenter_2022_Setup.exe). Load the Windows Explorer and locate the downloaded file (probably in your Downloads folder). Then right-click the mouse when the cursor is over the file name and choose the option “Run as Administraor” (VERY IMPORTANT, otherwise the files will not be registered correctly), then the installation process will begin.

Take note: It is essential to start the Corpus_Presenter_2022_Setup.exe file with administrator rights, otherwise it will not install correctly. It will install Corpus Presenter 2022 in about a minute on any Windows-based computer system. It can be uninstalled just as easily from the Windows Control Panel (go to Programs and Features) or with various items of third-party software.



Apart from Corpus Presenter I program adaptations of this corpus software for university colleagues who have prepared or are preparing a corpus of their own and wish to have sophisticated retrieval software to go with this. So far such adaptations have been published by the Dutch linguistic publishers John Benjamins who also published the original of Corpus Presenter (version 7) in 2003.

Corpus Presenter works best with Windows 7 / Windows 8 or 8.1 / Windows 10. All the programs will also run on Windows XP, in case you are still using this older operating system.

You are not advised to use versions of the operating system older than Windows XP (i.e. not Windows 2000 and certainly not Windows 98). For legal reasons, I must stress that you use Corpus Presenter at your own risk. The program can be removed easily from your computer via the Programs and Features module in the Control Panel of Windows (called the Add/Remove Software in Windows XP). In Windows 10 programs are called apps.

The current version of Corpus Presenter is Version 2022, build: January 2022 (see download link above). It follows on Version 15 (April 2019), Version 14 (February 2016), Version 13 (July 2013), Version 12 (September 2012), Version 11 (February 2009) and previous versions, including Version 7 supplied with the book. You do not need previous versions to run Version 2022. You can update from the CD with the book (Version 7) directly to Version 2022. Please uninstall all previous versions and re-start your computer before installing Version 2022.

The book Corpus Presenter can be purchased from John Benjamins (see the relevant section of their website). If either you or your library purchase the book then this entitles you to support from the author.


   Helsinki Corpus Files (size: 91 KB)

The Helsinki Corpus of English Texts consists of 242 text files which are located in a single directory. I have constructed a data set file – Helsinki_Corpus.cpd – which will display the Helsinki Corpus as a hierarchical tree divided into layers according to period (Old, Middle and Early Modern English) and sub-period and then by genre as can be seen in the following screen shot.

The second file is intended for use with the supplied utility Corpus Presenter Find Text. The file is Helsinki_Codes.lst and it will replace the sequences of "+" and a letter with the actual Old and Middle English symbols, ash, thorn and eth in all the texts of the corpus which contain these. This makes the Old and Middle English texts much more readable. Bear in mind that the symbols, ash, thorn and eth can be accessed in Corpus Presenter modules by clicking on the button OE/ME, e.g. in the search options window of the Quick search or the parameters window on the Advanced search level.

To carry out the replacements, do the following. Unzip the download file Helsinki.zip from the above link to the directory in which you keep the files of the Helsinki Corpus. Start Corpus Presenter Find Text and enter this directory. Choose Helsinki_Codes.lst as the file with input form for the Find / Replace operation. Select all the forms and click on the Proceed button. When the files have been processed, all replacements will have been made, some 202,550 in all. The procedure should take some minutes, that is normal.

The problem of yogh


In the ZIP file Helsinki.zip there is another file for doing replacements in Helsinki Corpus texts, namely Helsinki_Codes_with_Yogh.lst. The following additional lines can be found in this file:

+g   3   
+G   3   

These replace all instances of +g and +G, the representation of yogh in the Helsinki Corpus texts, with the number 3 (there are no separate uppercase and lowercase forms for Arabic numerals, hence the same replacement in both cases). The only problem here is that earlier English yogh is not really a 3 (the number ‘three’). If you do carry out this replacement in the Helsinki Corpus texts, then you will have to remember to enter 3 every time you search for a string in Corpus Presenter which has yogh (= 3) in it. You can do that, it’s messy I admit, but it is a solution because 3 instead of +g is definitely makes texts more readable.

There are two further data set files in the ZIP file Helsinki.zip: (1) CEECS.cpd which is designed to work with the Corpus of Early English Correspondence Sampler by Terttu Nevalainen and Helena Raumolin-Brunberg, (2) Old_Scots.cpd which can be used with the Helsinki Corpus of Older Scots by Anneli Meurman-Solin.