Goto desktop  Move back one step  Move forward one step  Sitemap
Larger font Smaller font

  Making a data set by copying files from a corpus

Another way of making a data set of your own is to extract files from a corpus you have access to. For instance, if you had a copy of the Helsinki Corpus of English Texts at your disposal you could create a subcorpus which consisted of, say, just legal texts. The way you would go about this is as follows: switch to the Checked files display mode by pressing F11. Then select the necessary files by ticking the check box for each. This can be done by clicking on the box with the mouse pointer of by pressing the space bar when the highlighting is on each file you wish to select.

Now press Ctrl-F11 and the window as shown in the following screen shot appears. Here you specify a directory into which the files from the current corpus are to be copied. You also give the new data set a name and then you click on the Conclude button. You may have to confirm overwriting files if they already exist. When the files have been extracted from the current corpus and the new data set has been created you are asked if you wish to load this to view it.