Lancaster University Department of Linguistics and Modern English Language
Corpus Linguistics Home
Page index
WordSmith
Basic WordSmith
Using Concord
Frequency Lists and Keywords
Part-of-speech Tags
BNCweb
DIY Corpora
 
Page One
 
 
Current page
 
 
Page Three
 
 
Page Four
 
 

Browsing Corpus Files

What is a Corpus?

 

Computer corpora are made up of text files. On this page, we will show you how to browse corpus files in Windows.

There are a number of corpora stored on the network in the Department of Linguistics at Lancaster. We will start by looking at some files in the ICAME Collection (a set of corpora distributed by the ICAME organisation).

First, let's find the files on the network.

  1. First, log on to the network.
  2. Start Windows Explorer (hold down the key and press "E").
  3. Go to the network drive Y: and double click on icame collection
    -- or alternatively enter in the Address bar \\Bowland_back\corpus\icame collection
     
  4. Each corpus is in its own folder. Browse the corpus files in the following folders by opening with Windows Notepad, WordPad or Microsoft Word:
    • Brown
    • LOB
    • LLC
    • Helsinki
    • SEC
  5. Check some of the corpus manual pages in the "manual" directory

You can always study what each corpus consists of by visiting this manual directory.