Lancaster University Department of Linguistics and Modern English Language
Corpus Linguistics Home
Page index
WordSmith
Basic WordSmith
Using Concord
Frequency Lists and Keywords
Part-of-speech Tags
BNCweb
DIY Corpora
 
Page One
 
 
Page Two
 
 
Current page
 
 
Page Four
 
 

Comparing frequencies for corpora of different sizes

 

We cannot easily compare the results of the previous exercise, because the sections of the corpora are of different sizes.

A common solution to this problem is to convert each frequency into a value per million words, or per thousand words. This is called normalizing the frequency scores.

Frequency per million words = ( frequency ÷ text no. words ) x 1,000,000

Now try filling in the "per million" column of the table, and think about the patterns.

Use the computer's calculator if you don't have your own pocket calculator:

[ Start - Programs - Accessories - Calculator ]