coca help

Please de-select the [MIN FREQ] checkbox for Section 2 in the form,
if the [MIN FREQ] for Section 1 is not selected as well.



Sort by raw frequency (e.g. hard * ) or by "relevance" ( hard *). Relevance uses the Mutual Information score.

It is often useful to specify the minumim frequency when you are sorting by "relevance", to eliminate very low frequency strings. For example, collocates of green where minimum frequency = 1 (strange once-off strings) and where minimum frequency = 20.

Note also that when you do a collocates search and you don't specify anything for the collocates field, it will automatically set MINIMUM to MUT INFO = 3 (Mutual Information score). It does this to remove high frequency noise words like the, to, with, etc. If you want to see more of these words, lower the MI score; to see less, increase it.



按照MINIMUM    MUT INFO 限制,不知道为啥无效


# HITS is the number of results.

# KWIC is the number of results for a KWIC (concordances) search.

GROUP BY determines whether words are grouped by word form (e.g. decide and decided separately), lemma (e.g. all forms of decide together), and whether you see the part of speech for word (e.g. beat as a noun and verb displayed separately).

SHOW # TEXTS determines whether you see the number of texts in which a word or phrase occurs, in addition to its frequency. This can be useful in finding words and phrases that are limited just to a few texts in the corpus.  (More information) 

You can also sort the results by the number of texts
in which they occur (use [SORTING] in the search

CASE SENSITIVE determines whether She thought and she thought would be two different searches, or The Office, the Office, and the office.

DISPLAY shows raw frequency, occurrences per million words, or a combination of these.

SAVE LISTS allows you to create a wordlist from the results and then re-use it later in your searches.


