coca flex (variable length) queries
LIST display: flex (variable length) queries
|
You can now do searches where there are a variable number of "slots". For example, the search: PUT (NOUN){3} away (click to run the query) would find strings with PUT at the beginning and away at the end, with up to three words between, at least one of which has to be a NOUN. In other words, it would do the following seven searches, one right after another, and would then display the results for all of the searches on one page.
In terms of search syntax, note that: 1. {n} indicates the number of words (0 to n) that can be in this "variable length" string. Valid numbers are 1, 2, or 3 (in other words, the longest variable length string is three words) 2. If you don't indicate {n} -- for example (NOUN) -- then it would be just one word -- meaning that it will be either that one word or nothing 3. Any "slot" without parentheses around it is obligatory. For example, put * away would not match put away, since * doesn't have parentheses around it. 4. You can't include multiple "flex" operators in a search. For example, they (VERB+}{2} notice (NOUN){3} would not be possible. The following are some additional searches. They produce interesting results in the one billion word COCA corpus), but the results in other corpora may not be as good. In each case, we show a few sample matching strings, and some strings that would not be generated by the search (and why not).
Some additional notes: 1. Because a "flex search" had involve up to seven different searches (see above), there are some limits on the number of flex searches in a given 24 hour period. For those who do not have a premium or academic license, there is a limit of five flex searches in 24 hours. Those who do have a license can do up to 50 flex searches in a 24 hour period. 2. Again, because of the number of searches that are done in a flex search, it would take a long time to do these searches if all of the "slots" are high frequency. This can be a real limitation in very large corpora like NOW (19+ billion words) or iWeb (14 billion words). So a search like HAVE (ADJ){3} time probably won't work in those corpora -- HAVE and time are too high of frequency. In a case like this, you will probably need to do these as a series of separate searches -- HAVE time, HAVE * time, HAVE * ADJ time, etc. But again, this should be a problem with a small corpus like the BNC. |