WN(1WN) | WordNet™ User Commands | WN(1WN) |
wn - command line interface to WordNet lexical database
wn [ searchstr ] [ -h] [ -g ] [ -a ] [ -l ] [ -o ] [ -s ] [ -n# ] [ search_option... ]
wn() provides a command line interface to the WordNet database, allowing synsets and relations to be displayed as formatted text. For each word, different searches are provided, based on syntactic category and pointer types. Although only base forms of words are usually stored in WordNet, users may search for inflected forms. A morphological process is applied to the search string to generate a form that is present in WordNet.
The command line interface is often useful when writing scripts to extract information from the WordNet database. Post-processing of the output with various scripting tools can reformat the results as desired.
Note that the last letter of search_option generally denotes the part of speech that the search applies to: n for nouns, v for verbs, a for adjectives, and r for adverbs. Multiple searches may be done for searchstr with a single command by specifying all the appropriate search options.
The results of a search are written to the standard output. For each search, the output consists a one line description of the search, followed by the search results.
All searches other than -over list all senses matching the search results in the following general format. Items enclosed in italicized square brackets ([ ... ]) may not be present.
Each sense matching the search requested displayed as follows:
Sense n [{synset_offset}] [<lex_filename>] word1[#sense_number][, word2...]
Where n is the sense number of the search word, synset_offset is the byte offset of the synset in the data.pos file corresponding to the syntactic category, lex_filename is the name of the lexicographer file that the synset comes from, word1 is the first word in the synset (note that this is not necessarily the search word) and sense_number is the WordNet sense number assigned to the preceding word. synset_offset, lex_filename, and sense_number are generated when the -o, -a, and -s options, respectively, are specified.
The synsets matching the search requested are printed below each sense's synset output described above. Each line of output is preceded by a marker (usually =>), then a synset, formatted as described above. If a search traverses more one level of the tree, then successive lines are indented by spaces corresponding to its level in the hierarchy. When the -g option is specified, synset glosses are displayed in parentheses at the end of each synset. Each synset is printed on one line.
Senses are generally ordered from most to least frequently used, with the most common sense numbered 1. Frequency of use is determined by the number of times a sense is tagged in the various semantic concordance texts. Senses that are not semantically tagged follow the ordered senses. Note that this ordering is only an estimate based on usage in a small corpus.
Verb senses can be grouped by similarity of meaning, rather than ordered by frequency of use. The -simsv search prints all senses that are close in meaning together, with a line of dashes indicating the end of a group. See wngroups(7WN) for a discussion of how senses are grouped.
The -over search displays an overview of all the senses of the search word in all syntactic categories. The results of this search are similar to the -syns search, however no additional (ex. hypernym) synsets are displayed, and synset glosses are always printed. The senses are grouped by syntactic category, and each synset is annotated as described above with synset_offset, lex_filename, and sense_number as dictated by the -o, -a, and -s options. The overview search also indicates how many of the senses in each syntactic category are represented in the tagged texts. This is a way for the user to determine whether a sense's sense number is based on semantic tagging data, or was arbitrarily assigned. For each sense that has appeared in such texts, the number of semantic tags to that sense are indicated in parentheses after the sense number.
If a search cannot be performed on some senses of searchstr, the search results are headed by a string of the form:
X of Y senses of searchstr
The output of the -deri search shows word forms that are morphologically related to searchstr. Each word form pointed to from searchstr is displayed, preceded by RELATED TO-> and the syntactic category of the link, followed, on the next line, by its synset. Printed after the word form is #n where n indicates the WordNet sense number of the term pointed to.
The -domn and -domt searches show the domain that a synset has been classified in and, conversely, all of the terms that have been assigned to a specific domain. A domain is either a TOPIC, REGION or USAGE, as reflected in the specific pointer character stored in the database, and displayed in the output. A -domn search on a term shows the domain, if any, that each synset containing searchstr has been classified in. The output display shows the domain type (TOPIC, REGION or USAGE), followed by the syntactic category of the domain synset and the terms in the synset. Each term is followed by #n where n indicates the WordNet sense number of the term. The converse search, -domt, shows all of the synsets that have been placed into the domain searchstr, with analogous markers.
When -framv is specified, sample illustrative sentences and generic sentence frames are displayed. If a sample sentence is found, the base form of search is substituted into the sentence, and it is printed below the synset, preceded with the EX: marker. When no sample sentences are found, the generic sentence frames are displayed. Sentence frames that are acceptable for all words in a synset are preceded by the marker *>. If a frame is acceptable for the search word only, it is preceded by the marker =>.
Search results for adjectives are slightly different from those for other parts of speech. When an adjective is printed, its direct antonym, if it has one, is also printed in parentheses. When searchstr is in a head synset, all of the head synset's satellites are also displayed. The position of an adjective in relation to the noun may be restricted to the prenominal, postnominal or predicative position. Where present, these restrictions are noted in parentheses.
When an adjective is a participle of a verb, the output indicates the verb and displays its synset.
When an adverb is derived from an adjective, the specific adjectival sense on which it is based is indicated.
The morphological transformations performed by the search code may result in more than one word to search for. WordNet automatically performs the requested search on all of the strings and returns the results grouped by word. For example, the verb saw is both the present tense of saw and the past tense of see. When passed searchstr saw, WordNet performs the desired search first on saw and next on see, returning the list of saw senses and search results, followed by those for see.
wn() normally exits with the number of senses displayed. If searchword is not found in WordNet, it exits with 0.
If the WordNet database cannot be opened, an error messages is displayed and wn() exits with -1.
wnintro(1WN), wnb(1WN), wnintro(3WN), lexnames(5WN), senseidx(5WN) wndb(5WN), wninput(5WN), morphy(7WN), wngloss(7WN), wngroups(7WN).
Please report bugs to wordnet@princeton.edu.
Dec 2006 | WordNet 3.0 |