8.1.17. cltk.tag package¶
8.1.17.1. Submodules¶
8.1.17.2. cltk.tag.ner module¶
Named entity recognition (NER).
8.1.17.3. cltk.tag.pos module¶
Tag part of speech (POS) using CLTK taggers.
- class cltk.tag.pos.POSTag(language)[source]¶
Bases:
object
Tag words’ parts-of-speech.
- _setup_language_variables(lang)[source]¶
Check for language availability and presence of tagger files. :type lang:
str
:param lang: The language argument given to the class. :type lang: str :rtype : dict
- tag_unigram(untagged_string)[source]¶
Tag POS with unigram tagger. :type untagged_string: str :param : An untagged, untokenized string of text. :rtype tagged_text: str
- tag_bigram(untagged_string)[source]¶
Tag POS with bigram tagger. :type untagged_string: str :param : An untagged, untokenized string of text. :rtype tagged_text: str
- tag_trigram(untagged_string)[source]¶
Tag POS with trigram tagger. :type untagged_string: str :param : An untagged, untokenized string of text. :rtype tagged_text: str
- tag_ngram_123_backoff(untagged_string)[source]¶
Tag POS with 1-, 2-, 3-gram tagger. :type untagged_string: str :param : An untagged, untokenized string of text. :rtype tagged_text: str
- tag_ngram_12_backoff(untagged_string)[source]¶
Tag POS with 1-, 2-gram tagger. :type untagged_string: str :param : An untagged, untokenized string of text. :rtype tagged_text: str
- tag_tnt(untagged_string)[source]¶
Tag POS with TnT tagger. :type untagged_string: str :param : An untagged, untokenized string of text. :rtype tagged_text: str
8.1.17.4. cltk.tag.treebanks module¶
Generate a Python dict from input tags from a treebank, in str. As of this version, only treebanks following the Penn notation are supported.
- cltk.tag.treebanks.set_path(dicts, keys, v)[source]¶
Helper function for modifying nested dictionaries
- Parameters:
dicts – dict: the given dictionary
keys – list str: path to added value
v – str: value to be added
>>> d = dict() >>> set_path(d, ['a', 'b', 'c'], 'd') >>> d {'a': {'b': {'c': ['d']}}}
In case of duplicate paths, the additional value will be added to the leaf node rather than simply replace it:
>>> set_path(d, ['a', 'b', 'c'], 'e')
>>> d {'a': {'b': {'c': ['d', 'e']}}}