8.1.13.1.1. cltk.prosody.lat package¶
8.1.13.1.1.1. Submodules¶
8.1.13.1.1.2. cltk.prosody.lat.clausulae_analysis module¶
Return dictionary of clausulae found in the prosody of Latin prose.
The clausulae analysis function returns a dictionary in which the key is the type of clausula and the value is the number of times it occurs in the text. The list of clausulae used in the method is derived from the 2019 Journal of Roman Studies paper “Auceps syllabarum: A Digital Analysis of Latin Prose Rhythm”. The list of clausulae are mutually exclusive so no one rhythm will be counted in multiple categories.
- class cltk.prosody.lat.clausulae_analysis.Clausula(rhythm_name, rhythm)¶
Bases:
tuple
- rhythm¶
Alias for field number 1
- rhythm_name¶
Alias for field number 0
- class cltk.prosody.lat.clausulae_analysis.Clausulae(rhythms=[Clausula(rhythm_name='cretic_trochee', rhythm='-u--x'), Clausula(rhythm_name='cretic_trochee_resolved_a', rhythm='uuu--x'), Clausula(rhythm_name='cretic_trochee_resolved_b', rhythm='-uuu-x'), Clausula(rhythm_name='cretic_trochee_resolved_c', rhythm='-u-uux'), Clausula(rhythm_name='double_cretic', rhythm='-u--ux'), Clausula(rhythm_name='molossus_cretic', rhythm='----ux'), Clausula(rhythm_name='double_molossus_cretic_resolved_a', rhythm='uuu--ux'), Clausula(rhythm_name='double_molossus_cretic_resolved_b', rhythm='-uuu-ux'), Clausula(rhythm_name='double_molossus_cretic_resolved_c', rhythm='-u-uux'), Clausula(rhythm_name='double_molossus_cretic_resolved_d', rhythm='uu---ux'), Clausula(rhythm_name='double_molossus_cretic_resolved_e', rhythm='-uu--ux'), Clausula(rhythm_name='double_molossus_cretic_resolved_f', rhythm='--uu-ux'), Clausula(rhythm_name='double_molossus_cretic_resolved_g', rhythm='---uuux'), Clausula(rhythm_name='double_molossus_cretic_resolved_h', rhythm='-u---ux'), Clausula(rhythm_name='double_trochee', rhythm='-u-x'), Clausula(rhythm_name='double_trochee_resolved_a', rhythm='uuu-x'), Clausula(rhythm_name='double_trochee_resolved_b', rhythm='-uuux'), Clausula(rhythm_name='hypodochmiac', rhythm='-u-ux'), Clausula(rhythm_name='hypodochmiac_resolved_a', rhythm='uuu-ux'), Clausula(rhythm_name='hypodochmiac_resolved_b', rhythm='-uuuux'), Clausula(rhythm_name='spondaic', rhythm='---x'), Clausula(rhythm_name='heroic', rhythm='-uu-x')])[source]¶
Bases:
object
- clausulae_analysis(prosody)[source]¶
Return dictionary in which the key is a type of clausula and the value is its frequency. :type prosody:
List
:param prosody: the prosody of a prose text (must be in the format of the scansion produced by the scanner classes. :rtype:List
[Dict
[str
,int
]] :return: dictionary of prosody >>> Clausulae().clausulae_analysis([‘-uuu-uuu-u–x’, ‘uu-uu-uu—-x’]) [{‘cretic_trochee’: 1}, {‘cretic_trochee_resolved_a’: 0}, {‘cretic_trochee_resolved_b’: 0}, {‘cretic_trochee_resolved_c’: 0}, {‘double_cretic’: 0}, {‘molossus_cretic’: 0}, {‘double_molossus_cretic_resolved_a’: 0}, {‘double_molossus_cretic_resolved_b’: 0}, {‘double_molossus_cretic_resolved_c’: 0}, {‘double_molossus_cretic_resolved_d’: 0}, {‘double_molossus_cretic_resolved_e’: 0}, {‘double_molossus_cretic_resolved_f’: 0}, {‘double_molossus_cretic_resolved_g’: 0}, {‘double_molossus_cretic_resolved_h’: 0}, {‘double_trochee’: 0}, {‘double_trochee_resolved_a’: 0}, {‘double_trochee_resolved_b’: 0}, {‘hypodochmiac’: 0}, {‘hypodochmiac_resolved_a’: 0}, {‘hypodochmiac_resolved_b’: 0}, {‘spondaic’: 1}, {‘heroic’: 0}]
8.1.13.1.1.3. cltk.prosody.lat.hendecasyllable_scanner module¶
Utility class for producing a scansion pattern for a Latin hendecasyllables.
Given a line of hendecasyllables, the scan method performs a series of transformation and checks are performed and for each one performed successfully, a note is added to the scansion_notes list so that end users may view the provenance of a scansion.
- class cltk.prosody.lat.hendecasyllable_scanner.HendecasyllableScanner(constants=<cltk.prosody.lat.scansion_constants.ScansionConstants object>, syllabifier=<cltk.prosody.lat.syllabifier.Syllabifier object>, optional_tranform=False, *args, **kwargs)[source]¶
Bases:
VerseScanner
The scansion symbols used can be configured by passing a suitable constants class to the constructor.
- scan(original_line, optional_transform=False)[source]¶
Scan a line of Latin hendecasyllables and produce a scansion pattern, and other data.
- Parameters:
original_line (
str
) – the original line of Latin verseoptional_transform (
bool
) – whether or not to perform i to j transform for syllabification
- Return type:
- Returns:
a Verse object
>>> scanner = HendecasyllableScanner() >>> print(scanner.scan("Cui dono lepidum novum libellum")) Verse(original='Cui dono lepidum novum libellum', scansion=' - U - U U - U - U - U ', meter='hendecasyllable', valid=True, syllable_count=11, accented='Cui donō lepidūm novūm libēllum', scansion_notes=['Corrected invalid start.'], syllables = ['Cui', 'do', 'no', 'le', 'pi', 'dūm', 'no', 'vūm', 'li', 'bēl', 'lum']) >>> print(scanner.scan( ... "ārida modo pumice expolitum?").scansion) - U - U U - U - U - U
- correct_invalid_start(scansion)[source]¶
The third syllable of a hendecasyllabic line is long, so we will convert it.
- Parameters:
scansion (
str
) – scansion string- Return type:
str
- Returns:
scansion string with corrected start
>>> print(HendecasyllableScanner().correct_invalid_start( ... "- U U U U - U - U - U").strip()) - U - U U - U - U - U
- correct_antepenult_chain(scansion)[source]¶
For hendecasyllables the last three feet of the verse are predictable and do not regularly allow substitutions.
- Parameters:
scansion (
str
) – scansion line thus far- Return type:
str
- Returns:
corrected line of scansion
>>> print(HendecasyllableScanner().correct_antepenult_chain( ... "-U -UU UU UU UX").strip()) -U -UU -U -U -X
8.1.13.1.1.4. cltk.prosody.lat.hexameter_scanner module¶
Utility class for producing a scansion pattern for a Latin hexameter.
Given a line of hexameter, the scan method performs a series of transformation and checks are performed and for each one performed successfully, a note is added to the scansion_notes list so that end users may view the provenance of a scansion.
Because hexameters have strict rules on the position and quantity of stressed and unstressed syllables, we can often infer the many stress qualities of the syllables, given a valid hexameter. If the Latin hexameter provided is not accented with macrons, then a best guess is made. For the scansion produced, the stress of a dipthong is indicated in the second of the two vowel positions; for the accented line produced, the dipthong stress is not indicated with any macronized vowels.
- class cltk.prosody.lat.hexameter_scanner.HexameterScanner(constants=<cltk.prosody.lat.scansion_constants.ScansionConstants object>, syllabifier=<cltk.prosody.lat.syllabifier.Syllabifier object>, optional_transform=False, *args, **kwargs)[source]¶
Bases:
VerseScanner
The scansion symbols used can be configured by passing a suitable constants class to the constructor.
- scan(original_line, optional_transform=False, dactyl_smoothing=False)[source]¶
Scan a line of Latin hexameter and produce a scansion pattern, and other data.
- Parameters:
original_line (
str
) – the original line of Latin verseoptional_transform (
bool
) – whether or not to perform i to j transform for syllabificationdactyl_smoothing (
bool
) – whether or not to perform dactyl smoothing
- Return type:
- Returns:
a Verse object
>>> scanner = HexameterScanner()
>>> print(HexameterScanner().scan( ... "ēxiguām sedēm pariturae tērra negavit").scansion) - - - - - U U - - - U U - U >>> print(scanner.scan("impulerit. Tantaene animis caelestibus irae?")) Verse(original='impulerit. Tantaene animis caelestibus irae?', scansion='- U U - - - U U - - - U U - - ', meter='hexameter', valid=True, syllable_count=15, accented='īmpulerīt. Tāntaene animīs caelēstibus īrae?', scansion_notes=['Valid by positional stresses.'], syllables = ['īm', 'pu', 'le', 'rīt', 'Tān', 'taen', 'a', 'ni', 'mīs', 'cae', 'lēs', 'ti', 'bus', 'i', 'rae']) >>> print(scanner.scan( ... "Arma virumque cano, Troiae qui prīmus ab ōrīs").scansion) - U U - U U - - - - - U U - - >>> # some hexameters need the optional transformations: >>> optional_transform_scanner = HexameterScanner(optional_transform=True) >>> print(optional_transform_scanner.scan( ... "Ītaliam, fāto profugus, Lāvīniaque vēnit").scansion) - - - - - U U - - - U U - U >>> print(HexameterScanner().scan( ... "lītora, multum ille et terrīs iactātus et alto").scansion) - U U - - - - - - - U U - U >>> print(HexameterScanner().scan( ... "vī superum saevae memorem Iūnōnis ob īram;").scansion) - U U - - - U U - - - U U - U >>> # handle multiple elisions >>> print(scanner.scan("monstrum horrendum, informe, ingens, cui lumen ademptum").scansion) - - - - - - - - - U U - U >>> # if we have 17 syllables, create a chain of all dactyls >>> print(scanner.scan("quadrupedante putrem sonitu quatit ungula campum" ... ).scansion) - U U - U U - U U - U U - U U - U >>> # if we have 13 syllables exactly, we'll create a spondaic hexameter >>> print(HexameterScanner().scan( ... "illi inter sese multa vi bracchia tollunt").scansion) - - - - - - - - - UU - - >>> print(HexameterScanner().scan( ... "dat latus; insequitur cumulo praeruptus aquae mons").scansion) - U U - U U - U U - - - U U - - >>> print(optional_transform_scanner.scan( ... "Non quivis videt inmodulata poëmata iudex").scansion) - - - U U - U U - U U- U U - - >>> print(HexameterScanner().scan( ... "certabant urbem Romam Remoramne vocarent").scansion) - - - - - - - U U - U U - - >>> # advanced smoothing is available via keyword flags: dactyl_smoothing >>> # print(HexameterScanner().scan( #... "his verbis: 'o gnata, tibi sunt ante ferendae", #... dactyl_smoothing=True).scansion) #- - - - - U U - - - U U - -
- correct_invalid_fifth_foot(scansion)[source]¶
The ‘inverted amphibrach’: stressed_unstressed_stressed syllable pattern is invalid in hexameters, so here we coerce it to stressed when it occurs at the end of a line
- Parameters:
scansion (
str
) – the scansion pattern- Return corrected scansion:
the corrected scansion pattern
>>> print(HexameterScanner().correct_invalid_fifth_foot( ... " - - - U U - U U U - - U U U - x")) - - - U U - U U U - - - U U - x
- Return type:
str
- invalid_foot_to_spondee(feet, foot, idx)[source]¶
In hexameters, a single foot that is a unstressed_stressed syllable pattern is often just a double spondee, so here we coerce it to stressed.
- Parameters:
feet (
list
) – list of string representations of meterical feetfoot (
str
) – the bad foot to correctidx (
int
) – the index of the foot to correct
- Return type:
str
- Returns:
corrected scansion
>>> print(HexameterScanner().invalid_foot_to_spondee( ... ['-UU', '--', '-U', 'U-', '--', '-UU'],'-U', 2)) -UU----U----UU
- correct_dactyl_chain(scansion)[source]¶
Three or more unstressed accents in a row is a broken dactyl chain, best detected and processed backwards.
Since this method takes a Procrustean approach to modifying the scansion pattern, it is not used by default in the scan method; however, it is available as an optional keyword parameter, and users looking to further automate the generation of scansion candidates should consider using this as a fall back.
- Parameters:
scansion (
str
) – scansion with broken dactyl chain; inverted amphibrachs not allowed- Return type:
str
- Returns:
corrected line of scansion
>>> print(HexameterScanner().correct_dactyl_chain( ... "- U U - - U U - - - U U - x")) - - - - - U U - - - U U - x >>> print(HexameterScanner().correct_dactyl_chain( ... "- U U U U - - - - - U U - U")) - - - U U - - - - - U U - U
- correct_inverted_amphibrachs(scansion)[source]¶
The ‘inverted amphibrach’: stressed_unstressed_stressed syllable pattern is invalid in hexameters, so here we coerce it to stressed: - U - -> - - -
- Parameters:
scansion (
str
) – the scansion stress pattern- Return type:
str
- Returns:
a string with the corrected scansion pattern
>>> print(HexameterScanner().correct_inverted_amphibrachs( ... " - U - - U - U U U U - U - x")) - - - - - - U U U U - - - x >>> print(HexameterScanner().correct_inverted_amphibrachs( ... " - - - U - - U U U U U- - U - x")) - - - - - - U U U U U- - - - x >>> print(HexameterScanner().correct_inverted_amphibrachs( ... "- - - - - U - U U - U U - -")) - - - - - - - U U - U U - - >>> print(HexameterScanner().correct_inverted_amphibrachs( ... "- UU- U - U - - U U U U- U")) - UU- - - - - - U U U U- U
8.1.13.1.1.5. cltk.prosody.lat.macronizer module¶
Delineate length of lat vowels.
The Macronizer class places a macron over naturally long Latin vowels. To discern whether a vowel is long, a word is first matched with its Morpheus entry by way of its POS tag. The Morpheus entry includes the macronized form of the matched word.
Since the accuracy of the macronizer largely derives from the accuracy of the POS tagger used to match words to their Morpheus entry, the Macronizer class allows for multiple POS to be used.
Todo
Determine how to disambiguate tags (see logger)
- class cltk.prosody.lat.macronizer.Macronizer(tagger)[source]¶
Bases:
object
Macronize Latin words.
Macronize text by using the POS tag to find the macronized form within the Morpheus database.
- _retrieve_tag(text)[source]¶
Tag text with chosen tagger and clean tags.
Tag format:
[('word', 'tag')]
- Parameters:
text (
str
) – string- Return type:
List
[Tuple
[str
,str
]]- Returns:
list of tuples, with each tuple containing the word and its pos tag
- _retrieve_morpheus_entry(word)[source]¶
Return Morpheus entry for word
Entry format:
[(head word, tag, macronized form)]
- Parameters:
word (
str
) – unmacronized, lowercased word- Ptype word:
string
- Return type:
Tuple
[str
,str
,str
]- Returns:
Morpheus entry in tuples
- _macronize_word(word)[source]¶
Return macronized word.
- Parameters:
word (
Tuple
[str
,str
]) – (word, tag)- Return type:
Tuple
[str
,str
,str
]- Returns:
(word, tag, macronized_form)
- macronize_tags(text)[source]¶
Return macronized form along with POS tags.
E.g. “Gallia est omnis divisa in partes tres,” -> [(‘gallia’, ‘n-s—fb-’, ‘galliā’), (‘est’, ‘v3spia—’, ‘est’), (‘omnis’, ‘a-s—mn-’, ‘omnis’), (‘divisa’, ‘t-prppnn-’, ‘dīvīsa’), (‘in’, ‘r——–’, ‘in’), (‘partes’, ‘n-p—fa-’, ‘partēs’), (‘tres’, ‘m——–’, ‘trēs’)]
- Parameters:
text (
str
) – raw text- Return type:
List
[Tuple
[str
,str
,str
]]- Returns:
tuples of head word, tag, macronized form
8.1.13.1.1.6. cltk.prosody.lat.metrical_validator module¶
Utility class for validating scansion patterns: hexameter, hendecasyllables, pentameter. Allows users to configure the scansion symbols internally via a constructor argument; a suitable default is provided.
- class cltk.prosody.lat.metrical_validator.MetricalValidator(constants=<cltk.prosody.lat.scansion_constants.ScansionConstants object>)[source]¶
Bases:
object
Currently supports validation for: hexameter, hendecasyllables, pentameter.
- is_valid_hexameter(scanned_line)[source]¶
Determine if a scansion pattern is one of the valid hexameter metrical patterns :type scanned_line:
str
:param scanned_line: a line containing a sequence of stressed and unstressed syllables :return bool>>> print(MetricalValidator().is_valid_hexameter("-UU---UU---UU-U")) True
- Return type:
bool
- is_valid_hendecasyllables(scanned_line)[source]¶
Determine if a scansion pattern is one of the valid Hendecasyllables metrical patterns
- Parameters:
scanned_line (
str
) – a line containing a sequence of stressed and unstressed syllables
>>> print(MetricalValidator().is_valid_hendecasyllables("-U-UU-U-U-U")) True
- Return type:
bool
- is_valid_pentameter(scanned_line)[source]¶
Determine if a scansion pattern is one of the valid Pentameter metrical patterns
- Parameters:
scanned_line (
str
) – a line containing a sequence of stressed and unstressed syllables- Return bool:
whether or not the scansion is a valid pentameter
>>> print(MetricalValidator().is_valid_pentameter('-UU-UU--UU-UUX')) True
- Return type:
bool
- hexameter_feet(scansion)[source]¶
Produces a list of hexameter feet, stressed and unstressed syllables with spaces intact. If the scansion line is not entirely correct, it will attempt to corral one or more improper patterns into one or more feet.
- Param:
scansion, the scanned line
:return list of strings, representing the feet of the hexameter, or if the scansion is wildly incorrect, the function will return an empty list.
>>> print("|".join(MetricalValidator().hexameter_feet( ... "- U U - - - - - - - U U - U")).strip() ) - U U |- - |- - |- - |- U U |- U >>> print("|".join(MetricalValidator().hexameter_feet( ... "- U U - - U - - - - U U - U")).strip()) - U U |- - |U - |- - |- U U |- U
- Return type:
List
[str
]
- static hexameter_known_stresses()[source]¶
Provide a list of known stress positions for a hexameter.
- Return type:
List
[int
]- Returns:
a zero based list enumerating which syllables are known to be stressed.
- static hexameter_possible_unstresses()[source]¶
Provide a list of possible positions which may be unstressed syllables in a hexameter.
- Return type:
List
[int
]- Returns:
a zero based list enumerating which syllables are known to be unstressed.
- closest_hexameter_patterns(scansion)[source]¶
Find the closest group of matching valid hexameter patterns.
- Return type:
List
[str
]- Returns:
list of the closest valid hexameter patterns; only candidates with a matching length/number of syllables are considered.
>>> print(MetricalValidator().closest_hexameter_patterns('-UUUUU-----UU--')) ['-UU-UU-----UU--']
- static pentameter_possible_stresses()[source]¶
Provide a list of possible stress positions for a hexameter.
- Return type:
List
[int
]- Returns:
a zero based list enumerating which syllables are known to be stressed.
- closest_pentameter_patterns(scansion)[source]¶
Find the closest group of matching valid pentameter patterns.
- Return type:
List
[str
]- Returns:
list of the closest valid pentameter patterns; only candidates with a matching length/number of syllables are considered.
>>> print(MetricalValidator().closest_pentameter_patterns('--UUU--UU-UUX')) ['---UU--UU-UUX']
- closest_hendecasyllable_patterns(scansion)[source]¶
Find the closest group of matching valid hendecasyllable patterns.
- Return type:
List
[str
]- Returns:
list of the closest valid hendecasyllable patterns; only candidates with a matching length/number of syllables are considered.
>>> print(MetricalValidator().closest_hendecasyllable_patterns('UU-UU-U-U-X')) ['-U-UU-U-U-X', 'U--UU-U-U-X']
- _closest_patterns(patterns, scansion)[source]¶
Find the closest group of matching valid patterns.
- Patterns:
a list of patterns
- Scansion:
the scansion pattern thus far
- Return type:
List
[str
]- Returns:
list of the closest valid patterns; only candidates with a matching length/number of syllables are considered.
- _build_hexameter_template(stress_positions)[source]¶
Build a hexameter scansion template from string of 5 binary numbers; NOTE: Traditionally the fifth foot is dactyl and spondee substitution is rare, however since it is a possible combination, we include it here.
- Parameters:
stress_positions (
str
) – 5 binary integers, indicating whether foot is dactyl or spondee- Return type:
str
- Returns:
a valid hexameter scansion template, a string representing stressed and unstresssed syllables with the optional terminal ending.
>>> print(MetricalValidator()._build_hexameter_template("01010")) -UU---UU---UU-X
8.1.13.1.1.7. cltk.prosody.lat.pentameter_scanner module¶
Utility class for producing a scansion pattern for a Latin pentameter.
Given a line of pentameter, the scan method performs a series of transformation and checks are performed, and for each one performed successfully, a note is added to the scansion_notes list so that end users may view the provenance of a scansion.
- class cltk.prosody.lat.pentameter_scanner.PentameterScanner(constants=<cltk.prosody.lat.scansion_constants.ScansionConstants object>, syllabifier=<cltk.prosody.lat.syllabifier.Syllabifier object>, optional_transform=False, *args, **kwargs)[source]¶
Bases:
VerseScanner
The scansion symbols used can be configured by passing a suitable constants class to the constructor.
- scan(original_line, optional_transform=False)[source]¶
Scan a line of Latin pentameter and produce a scansion pattern, and other data.
- Parameters:
original_line (
str
) – the original line of Latin verseoptional_transform (
bool
) – whether or not to perform i to j transform for syllabification
- Return type:
- Returns:
a Verse object
>>> scanner = PentameterScanner() >>> print(scanner.scan('ex hoc ingrato gaudia amore tibi.')) Verse(original='ex hoc ingrato gaudia amore tibi.', scansion='- - - - - - U U - U U U ', meter='pentameter', valid=True, syllable_count=12, accented='ēx hōc īngrātō gaudia amōre tibi.', scansion_notes=['Spondaic pentameter'], syllables = ['ēx', 'hoc', 'īn', 'gra', 'to', 'gau', 'di', 'a', 'mo', 're', 'ti', 'bi']) >>> print(scanner.scan( ... "in vento et rapida scribere oportet aqua.").scansion) - - - U U - - U U - U U U
- make_spondaic(scansion)[source]¶
If a pentameter line has 12 syllables, then it must start with double spondees.
- Parameters:
scansion (
str
) – a string of scansion patterns- Return type:
str
- Returns:
a scansion pattern string starting with two spondees
>>> print(PentameterScanner().make_spondaic("U U U U U U U U U U U U")) - - - - - - U U - U U U
- make_dactyls(scansion)[source]¶
If a pentameter line has 14 syllables, it starts and ends with double dactyls.
- Parameters:
scansion (
str
) – a string of scansion patterns- Return type:
str
- Returns:
a scansion pattern string starting and ending with double dactyls
>>> print(PentameterScanner().make_dactyls("U U U U U U U U U U U U U U")) - U U - U U - - U U - U U U
- correct_penultimate_dactyl_chain(scansion)[source]¶
For pentameter the last two feet of the verse are predictable dactyls, and do not regularly allow substitutions.
- Parameters:
scansion (
str
) – scansion line thus far- Return type:
str
- Returns:
corrected line of scansion
>>> print(PentameterScanner().correct_penultimate_dactyl_chain( ... "U U U U U U U U U U U U U U")) U U U U U U U - U U - U U U
8.1.13.1.1.8. cltk.prosody.lat.scanner module¶
Scansion module for scanning Latin prose rhythms.
- class cltk.prosody.lat.scanner.Scansion(punctuation=None, clausula_length=13, elide=True)[source]¶
Bases:
object
Prepossesses Latin text for prose rhythm analysis.
- SHORT_VOWELS = ['a', 'e', 'i', 'o', 'u', 'y']¶
- LONG_VOWELS = ['ā', 'ē', 'ī', 'ō', 'ū']¶
- VOWELS = ['a', 'e', 'i', 'o', 'u', 'y', 'ā', 'ē', 'ī', 'ō', 'ū']¶
- DIPHTHONGS = ['ae', 'au', 'ei', 'oe', 'ui']¶
- SINGLE_CONSONANTS = ['b', 'c', 'd', 'g', 'k', 'l', 'm', 'n', 'p', 'q', 'r', 's', 't', 'v', 'f', 'j']¶
- DOUBLE_CONSONANTS = ['x', 'z']¶
- CONSONANTS = ['b', 'c', 'd', 'g', 'k', 'l', 'm', 'n', 'p', 'q', 'r', 's', 't', 'v', 'f', 'j', 'x', 'z']¶
- DIGRAPHS = ['ch', 'ph', 'th', 'qu']¶
- LIQUIDS = ['r', 'l']¶
- MUTES = ['b', 'p', 'd', 't', 'c', 'g']¶
- MUTE_LIQUID_EXCEPTIONS = ['gl', 'bl']¶
- NASALS = ['m', 'n']¶
- SESTS = ['sc', 'sm', 'sp', 'st', 'z']¶
- _tokenize_syllables(word)[source]¶
Tokenize syllables for word. “mihi” -> [{“syllable”: “mi”, index: 0, … } … ] Syllable properties: syllable: string -> syllable index: int -> postion in word long_by_nature: bool -> is syllable long by nature accented: bool -> does receive accent long_by_position: bool -> is syllable long by position :type word:
str
:param word: string :rtype:List
[Dict
] :return: list>>> Scansion()._tokenize_syllables("mihi") [{'syllable': 'mi', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': True}, {'syllable': 'hi', 'index': 1, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': False}] >>> Scansion()._tokenize_syllables("ivi") [{'syllable': 'i', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': True}, {'syllable': 'vi', 'index': 1, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': False}] >>> Scansion()._tokenize_syllables("audītū") [{'syllable': 'au', 'index': 0, 'elide': (False, None), 'long_by_nature': True, 'long_by_position': (False, None), 'accented': False}, {'syllable': 'dī', 'index': 1, 'elide': (False, None), 'long_by_nature': True, 'long_by_position': (False, None), 'accented': True}, {'syllable': 'tū', 'index': 2, 'elide': (False, None), 'long_by_nature': True, 'long_by_position': (False, None), 'accented': False}] >>> Scansion()._tokenize_syllables("ā") [{'syllable': 'ā', 'index': 0, 'elide': (False, None), 'long_by_nature': True, 'long_by_position': (False, None), 'accented': True}] >>> Scansion()._tokenize_syllables("conjiciō") [{'syllable': 'con', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (True, None), 'accented': False}, {'syllable': 'ji', 'index': 1, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': True}, {'syllable': 'ci', 'index': 2, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': False}, {'syllable': 'ō', 'index': 3, 'elide': (False, None), 'long_by_nature': True, 'long_by_position': (False, None), 'accented': False}] >>> Scansion()._tokenize_syllables("lingua") [{'syllable': 'lin', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (True, None), 'accented': True}, {'syllable': 'gua', 'index': 1, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': False}] >>> Scansion()._tokenize_syllables("abrante") [{'syllable': 'ab', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, 'mute+liquid'), 'accented': False}, {'syllable': 'ran', 'index': 1, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (True, None), 'accented': True}, {'syllable': 'te', 'index': 2, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': False}] >>> Scansion()._tokenize_syllables("redemptor") [{'syllable': 'red', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': False}, {'syllable': 'em', 'index': 1, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (True, None), 'accented': True}, {'syllable': 'ptor', 'index': 2, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': False}] >>> Scansion()._tokenize_syllables("nagrante") [{'syllable': 'na', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, 'mute+liquid'), 'accented': False}, {'syllable': 'gran', 'index': 1, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (True, None), 'accented': True}, {'syllable': 'te', 'index': 2, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': False}]
- _tokenize_words(sentence)[source]¶
Tokenize words for sentence. “Puella bona est” -> [{word: puella, index: 0, … }, … ] Word properties: word: string -> word index: int -> position in sentence syllables: list -> list of syllable objects syllables_count: int -> number of syllables in word :type sentence:
str
:param sentence: string :rtype:List
[Dict
] :return: list>>> Scansion()._tokenize_words('dedērunt te miror antōnī quorum.') [{'word': 'dedērunt', 'index': 0, 'syllables': [{'syllable': 'de', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': False}, {'syllable': 'dē', 'index': 1, 'elide': (False, None), 'long_by_nature': True, 'long_by_position': (False, None), 'accented': True}, {'syllable': 'runt', 'index': 2, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (True, None), 'accented': False}], 'syllables_count': 3}, {'word': 'te', 'index': 1, 'syllables': [{'syllable': 'te', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': True}], 'syllables_count': 1}, {'word': 'miror', 'index': 2, 'syllables': [{'syllable': 'mi', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': True}, {'syllable': 'ror', 'index': 1, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': False}], 'syllables_count': 2}, {'word': 'antōnī', 'index': 3, 'syllables': [{'syllable': 'an', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (True, None), 'accented': False}, {'syllable': 'tō', 'index': 1, 'elide': (False, None), 'long_by_nature': True, 'long_by_position': (False, None), 'accented': True}, {'syllable': 'nī', 'index': 2, 'elide': (False, None), 'long_by_nature': True, 'long_by_position': (False, None), 'accented': False}], 'syllables_count': 3}, {'word': 'quorum.', 'index': 4, 'syllables': [{'syllable': 'quo', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': True}, {'syllable': 'rum', 'index': 1, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': False}], 'syllables_count': 2}] >>> Scansion()._tokenize_words('a spes co i no xe cta.') [{'word': 'a', 'index': 0, 'syllables': [{'syllable': 'a', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, 'sest'), 'accented': True}], 'syllables_count': 1}, {'word': 'spes', 'index': 1, 'syllables': [{'syllable': 'spes', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (True, None), 'accented': True}], 'syllables_count': 1}, {'word': 'co', 'index': 2, 'syllables': [{'syllable': 'co', 'index': 0, 'elide': (True, 'weak'), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': True}], 'syllables_count': 1}, {'word': 'i', 'index': 3, 'syllables': [{'syllable': 'i', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': True}], 'syllables_count': 1}, {'word': 'no', 'index': 4, 'syllables': [{'syllable': 'no', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (True, None), 'accented': True}], 'syllables_count': 1}, {'word': 'xe', 'index': 5, 'syllables': [{'syllable': 'xe', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (True, None), 'accented': True}], 'syllables_count': 1}, {'word': 'cta.', 'index': 6, 'syllables': [{'syllable': 'cta', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': True}], 'syllables_count': 1}] >>> Scansion()._tokenize_words('x') [] >>> Scansion()._tokenize_words('atae amo.') [{'word': 'atae', 'index': 0, 'syllables': [{'syllable': 'a', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': True}, {'syllable': 'tae', 'index': 1, 'elide': (True, 'strong'), 'long_by_nature': True, 'long_by_position': (False, None), 'accented': False}], 'syllables_count': 2}, {'word': 'amo.', 'index': 1, 'syllables': [{'syllable': 'a', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': True}, {'syllable': 'mo', 'index': 1, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': False}], 'syllables_count': 2}] >>> Scansion()._tokenize_words('bar rid.') [{'word': 'bar', 'index': 0, 'syllables': [{'syllable': 'bar', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (True, None), 'accented': True}], 'syllables_count': 1}, {'word': 'rid.', 'index': 1, 'syllables': [{'syllable': 'rid', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': True}], 'syllables_count': 1}] >>> Scansion()._tokenize_words('ba brid.') [{'word': 'ba', 'index': 0, 'syllables': [{'syllable': 'ba', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, 'mute+liquid'), 'accented': True}], 'syllables_count': 1}, {'word': 'brid.', 'index': 1, 'syllables': [{'syllable': 'brid', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': True}], 'syllables_count': 1}]
- tokenize(text)[source]¶
Tokenize text on supplied characters. “Puella bona est. Puer malus est.” -> [ [{word: puella, syllables: […], index: 0}, … ], … ] :rtype:
List
[Dict
] :return:list>>> Scansion().tokenize('puella bona est. puer malus est.') [{'plain_text_sentence': 'puella bona est', 'structured_sentence': [{'word': 'puella', 'index': 0, 'syllables': [{'syllable': 'pu', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': False}, {'syllable': 'el', 'index': 1, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (True, None), 'accented': True}, {'syllable': 'la', 'index': 2, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': False}], 'syllables_count': 3}, {'word': 'bona', 'index': 1, 'syllables': [{'syllable': 'bo', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': True}, {'syllable': 'na', 'index': 1, 'elide': (True, 'weak'), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': False}], 'syllables_count': 2}, {'word': 'est', 'index': 2, 'syllables': [{'syllable': 'est', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (True, None), 'accented': True}], 'syllables_count': 1}]}, {'plain_text_sentence': ' puer malus est', 'structured_sentence': [{'word': 'puer', 'index': 0, 'syllables': [{'syllable': 'pu', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': True}, {'syllable': 'er', 'index': 1, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (True, None), 'accented': False}], 'syllables_count': 2}, {'word': 'malus', 'index': 1, 'syllables': [{'syllable': 'ma', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': True}, {'syllable': 'lus', 'index': 1, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (False, None), 'accented': False}], 'syllables_count': 2}, {'word': 'est', 'index': 2, 'syllables': [{'syllable': 'est', 'index': 0, 'elide': (False, None), 'long_by_nature': False, 'long_by_position': (True, None), 'accented': True}], 'syllables_count': 1}]}, {'plain_text_sentence': '', 'structured_sentence': []}]
- scan_text(text)[source]¶
Return a flat list of rhythms. Desired clausula length is passed as a parameter. Clausula shorter than the specified length can be exluded. :rtype:
List
[str
] :return:>>> Scansion().scan_text('dedērunt te miror antōnī quorum. sī quid est in mē ingenī jūdicēs quod sentiō.') ['u--uuu---ux', 'u---u--u---ux']
8.1.13.1.1.9. cltk.prosody.lat.scansion_constants module¶
Configuration class for specifying scansion constants.
- class cltk.prosody.lat.scansion_constants.ScansionConstants(unstressed='U', stressed='-', optional_terminal_ending='X', separator='|')[source]¶
Bases:
object
Constants containing strings have characters in upper and lower case since they will often be used in regular expressions, and used to preserve/a verse’s original case.
This class also allows users to customizing scansion constants and scanner behavior.
>>> constants = ScansionConstants(unstressed="U",stressed= "-", optional_terminal_ending="X") >>> print(constants.DACTYL) -UU
>>> smaller_constants = ScansionConstants( ... unstressed="˘",stressed= "¯", optional_terminal_ending="x") >>> print(smaller_constants.DACTYL) ¯˘˘
- HEXAMETER_ENDING¶
The following two constants are not offical scansion terms, but invalid in hexameters
- DOUBLED_CONSONANTS¶
Prefix order not arbitrary; one will want to match on extra before ex
8.1.13.1.1.10. cltk.prosody.lat.scansion_formatter module¶
Utility class for formatting scansion patterns
- class cltk.prosody.lat.scansion_formatter.ScansionFormatter(constants=<cltk.prosody.lat.scansion_constants.ScansionConstants object>)[source]¶
Bases:
object
Users can specify which scansion symbols to use in the formatting.
>>> print(ScansionFormatter().hexameter( "-UU-UU-UU---UU--")) -UU|-UU|-UU|--|-UU|-- >>> constants = ScansionConstants(unstressed="˘", stressed= "¯", optional_terminal_ending="x") >>> formatter = ScansionFormatter(constants) >>> print(formatter.hexameter( "¯˘˘¯˘˘¯˘˘¯¯¯˘˘¯¯")) ¯˘˘|¯˘˘|¯˘˘|¯¯|¯˘˘|¯¯
- hexameter(line)[source]¶
Format a string of hexameter metrical stress patterns into foot divisions
- Parameters:
line (
str
) – the scansion pattern- Return type:
str
- Returns:
the scansion string formatted with foot breaks
>>> print(ScansionFormatter().hexameter( "-UU-UU-UU---UU--")) -UU|-UU|-UU|--|-UU|--
- merge_line_scansion(line, scansion)[source]¶
Merge a line of verse with its scansion string. Do not accent dipthongs.
- Parameters:
line (
str
) – the original Latin verse linescansion (
str
) – the scansion pattern
- Return type:
str
- Returns:
the original line with the scansion pattern applied via macrons
>>> print(ScansionFormatter().merge_line_scansion( ... "Arma virumque cano, Troiae qui prīmus ab ōrīs", ... "- U U - U U - UU- - - U U - -")) Ārma virūmque canō, Troiae quī prīmus ab ōrīs
>>> print(ScansionFormatter().merge_line_scansion( ... "lītora, multum ille et terrīs iactātus et alto", ... " - U U - - - - - - - U U - U")) lītora, mūltum īlle ēt tērrīs iāctātus et ālto
>>> print(ScansionFormatter().merge_line_scansion( ... 'aut facere, haec a te dictaque factaque sunt', ... ' - U U - - - - U U - U U - ')) aut facere, haec ā tē dīctaque fāctaque sūnt
8.1.13.1.1.11. cltk.prosody.lat.string_utils module¶
Utillity class for processing scansion and text.
- cltk.prosody.lat.string_utils.remove_punctuation_dict()[source]¶
Provide a dictionary for removing punctuation, swallowing spaces.
:return dict with punctuation from the unicode table
>>> print("I'm ok! Oh #%&*()[]{}!? Fine!".translate( ... remove_punctuation_dict()).lstrip()) Im ok Oh Fine
- Return type:
Dict
[int
,None
]
- cltk.prosody.lat.string_utils.punctuation_for_spaces_dict()[source]¶
Provide a dictionary for removing punctuation, keeping spaces. Essential for scansion to keep stress patterns in alignment with original vowel positions in the verse.
:return dict with punctuation from the unicode table
>>> print("I'm ok! Oh #%&*()[]{}!? Fine!".translate( ... punctuation_for_spaces_dict()).strip()) I m ok Oh Fine
- Return type:
Dict
[int
,str
]
- cltk.prosody.lat.string_utils.differences(scansion, candidate)[source]¶
Given two strings, return a list of index positions where the contents differ.
- Parameters:
scansion (
str
) –candidate (
str
) –
- Return type:
List
[int
]- Returns:
>>> differences("abc", "abz") [2]
- cltk.prosody.lat.string_utils.mark_list(line)[source]¶
Given a string, return a list of index positions where a character/non blank space exists.
- Parameters:
line (
str
) –- Return type:
List
[int
]- Returns:
>>> mark_list(" a b c") [1, 3, 5]
- cltk.prosody.lat.string_utils.space_list(line)[source]¶
Given a string, return a list of index positions where a blank space occurs.
- Parameters:
line (
str
) –- Return type:
List
[int
]- Returns:
>>> space_list(" abc ") [0, 1, 2, 3, 7]
- cltk.prosody.lat.string_utils.flatten(list_of_lists)[source]¶
Given a list of lists, flatten all the items into one list.
- Parameters:
list_of_lists –
- Returns:
>>> flatten([ [1, 2, 3], [4, 5, 6]]) [1, 2, 3, 4, 5, 6]
- cltk.prosody.lat.string_utils.to_syllables_with_trailing_spaces(line, syllables)[source]¶
Given a line of syllables and spaces, and a list of syllables, produce a list of the syllables with trailing spaces attached as approriate.
- Parameters:
line (
str
) –syllables (
List
[str
]) –
- Return type:
List
[str
]- Returns:
>>> to_syllables_with_trailing_spaces(' arma virumque cano ', ... ['ar', 'ma', 'vi', 'rum', 'que', 'ca', 'no' ]) [' ar', 'ma ', 'vi', 'rum', 'que ', 'ca', 'no ']
- cltk.prosody.lat.string_utils.join_syllables_spaces(syllables, spaces)[source]¶
Given a list of syllables, and a list of integers indicating the position of spaces, return a string that has a space inserted at the designated points.
- Parameters:
syllables (
List
[str
]) –spaces (
List
[int
]) –
- Return type:
str
- Returns:
>>> join_syllables_spaces(["won", "to", "tree", "dun"], [3, 6, 11]) 'won to tree dun'
- cltk.prosody.lat.string_utils.starts_with_qu(word)[source]¶
Determine whether or not a word start with the letters Q and U.
- Parameters:
word –
- Return type:
bool
- Returns:
>>> starts_with_qu("qui") True >>> starts_with_qu("Quirites") True
- cltk.prosody.lat.string_utils.stress_positions(stress, scansion)[source]¶
Given a stress value and a scansion line, return the index positions of the stresses.
- Parameters:
stress (
str
) –scansion (
str
) –
- Return type:
List
[int
]- Returns:
>>> stress_positions("-", " - U U - UU - U U") [0, 3, 6]
- cltk.prosody.lat.string_utils.merge_elisions(elided)[source]¶
Given a list of strings with different space swapping elisions applied, merge the elisions, taking the most without compounding the omissions.
- Parameters:
elided (
List
[str
]) –- Return type:
str
- Returns:
>>> merge_elisions([ ... "ignavae agua multum hiatus", "ignav agua multum hiatus" ,"ignavae agua mult hiatus"]) 'ignav agua mult hiatus'
- cltk.prosody.lat.string_utils.move_consonant_right(letters, positions)[source]¶
Given a list of letters, and a list of consonant positions, move the consonant positions to the right, merging strings as necessary.
- Parameters:
letters (
List
[str
]) –positions (
List
[int
]) –
- Return type:
List
[str
]- Returns:
>>> move_consonant_right(list("abbra"), [ 2, 3]) ['a', 'b', '', '', 'bra']
- cltk.prosody.lat.string_utils.move_consonant_left(letters, positions)[source]¶
Given a list of letters, and a list of consonant positions, move the consonant positions to the left, merging strings as necessary.
- Parameters:
letters (
List
[str
]) –positions (
List
[int
]) –
- Return type:
List
[str
]- Returns:
>>> move_consonant_left(['a', 'b', '', '', 'bra'], [1]) ['ab', '', '', '', 'bra']
- cltk.prosody.lat.string_utils.merge_next(letters, positions)[source]¶
Given a list of letter positions, merge each letter with its next neighbor.
- Parameters:
letters (
List
[str
]) –positions (
List
[int
]) –
- Return type:
List
[str
]- Returns:
>>> merge_next(['a', 'b', 'o', 'v', 'o' ], [0, 2]) ['ab', '', 'ov', '', 'o'] >>> # Note: because it operates on the original list passed in, the effect is not cummulative: >>> merge_next(['a', 'b', 'o', 'v', 'o' ], [0, 2, 3]) ['ab', '', 'ov', 'o', '']
- cltk.prosody.lat.string_utils.remove_blanks(letters)[source]¶
Given a list of letters, remove any empty strings.
- Parameters:
letters (
List
[str
]) –- Returns:
>>> remove_blanks(['a', '', 'b', '', 'c']) ['a', 'b', 'c']
- cltk.prosody.lat.string_utils.split_on(word, section)[source]¶
Given a string, split on a section, and return the two sections as a tuple.
- Parameters:
word (
str
) –section (
str
) –
- Return type:
Tuple
[str
,str
]- Returns:
>>> split_on('hamrye', 'ham') ('ham', 'rye')
- cltk.prosody.lat.string_utils.remove_blank_spaces(syllables)[source]¶
Given a list of letters, remove any blank spaces or empty strings.
- Parameters:
syllables (
List
[str
]) –- Return type:
List
[str
]- Returns:
>>> remove_blank_spaces(['', 'a', ' ', 'b', ' ', 'c', '']) ['a', 'b', 'c']
- cltk.prosody.lat.string_utils.overwrite(char_list, regexp, quality, offset=0)[source]¶
Given a list of characters and spaces, a matching regular expression, and a quality or character, replace the matching character with a space, overwriting with an offset and a multiplier if provided.
- Parameters:
char_list (
List
[str
]) –regexp (
str
) –quality (
str
) –offset (
int
) –
- Return type:
List
[str
]- Returns:
>>> overwrite(list("multe igne"), r"e\s[aeiou]", " ") ['m', 'u', 'l', 't', ' ', ' ', 'i', 'g', 'n', 'e']
- cltk.prosody.lat.string_utils.overwrite_dipthong(char_list, regexp, quality)[source]¶
Given a list of characters and spaces, a matching regular expression, and a quality or character, replace the matching character with a space, overwriting with an offset and a multiplier if provided.
- Parameters:
char_list (
List
[str
]) – a list of charactersregexp (
str
) – a matching regular expressionquality (
str
) – a quality or character to replace
- Return type:
List
[str
]- Returns:
a list of characters with the dipthong overwritten
>>> overwrite_dipthong(list("multae aguae"), r"ae\s[aeou]", " ") ['m', 'u', 'l', 't', ' ', ' ', ' ', 'a', 'g', 'u', 'a', 'e']
- cltk.prosody.lat.string_utils.get_unstresses(stresses, count)[source]¶
Given a list of stressed positions, and count of possible positions, return a list of the unstressed positions.
- Parameters:
stresses (
List
[int
]) – a list of stressed positionscount (
int
) – the number of possible positions
- Return type:
List
[int
]- Returns:
a list of unstressed positions
>>> get_unstresses([0, 3, 6, 9, 12, 15], 17) [1, 2, 4, 5, 7, 8, 10, 11, 13, 14, 16]
8.1.13.1.1.12. cltk.prosody.lat.syllabifier module¶
Latin language syllabifier. Parses a lat word or a space separated list of words into a list of syllables. Consonantal I is transformed into a J at the start of a word as necessary. Tuned for poetry and verse, this class is tolerant of isolated single character consonants that may appear due to elision.
- class cltk.prosody.lat.syllabifier.Syllabifier(constants=<cltk.prosody.lat.scansion_constants.ScansionConstants object>)[source]¶
Bases:
object
Scansion constants can be modified and passed into the constructor if desired.
- syllabify(words)[source]¶
Parse a Latin word into a list of syllable strings.
- Parameters:
words (
str
) – a string containing one lat word or many words separated by spaces.- Return type:
List
[str
]- Returns:
list of string, each representing a syllable.
>>> syllabifier = Syllabifier() >>> print(syllabifier.syllabify("fuit")) ['fu', 'it'] >>> print(syllabifier.syllabify("libri")) ['li', 'bri'] >>> print(syllabifier.syllabify("contra")) ['con', 'tra'] >>> print(syllabifier.syllabify("iaculum")) ['ja', 'cu', 'lum'] >>> print(syllabifier.syllabify("amo")) ['a', 'mo'] >>> print(syllabifier.syllabify("bracchia")) ['brac', 'chi', 'a'] >>> print(syllabifier.syllabify("deinde")) ['dein', 'de'] >>> print(syllabifier.syllabify("certabant")) ['cer', 'ta', 'bant'] >>> print(syllabifier.syllabify("aere")) ['ae', 're'] >>> print(syllabifier.syllabify("adiungere")) ['ad', 'jun', 'ge', 're'] >>> print(syllabifier.syllabify("mōns")) ['mōns'] >>> print(syllabifier.syllabify("domus")) ['do', 'mus'] >>> print(syllabifier.syllabify("lixa")) ['li', 'xa'] >>> print(syllabifier.syllabify("asper")) ['as', 'per'] >>> # handle doubles >>> print(syllabifier.syllabify("siccus")) ['sic', 'cus'] >>> # handle liquid + liquid >>> print(syllabifier.syllabify("almus")) ['al', 'mus'] >>> # handle liquid + mute >>> print(syllabifier.syllabify("ambo")) ['am', 'bo'] >>> print(syllabifier.syllabify("anguis")) ['an', 'guis'] >>> print(syllabifier.syllabify("arbor")) ['ar', 'bor'] >>> print(syllabifier.syllabify("pulcher")) ['pul', 'cher'] >>> print(syllabifier.syllabify("ruptus")) ['ru', 'ptus'] >>> print(syllabifier.syllabify("Bīthÿnus")) ['Bī', 'thÿ', 'nus'] >>> print(syllabifier.syllabify("sanguen")) ['san', 'guen'] >>> print(syllabifier.syllabify("unguentum")) ['un', 'guen', 'tum'] >>> print(syllabifier.syllabify("lingua")) ['lin', 'gua'] >>> print(syllabifier.syllabify("linguā")) ['lin', 'guā'] >>> print(syllabifier.syllabify("languidus")) ['lan', 'gui', 'dus'] >>> print(syllabifier.syllabify("suis")) ['su', 'is'] >>> print(syllabifier.syllabify("habui")) ['ha', 'bu', 'i'] >>> print(syllabifier.syllabify("habuit")) ['ha', 'bu', 'it'] >>> print(syllabifier.syllabify("qui")) ['qui'] >>> print(syllabifier.syllabify("quibus")) ['qui', 'bus'] >>> print(syllabifier.syllabify("hui")) ['hui'] >>> print(syllabifier.syllabify("cui")) ['cui'] >>> print(syllabifier.syllabify("huic")) ['huic']
- _setup(word)[source]¶
Prepares a word for syllable processing.
If the word starts with a prefix, process it separately. :param word: :rtype:
List
[str
] :return:
- _process(word)[source]¶
Process a word into a list of strings representing the syllables of the word. This method describes rules for consonant grouping behaviors and then iteratively applies those rules the list of letters that comprise the word, until all the letters are grouped into appropriate syllable groups.
- Parameters:
word (
str
) –- Return type:
List
[str
]- Returns:
- _contains_consonants(letter_group)[source]¶
Check if a string contains consonants.
- Return type:
bool
- _starting_consonants_only(letters)[source]¶
Return a list of starting consonant positions.
- Return type:
list
- _ending_consonants_only(letters)[source]¶
Return a list of positions for ending consonants.
- Return type:
List
[int
]
- _find_solo_consonant(letters)[source]¶
Find the positions of any solo consonants that are not yet paired with a vowel.
- Return type:
List
[int
]
- _find_consonant_cluster(letters)[source]¶
Find clusters of consonants that do not contain a vowel. :type letters:
List
[str
] :param letters: :rtype:List
[int
] :return:
- _move_consonant(letters, positions)[source]¶
Given a list of consonant positions, move the consonants according to certain consonant syllable behavioral rules for gathering and grouping.
- Parameters:
letters (
list
) –positions (
List
[int
]) –
- Return type:
List
[str
]- Returns:
- get_syllable_count(syllables)[source]¶
Counts the number of syllable groups that would occur after ellision.
Often we will want preserve the position and separation of syllables so that they can be used to reconstitute a line, and apply stresses to the original word positions. However, we also want to be able to count the number of syllables accurately.
- Parameters:
syllables (
List
[str
]) –- Return type:
int
- Returns:
>>> syllabifier = Syllabifier() >>> print(syllabifier.get_syllable_count([ ... 'Jām', 'tūm', 'c', 'au', 'sus', 'es', 'u', 'nus', 'I', 'ta', 'lo', 'rum'])) 11
8.1.13.1.1.13. cltk.prosody.lat.verse module¶
Data structure class for a line of metrical verse.
- class cltk.prosody.lat.verse.Verse(original, scansion='', meter=None, valid=False, syllable_count=0, accented='', scansion_notes=None, syllables=None)[source]¶
Bases:
object
Class representing a line of metrical verse.
This class is round-trippable; the __repr__ call can be used for construction.
>>> positional_hex = Verse(original='impulerit. Tantaene animis caelestibus irae?', ... scansion='- U U - - - U U - - - U U - - ', meter='hexameter', ... valid=True, syllable_count=15, accented='īmpulerīt. Tāntaene animīs caelēstibus īrae?', ... scansion_notes=['Valid by positional stresses.'], ... syllables = ['īm', 'pu', 'le', 'rīt', 'Tān', 'taen', 'a', 'ni', 'mīs', 'cae', 'lēs', 'ti', 'bus', 'i', 'rae']) >>> dupe = eval(positional_hex.__repr__()) >>> dupe Verse(original='impulerit. Tantaene animis caelestibus irae?', scansion='- U U - - - U U - - - U U - - ', meter='hexameter', valid=True, syllable_count=15, accented='īmpulerīt. Tāntaene animīs caelēstibus īrae?', scansion_notes=['Valid by positional stresses.'], syllables = ['īm', 'pu', 'le', 'rīt', 'Tān', 'taen', 'a', 'ni', 'mīs', 'cae', 'lēs', 'ti', 'bus', 'i', 'rae']) >>> positional_hex Verse(original='impulerit. Tantaene animis caelestibus irae?', scansion='- U U - - - U U - - - U U - - ', meter='hexameter', valid=True, syllable_count=15, accented='īmpulerīt. Tāntaene animīs caelēstibus īrae?', scansion_notes=['Valid by positional stresses.'], syllables = ['īm', 'pu', 'le', 'rīt', 'Tān', 'taen', 'a', 'ni', 'mīs', 'cae', 'lēs', 'ti', 'bus', 'i', 'rae'])
- working_line¶
placeholder for data transformations
8.1.13.1.1.14. cltk.prosody.lat.verse_scanner module¶
Parent class and utility class for producing a scansion pattern for a line of Latin verse.
Some useful methods * Perform a conservative i to j transformation * Performs elisions * Accents vowels by position * Breaks the line into a list of syllables by calling a Syllabifier class which may be injected into this classes constructor.
- class cltk.prosody.lat.verse_scanner.VerseScanner(constants=<cltk.prosody.lat.scansion_constants.ScansionConstants object>, syllabifier=<cltk.prosody.lat.syllabifier.Syllabifier object>, **kwargs)[source]¶
Bases:
object
The scansion symbols used can be configured by passing a suitable constants class to the constructor.
- transform_i_to_j(line)[source]¶
Transform instances of consonantal i to j :type line:
str
:param line: :rtype:str
:return:>>> print(VerseScanner().transform_i_to_j("iactātus")) jactātus >>> print(VerseScanner().transform_i_to_j("bracchia")) bracchia
- transform_i_to_j_optional(line)[source]¶
Sometimes for the demands of meter a more permissive i to j transformation is warranted.
- Parameters:
line (
str
) –- Return type:
str
- Returns:
>>> print(VerseScanner().transform_i_to_j_optional("Italiam")) Italjam >>> print(VerseScanner().transform_i_to_j_optional("Lāvīniaque")) Lāvīnjaque >>> print(VerseScanner().transform_i_to_j_optional("omnium")) omnjum
- accent_by_position(verse_line)[source]¶
Accent vowels according to the rules of scansion.
- Parameters:
verse_line (
str
) – a line of unaccented verse- Return type:
str
- Returns:
the same line with vowels accented by position
>>> print(VerseScanner().accent_by_position( ... "Arma virumque cano, Troiae qui primus ab oris").lstrip()) Ārma virūmque canō Trojae qui primus ab oris
- elide_all(line)[source]¶
Given a string of space separated syllables, erase with spaces the syllable portions that would disappear according to the rules of elision.
- Parameters:
line (
str
) –- Return type:
str
- Returns:
- calc_offset(syllables_spaces)[source]¶
Calculate a dictionary of accent positions from a list of syllables with spaces.
- Parameters:
syllables_spaces (
List
[str
]) –- Return type:
Dict
[int
,int
]- Returns:
- produce_scansion(stresses, syllables_wspaces, offset_map)[source]¶
Create a scansion string that has stressed and unstressed syllable positions in locations that correspond with the original texts syllable vowels.
:param stresses list of syllable positions :param syllables_wspaces list of syllables with spaces escaped for punctuation or elision :param offset_map dictionary of syllable positions, and an offset amount which is the number of spaces to skip in the original line before inserting the accent.
- Return type:
str
- flag_dipthongs(syllables)[source]¶
Return a list of syllables that contain a dipthong
- Parameters:
syllables (
List
[str
]) –- Return type:
List
[int
]- Returns:
- elide(line, regexp, quantity=1, offset=0)[source]¶
Erase a section of a line, matching on a regex, pushing in a quantity of blank spaces, and jumping forward with an offset if necessary. If the elided vowel was strong, the vowel merged with takes on the stress.
- Parameters:
line (
str
) –regexp (
str
) –quantity (
int
) –offset (
int
) –
- Return type:
str
- Returns:
>>> print(VerseScanner().elide("uvae avaritia", r"[e]\s*[a]")) uv āvaritia >>> print(VerseScanner().elide("mare avaritia", r"[e]\s*[a]")) mar avaritia
- correct_invalid_start(scansion)[source]¶
If a hexameter, hendecasyllables, or pentameter scansion starts with spondee, an unstressed syllable in the third position must actually be stressed, so we will convert it: - - | U -> - - | -
- Parameters:
scansion (
str
) –- Return type:
str
- Returns:
>>> print(VerseScanner().correct_invalid_start( ... " - - U U - - U U U U U U - -").strip()) - - - - - - U U U U U U - -
- correct_first_two_dactyls(scansion)[source]¶
If a hexameter or pentameter starts with spondee, an unstressed syllable in the third position must actually be stressed, so we will convert it: - - | U -> - - | - And/or if the starting pattern is spondee + trochee + stressed, then the unstressed trochee can be corrected: - - | - u | - -> - - | - -| -
- Parameters:
scansion (
str
) –- Return type:
str
- Returns:
>>> print(VerseScanner().correct_first_two_dactyls( ... " - - U U - - U U U U U U - -")) - - - - - - U U U U U U - -