Text::Shellwords::Cursor(3pm) | User Contributed Perl Documentation | Text::Shellwords::Cursor(3pm) |
Text::Shellwords::Cursor - Parse a string into tokens
use Text::Shellwords::Cursor; my $parser = Text::Shellwords::Cursor->new(); my $str = 'ab cdef "ghi" j"k\"l "'; my ($tok1) = $parser->parse_line($str); $tok1 = ['ab', 'cdef', 'ghi', 'j', 'k"l '] my ($tok2, $tokno, $tokoff) = $parser->parse_line($str, cursorpos => 6); as above, but $tokno=1, $tokoff=3 (under the 'f')
DESCRIPTION
This module is very similar to Text::Shellwords and Text::ParseWords. However, it has one very significant difference: it keeps track of a character position in the line it's parsing. For instance, if you pass it ("zq fmgb", cursorpos=>6), it would return (['zq', 'fmgb'], 1, 3). The cursorpos parameter tells where in the input string the cursor resides (just before the 'b'), and the result tells you that the cursor was on token 1 ('fmgb'), character 3 ('b'). This is very useful when computing command-line completions involving quoting, escaping, and tokenizing characters (like '(' or '=').
A few helper utilities are included as well. You can escape a string to ensure that parsing it will produce the original string (parse_escape). You can also reassemble the tokens with a visually pleasing amount of whitespace between them (join_line).
This module started out as an integral part of Term::GDBUI using code loosely based on Text::ParseWords. However, it is now basically a ground-up reimplementation. It was split out of Term::GDBUI for version 0.8.
NOTE: you cannot change token_chars after the constructor has been called! The regexps that use it are compiled once (m//o). Also, until the Gnu Readline library can accept "=[]," without diving into an endless loop, we will not tell history expansion to use token_chars (it uses " \t\fBen()<>;&|" by default).
This routine originally bore some resemblance to Text::ParseWords. It has changed almost completely, however, to support keeping track of the cursor position. It also has nicer failure modes, modular quoting, token characters (see token_chars in "new"), etc. This routine now does much more.
Arguments:
This routine also accepts the following named parameters:
Note that passing undef is not the same as passing some random number and ignoring the result! For instance, if you pass 0 and the line begins with whitespace, you'll get a 0-length token at the beginning of the line to represent the cursor in the middle of the whitespace. This allows command completion to work even when the cursor is not near any tokens. If you pass undef, all whitespace at the beginning and end of the line will be trimmed as you would expect.
If it is ambiguous whether the cursor should belong to the previous token or to the following one (i.e. if it's between two quoted strings, say "a""b" or a token_char), it always gravitates to the previous token. This makes more sense when completing.
This function returns a reference to an array containing three items:
If the cursor is at the end of the token, tokoff will point to 1 character past the last character in tokno, a non-existant character. If the cursor is between tokens (surrounded by whitespace), a zero-length token will be created for it.
However, if token_chars is nonempty, it tries to insert a visually pleasing amount of space between the tokens. For instance, rather than 'a ( b , c )', it tries to produce 'a (b, c)'. It won't reformat any tokens that aren't found in $self->{token_chars}, of course.
To change the formatting, you can redefine the variables $self->{space_none}, $self->{space_before}, and $self->{space_after}. Each variable is a string containing all characters that should not be surrounded by whitespace, should have whitespace before, and should have whitespace after, respectively. Any character found in token_chars, but non in any of these space_ variables, will have space placed both before and after.
None known.
Copyright (c) 2003-2011 Scott Bronson, all rights reserved. This program is covered by the MIT license.
Scott Bronson <bronson@rinspin.com>
2022-11-20 | perl v5.36.0 |