Sympa::Tools::Text - Text-related functions
This package provides some text-related functions.
  - addrencode (
    $addr, [ $phrase, [ $charset, [ $comment ] ] ] )
- Returns formatted (and encoded) name-addr as RFC5322 3.4.
- canonic_email
    ( $email )
- Function. Returns canonical form of e-mail address.
    Leading and trailing white spaces are removed. Latin letters
        without accents are lower-cased. For malformed inputs returns
        "undef". 
- canonic_message_id
    ( $message_id )
- Returns canonical form of message ID without trailing or leading
      whitespaces or "<",
      ">".
- canonic_text (
    $text )
- Canonicalizes text. $text should be a binary
      string encoded by UTF-8 character set or a Unicode string. Forbidden
      sequences in binary string will be replaced by U+FFFD REPLACEMENT
      CHARACTERs, and Normalization Form C (NFC) will be applied.
- clip ( $string, $length
    )
- Function. Clips $string according to
      $length by bytes, considering boundary of grapheme
      clusters. UTF-8 is assumed for $string as
      bytestring.
- decode_filesystem_safe
    ( $str )
- Function. Decodes a string encoded by
      encode_filesystem_safe().
    Parameter: 
  - $str
- String to be decoded.
 
Returns:
Decoded string, stripped "utf8"
    flag if any.
 
  - decode_html (
    $str )
- Function. Decodes HTML entities in a string encoded by UTF-8 or a
      Unicode string.
    Parameter: 
  - $str
- String to be decoded.
 
Returns:
Decoded string, stripped "utf8"
    flag if any.
 
  - encode_filesystem_safe
    ( $str )
- Function. Encodes a string $str to be
      suitable for filesystem.
    Parameter: 
  - $str
- String to be encoded.
 
Returns:
Encoded string, stripped "utf8"
    flag if any. All bytes except '-',
    '+', '.',
    '@' and alphanumeric characters are encoded to
    sequences '_' followed by two hexdigits.
Note that '/' will also be encoded.
 
  - encode_html (
    $str, [ $additional_unsafe ] )
- Function. Encodes characters in a string
      $str to HTML entities. By default
      '<', '>',
      '&' and '"' are
      encoded.
    Parameter: 
  - $str
- String to be encoded.
- $additional_unsafe
- Character or range of characters additionally encoded as entity
      references.
    This optional parameter was introduced on Sympa 6.2.37b.3. 
 
Returns:
Encoded string, not stripping utf8 flag if any.
 
  - encode_uri ( $str,
    [ omit => $chars ] )
- Function. Encodes potentially unsafe characters in the string using
      "percent" encoding suitable for URIs.
    Parameters: 
  - $str
- String to be encoded.
- omit => $chars
- By default, all characters except those defined as "unreserved"
      in RFC 3986 are encoded, that is,
      "[^-A-Za-z0-9._~]". If this parameter is
      given, it will prevent encoding additional characters.
 
Returns:
Encoded string, stripped "utf8"
    flag if any.
 
  - escape_chars (
    $str )
- Deprecated. Use "encode_filesystem_safe".
    Escape weird characters. 
- escape_url ( $str
    )
- DEPRECATED. Would be better to use "encode_uri" or
      "mailtourl".
- foldcase ( $str
    )
- Function. Returns "fold-case" string suitable for
      case-insensitive match. For example, a code below looks for a needle in
      haystack not regarding case, even if they are non-ASCII UTF-8 strings.
    
      $haystack = Sympa::Tools::Text::foldcase($HayStack);
  $needle   = Sympa::Tools::Text::foldcase($NeedLe);
  if (index $haystack, $needle >= 0) {
      ...
  }
    Parameter: 
  - guessed_to_utf8(
    $text, [ lang, ... ] )
- Function. Guesses text charset considering language context and
      returns the text reencoded by UTF-8.
    Parameters: 
  - $text
- Text to be reencoded.
- lang, ...
- Language tag(s) which may be given by "implicated_langs" in
      Sympa::Language.
 
Returns:
Reencoded text. If any charsets could not be guessed,
    "iso-8859-1" will be used as the last
    resort, just because it covers full range of 8-bit.
 
  - mailtourl ( $email,
    [ decode_html => 1 ], [ query => {key => val, ...} ] )
- Function. Constructs a "mailto:"
      URL for given e-mail.
    Parameters: 
Returns:
Constructed URL.
 
  - pad ( $str, $width )
- Pads space a string so that result will not be narrower than given width.
    Parameters: 
  - $str
- A string.
- $width
- If $width is false value or width of
      $str is not less than
      $width, does nothing. If
      $width is less than 0,
      pads right. Otherwise, pads left.
 
  - qdecode_filename
    ( $filename )
- Q-Decodes web file name.
    ToDo: This should be obsoleted in the future release: Would be
        better to use "decode_filesystem_safe". 
- qencode_filename
    ( $filename )
- Q-Encodes web file name.
    ToDo: This should be obsoleted in the future release: Would be
        better to use "encode_filesystem_safe". 
- slurp ( $file )
- Get entire content of the file. Normalization by canonic_text() is
      applied. $file is the path to text file.
- unescape_chars
    ( $str )
- Deprecated. Use "decode_filesystem_safe".
    Unescape weird characters. 
- valid_email (
    $string )
- Basic check of an email address.
- weburl ( $base, \@paths, [
    decode_html => 1 ], [ fragment => $fragment ], [ query => \%query ]
    )
- Constructs a "http:" or
      "https:" URL under given base URI.
    Parameters: 
  - wrap_text ( $text, [
    $init_tab, [ $subsequent_tab, [ $cols ] ] ] )
- Function. Returns line-wrapped text.
    Parameters: 
  - $text
- The text to be folded.
- $init_tab
- Indentation prepended to the first line of paragraph. Default is
      '', no indentation.
- $subsequent_tab
- Indentation prepended to each subsequent line of folded paragraph. Default
      is '', no indentation.
- $cols
- Max number of columns of folded text. Default is
      78.
 
Sympa::Tools::Text appeared on Sympa 6.2a.41.
decode_filesystem_safe() and
    encode_filesystem_safe() were added on Sympa 6.2.10.
decode_html(), encode_html(), encode_uri()
    and mailtourl() were added on Sympa 6.2.14, and escape_url()
    was deprecated.
guessed_to_utf8() and pad() were added on Sympa
    6.2.17.
canonic_text() and slurp() were added on Sympa
    6.2.53b.
clip() was added on Sympa 6.2.61b.