DOKK / manpages / debian 11 / libhtml-quoted-perl / HTML::Quoted.3pm.en
HTML::Quoted(3pm) User Contributed Perl Documentation HTML::Quoted(3pm)

HTML::Quoted - extract structure of quoted HTML mail message

    use HTML::Quoted;
    my $html = '...';
    my $struct = HTML::Quoted->extract( $html );

Parses and extracts quotation structure out of a HTML message. Purpose and returned structures are very similar to Text::Quoted.

Variouse MUAs use quite different approaches for quoting in mails.

Some use blockquote tag and it's quite easy to parse.

Some wrap text into p tags and add '>' in the beginning of the paragraphs.

Things gettign messier when it's an HTML reply on plain text mail thread.

If you found format that is not supported then file a bug report via rt.cpan.org with as short as possible example. Test file is even better. Test file with patch is the best. Not obviouse patches without tests suck.

    my $struct = HTML::Quoted->extract( $html );

Takes a string with HTML and returns array reference. Each element in the array either array or hash. For example:

    [
        { 'raw' => 'Hi,' },
        { 'raw' => '<div><br><div>On date X wrote:<br>' },
        [
             { 'raw' => '<blockquote>' },
             { 'raw' => 'Hello,' },
             { 'raw' => '<div>How are you?</div>' },
             { 'raw' => '</blockquote>' }
        ],
        ...
    ]

Hashes represent a part of the html. The following keys are meaningful at the moment:

  • raw - raw HTML
  • quoter_raw, quoter - raw and decoded (entities are converted) quoter if block is prefixed with quoting characters

  my $html = HTML::Quoted->combine_hunks( $arrayref_of_hunks );

Takes the output of "extract" and turns it back into HTML.

Ruslan.Zakirov <ruz@bestpractical.com>

Under the same terms as perl itself.

2018-04-02 perl v5.26.1