RSSLite(3pm) | User Contributed Perl Documentation | RSSLite(3pm) |
XML::RSSLite - lightweight, "relaxed" RSS (and XML-ish) parser
use XML::RSSLite; . . . parseRSS(\%result, \$content); print "=== Channel ===\n", "Title: $result{'title'}\n", "Desc: $result{'description'}\n", "Link: $result{'link'}\n\n"; foreach $item (@{$result{'item'}}) { print " --- Item ---\n", " Title: $item->{'title'}\n", " Desc: $item->{'description'}\n", " Link: $item->{'link'}\n\n"; }
This module attempts to extract the maximum amount of content from available documents, and is less concerned with XML compliance than alternatives. Rather than rely on XML::Parser, it uses heuristics and good old-fashioned Perl regular expressions. It stores the data in a simple hash structure, and "aliases" certain tags so that when done, you can count on having the minimal data necessary for re-constructing a valid RSS file. This means you get the basic title, description, and link for a channel and its items.
This module extracts more usable links by parsing "scriptingNews" and "weblog" formats in addition to RDF & RSS. It also "sanitizes" the output for best results. The munging includes:
This is not a conforming parser. It does not handle the following
<foo bar=">">
<foo><bar> <bar></bar> <bar></bar> </bar></foo>
<![CDATA[ ]]>
PI
It's non-validating, without a DTD the following cannot be properly addressed
perl(1), "XML::RSS", "XML::SAX::PurePerl", "XML::Parser::Lite", <XML::Parser>
Jerrad Pierce <jpierce@cpan.org>.
Scott Thomason <scott@thomasons.org>
Portions Copyright (c) 2002,2003,2009 Jerrad Pierce, (c) 2000 Scott Thomason. All rights reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
2021-01-05 | perl v5.32.0 |