YAWriter(3pm) | User Contributed Perl Documentation | YAWriter(3pm) |
XML::Handler::YAWriter - Yet another Perl SAX XML Writer
use XML::Handler::YAWriter; my $ya = new XML::Handler::YAWriter( %options ); my $perlsax = new XML::Parser::PerlSAX( 'Handler' => $ya );
YAWriter implements Yet Another XML::Handler::Writer. The reasons for this one are that I needed a flexible escaping technique, and want some kind of pretty printing. If an instance of YAWriter is created without any options, the default behavior is to produce an array of strings containing the XML in :
@{$ya->{Strings}}
Options are given in the usual 'key' => 'value' idiom.
The default value for Escape is
$XML::Handler::YAWriter::escape = { '&' => '&', '<' => '<', '>' => '>', '"' => '"', '--' => '--' };
YAWriter will use an evaluated sub to make the recoding based on a given Escape hash reasonably fast. Future versions may use XS to improve this performance bottleneck.
Correct handling of start_document and end_document is required!
The YAWriter Object initialises its structures during start_document and does its cleanup during end_document. If you forget to call start_document, any other method will break during the run. Most likely place is the encode method, trying to eval undef as a subroutine. If you forget to call end_document, you should not use a single instance of YAWriter more than once.
For small documents AsArray may be the fastest method and AsString the easiest one to receive the output of YAWriter. But AsString and AsArray may run out of memory with infinite SAX streams. The only method XML::Handler::Writer calls on a given Output object is the print method. So it's easy to use a self written Output object to improve streaming.
A single instance of XML::Handler::YAWriter is able to produce more than one file in a single run. Be sure to provide a fresh IO::File as Output before you call start_document and close this File after calling end_document. Or provide a filename in AsFile, so start_document and end_document can open and close its own filehandle.
Automatic recoding between 8bit and 16bit does not work in any Perl correctly !
I have Perl-5.00563 at home and here I can specify "use utf8;" in the right places to make recoding work. But I dislike saying "use 5.00555;" because many systems run 5.00503.
If you use some 8bit character set internally and want use national characters, either state your character as Encoding to be ISO-8859-1, or provide an Escape hash similar to the following :
$ya->{'Escape'} = { '&' => '&', '<' => '<', '>' => '>', '"' => '"', '--' => '--' 'ö' => 'ö' 'ä' => 'ä' 'ü' => 'ü' 'Ö' => 'Ö' 'Ä' => 'Ä' 'Ü' => 'Ü' 'ß' => 'ß' };
You may abuse YAWriter to clean whitespace from XML documents. Take a look at test.pl, doing just that with an XML::Edifact message, without querying the DTD. This may work in 99% of the cases where you want to get rid of ignorable whitespace caused by the various forms of pretty printing.
my $ya = new XML::Handler::YAWriter( 'Output' => new IO::File ( ">-" ); 'Pretty' => { 'NoWhiteSpace'=>1, 'NoComments'=>1, 'AddHiddenNewline'=>1, 'AddHiddenAttrTab'=>1, } );
XML::Handler::Writer implements any method XML::Parser::PerlSAX wants. This extends the Java SAX1.0 specification. I have in mind using Pretty=>SAX1=>1 to disable this feature, if abusing YAWriter for a SAX proxy.
Michael Koehne, Kraehe@Copyleft.De
"Derksen, Eduard (Enno), CSCIO" <enno@att.com> helped me with the Escape hash and gave quite a lot of useful comments.
perl and XML::Parser::PerlSAX
2022-12-11 | perl v5.36.0 |