Statistics::R::IO::REXPFactory - Functions for parsing R data
files
use Statistics::R::IO::REXPFactory qw( unserialize );
# Assume $data was created by reading, say, an RDS file
my ($rexp, $state) = @{unserialize($data)}
or die "couldn't parse";
# If we're reading an RDS file, there should be no data left
# unparsed
die 'Unread data remaining in the RDS file' unless $state->eof;
# the result of the unserialization is a REXP
say $rexp;
# REXPs can be converted to the closest native Perl data type
print $rexp->to_pl;
This module implements the actual reading of serialized R objects
and their conversion to a Statistics::R::REXP. You are not expected to use
it directly, as it's normally wrapped by "readRDS" in
Statistics::R::IO and "readRData" in Statistics::R::IO.
- unserialize
$data
- Constructs a Statistics::R::REXP object from its serialization in
$data. Returns a pair of the object and the
Statistics::R::IO::ParserState at the end of serialization.
- intsxp, langsxp, lglsxp,
listsxp, rawsxp, realsxp, refsxp, strsxp, symsxp, vecsxp, envsxp, charsxp,
cplxsxp, closxp, expsxp, s4sxp
- Parsers for the corresponding R SEXP-types.
- object_content
- Parses object info and its data by sequencing
"unpack_object_info" and "object_data".
- unpack_object_info
- Parser for serialized object info structure. Returns a hash with keys
"is_object", "has_attributes", "has_tag",
"object_type", and "levels", each corresponding to the
field in R serialization described in
<http://cran.r-project.org/doc/manuals/r-release/R-ints.html#Serialization-Formats>.
An additional key "flags" contains the full 32-bit value as
stored in the file.
- object_data
$obj_info
- Parser for a serialized R object, using the object type stored in
$obj_info hash's "object_type" key to
use the correct parser for the particular type.
- vector_and_attributes
$object_info, $element_parser, $rexp_class
- Convenience parser for vectors, which are serialized first with a SEXP for
the vector elements, followed by attributes stored as a tagged pairlist.
Attributes are stored only if $object_info
indicates their presence, while vector elements are parsed using
$element_parser. Finally, the parsed attributes
and elements are used as arguments to the constructor of the
$rexp_class, which should be a subclass of
Statistics::R::REXP::Vector.
- Parser for header of R serialization: the serialization format (XDR,
binary, etc.), the version number of the serialization (currently 2), and
two 32-bit integers indicating the version of R which wrote the file
followed by the minimal version of R needed to read the format.
- xdr, bin
- Parsers for RDS header indicating files in XDR or native-binary
format.
- maybe_long_length
- Parser for vector length, allowing for the encoding of 64-bit long vectors
introduced in R 3.0.
- tagged_pairlist_to_rexp_hash
- Converts a pairlist to a REXP hash whose keys are the pairlist's element
tags and values the pairlist elements themselves.
- tagged_pairlist_to_attribute_hash
- Converts object attributes, which are serialized as a pairlist with
attribute name in the element's tag, to a hash that can be used as the
"attributes" argument to
Statistics::R::REXP constructors.
Some attributes are serialized using a compact encoding (for
instance, when a table's row names are just integers 1:nrows), and this
function will decode them to a complete REXP.
There are no known bugs in this module. Please see
Statistics::R::IO for bug reporting.
See Statistics::R::IO for support and contact information.
Davor Cubranic <cubranic@stat.ubc.ca>
This software is Copyright (c) 2017 by University of British
Columbia.
This is free software, licensed under:
The GNU General Public License, Version 3, June 2007