PERLXSTYPEMAP(1) | Perl Programmers Reference Guide | PERLXSTYPEMAP(1) |
perlxstypemap - Perl XS C/Perl type mapping
The more you think about interfacing between two languages, the more you'll realize that the majority of programmer effort has to go into converting between the data structures that are native to either of the languages involved. This trumps other matter such as differing calling conventions because the problem space is so much greater. There are simply more ways to shove data into memory than there are ways to implement a function call.
Perl XS' attempt at a solution to this is the concept of typemaps. At an abstract level, a Perl XS typemap is nothing but a recipe for converting from a certain Perl data structure to a certain C data structure and vice versa. Since there can be C types that are sufficiently similar to one another to warrant converting with the same logic, XS typemaps are represented by a unique identifier, henceforth called an XS type in this document. You can then tell the XS compiler that multiple C types are to be mapped with the same XS typemap.
In your XS code, when you define an argument with a C type or when you are using a "CODE:" and an "OUTPUT:" section together with a C return type of your XSUB, it'll be the typemapping mechanism that makes this easy.
In more practical terms, the typemap is a collection of code fragments which are used by the xsubpp compiler to map C function parameters and values to Perl values. The typemap file may consist of three sections labelled "TYPEMAP", "INPUT", and "OUTPUT". An unlabelled initial section is assumed to be a "TYPEMAP" section. The INPUT section tells the compiler how to translate Perl values into variables of certain C types. The OUTPUT section tells the compiler how to translate the values from certain C types into values Perl can understand. The TYPEMAP section tells the compiler which of the INPUT and OUTPUT code fragments should be used to map a given C type to a Perl value. The section labels "TYPEMAP", "INPUT", or "OUTPUT" must begin in the first column on a line by themselves, and must be in uppercase.
Each type of section can appear an arbitrary number of times and does not have to appear at all. For example, a typemap may commonly lack "INPUT" and "OUTPUT" sections if all it needs to do is associate additional C types with core XS types like T_PTROBJ. Lines that start with a hash "#" are considered comments and ignored in the "TYPEMAP" section, but are considered significant in "INPUT" and "OUTPUT". Blank lines are generally ignored.
Traditionally, typemaps needed to be written to a separate file, conventionally called "typemap" in a CPAN distribution. With ExtUtils::ParseXS (the XS compiler) version 3.12 or better which comes with perl 5.16, typemaps can also be embedded directly into XS code using a HERE-doc like syntax:
TYPEMAP: <<HERE ... HERE
where "HERE" can be replaced by other identifiers like with normal Perl HERE-docs. All details below about the typemap textual format remain valid.
The "TYPEMAP" section should contain one pair of C type and XS type per line as follows. An example from the core typemap file:
TYPEMAP # all variants of char* is handled by the T_PV typemap char * T_PV const char * T_PV unsigned char * T_PV ...
The "INPUT" and "OUTPUT" sections have identical formats, that is, each unindented line starts a new in- or output map respectively. A new in- or output map must start with the name of the XS type to map on a line by itself, followed by the code that implements it indented on the following lines. Example:
INPUT T_PV $var = ($type)SvPV_nolen($arg) T_PTR $var = INT2PTR($type,SvIV($arg))
We'll get to the meaning of those Perlish-looking variables in a little bit.
Finally, here's an example of the full typemap file for mapping C strings of the "char *" type to Perl scalars/strings:
TYPEMAP char * T_PV INPUT T_PV $var = ($type)SvPV_nolen($arg) OUTPUT T_PV sv_setpv((SV*)$arg, $var);
Here's a more complicated example: suppose that you wanted "struct netconfig" to be blessed into the class "Net::Config". One way to do this is to use underscores (_) to separate package names, as follows:
typedef struct netconfig * Net_Config;
And then provide a typemap entry "T_PTROBJ_SPECIAL" that maps underscores to double-colons (::), and declare "Net_Config" to be of that type:
TYPEMAP Net_Config T_PTROBJ_SPECIAL INPUT T_PTROBJ_SPECIAL if (sv_derived_from($arg, \"${(my $ntt=$ntype)=~s/_/::/g;\$ntt}\")){ IV tmp = SvIV((SV*)SvRV($arg)); $var = INT2PTR($type, tmp); } else croak(\"$var is not of type ${(my $ntt=$ntype)=~s/_/::/g;\$ntt}\") OUTPUT T_PTROBJ_SPECIAL sv_setref_pv($arg, \"${(my $ntt=$ntype)=~s/_/::/g;\$ntt}\", (void*)$var);
The INPUT and OUTPUT sections substitute underscores for double-colons on the fly, giving the desired effect. This example demonstrates some of the power and versatility of the typemap facility.
The "INT2PTR" macro (defined in perl.h) casts an integer to a pointer of a given type, taking care of the possible different size of integers and pointers. There are also "PTR2IV", "PTR2UV", "PTR2NV" macros, to map the other way, which may be useful in OUTPUT sections.
The default typemap in the lib/ExtUtils directory of the Perl source contains many useful types which can be used by Perl extensions. Some extensions define additional typemaps which they keep in their own directory. These additional typemaps may reference INPUT and OUTPUT maps in the main typemap. The xsubpp compiler will allow the extension's own typemap to override any mappings which are in the default typemap. Instead of using an additional typemap file, typemaps may be embedded verbatim in XS with a heredoc-like syntax. See the documentation on the "TYPEMAP:" XS keyword.
For CPAN distributions, you can assume that the XS types defined by the perl core are already available. Additionally, the core typemap has default XS types for a large number of C types. For example, if you simply return a "char *" from your XSUB, the core typemap will have this C type associated with the T_PV XS type. That means your C string will be copied into the PV (pointer value) slot of a new scalar that will be returned from your XSUB to Perl.
If you're developing a CPAN distribution using XS, you may add your own file called typemap to the distribution. That file may contain typemaps that either map types that are specific to your code or that override the core typemap file's mappings for common C types.
Starting with ExtUtils::ParseXS version 3.13_01 (comes with perl 5.16 and better), it is rather easy to share typemap code between multiple CPAN distributions. The general idea is to share it as a module that offers a certain API and have the dependent modules declare that as a built-time requirement and import the typemap into the XS. An example of such a typemap-sharing module on CPAN is "ExtUtils::Typemaps::Basic". Two steps to getting that module's typemaps available in your code:
INCLUDE_COMMAND: $^X -MExtUtils::Typemaps::Cmd -e "print embeddable_typemap(q{Basic})"
Each INPUT or OUTPUT typemap entry is a double-quoted Perl string that will be evaluated in the presence of certain variables to get the final C code for mapping a certain C type.
This means that you can embed Perl code in your typemap (C) code using constructs such as "${ perl code that evaluates to scalar reference here }". A common use case is to generate error messages that refer to the true function name even when using the ALIAS XS feature:
${ $ALIAS ? \q[GvNAME(CvGV(cv))] : \qq[\"$pname\"] }
For many typemap examples, refer to the core typemap file that can be found in the perl source tree at lib/ExtUtils/typemap.
The Perl variables that are available for interpolation into typemaps are the following:
Each C type is represented by an entry in the typemap file that is responsible for converting perl variables (SV, AV, HV, CV, etc.) to and from that type. The following sections list all XS types that come with perl by default.
Note that this typemap does not decrement the reference count when returning the reference to an SV*. See also: T_SVREF_REFCOUNT_FIXED
Note that this typemap does not decrement the reference count when returning an AV*. See also: T_AVREF_REFCOUNT_FIXED
Note that this typemap does not decrement the reference count when returning an HV*. See also: T_HVREF_REFCOUNT_FIXED
Note that this typemap does not decrement the reference count when returning an HV*. See also: T_HVREF_REFCOUNT_FIXED
This is a fixed variant of T_HVREF that decrements the refcount appropriately when returning an HV*. Introduced in perl 5.15.4.
System calls return -1 on error (setting ERRNO with the reason) and (usually) 0 on success. If the return value is -1 this typemap returns "undef". If the return value is not -1, this typemap translates a 0 (perl false) to "0 but true" (which is perl true) or returns the value itself, to indicate that the command succeeded.
The POSIX module makes extensive use of this type.
Its behaviour is identical to using an "int" type in XS with T_IV.
T_U_SHORT is used for type "U16" in the standard typemap.
T_U_LONG is used for type "U32" in the standard typemap.
The typemap checks that a scalar reference is passed from perl to XS.
The pointer is blessed into a class that is derived from the name of type of the pointer but with all '*' in the name replaced with 'Ptr'.
For "DESTROY" XSUBs only, a T_PTROBJ is optimized to a T_PTRREF. This means the class check is skipped.
The pointer is blessed into a class that is derived from the name of type of the pointer but with all '*' in the name replaced with 'Ptr'.
For "DESTROY" XSUBs only, a T_REF_IV_PTR is optimized to a T_PTRREF. This means the class check is skipped.
Only the INPUT part of this is implemented (Perl to XSUB) and there are no known users in core or on CPAN.
For "DESTROY" XSUBs only, a T_REFOBJ is optimized to a T_REFREF. This means the class check is skipped.
In principle the unpack() command can be used to convert the bytes back to a number (if the underlying type is known to be a number).
This entry can be used to store a C structure (the number of bytes to be copied is calculated using the C "sizeof" function) and can be used as an alternative to T_PTRREF without having to worry about a memory leak (since Perl will clean up the SV).
The data may be retrieved using the "unpack" function if the underlying type of the byte stream is known.
T_OPAQUE supports input and output of simple types. T_OPAQUEPTR can be used to pass these bytes back into C if a pointer is acceptable.
array(type, nelem)
xsubpp will copy the contents of "nelem * sizeof(type)" bytes from RETVAL to an SV and push it onto the stack. This is only really useful if the number of items to be returned is known at compile time and you don't mind having a string of bytes in your SV. Use T_ARRAY to push a variable number of arguments onto the return stack (they won't be packed as a single string though).
This is similar to using T_OPAQUEPTR but can be used to process more than one element.
Conversely for "INPUT" (Perl to XSUB) mapping, the function named "XS_unpack_$ntype" is called with the input Perl scalar as argument and the return value is cast to the mapped C type and assigned to the output C variable.
An example conversion function for a typemapped struct "foo_t *" might be:
static void XS_pack_foo_tPtr(SV *out, foo_t *in) { dTHX; /* alas, signature does not include pTHX_ */ HV* hash = newHV(); hv_stores(hash, "int_member", newSViv(in->int_member)); hv_stores(hash, "float_member", newSVnv(in->float_member)); /* ... */ /* mortalize as thy stack is not refcounted */ sv_setsv(out, sv_2mortal(newRV_noinc((SV*)hash))); }
The conversion from Perl to C is left as an exercise to the reader, but the prototype would be:
static foo_t * XS_unpack_foo_tPtr(SV *in);
Instead of an actual C function that has to fetch the thread context using "dTHX", you can define macros of the same name and avoid the overhead. Also, keep in mind to possibly free the memory allocated by "XS_unpack_foo_tPtr".
static void XS_pack_foo_tPtrPtr(SV *out, foo_t *in, UV count_foo_tPtrPtr);
The type of the third parameter is arbitrary as far as the typemap is concerned. It just has to be in line with the declared variable.
Of course, unless you know the number of elements in the "sometype **" C array, within your XSUB, the return value from "foo_t ** XS_unpack_foo_tPtrPtr(...)" will be hard to decipher. Since the details are all up to the XS author (the typemap user), there are several solutions, none of which particularly elegant. The most commonly seen solution has been to allocate memory for N+1 pointers and assign "NULL" to the (N+1)th to facilitate iteration.
Alternatively, using a customized typemap for your purposes in the first place is probably preferable.
The usual calling signature is
@out = array_func( @in );
Any number of arguments can occur in the list before the array but the input and output arrays must be the last elements in the list.
When used to pass a perl list to C the XS writer must provide a function (named after the array type but with 'Ptr' substituted for '*') to allocate the memory required to hold the list. A pointer should be returned. It is up to the XS writer to free the memory on exit from the function. The variable "ix_$var" is set to the number of elements in the new array.
When returning a C array to Perl the XS writer must provide an integer variable called "size_$var" containing the number of elements in the array. This is used to determine how many elements should be pushed onto the return argument stack. This is not required on input since Perl knows how many arguments are on the stack when the routine is called. Ordinarily this variable would be called "size_RETVAL".
Additionally, the type of each element is determined from the type of the array. If the array uses type "intArray *" xsubpp will automatically work out that it contains variables of type "int" and use that typemap entry to perform the copy of each element. All pointer '*' and 'Array' tags are removed from the name to determine the subtype.
See perliol for more information on the Perl IO abstraction layer. Perl must have been built with "-Duseperlio".
There is no check to assert that the filehandle passed from Perl to C was created with the right "open()" mode.
Hint: The perlxstut tutorial covers the T_INOUT, T_IN, and T_OUT XS types nicely.
2023-11-25 | perl v5.32.1 |