Locale::XGettext(3pm) | User Contributed Perl Documentation | Locale::XGettext(3pm) |
Locale::XGettext - Extract Strings To PO Files
use base 'Locale::XGettext';
Locale::XGettext is the base class for various string extractors. These string extractors can be used as standalone programs on the command-line or as a module as a part of other software.
See <https://github.com/gflohr/Locale-XGettext> for an overall picture of the software.
This section describes the usage of extractors based on this library. See "SUBCLASSING" and the sections following it for the API documentation!
xgettext-LANG [OPTIONS] [INPUTFILE]...
LANG will be replaced by an identifier for the language that a specific extractor was written for, for example "xgettext-txt" for plain text files or "xgettext-tt2" for templates for the Template Toolkit version 2 (see Template).
By default, string extractors based on this module extract strings from one or more INPUTFILES and write the output to a file "messages.po" if any strings had been found.
The command line options are mostly compatible to xgettext from GNU Gettext <https://www.gnu.org/software/gettext/manual/html_node/xgettext-Invocation.html>.
Note! Unlike xgettext from GNU Gettext, extractors based on Locale::XGettext accept this option multiple times, so that you can read the list of input files from multiple files.
If the output file is - or /dev/stdout, the output is written to standard output.
By default the input files are assumed to be in ASCII.
Note! Some extractors have a fixed input set, UTF-8 most of the times.
For obvious reasons, you cannot use this option if output is written to standard output.
Not all extractors support this option!
Not all extractors support this option!
Not all extractors support this option!
The meaning of --flag=function:arg:lang-format is that in language lang, the specified function expects as argth argument a format string. (For those of you familiar with GCC function attributes, --flag=function:arg:c-format is roughly equivalent to the declaration X__attribute__ ((__format__ (__printf__, arg, ...)))X attached to function in a C source file.) For example, if you use the XerrorX function from GNU libc, you can specify its behaviour through --flag=error:3:c-format. The effect of this specification is that xgettext will mark as format strings all gettext invocations that occur as argth argument of function. This is useful when such strings contain no format string directives: together with the checks done by Xmsgfmt -cX it will ensure that translators cannot accidentally use format string directives that would lead to a crash at runtime.
The meaning of --flag=function:arg:pass-lang-format is that in language lang, if the function call occurs in a position that must yield a format string, then its argth argument must yield a format string of the same type as well. (If you know GCC function attributes, the --flag=function:arg:pass-c-format option is roughly equivalent to the declaration X__attribute__ ((__format_arg__ (arg)))X attached to function in a C source file.) For example, if you use the X_X shortcut for the gettext function, you should use --flag=_:1:pass-c-format. The effect of this specification is that xgettext will propagate a format string requirement for a _("string") call to its first argument, the literal "string", and thus mark it as a format string. This is useful when such strings contain no format string directives: together with the checks done by Xmsgfmt -cX it will ensure that translators cannot accidentally use format string directives that would lead to a crash at runtime.
Note that Locale::XGettext ignores the prefix pass- and therefore most extractors based on Locale::XGettext will also ignore it.
Individual extractors may define more language-specific options.
If you want to hava a translation for an empty string you should also consider using message contexts.
Writing a complete extractor script in Perl with Locale::XGettext is as simple as:
#! /usr/bin/env perl use Locale::Messages qw(setlocale LC_MESSAGES); use Locale::TextDomain qw(YOURTEXTDOMAIN); use Locale::XGettext::YOURSUBCLASS; Locale::Messages::setlocale(LC_MESSAGES, ""); Locale::XGettext::YOURSUBCLASS->newFromArgv(\@ARGV)->run->output;
Writing the extractor class is also trivial:
package Locale::XGettext::YOURSUBCLASS; use base 'Locale::XGettext'; sub readFile { my ($self, $filename) = @_; foreach my $found (search_for_strings_in $filename) { $self->addEntry({ msgid => $found->{string}, # More possible fields following, see # addEntry() below! }, $found->{possible_comment}); } # The return value is actually ignored. return $self; }
All the heavy lifting happens in the method readFile() that you have to implement yourself. All other methods are optional.
See the section "METHODS" below for information on how to additionally modify the behavior your extractor.
This is the constructor that you should normally use in custom extractors that you write.
Locale::XGettext is an abstract base class. All public methods may be overridden by subclassed extractors.
The method is not invoked for filenames ending in ".po" or ".pot"! For those files, readPO() is invoked instead.
This method is the only one that you have to implement!
COMMENT is an optional comment that you may have extracted along with the message. Note that addEntry() checks whether this comment should make it into the output. Therefore, just pass any comment that you have found preceding the keyword.
ENTRY should be a reference to a hash with these possible keys:
xgettext-my.pl --keyword=greet:1,'"Hello, world!"'
If you set keyword to "greet", the comment "Hello, world" will be added. Note that the "double quotes" are part of the command-line argument!
Likewise, if "--flag" was specified on the command-line or the extractor ships with default flags, entries matching the flag definition will automatically have this flag.
You can try this out with:
xgettext-my.pl --keyword="greet:1" --flag=greet:1:hello-format
Now all PO entries for the keyword "greet" will have the flag "hello-format"
Instead of a hash you can currently also pass a Locale::PO object. This may no longer be supported in the future. Do not use!
Your own implementation can return an reference to an array of arrays, each of them containing one option specification consisting of four items:
Subclasses may return a reference to an array with default keyword definitions for the specific language. The default keywords (actually just a subset for it) for the language C would look like this (expressed in JSON):
[ "gettext:1", "ngettext:1,2", "pgettext:1c,2", "npgettext:1c,2,3" ]
See above the description of the command-line option "--keyword" for more information about the meaning of these strings.
Subclasses may return a reference to an array with default flag specifications for the specific language. An example may look like this (expressed in JSON):
[ "gettextx:1:perl-brace-format", "ngettextx:1:perl-brace-format", "ngettextx:2:perl-brace-format", ]
We assume that "gettextx()" and "gettextx() are keywords for the language in question. The above default flag definition would mean that in all invocations of the function "gettextx()", the 1st argument would get the flag "perl-brace-format". In all invocations of "ngettextx()", the 1st and 2nd argument would get the flag "perl-brace-format".
You can prefix the format with "no-" which tells the GNU gettext tools that the particular never uses that format.
You can additionally prefix the format with "pass-" but this is ignored by Locale::XGettext. If you want to implemnt the GNU xgettext behavior for the "pass-" prefix, you have to implement it yourself in your extractor.
Do not use!
Copyright (C) 2016-2017 Guido Flohr <guido.flohr@cantanea.com>, all rights reserved.
Getopt::Long, xgettext(1), perl
2023-02-05 | perl v5.36.0 |