Time::OlsonTZ::Download - Olson timezone database from source
use Time::OlsonTZ::Download;
$version = Time::OlsonTZ::Download->latest_version;
$download = Time::OlsonTZ::Download->new;
$version = $download->version;
$version = $download->code_version;
$version = $download->data_version;
$dir = $download->dir;
$dir = $download->unpacked_dir;
$names = $download->canonical_names;
$names = $download->link_names;
$names = $download->all_names;
$links = $download->raw_links;
$links = $download->threaded_links;
$countries = $download->country_selection;
$files = $download->source_data_files;
$files = $download->zic_input_files;
$zic = $download->zic_exe;
$dir = $download->zoneinfo_dir;
An object of this class represents a local copy of the source of
the Olson timezone database, possibly used to build binary tzfiles. The
source copy always begins by being downloaded from the canonical repository
of the Olson database. This class provides methods to help with extracting
useful information from the source.
- Time::OlsonTZ::Download->latest_version
- Returns the version number of the latest available version of the Olson
timezone database. This requires consulting the repository, but is much
cheaper than actually downloading the database.
- Time::OlsonTZ::Download->new([VERSION])
- Downloads a copy of the source of the Olson database, and returns an
object representing that copy.
VERSION, if supplied, is a version number specifying
which version of the database is to be downloaded. If not supplied, the
latest available version will be downloaded. Version numbers for the
Olson database currently consist of a year number and a lowercase
letter, such as ""2010k"".
The letter advances with each release in a year.
Historical vesrions make the version numbers a bit more
complicated. Prior to late 1996 the century portion of the year number
was omitted, giving version numbers such as
""96g"". Prior to 1994 the
first release of each year omitted the letter
""a"", giving version
numbers such as "93" (with the second
release of the year being
""93b"").
From 1993 to to late 2012 the database was split into `code'
and `data' parts that could each be released without releasing a new
version of the other part. Each part had its own version number,
sometimes advancing independently of each other, and sometimes skipping
sequence letters in order to catch up with the other part. Where the two
parts of some version of the database have different version numbers,
the version number of the database as a whole is whichever part's
version number is higher. If this would give two database versions the
same number, due to multiple releases of one part happening while the
other part has a higher version number, a digit
"2" or
"3" is appended after the letter to
distinguish the second and third such versions.
This module does not currently support downloading database
versions earlier than version 93. One can expect to successfully
download most versions from then on, but a handful are missing from the
public archive. The public archive is complete from version 2006f
onwards. Details of historical version availability may change in
future.
- Time::OlsonTZ::Download->new_from_local_source(ATTR
=> VALUE, ...)
- Acquires Olson database source locally, without downloading, and returns
an object representing a copy of it ready to use like a download. This can
be used to work with locally-modified versions of the database. The
following attributes may be given:
- source_dir
- Local directory containing Olson source files. Must be supplied. The
entire directory will be copied into a temporary location to be worked
on.
- version
- Olson version number to attribute to the source files. Must be
supplied.
- code_version
- data_version
- Olson version number to attribute to the code and data parts of the source
files. Both default to the main version number.
- $download->version
- Returns the version number of the database of which a copy is represented
by this object.
The database consists of code and data parts which are updated
semi-independently. The latest version of the database as a whole
consists of the latest version of the code and the latest version of the
data. If both parts are updated at once then they will both get the same
version number, and that will be the version number of the database as a
whole. However, in general they may be updated at different times, and a
single version of the database may be made up of code and data parts
that have different version numbers. The version number of the database
as a whole will then be the version number of the most recently updated
part.
- $download->code_version
- Returns the version number of the code part of the database of which a
copy is represented by this object.
- $download->data_version
- Returns the version number of the data part of the database of which a
copy is represented by this object.
- $download->dir
- Returns the pathname of the directory in which the files of this download
are located. With this method, there is no guarantee of particular files
being available in the directory; see other directory-related methods
below that establish particular directory contents.
The directory does not move during the lifetime of the
download object: this method will always return the same pathname. The
directory and all of its contents, including subdirectories, will be
automatically deleted when this object is destroyed. This will be when
the main program terminates, if it is not otherwise destroyed. Any files
that it is desired to keep must be copied to a permanent location.
- $download->unpacked_dir
- Returns the pathname of the directory in which the downloaded source files
have been unpacked. This is the local temporary directory used by this
download. This method will unpack the files there if they have not already
been unpacked.
- $download->canonical_names
- Returns the set of timezone names that this version of the database
defines as canonical. These are the timezone names that are directly
associated with a set of observance data. The return value is a reference
to a hash, in which the keys are the canonical timezone names and the
values are all "undef".
- $download->link_names
- Returns the set of timezone names that this version of the database
defines as links. These are the timezone names that are aliases for other
names. The return value is a reference to a hash, in which the keys are
the link timezone names and the values are all
"undef".
- $download->all_names
- Returns the set of timezone names that this version of the database
defines. These are the "canonical_names" and the
"link_names". The return value is a reference to a hash, in
which the keys are the timezone names and the values are all
"undef".
- $download->raw_links
- Returns details of the timezone name links in this version of the
database. Each link defines one timezone name as an alias for some other
timezone name. The return value is a reference to a hash, in which the
keys are the aliases and each value is the preferred timezone name to
which that alias directly refers. It is possible for an alias to point to
another alias, or to point to a non-existent name. For a more processed
view of links, see "threaded_links".
- $download->threaded_links
- Returns details of the timezone name links in this version of the
database. Each link defines one timezone name as an alias for some other
timezone name. The return value is a reference to a hash, in which the
keys are the aliases and each value is the canonical name of the timezone
to which that alias refers. All such canonical names can be found in the
"canonical_names" hash.
- $download->country_selection
- Returns information about how timezones relate to countries, intended to
aid humans in selecting a geographical timezone. This information is
derived from the "zone.tab" and
"iso3166.tab" files in the database
source.
The return value is a reference to a hash, keyed by (ISO 3166
alpha-2 uppercase) country code. The value for each country is a hash
containing these values:
- alpha2_code
- The ISO 3166 alpha-2 uppercase country code.
- olson_name
- An English name for the country, possibly in a modified form, optimised to
help humans find the right entry in alphabetical lists. This is not
necessarily identical to the country's standard short or long name. (For
other forms of the name, consult a database of countries, keying by the
country code.)
- regions
- Information about the regions of the country that use distinct timezones.
This is a hash, keyed by English description of the region. The
description is empty if there is only one region. The value for each
region is a hash containing these values:
- olson_description
- Brief English description of the region, used to distinguish between the
regions of a single country. Empty string if the country has only one
region for timezone purposes. (This is the same string used as the key in
the regions hash.)
- timezone_name
- Name of the Olson timezone used in this region. This is not necessarily a
canonical name (it may be a link). Typically, where there are aliases or
identical canonical zones, a name is chosen that refers to a location in
the country of interest. It is not guaranteed that the named timezone
exists in the database (though it always should).
- location_coords
- Geographical coordinates of some point within the location referred to in
the timezone name. This is a latitude and longitude, in ISO 6709
format.
This data structure is intended to help a human select the
appropriate timezone based on political geography, specifically working from
a selection of country. It is of essentially no use for any other purpose.
It is not strictly guaranteed that every geographical timezone in the
database is listed somewhere in this structure, so it is of limited use in
providing information about an already-selected timezone. It does not
include non-geographic timezones at all. It also does not claim to be a
comprehensive list of countries, and does not make any claims regarding the
political status of any entity listed: the "country"
classification is loose, and used only for identification purposes.
- $download->source_data_files
- Returns a reference to an array containing the pathnames of all the source
data files. These express the database's data (i.e., a description of
known civil timezones) in a textual format, and are intended for human
editing. They are located in the local temporary directory used by this
download.
There is normally approximately one source data file per
continent, though this arrangement could change in the future. The
textual format is machine parseable, the same format intended for input
to "zic", but when interpreted this
way the files do not necessarily correspond to the the official content
of the database. There may be transformations that the database code
would normally apply between the source data files and the actual input
to "zic".
If you intend to parse the source, taking the place of
"zic", then you should prefer to use
the "zic_input_files" method, which provides the input that
"zic" would actually see.
- $download->zic_input_files
- Returns a reference to an array containing the pathnames of all the data
files that would normally be fed to
"zic". These express the database's data
(i.e., a description of known civil timezones) in the format expected by
"zic", and are suitable for machine
parsing. They are located in the local temporary directory used by this
download. This method will build the files if they didn't already exist.
The "zic" input files are
not necessarily source files intended for human editing. In older
versions of the database they are such source files, but from database
version "2017c" onwards there is a
single "zic" input file, which is
generated from the source files and omits the niceties of the source
files. From database version "2018d"
onwards there is some transformation between the source files and the
"zic" input, such that they do not
necessarily express the same data when parsed by
"zic". These arrangements could change
again in the future.
The textual format of "zic"
input is not standardised, and is peculiar to the Olson database.
Parsing it directly is in principle a dubious proposition, but in
practice it is very stable.
If you want the human-editable source form of the data, use
the "source_data_files" method instead.
- $download->data_files
- Returns a reference to an array containing the pathnames of all the source
data files, provided that the database code would feed the same data to
"zic". This method is deprecated: you
should use either "source_data_files" or
"zic_input_files" depending on which aspect of the data files
you are interested in. In older versions of the database the same files
were both human-editable and used as
"zic" input, so this single method
served both roles. From database version
"2018d" onwards there is some
transformation between the source files and the
"zic" input, so the two roles of the
files need to be distinguished.
- $download->zic_exe
- Returns the pathname of the "zic"
executable that has been built from the downloaded source. This is located
in the local temporary directory used by this download. This method will
build "zic" if it has not already been
built.
- $download->zoneinfo_dir([OPTIONS])
- Returns the pathname of the directory containing binary tzfiles (in
tzfile(5) format) that have been generated from the downloaded
source. This is located in the local temporary directory used by this
download, and the files within it have names that match the timezone names
(as returned by "all_names"). This method will generate the
tzfiles if they have not already been generated.
The optional parameter OPTIONS controls which kind of
tzfiles are desired. If supplied, it must be a reference to a hash, in
which these keys are permitted:
- leaps
- Truth value, controls whether the tzfiles incorporate information about
known leap seconds offsets that account for the known leap seconds. If
false (which is the default), the tzfiles have no knowledge of leap
seconds, and are intended to be used on a system where
"time_t" is some flavour of UT (as is
conventional on Unix and is the POSIX standard). If true, the tzfiles know
about leap seconds that have occurred between 1972 and the date of the
database, and are intended to be used on a system where
"time_t" is (from 1972 onwards) a linear
count of TAI seconds (which is a non-standard arrangement).
Most of what this class does will only work on Unix platforms.
This is largely because the Olson database source is heavily
Unix-oriented.
This class also depends on the availability of some tools beyond
baseline Unix. Specifically, it requires GNU
"gpgv", GNU
"tar",
"lzip",
"sha512sum", and GNU
"make".
It also won't be much good if you're not connected to the
Internet.
This class is liable to break if the format of the Olson database
source ever changes substantially. If that happens, an update of this class
will be required. It should at least recognise that it can't perform, rather
than do the wrong thing.
DateTime::TimeZone::Tzfile, Time::OlsonTZ::Data,
tzfile(5)
Andrew Main (Zefram) <zefram@fysh.org>
Copyright (C) 2010, 2011, 2012, 2017, 2018 Andrew Main (Zefram)
<zefram@fysh.org>
This module is free software; you can redistribute it and/or
modify it under the same terms as Perl itself.